Search (15 results, page 1 of 1)

  • × theme_ss:"Retrievalstudien"
  • × year_i:[2000 TO 2010}
  1. Leininger, K.: Interindexer consistency in PsychINFO (2000) 0.04
    0.03621345 = product of:
      0.10864034 = sum of:
        0.10864034 = sum of:
          0.068250805 = weight(_text_:indexing in 2552) [ClassicSimilarity], result of:
            0.068250805 = score(doc=2552,freq=4.0), product of:
              0.19018644 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.049684696 = queryNorm
              0.3588626 = fieldWeight in 2552, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.046875 = fieldNorm(doc=2552)
          0.04038954 = weight(_text_:22 in 2552) [ClassicSimilarity], result of:
            0.04038954 = score(doc=2552,freq=2.0), product of:
              0.17398734 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.049684696 = queryNorm
              0.23214069 = fieldWeight in 2552, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2552)
      0.33333334 = coord(1/3)
    
    Abstract
    Reports results of a study to examine interindexer consistency (the degree to which indexers, when assigning terms to a chosen record, will choose the same terms to reflect that record) in the PsycINFO database using 60 records that were inadvertently processed twice between 1996 and 1998. Five aspects of interindexer consistency were analysed. Two methods were used to calculate interindexer consistency: one posited by Hooper (1965) and the other by Rollin (1981). Aspects analysed were: checktag consistency (66.24% using Hooper's calculation and 77.17% using Rollin's); major-to-all term consistency (49.31% and 62.59% respectively); overall indexing consistency (49.02% and 63.32%); classification code consistency (44.17% and 45.00%); and major-to-major term consistency (43.24% and 56.09%). The average consistency across all categories was 50.4% using Hooper's method and 60.83% using Rollin's. Although comparison with previous studies is difficult due to methodological variations in the overall study of indexing consistency and the specific characteristics of the database, results generally support previous findings when trends and similar studies are analysed.
    Date
    9. 2.1997 18:44:22
  2. Mansourian, Y.; Ford, N.: Web searchers' attributions of success and failure: an empirical study (2007) 0.03
    0.033813164 = product of:
      0.10143948 = sum of:
        0.10143948 = weight(_text_:systematic in 840) [ClassicSimilarity], result of:
          0.10143948 = score(doc=840,freq=4.0), product of:
            0.28397155 = queryWeight, product of:
              5.715473 = idf(docFreq=395, maxDocs=44218)
              0.049684696 = queryNorm
            0.35721707 = fieldWeight in 840, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.715473 = idf(docFreq=395, maxDocs=44218)
              0.03125 = fieldNorm(doc=840)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - This paper reports the findings of a study designed to explore web searchers' perceptions of the causes of their search failure and success. In particular, it seeks to discover the extent to which the constructs locus of control and attribution theory might provide useful frameworks for understanding searchers' perceptions. Design/methodology/approach - A combination of inductive and deductive approaches were employed. Perceptions of failed and successful searches were derived from the inductive analysis of using open-ended qualitative interviews with a sample of 37 biologists at the University of Sheffield. These perceptions were classified into "internal" and "external" attributions, and the relationships between these categories and "successful" and "failed" searches were analysed deductively to test the extent to which they might be explainable using locus of control and attribution theory interpretive frameworks. Findings - All searchers were readily able to recall "successful" and "unsuccessful" searches. In a large majority of cases (82.4 per cent), they clearly attributed each search to either internal (e.g. ability or effort) or external (e.g. luck or information not being available) factors. The pattern of such relationships was analysed, and mapped onto those that would be predicted by locus of control and attribution theory. The authors conclude that the potential of these theoretical frameworks to illuminate one's understanding of web searching, and associated training, merits further systematic study. Research limitations/implications - The findings are based on a relatively small sample of academic and research staff in a particular subject area. Importantly, also, the study can at best provide a prima facie case for further systematic study since, although the patterns of attribution behaviour accord with those predictable by locus of control and attribution theory, data relating to the predictive elements of these theories (e.g. levels of confidence and achievement) were not available. This issue is discussed, and recommendations made for further work. Originality/value - The findings provide some empirical support for the notion that locus of control and attribution theory might - subject to the limitations noted above - be potentially useful theoretical frameworks for helping us better understand web-based information seeking. If so, they could have implications particularly for better understanding of searchers' motivations, and for the design and development of more effective search training programmes.
  3. Debole, F.; Sebastiani, F.: ¬An analysis of the relative hardness of Reuters-21578 subsets (2005) 0.03
    0.029886894 = product of:
      0.08966068 = sum of:
        0.08966068 = weight(_text_:systematic in 3456) [ClassicSimilarity], result of:
          0.08966068 = score(doc=3456,freq=2.0), product of:
            0.28397155 = queryWeight, product of:
              5.715473 = idf(docFreq=395, maxDocs=44218)
              0.049684696 = queryNorm
            0.31573826 = fieldWeight in 3456, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.715473 = idf(docFreq=395, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3456)
      0.33333334 = coord(1/3)
    
    Abstract
    The existence, public availability, and widespread acceptance of a standard benchmark for a given information retrieval (IR) task are beneficial to research an this task, because they allow different researchers to experimentally compare their own systems by comparing the results they have obtained an this benchmark. The Reuters-21578 test collection, together with its earlier variants, has been such a standard benchmark for the text categorization (TC) task throughout the last 10 years. However, the benefits that this has brought about have somehow been limited by the fact that different researchers have "carved" different subsets out of this collection and tested their systems an one of these subsets only; systems that have been tested an different Reuters-21578 subsets are thus not readily comparable. In this article, we present a systematic, comparative experimental study of the three subsets of Reuters-21578 that have been most popular among TC researchers. The results we obtain allow us to determine the relative hardness of these subsets, thus establishing an indirect means for comparing TC systems that have, or will be, tested an these different subsets.
  4. Ménard, E.: Image retrieval : a comparative study on the influence of indexing vocabularies (2009) 0.02
    0.020108584 = product of:
      0.060325753 = sum of:
        0.060325753 = product of:
          0.120651506 = sum of:
            0.120651506 = weight(_text_:indexing in 3250) [ClassicSimilarity], result of:
              0.120651506 = score(doc=3250,freq=18.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.6343854 = fieldWeight in 3250, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3250)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper reports on a research project that compared two different approaches for the indexing of ordinary images representing common objects: traditional indexing with controlled vocabulary and free indexing with uncontrolled vocabulary. We also compared image retrieval within two contexts: a monolingual context where the language of the query is the same as the indexing language and, secondly, a multilingual context where the language of the query is different from the indexing language. As a means of comparison in evaluating the performance of each indexing form, a simulation of the retrieval process involving 30 images was performed with 60 participants. A questionnaire was also submitted to participants in order to gather information with regard to the retrieval process and performance. The results of the retrieval simulation confirm that the retrieval is more effective and more satisfactory for the searcher when the images are indexed with the approach combining the controlled and uncontrolled vocabularies. The results also indicate that the indexing approach with controlled vocabulary is more efficient (queries needed to retrieve an image) than the uncontrolled vocabulary indexing approach. However, no significant differences in terms of temporal efficiency (time required to retrieve an image) was observed. Finally, the comparison of the two linguistic contexts reveal that the retrieval is more effective and more efficient (queries needed to retrieve an image) in the monolingual context rather than the multilingual context. Furthermore, image searchers are more satisfied when the retrieval is done in a monolingual context rather than a multilingual context.
  5. Savoy, J.: Bibliographic database access using free-text and controlled vocabulary : an evaluation (2005) 0.02
    0.018768014 = product of:
      0.05630404 = sum of:
        0.05630404 = product of:
          0.11260808 = sum of:
            0.11260808 = weight(_text_:indexing in 1053) [ClassicSimilarity], result of:
              0.11260808 = score(doc=1053,freq=8.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.5920931 = fieldWeight in 1053, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1053)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper evaluates and compares the retrieval effectiveness of various search models, based on either automatic text-word indexing or on manually assigned controlled descriptors. Retrieval is from a relatively large collection of bibliographic material written in French. Moreover, for this French collection we evaluate improvements that result from combining automatic and manual indexing. First, when considering various contexts, this study reveals that the combined indexing strategy always obtains the best retrieval performance. Second, when users wish to conduct exhaustive searches with minimal effort, we demonstrate that manually assigned terms are essential. Third, the evaluations presented in this paper study reveal the comparative retrieval performances that result from manual and automatic indexing in a variety of circumstances.
  6. Voorhees, E.M.; Harman, D.: Overview of the Sixth Text REtrieval Conference (TREC-6) (2000) 0.02
    0.015707046 = product of:
      0.047121134 = sum of:
        0.047121134 = product of:
          0.09424227 = sum of:
            0.09424227 = weight(_text_:22 in 6438) [ClassicSimilarity], result of:
              0.09424227 = score(doc=6438,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.5416616 = fieldWeight in 6438, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6438)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    11. 8.2001 16:22:19
  7. Ding, C.H.Q.: ¬A probabilistic model for Latent Semantic Indexing (2005) 0.01
    0.013931636 = product of:
      0.041794907 = sum of:
        0.041794907 = product of:
          0.083589815 = sum of:
            0.083589815 = weight(_text_:indexing in 3459) [ClassicSimilarity], result of:
              0.083589815 = score(doc=3459,freq=6.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.4395151 = fieldWeight in 3459, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3459)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Latent Semantic Indexing (LSI), when applied to semantic space built an text collections, improves information retrieval, information filtering, and word sense disambiguation. A new dual probability model based an the similarity concepts is introduced to provide deeper understanding of LSI. Semantic associations can be quantitatively characterized by their statistical significance, the likelihood. Semantic dimensions containing redundant and noisy information can be separated out and should be ignored because their negative contribution to the overall statistical significance. LSI is the optimal solution of the model. The peak in the likelihood curve indicates the existence of an intrinsic semantic dimension. The importance of LSI dimensions follows the Zipf-distribution, indicating that LSI dimensions represent latent concepts. Document frequency of words follows the Zipf distribution, and the number of distinct words follows log-normal distribution. Experiments an five standard document collections confirm and illustrate the analysis.
    Object
    Latent Semantic Indexing
  8. Lioma, C.; Ounis, I.: ¬A syntactically-based query reformulation technique for information retrieval (2008) 0.01
    0.01072458 = product of:
      0.032173738 = sum of:
        0.032173738 = product of:
          0.064347476 = sum of:
            0.064347476 = weight(_text_:indexing in 2031) [ClassicSimilarity], result of:
              0.064347476 = score(doc=2031,freq=8.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.3383389 = fieldWeight in 2031, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2031)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Whereas in language words of high frequency are generally associated with low content [Bookstein, A., & Swanson, D. (1974). Probabilistic models for automatic indexing. Journal of the American Society of Information Science, 25(5), 312-318; Damerau, F. J. (1965). An experiment in automatic indexing. American Documentation, 16, 283-289; Harter, S. P. (1974). A probabilistic approach to automatic keyword indexing. PhD thesis, University of Chicago; Sparck-Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11-21; Yu, C., & Salton, G. (1976). Precision weighting - an effective automatic indexing method. Journal of the Association for Computer Machinery (ACM), 23(1), 76-88], shallow syntactic fragments of high frequency generally correspond to lexical fragments of high content [Lioma, C., & Ounis, I. (2006). Examining the content load of part of speech blocks for information retrieval. In Proceedings of the international committee on computational linguistics and the association for computational linguistics (COLING/ACL 2006), Sydney, Australia]. We implement this finding to Information Retrieval, as follows. We present a novel automatic query reformulation technique, which is based on shallow syntactic evidence induced from various language samples, and used to enhance the performance of an Information Retrieval system. Firstly, we draw shallow syntactic evidence from language samples of varying size, and compare the effect of language sample size upon retrieval performance, when using our syntactically-based query reformulation (SQR) technique. Secondly, we compare SQR to a state-of-the-art probabilistic pseudo-relevance feedback technique. Additionally, we combine both techniques and evaluate their compatibility. We evaluate our proposed technique across two standard Text REtrieval Conference (TREC) English test collections, and three statistically different weighting models. Experimental results suggest that SQR markedly enhances retrieval performance, and is at least comparable to pseudo-relevance feedback. Notably, the combination of SQR and pseudo-relevance feedback further enhances retrieval performance considerably. These collective experimental results confirm the tenet that high frequency shallow syntactic fragments correspond to content-bearing lexical fragments.
  9. Dresel, R.; Hörnig, D.; Kaluza, H.; Peter, A.; Roßmann, A.; Sieber, W.: Evaluation deutscher Web-Suchwerkzeuge : Ein vergleichender Retrievaltest (2001) 0.01
    0.008975455 = product of:
      0.026926363 = sum of:
        0.026926363 = product of:
          0.053852726 = sum of:
            0.053852726 = weight(_text_:22 in 261) [ClassicSimilarity], result of:
              0.053852726 = score(doc=261,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.30952093 = fieldWeight in 261, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=261)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Die deutschen Suchmaschinen, Abacho, Acoon, Fireball und Lycos sowie die Web-Kataloge Web.de und Yahoo! werden einem Qualitätstest nach relativem Recall, Precision und Availability unterzogen. Die Methoden der Retrievaltests werden vorgestellt. Im Durchschnitt werden bei einem Cut-Off-Wert von 25 ein Recall von rund 22%, eine Precision von knapp 19% und eine Verfügbarkeit von 24% erreicht
  10. ¬The Eleventh Text Retrieval Conference, TREC 2002 (2003) 0.01
    0.008975455 = product of:
      0.026926363 = sum of:
        0.026926363 = product of:
          0.053852726 = sum of:
            0.053852726 = weight(_text_:22 in 4049) [ClassicSimilarity], result of:
              0.053852726 = score(doc=4049,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.30952093 = fieldWeight in 4049, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4049)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Proceedings of the llth TREC-conference held in Gaithersburg, Maryland (USA), November 19-22, 2002. Aim of the conference was discussion an retrieval and related information-seeking tasks for large test collection. 93 research groups used different techniques, for information retrieval from the same large database. This procedure makes it possible to compare the results. The tasks are: Cross-language searching, filtering, interactive searching, searching for novelty, question answering, searching for video shots, and Web searching.
  11. Savoy , J.: Cross-language information retrieval : experiments based an CLEF 2000 corpora (2003) 0.01
    0.0080434345 = product of:
      0.024130303 = sum of:
        0.024130303 = product of:
          0.048260607 = sum of:
            0.048260607 = weight(_text_:indexing in 1034) [ClassicSimilarity], result of:
              0.048260607 = score(doc=1034,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.2537542 = fieldWeight in 1034, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1034)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Search engines play an essential role in the usability of Internet-based information systems and without them the Web would be much less accessible, and at the very least would develop at a much slower rate. Given that non-English users now tend to make up the majority in this environment, our main objective is to analyze and evaluate the retrieval effectiveness of various indexing and search strategies based on test-collections written in four different languages: English, French, German, and Italian. Our second objective is to describe and evaluate various approaches that might be implemented in order to effectively access document collections written in another language. As a third objective, we will explore the underlying problems involved in searching document collections written in the four different languages, and we will suggest and evaluate different database merging strategies capable of providing the user with a single unique result list.
  12. Abdou, S.; Savoy, J.: Searching in Medline : query expansion and manual indexing evaluation (2008) 0.01
    0.0080434345 = product of:
      0.024130303 = sum of:
        0.024130303 = product of:
          0.048260607 = sum of:
            0.048260607 = weight(_text_:indexing in 2062) [ClassicSimilarity], result of:
              0.048260607 = score(doc=2062,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.2537542 = fieldWeight in 2062, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2062)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
  13. King, D.W.: Blazing new trails : in celebration of an audacious career (2000) 0.01
    0.005609659 = product of:
      0.016828977 = sum of:
        0.016828977 = product of:
          0.033657953 = sum of:
            0.033657953 = weight(_text_:22 in 1184) [ClassicSimilarity], result of:
              0.033657953 = score(doc=1184,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.19345059 = fieldWeight in 1184, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1184)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 9.1997 19:16:05
  14. Petrelli, D.: On the role of user-centred evaluation in the advancement of interactive information retrieval (2008) 0.01
    0.005609659 = product of:
      0.016828977 = sum of:
        0.016828977 = product of:
          0.033657953 = sum of:
            0.033657953 = weight(_text_:22 in 2026) [ClassicSimilarity], result of:
              0.033657953 = score(doc=2026,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.19345059 = fieldWeight in 2026, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2026)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Information processing and management. 44(2008) no.1, S.22-38
  15. Larsen, B.; Ingwersen, P.; Lund, B.: Data fusion according to the principle of polyrepresentation (2009) 0.00
    0.0044877273 = product of:
      0.013463181 = sum of:
        0.013463181 = product of:
          0.026926363 = sum of:
            0.026926363 = weight(_text_:22 in 2752) [ClassicSimilarity], result of:
              0.026926363 = score(doc=2752,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.15476047 = fieldWeight in 2752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2752)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 3.2009 18:48:28