Search (10 results, page 1 of 1)

  • × theme_ss:"Retrievalstudien"
  • × theme_ss:"Volltextretrieval"
  1. Blair, D.C.; Maron, M.E.: ¬An evaluation of retrieval effectiveness for a full-text document-retrieval system (1985) 0.03
    0.026390245 = product of:
      0.065975614 = sum of:
        0.046599183 = weight(_text_:system in 1345) [ClassicSimilarity], result of:
          0.046599183 = score(doc=1345,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.3479797 = fieldWeight in 1345, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.078125 = fieldNorm(doc=1345)
        0.01937643 = product of:
          0.05812929 = sum of:
            0.05812929 = weight(_text_:29 in 1345) [ClassicSimilarity], result of:
              0.05812929 = score(doc=1345,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.38865322 = fieldWeight in 1345, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1345)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Footnote
    Vgl. auch : Salton, G.: Another look ... Comm. ACM 29(1986) S.S.648-656; Blair, D.C.: Full text retrieval ... Int. Class. 13(1986) S.18-23: Blair, D.C., M.E. Maron: Full-text information retrieval ... Inf. proc. man. 26(1990) S.437-447.
  2. Blair, D.C.: Full text retrieval : Evaluation and implications (1986) 0.02
    0.020466631 = product of:
      0.05116658 = sum of:
        0.03954072 = weight(_text_:system in 2047) [ClassicSimilarity], result of:
          0.03954072 = score(doc=2047,freq=4.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.29527056 = fieldWeight in 2047, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2047)
        0.011625858 = product of:
          0.034877572 = sum of:
            0.034877572 = weight(_text_:29 in 2047) [ClassicSimilarity], result of:
              0.034877572 = score(doc=2047,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.23319192 = fieldWeight in 2047, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2047)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    Recently, a detailed evaluation of a large, operational full-text document retrieval system was reported in the literature. Values of precision and recall were estimated usind traditional statistical sampling methods and blind evaluation procedures. The results of this evaluation demonstrated that the system tested was retrieving less then 20% of the relevant documents when the searchers believed it was retrieving over 75% of the relevant documents. This evaluation is described including some data not reported in the original article. Also discussed are the implications which this study has for how the subjects of documents should be represented, as well as the importance of rigorous retrieval evaluations for the furtherhance of information retrieval research
    Footnote
    Vgl.: Blair, D.C., M.E. Maron: An evaluation ... Comm. ACM 28(1985) S.280-299; Salton, G.: Another look ... Comm. ACM 29(1986) S.648-656; Blair, D.C., M.E. Maron: Full-text information retrieval ... Inf. Proc. Man. 26(1990) S.437-447.
  3. Leppanen, E.: Homografiongelma tekstihaussa ja homografien disambiguoinnin vaikutukset (1996) 0.02
    0.015834149 = product of:
      0.03958537 = sum of:
        0.027959513 = weight(_text_:system in 27) [ClassicSimilarity], result of:
          0.027959513 = score(doc=27,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.20878783 = fieldWeight in 27, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=27)
        0.011625858 = product of:
          0.034877572 = sum of:
            0.034877572 = weight(_text_:29 in 27) [ClassicSimilarity], result of:
              0.034877572 = score(doc=27,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.23319192 = fieldWeight in 27, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=27)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    Homonymy is known to often cause false drops in free text searching in a full text database. The problem is quite common and difficult to avoid in Finnish, but nobody has examined it before. Reports on a study that examined the frequency of, and solutions to, the homonymy problem, based on searches made in a Finnish full text database containing about 55.000 newspaper articles. The results indicate that homonymy is not a very serious problem in full text searching, with only about 1 search result set out of 4 containing false drops caused by homonymy. Several other reasons for nonrelevance were much more common. However, in some set results there were a considerable number of homonymy errors, so the number seems to be very random. A study was also made into whether homonyms can be disambiguated by syntactic analysis. The result was that 75,2% of homonyms were disambiguated by this method. Verb homonyms were considerably easier to disambiguate than substantives. Although homonymy is not a very big problem it could perhaps easily be eliminated if there was a suitable syntactic analyzer in the IR system
    Date
    9.12.1997 18:33:29
  4. Bernstein, L.M.; Williamson, R.E.: Testing of a natural language retrieval system for a full text knowledge base (1984) 0.01
    0.013047772 = product of:
      0.06523886 = sum of:
        0.06523886 = weight(_text_:system in 1803) [ClassicSimilarity], result of:
          0.06523886 = score(doc=1803,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.4871716 = fieldWeight in 1803, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.109375 = fieldNorm(doc=1803)
      0.2 = coord(1/5)
    
  5. Hider, P.: ¬The search value added by professional indexing to a bibliographic database (2017) 0.01
    0.008970084 = product of:
      0.044850416 = sum of:
        0.044850416 = weight(_text_:index in 3868) [ClassicSimilarity], result of:
          0.044850416 = score(doc=3868,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.24139762 = fieldWeight in 3868, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3868)
      0.2 = coord(1/5)
    
    Abstract
    Gross et al. (2015) have demonstrated that about a quarter of hits would typically be lost to keyword searchers if contemporary academic library catalogs dropped their controlled subject headings. This paper reports on an analysis of the loss levels that would result if a bibliographic database, namely the Australian Education Index (AEI), were missing the subject descriptors and identifiers assigned by its professional indexers, employing the methodology developed by Gross and Taylor (2005), and later by Gross et al. (2015). The results indicate that AEI users would lose a similar proportion of hits per query to that experienced by library catalog users: on average, 27% of the resources found by a sample of keyword queries on the AEI database would not have been found without the subject indexing, based on the Australian Thesaurus of Education Descriptors (ATED). The paper also discusses the methodological limitations of these studies, pointing out that real-life users might still find some of the resources missed by a particular query through follow-up searches, while additional resources might also be found through iterative searching on the subject vocabulary. The paper goes on to describe a new research design, based on a before - and - after experiment, which addresses some of these limitations. It is argued that this alternative design will provide a more realistic picture of the value that professionally assigned subject indexing and controlled subject vocabularies can add to literature searching of a more scholarly and thorough kind.
  6. Hider, P.: ¬The search value added by professional indexing to a bibliographic database (2018) 0.01
    0.008970084 = product of:
      0.044850416 = sum of:
        0.044850416 = weight(_text_:index in 4300) [ClassicSimilarity], result of:
          0.044850416 = score(doc=4300,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.24139762 = fieldWeight in 4300, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4300)
      0.2 = coord(1/5)
    
    Abstract
    Gross et al. (2015) have demonstrated that about a quarter of hits would typically be lost to keyword searchers if contemporary academic library catalogs dropped their controlled subject headings. This article reports on an investigation of the search value that subject descriptors and identifiers assigned by professional indexers add to a bibliographic database, namely the Australian Education Index (AEI). First, a similar methodology to that developed by Gross et al. (2015) was applied, with keyword searches representing a range of educational topics run on the AEI database with and without its subject indexing. The results indicated that AEI users would also lose, on average, about a quarter of hits per query. Second, an alternative research design was applied in which an experienced literature searcher was asked to find resources on a set of educational topics on an AEI database stripped of its subject indexing and then asked to search for additional resources on the same topics after the subject indexing had been reinserted. In this study, the proportion of additional resources that would have been lost had it not been for the subject indexing was again found to be about a quarter of the total resources found for each topic, on average.
  7. Pirkola, A.; Jarvelin, K.: ¬The effect of anaphor and ellipsis resolution on proximity searching in a text database (1995) 0.01
    0.008069678 = product of:
      0.040348392 = sum of:
        0.040348392 = weight(_text_:context in 4088) [ClassicSimilarity], result of:
          0.040348392 = score(doc=4088,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.22896172 = fieldWeight in 4088, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4088)
      0.2 = coord(1/5)
    
    Abstract
    So far, methods for ellipsis and anaphor resolution have been developed and the effects of anaphor resolution have been analyzed in the context of statistical information retrieval of scientific abstracts. No significant improvements has been observed. Analyzes the effects of ellipsis and anaphor resolution on proximity searching in a full text database. Anaphora and ellipsis are classified on the basis of the type of their correlates / antecedents rather than, as traditional, on the basis of their own linguistic type. The classification differentiates proper names and common nouns of basic words, compound words, and phrases. The study was carried out in a newspaper article database containing 55.000 full text articles. A set of 154 keyword pairs in different categories was created. Human resolution of keyword ellipsis and anaphora was performed to identify sentences and paragraphs which would match proximity searches after resolution. Findings indicate that ellipsis and anaphor resolution is most relevant for proper name phrases and only marginal in the other keyword categories. Therefore the recall effect of restricted resolution of proper name phrases only was analyzed for keyword pairs containing at least 1 proper name phrase. Findings indicate a recall increase of 38.2% in sentence searches, and 28.8% in paragraph searches when proper name ellipsis were resolved. The recall increase was 17.6% sentence searches, and 19.8% in paragraph searches when proper name anaphora were resolved. Some simple and computationally justifiable resolution method might be developed only for proper name phrases to support keyword based full text information retrieval. Discusses elements of such a method
  8. Blair, D.C.; Maron, M.E.: Full-text information retrieval : further analysis and clarification (1990) 0.01
    0.00745587 = product of:
      0.03727935 = sum of:
        0.03727935 = weight(_text_:system in 2046) [ClassicSimilarity], result of:
          0.03727935 = score(doc=2046,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.27838376 = fieldWeight in 2046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=2046)
      0.2 = coord(1/5)
    
    Abstract
    In 1985, an article by Blair and Maron described a detailed evaluation of the effectiveness of an operational full text retrieval system used to support the defense of a large corporate lawsuit. The following year Salton published an article which called into question the conclusions of the 1985 study. The following article briefly reviews the initial study, replies to the objections raised by the secon article, and clarifies several confusions and misunderstandings of the 1985 study
  9. Kristensen, J.: Expanding end-users' query statements for free text searching with a search-aid thesaurus (1993) 0.00
    0.0031002287 = product of:
      0.015501143 = sum of:
        0.015501143 = product of:
          0.04650343 = sum of:
            0.04650343 = weight(_text_:29 in 6621) [ClassicSimilarity], result of:
              0.04650343 = score(doc=6621,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.31092256 = fieldWeight in 6621, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6621)
          0.33333334 = coord(1/3)
      0.2 = coord(1/5)
    
    Source
    Information processing and management. 29(1993) no.6, S.733-744
  10. Sievert, M.E.; McKinin, E.J.: Why full-text misses some relevant documents : an analysis of documents not retrieved by CCML or MEDIS (1989) 0.00
    0.0023042548 = product of:
      0.011521274 = sum of:
        0.011521274 = product of:
          0.03456382 = sum of:
            0.03456382 = weight(_text_:22 in 3564) [ClassicSimilarity], result of:
              0.03456382 = score(doc=3564,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.23214069 = fieldWeight in 3564, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3564)
          0.33333334 = coord(1/3)
      0.2 = coord(1/5)
    
    Date
    9. 1.1996 10:22:31