Search (2 results, page 1 of 1)

  • × theme_ss:"Computerlinguistik"
  • × theme_ss:"Retrievalalgorithmen"
  1. French, J.C.; Powell, A.L.; Schulman, E.: Using clustering strategies for creating authority files (2000) 0.00
    0.0015337638 = product of:
      0.012270111 = sum of:
        0.012270111 = product of:
          0.03681033 = sum of:
            0.03681033 = weight(_text_:problem in 4811) [ClassicSimilarity], result of:
              0.03681033 = score(doc=4811,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.28137225 = fieldWeight in 4811, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4811)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    As more online databases are integrated into digital libraries, the issue of quality control of the data becomes increasingly important, especially as it relates to the effective retrieval of information. Authority work, the need to discover and reconcile variant forms of strings in bibliographical entries, will become more critical in the future. Spelling variants, misspellings, and transliteration differences will all increase the difficulty of retrieving information. We investigate a number of approximate string matching techniques that have traditionally been used to help with this problem. We then introduce the notion of approximate word matching and show how it can be used to improve detection and categorization of variant forms. We demonstrate the utility of these approaches using data from the Astrophysics Data System and show how we can reduce the human effort involved in the creation of authority files
  2. Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.00
    0.0015337638 = product of:
      0.012270111 = sum of:
        0.012270111 = product of:
          0.03681033 = sum of:
            0.03681033 = weight(_text_:problem in 2502) [ClassicSimilarity], result of:
              0.03681033 = score(doc=2502,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.28137225 = fieldWeight in 2502, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2502)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    Prior efforts have shown that under certain situations retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single information retrieval system will lead to similar improvements. In this study, we show that this is not the case. We hold constant systemic differences such as parsing, stemming, phrase processing, and relevance feedback, and fuse result sets generated from highly effective retrieval strategies in the same information retrieval system. From this, we show that data fusion of highly effective retrieval strategies alone shows little or no improvement in retrieval effectiveness. Furthermore, we present a detailed analysis of the performance of modern data fusion approaches, and demonstrate the reasons why they do not perform weIl when applied to this problem. Detailed results and analyses are included to support our conclusions.