Search (3 results, page 1 of 1)

  • × theme_ss:"Computerlinguistik"
  • × theme_ss:"Retrievalalgorithmen"
  • × year_i:[1990 TO 2000}
  1. Frakes, W.B.: Stemming algorithms (1992) 0.01
    0.00599504 = product of:
      0.041965276 = sum of:
        0.008071727 = weight(_text_:information in 3503) [ClassicSimilarity], result of:
          0.008071727 = score(doc=3503,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1551638 = fieldWeight in 3503, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3503)
        0.033893548 = weight(_text_:retrieval in 3503) [ClassicSimilarity], result of:
          0.033893548 = score(doc=3503,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.37811437 = fieldWeight in 3503, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=3503)
      0.14285715 = coord(2/14)
    
    Abstract
    Desribes stemming algorithms - programs that relate morphologically similar indexing and search terms. Stemming is used to improve retrieval effectiveness and to reduce the size of indexing files. Several approaches to stemming are describes - table lookup, affix removal, successor variety, and n-gram. empirical studies of stemming are summarized. The Porter stemmer is described in detail, and a full implementation in C is presented
    Source
    Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
  2. Abu-Salem, H.; Al-Omari, M.; Evens, M.W.: Stemming methodologies over individual query words for an Arabic information retrieval system (1999) 0.01
    0.005721087 = product of:
      0.04004761 = sum of:
        0.010089659 = weight(_text_:information in 3672) [ClassicSimilarity], result of:
          0.010089659 = score(doc=3672,freq=8.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.19395474 = fieldWeight in 3672, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3672)
        0.029957948 = weight(_text_:retrieval in 3672) [ClassicSimilarity], result of:
          0.029957948 = score(doc=3672,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.33420905 = fieldWeight in 3672, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3672)
      0.14285715 = coord(2/14)
    
    Abstract
    Stemming is one of the most important factors that affect the performance of information retrieval systems. This article investigates how to improve the performance of an Arabic information retrieval system by imposing the retrieval method over individual words of a query depending on the importance of the WORD, the STEM, or the ROOT of the query terms in the database. This method, called Mxed Stemming, computes term importance using a weighting scheme that use the Term Frequency (TF) and the Inverse Document Frequency (IDF), called TFxIDF. An extended version of the Arabic IRS system is designed, implemented, and evaluated to reduce the number of irrelevant documents retrieved. The results of the experiment suggest that the proposed method outperforms the Word index method using the TFxIDF weighting scheme. It also outperforms the Stem index method using the Binary weighting scheme but does not outperform the Stem index method using the TFxIDF weighting scheme, and again it outperforms the Root index method using the Binary weighting scheme but does not outperform the Root index method using the TFxIDF weighting scheme
    Source
    Journal of the American Society for Information Science. 50(1999) no.6, S.524-529
  3. Brenner, E.H.: Beyond Boolean : new approaches in information retrieval; the quest for intuitive online search systems past, present & future (1995) 0.00
    0.004743375 = product of:
      0.033203624 = sum of:
        0.012233062 = weight(_text_:information in 2547) [ClassicSimilarity], result of:
          0.012233062 = score(doc=2547,freq=6.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23515764 = fieldWeight in 2547, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2547)
        0.020970564 = weight(_text_:retrieval in 2547) [ClassicSimilarity], result of:
          0.020970564 = score(doc=2547,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.23394634 = fieldWeight in 2547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2547)
      0.14285715 = coord(2/14)
    
    Abstract
    The challenge of effectively bringing specific, relevant information from the global sea of data to our fingertips, has become an increasingly difficult one. Discusses how the online information industry, founded on Boolean search systems, may be evolving to take advantage of other methods, such as 'term weighting', 'relevance ranking' and 'query by example'