Search (5 results, page 1 of 1)

  • × theme_ss:"Computerlinguistik"
  • × author_ss:"Savoy, J."
  1. Savoy, J.: ¬A stemming procedure and stopword list for general French Corpora (1999) 0.00
    3.9466174E-4 = product of:
      0.005919926 = sum of:
        0.005919926 = product of:
          0.011839852 = sum of:
            0.011839852 = weight(_text_:information in 4314) [ClassicSimilarity], result of:
              0.011839852 = score(doc=4314,freq=2.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.23274569 = fieldWeight in 4314, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4314)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Source
    Journal of the American Society for Information Science. 50(1999) no.10, S.944-954
  2. Fautsch, C.; Savoy, J.: Algorithmic stemmers or morphological analysis? : an evaluation (2009) 0.00
    3.4178712E-4 = product of:
      0.0051268064 = sum of:
        0.0051268064 = product of:
          0.010253613 = sum of:
            0.010253613 = weight(_text_:information in 2950) [ClassicSimilarity], result of:
              0.010253613 = score(doc=2950,freq=6.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.20156369 = fieldWeight in 2950, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2950)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Abstract
    It is important in information retrieval (IR), information extraction, or classification tasks that morphologically related forms are conflated under the same stem (using stemmer) or lemma (using morphological analyzer). To achieve this for the English language, algorithmic stemming or various morphological analysis approaches have been suggested. Based on Cross-Language Evaluation Forum test collections containing 284 queries and various IR models, this article evaluates these word-normalization proposals. Stemming improves the mean average precision significantly by around 7% while performance differences are not significant when comparing various algorithmic stemmers or algorithmic stemmers and morphological analysis. Accounting for thesaurus class numbers during indexing does not modify overall retrieval performances. Finally, we demonstrate that including a stop word list, even one containing only around 10 terms, might significantly improve retrieval performance, depending on the IR model.
    Source
    Journal of the American Society for Information Science and Technology. 60(2009) no.8, S.1616-1624
  3. Dolamic, L.; Savoy, J.: Retrieval effectiveness of machine translated queries (2010) 0.00
    2.79068E-4 = product of:
      0.0041860198 = sum of:
        0.0041860198 = product of:
          0.0083720395 = sum of:
            0.0083720395 = weight(_text_:information in 4102) [ClassicSimilarity], result of:
              0.0083720395 = score(doc=4102,freq=4.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.16457605 = fieldWeight in 4102, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4102)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Abstract
    This article describes and evaluates various information retrieval models used to search document collections written in English through submitting queries written in various other languages, either members of the Indo-European family (English, French, German, and Spanish) or radically different language groups such as Chinese. This evaluation method involves searching a rather large number of topics (around 300) and using two commercial machine translation systems to translate across the language barriers. In this study, mean average precision is used to measure variances in retrieval effectiveness when a query language differs from the document language. Although performance differences are rather large for certain languages pairs, this does not mean that bilingual search methods are not commercially viable. Causes of the difficulties incurred when searching or during translation are analyzed and the results of concrete examples are explained.
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.11, S.2266-2273
  4. Savoy, J.: Searching strategies for the Hungarian language (2008) 0.00
    1.9733087E-4 = product of:
      0.002959963 = sum of:
        0.002959963 = product of:
          0.005919926 = sum of:
            0.005919926 = weight(_text_:information in 2037) [ClassicSimilarity], result of:
              0.005919926 = score(doc=2037,freq=2.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.116372846 = fieldWeight in 2037, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2037)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Source
    Information processing and management. 44(2008) no.1, S.310-324
  5. Savoy, J.: Text representation strategies : an example with the State of the union addresses (2016) 0.00
    1.6444239E-4 = product of:
      0.0024666358 = sum of:
        0.0024666358 = product of:
          0.0049332716 = sum of:
            0.0049332716 = weight(_text_:information in 3042) [ClassicSimilarity], result of:
              0.0049332716 = score(doc=3042,freq=2.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.09697737 = fieldWeight in 3042, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3042)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.8, S.1858-1870