Search (9 results, page 1 of 1)

  • × theme_ss:"Automatisches Indexieren"
  • × year_i:[2000 TO 2010}
  1. Dolamic, L.; Savoy, J.: Indexing and searching strategies for the Russian language (2009) 0.04
    0.040342852 = product of:
      0.080685705 = sum of:
        0.080685705 = product of:
          0.16137141 = sum of:
            0.16137141 = weight(_text_:light in 3301) [ClassicSimilarity], result of:
              0.16137141 = score(doc=3301,freq=6.0), product of:
                0.2920221 = queryWeight, product of:
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.050563898 = queryNorm
                0.55259997 = fieldWeight in 3301, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3301)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper describes and evaluates various stemming and indexing strategies for the Russian language. We design and evaluate two stemming approaches, a light and a more aggressive one, and compare these stemmers to the Snowball stemmer, to no stemming, and also to a language-independent approach (n-gram). To evaluate the suggested stemming strategies we apply various probabilistic information retrieval (IR) models, including the Okapi, the Divergence from Randomness (DFR), a statistical language model (LM), as well as two vector-space approaches, namely, the classical tf idf scheme and the dtu-dtn model. We find that the vector-space dtu-dtn and the DFR models tend to result in better retrieval effectiveness than the Okapi, LM, or tf idf models, while only the latter two IR approaches result in statistically significant performance differences. Ignoring stemming generally reduces the MAP by more than 50%, and these differences are always significant. When applying an n-gram approach, performance differences are usually lower than an approach involving stemming. Finally, our light stemmer tends to perform best, although performance differences between the light, aggressive, and Snowball stemmers are not statistically significant.
  2. Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.02
    0.023977486 = product of:
      0.047954973 = sum of:
        0.047954973 = product of:
          0.095909946 = sum of:
            0.095909946 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
              0.095909946 = score(doc=6265,freq=2.0), product of:
                0.17706616 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050563898 = queryNorm
                0.5416616 = fieldWeight in 6265, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6265)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information outlook. 9(2005) no.8, S.22-23
  3. Hauer, M.: Automatische Indexierung (2000) 0.02
    0.02055213 = product of:
      0.04110426 = sum of:
        0.04110426 = product of:
          0.08220852 = sum of:
            0.08220852 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
              0.08220852 = score(doc=5887,freq=2.0), product of:
                0.17706616 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050563898 = queryNorm
                0.46428138 = fieldWeight in 5887, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5887)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt
  4. Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.01
    0.013701421 = product of:
      0.027402842 = sum of:
        0.027402842 = product of:
          0.054805685 = sum of:
            0.054805685 = weight(_text_:22 in 3581) [ClassicSimilarity], result of:
              0.054805685 = score(doc=3581,freq=2.0), product of:
                0.17706616 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050563898 = queryNorm
                0.30952093 = fieldWeight in 3581, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3581)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    24. 3.2006 12:22:02
  5. Probst, M.; Mittelbach, J.: Maschinelle Indexierung in der Sacherschließung wissenschaftlicher Bibliotheken (2006) 0.01
    0.013701421 = product of:
      0.027402842 = sum of:
        0.027402842 = product of:
          0.054805685 = sum of:
            0.054805685 = weight(_text_:22 in 1755) [ClassicSimilarity], result of:
              0.054805685 = score(doc=1755,freq=2.0), product of:
                0.17706616 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050563898 = queryNorm
                0.30952093 = fieldWeight in 1755, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1755)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2008 12:35:19
  6. Renz, M.: Automatische Inhaltserschließung im Zeichen von Wissensmanagement (2001) 0.01
    0.011988743 = product of:
      0.023977486 = sum of:
        0.023977486 = product of:
          0.047954973 = sum of:
            0.047954973 = weight(_text_:22 in 5671) [ClassicSimilarity], result of:
              0.047954973 = score(doc=5671,freq=2.0), product of:
                0.17706616 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050563898 = queryNorm
                0.2708308 = fieldWeight in 5671, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5671)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2001 13:14:48
  7. Newman, D.J.; Block, S.: Probabilistic topic decomposition of an eighteenth-century American newspaper (2006) 0.01
    0.011988743 = product of:
      0.023977486 = sum of:
        0.023977486 = product of:
          0.047954973 = sum of:
            0.047954973 = weight(_text_:22 in 5291) [ClassicSimilarity], result of:
              0.047954973 = score(doc=5291,freq=2.0), product of:
                0.17706616 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050563898 = queryNorm
                0.2708308 = fieldWeight in 5291, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5291)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 17:32:00
  8. Lorenz, S.: Konzeption und prototypische Realisierung einer begriffsbasierten Texterschließung (2006) 0.01
    0.010276065 = product of:
      0.02055213 = sum of:
        0.02055213 = product of:
          0.04110426 = sum of:
            0.04110426 = weight(_text_:22 in 1746) [ClassicSimilarity], result of:
              0.04110426 = score(doc=1746,freq=2.0), product of:
                0.17706616 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050563898 = queryNorm
                0.23214069 = fieldWeight in 1746, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1746)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2015 9:17:30
  9. Nohr, H.: Grundlagen der automatischen Indexierung : ein Lehrbuch (2003) 0.01
    0.0068507106 = product of:
      0.013701421 = sum of:
        0.013701421 = product of:
          0.027402842 = sum of:
            0.027402842 = weight(_text_:22 in 1767) [ClassicSimilarity], result of:
              0.027402842 = score(doc=1767,freq=2.0), product of:
                0.17706616 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050563898 = queryNorm
                0.15476047 = fieldWeight in 1767, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1767)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 6.2009 12:46:51