Search (4 results, page 1 of 1)

Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.03

0.02600406 = product of:
  0.05200812 = sum of:
    0.05200812 = product of:
      0.10401624 = sum of:
        0.10401624 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.10401624 = score(doc=6265,freq=2.0), product of:
            0.19203177 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.054837555 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information outlook. 9(2005) no.8, S.22-23

Salton, G.: SMART System: 1961-1976 (2009) 0.02

0.016324414 = product of:
  0.032648828 = sum of:
    0.032648828 = product of:
      0.065297656 = sum of:
        0.065297656 = weight(_text_:work in 3879) [ClassicSimilarity], result of:
          0.065297656 = score(doc=3879,freq=2.0), product of:
            0.20127523 = queryWeight, product of:
              3.6703904 = idf(docFreq=3060, maxDocs=44218)
              0.054837555 = queryNorm
            0.32441974 = fieldWeight in 3879, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6703904 = idf(docFreq=3060, maxDocs=44218)
              0.0625 = fieldNorm(doc=3879)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: While a number of researchers had experimented during the 1950's on automatic indexing and retrieval in various forms, it was Gerard Salton who brought the information retrieval experimental paradigm to full fruition, with his "SMART" system. His work has been enormously influential.

Newman, D.J.; Block, S.: Probabilistic topic decomposition of an eighteenth-century American newspaper (2006) 0.01

0.01300203 = product of:
  0.02600406 = sum of:
    0.02600406 = product of:
      0.05200812 = sum of:
        0.05200812 = weight(_text_:22 in 5291) [ClassicSimilarity], result of:
          0.05200812 = score(doc=5291,freq=2.0), product of:
            0.19203177 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.054837555 = queryNorm
            0.2708308 = fieldWeight in 5291, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5291)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 7.2006 17:32:00

Mansour, N.; Haraty, R.A.; Daher, W.; Houri, M.: ¬An auto-indexing method for Arabic text (2008) 0.01
```
0.012243311 = product of:
  0.024486622 = sum of:
    0.024486622 = product of:
      0.048973244 = sum of:
        0.048973244 = weight(_text_:work in 2103) [ClassicSimilarity], result of:
          0.048973244 = score(doc=2103,freq=2.0), product of:
            0.20127523 = queryWeight, product of:
              3.6703904 = idf(docFreq=3060, maxDocs=44218)
              0.054837555 = queryNorm
            0.2433148 = fieldWeight in 2103, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6703904 = idf(docFreq=3060, maxDocs=44218)
              0.046875 = fieldNorm(doc=2103)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This work addresses the information retrieval problem of auto-indexing Arabic documents. Auto-indexing a text document refers to automatically extracting words that are suitable for building an index for the document. In this paper, we propose an auto-indexing method for Arabic text documents. This method is mainly based on morphological analysis and on a technique for assigning weights to words. The morphological analysis uses a number of grammatical rules to extract stem words that become candidate index words. The weight assignment technique computes weights for these words relative to the container document. The weight is based on how spread is the word in a document and not only on its rate of occurrence. The candidate index words are then sorted in descending order by weight so that information retrievers can select the more important index words. We empirically verify the usefulness of our method using several examples. For these examples, we obtained an average recall of 46% and an average precision of 64%.

Search (4 results, page 1 of 1)

Authors