Search (59 results, page 1 of 3)

Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.06

0.06357585 = product of:
  0.1271517 = sum of:
    0.1271517 = product of:
      0.19072755 = sum of:
        0.13006248 = weight(_text_:n in 1952) [ClassicSimilarity], result of:
          0.13006248 = score(doc=1952,freq=4.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.67369634 = fieldWeight in 1952, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
        0.06066507 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
          0.06066507 = score(doc=1952,freq=2.0), product of:
            0.15679733 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044775832 = queryNorm
            0.38690117 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Date: 16. 8.1998 12:51:22

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.04

0.036871053 = product of:
  0.07374211 = sum of:
    0.07374211 = product of:
      0.11061315 = sum of:
        0.049948085 = weight(_text_:j in 4157) [ClassicSimilarity], result of:
          0.049948085 = score(doc=4157,freq=2.0), product of:
            0.14227505 = queryWeight, product of:
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.044775832 = queryNorm
            0.35106707 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
        0.06066507 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.06066507 = score(doc=4157,freq=2.0), product of:
            0.15679733 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044775832 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.04

0.036871053 = product of:
  0.07374211 = sum of:
    0.07374211 = product of:
      0.11061315 = sum of:
        0.049948085 = weight(_text_:j in 2759) [ClassicSimilarity], result of:
          0.049948085 = score(doc=2759,freq=2.0), product of:
            0.14227505 = queryWeight, product of:
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.044775832 = queryNorm
            0.35106707 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
        0.06066507 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
          0.06066507 = score(doc=2759,freq=2.0), product of:
            0.15679733 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044775832 = queryNorm
            0.38690117 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Date: 1. 2.2016 18:25:22
Source: Semantic keyword-based search on structured data sources: First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers. Eds.: J. Cardoso et al

Dolamic, L.; Savoy, J.: Indexing and searching strategies for the Russian language (2009) 0.03
```
0.03000176 = product of:
  0.06000352 = sum of:
    0.06000352 = product of:
      0.09000528 = sum of:
        0.024974043 = weight(_text_:j in 3301) [ClassicSimilarity], result of:
          0.024974043 = score(doc=3301,freq=2.0), product of:
            0.14227505 = queryWeight, product of:
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.044775832 = queryNorm
            0.17553353 = fieldWeight in 3301, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3301)
        0.06503124 = weight(_text_:n in 3301) [ClassicSimilarity], result of:
          0.06503124 = score(doc=3301,freq=4.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.33684817 = fieldWeight in 3301, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3301)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)
```
Abstract

This paper describes and evaluates various stemming and indexing strategies for the Russian language. We design and evaluate two stemming approaches, a light and a more aggressive one, and compare these stemmers to the Snowball stemmer, to no stemming, and also to a language-independent approach (n-gram). To evaluate the suggested stemming strategies we apply various probabilistic information retrieval (IR) models, including the Okapi, the Divergence from Randomness (DFR), a statistical language model (LM), as well as two vector-space approaches, namely, the classical tf idf scheme and the dtu-dtn model. We find that the vector-space dtu-dtn and the DFR models tend to result in better retrieval effectiveness than the Okapi, LM, or tf idf models, while only the latter two IR approaches result in statistically significant performance differences. Ignoring stemming generally reduces the MAP by more than 50%, and these differences are always significant. When applying an n-gram approach, performance differences are usually lower than an approach involving stemming. Finally, our light stemmer tends to perform best, although performance differences between the light, aggressive, and Snowball stemmers are not statistically significant.

Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.03

0.029496845 = product of:
  0.05899369 = sum of:
    0.05899369 = product of:
      0.08849053 = sum of:
        0.03995847 = weight(_text_:j in 4709) [ClassicSimilarity], result of:
          0.03995847 = score(doc=4709,freq=2.0), product of:
            0.14227505 = queryWeight, product of:
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.044775832 = queryNorm
            0.28085366 = fieldWeight in 4709, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
        0.048532058 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
          0.048532058 = score(doc=4709,freq=2.0), product of:
            0.15679733 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044775832 = queryNorm
            0.30952093 = fieldWeight in 4709, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
      0.6666667 = coord(2/3)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Jardine, N.; Rijsbergen, C.J. van: ¬The use of hierarchic clustering in information retrieval (1971) 0.02

0.024524815 = product of:
  0.04904963 = sum of:
    0.04904963 = product of:
      0.14714889 = sum of:
        0.14714889 = weight(_text_:n in 5170) [ClassicSimilarity], result of:
          0.14714889 = score(doc=5170,freq=2.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.76220036 = fieldWeight in 5170, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.125 = fieldNorm(doc=5170)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Garfield, E.; Sager, N.: Mechanical indexing, structural linguistics and information retrieval (1993) 0.02

0.024524815 = product of:
  0.04904963 = sum of:
    0.04904963 = product of:
      0.14714889 = sum of:
        0.14714889 = weight(_text_:n in 5900) [ClassicSimilarity], result of:
          0.14714889 = score(doc=5900,freq=2.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.76220036 = fieldWeight in 5900, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.125 = fieldNorm(doc=5900)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Fuhr, N.; Knorz, G.: Retrieval test evaluation of a rule based automatic indexing (AIR/PHYS) (1984) 0.02

0.018393612 = product of:
  0.036787223 = sum of:
    0.036787223 = product of:
      0.110361665 = sum of:
        0.110361665 = weight(_text_:n in 2321) [ClassicSimilarity], result of:
          0.110361665 = score(doc=2321,freq=2.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.57165027 = fieldWeight in 2321, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.09375 = fieldNorm(doc=2321)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.02

0.016177353 = product of:
  0.032354705 = sum of:
    0.032354705 = product of:
      0.097064115 = sum of:
        0.097064115 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.097064115 = score(doc=402,freq=2.0), product of:
            0.15679733 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044775832 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: Information processing and management. 22(1986) no.6, S.465-476

Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.01

0.014155183 = product of:
  0.028310366 = sum of:
    0.028310366 = product of:
      0.0849311 = sum of:
        0.0849311 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.0849311 = score(doc=6265,freq=2.0), product of:
            0.15679733 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044775832 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: Information outlook. 9(2005) no.8, S.22-23

Salton, G.; Allen, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine-readable data (1994) 0.01

0.011654554 = product of:
  0.023309108 = sum of:
    0.023309108 = product of:
      0.06992732 = sum of:
        0.06992732 = weight(_text_:j in 1168) [ClassicSimilarity], result of:
          0.06992732 = score(doc=1168,freq=2.0), product of:
            0.14227505 = queryWeight, product of:
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.044775832 = queryNorm
            0.4914939 = fieldWeight in 1168, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.109375 = fieldNorm(doc=1168)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Cohen, J.D.: Highlights: language- and domain-independent automatic indexing terms for abstracting (1995) 0.01
```
0.010729606 = product of:
  0.021459213 = sum of:
    0.021459213 = product of:
      0.064377636 = sum of:
        0.064377636 = weight(_text_:n in 1793) [ClassicSimilarity], result of:
          0.064377636 = score(doc=1793,freq=2.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.33346266 = fieldWeight in 1793, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1793)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Abstract

Presents a model of drawing index terms from text. The approach uses no stop list, stemmer, or other language and domain specific component, allowing operation in any language or domain with only trivial modification. The method uses n-grams counts, achieving a function similar to, but more general than, a stemmer. The generated index terms, called 'highlights', are suitable for identifying the topic for perusal and selection. An extension is also described and demonstrated which selects index terms to represent a subset of documents, distinguishing them from the corpus. Presents some experimental results, showing operation in English, Spanish, German, Georgian, Russian and Japanese

Pfeifer, U.; Fuhr, N.; Huynh, T.: Searching structured documents with the enhanced retrieval functionality of freeWAIS-sf and SFgate (1995) 0.01

0.010729606 = product of:
  0.021459213 = sum of:
    0.021459213 = product of:
      0.064377636 = sum of:
        0.064377636 = weight(_text_:n in 2214) [ClassicSimilarity], result of:
          0.064377636 = score(doc=2214,freq=2.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.33346266 = fieldWeight in 2214, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2214)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Salton, G.: Future prospects for text-based information retrieval (1990) 0.01

0.009989617 = product of:
  0.019979235 = sum of:
    0.019979235 = product of:
      0.059937704 = sum of:
        0.059937704 = weight(_text_:j in 2327) [ClassicSimilarity], result of:
          0.059937704 = score(doc=2327,freq=2.0), product of:
            0.14227505 = queryWeight, product of:
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.044775832 = queryNorm
            0.4212805 = fieldWeight in 2327, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.09375 = fieldNorm(doc=2327)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: Pragmatische Aspekte beim Entwurf und Betrieb von Informationssystemen: Proc. des 1. Int. Symposiums für Informationswissenschaft, Universität Konstanz, 17.-19.10.1990. Hrsg.: J. Herget u. R. Kuhlen

Salton, G.; Araya, J.: On the use of clustered file organizations in information search and retrieval (1990) 0.01

0.009989617 = product of:
  0.019979235 = sum of:
    0.019979235 = product of:
      0.059937704 = sum of:
        0.059937704 = weight(_text_:j in 2409) [ClassicSimilarity], result of:
          0.059937704 = score(doc=2409,freq=2.0), product of:
            0.14227505 = queryWeight, product of:
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.044775832 = queryNorm
            0.4212805 = fieldWeight in 2409, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.09375 = fieldNorm(doc=2409)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Anderson, J.D.; Pérez-Carballo, J.: ¬The nature of indexing: how humans and machines analyze messages and texts for retrieval : Part I: Research and the nature of human indexing (2001) 0.01

0.009989617 = product of:
  0.019979235 = sum of:
    0.019979235 = product of:
      0.059937704 = sum of:
        0.059937704 = weight(_text_:j in 3136) [ClassicSimilarity], result of:
          0.059937704 = score(doc=3136,freq=2.0), product of:
            0.14227505 = queryWeight, product of:
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.044775832 = queryNorm
            0.4212805 = fieldWeight in 3136, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.09375 = fieldNorm(doc=3136)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Thirion, B.; Leroy, J.P.; Baudic, F.; Douyère, M.; Piot, J.; Darmoni, S.J.: SDI selecting, decribing, and indexing : did you mean automatically? (2001) 0.01

0.009989617 = product of:
  0.019979235 = sum of:
    0.019979235 = product of:
      0.059937704 = sum of:
        0.059937704 = weight(_text_:j in 6198) [ClassicSimilarity], result of:
          0.059937704 = score(doc=6198,freq=2.0), product of:
            0.14227505 = queryWeight, product of:
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.044775832 = queryNorm
            0.4212805 = fieldWeight in 6198, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1774964 = idf(docFreq=5010, maxDocs=44218)
              0.09375 = fieldNorm(doc=6198)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Wacholder, N.; Byrd, R.J.: Retrieving information from full text using linguistic knowledge (1994) 0.01

0.009196806 = product of:
  0.018393612 = sum of:
    0.018393612 = product of:
      0.055180833 = sum of:
        0.055180833 = weight(_text_:n in 8524) [ClassicSimilarity], result of:
          0.055180833 = score(doc=8524,freq=2.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.28582513 = fieldWeight in 8524, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.046875 = fieldNorm(doc=8524)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Mansour, N.; Haraty, R.A.; Daher, W.; Houri, M.: ¬An auto-indexing method for Arabic text (2008) 0.01

0.009196806 = product of:
  0.018393612 = sum of:
    0.018393612 = product of:
      0.055180833 = sum of:
        0.055180833 = weight(_text_:n in 2103) [ClassicSimilarity], result of:
          0.055180833 = score(doc=2103,freq=2.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.28582513 = fieldWeight in 2103, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.046875 = fieldNorm(doc=2103)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Fauzi, F.; Belkhatir, M.: Multifaceted conceptual image indexing on the world wide web (2013) 0.01
```
0.009196806 = product of:
  0.018393612 = sum of:
    0.018393612 = product of:
      0.055180833 = sum of:
        0.055180833 = weight(_text_:n in 2721) [ClassicSimilarity], result of:
          0.055180833 = score(doc=2721,freq=2.0), product of:
            0.19305801 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.044775832 = queryNorm
            0.28582513 = fieldWeight in 2721, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.046875 = fieldNorm(doc=2721)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Abstract

In this paper, we describe a user-centered design of an automated multifaceted concept-based indexing framework which analyzes the semantics of the Web image contextual information and classifies it into five broad semantic concept facets: signal, object, abstract, scene, and relational; and identifies the semantic relationships between the concepts. An important aspect of our indexing model is that it relates to the users' levels of image descriptions. Also, a major contribution relies on the fact that the classification is performed automatically with the raw image contextual information extracted from any general webpage and is not solely based on image tags like state-of-the-art solutions. Human Language Technology techniques and an external knowledge base are used to analyze the information both syntactically and semantically. Experimental results on a human-annotated Web image collection and corresponding contextual information indicate that our method outperforms empirical frameworks employing tf-idf and location-based tf-idf weighting schemes as well as n-gram indexing in a recall/precision based evaluation framework.

Search (59 results, page 1 of 3)

Authors

Years

Themes