Document (#21506)

Author
Srinivasan, P.
Title
Thesaurus construction
Source
Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
Imprint
Englewood Cliffs, NJ : Prentice Hall
Year
1992
Pages
S.161-218
Abstract
Thesauri are valuable structures for Information Retrieval systems. A thesaurus provides a precise and controlled vocabulary which serves to coordinate dacument indexing and document retrieval. In both indexing and retrieval, a thesaurus may be used to select the most appropriate terms. Additionally, the thesaurus can assist the searcher in reformulating search strategies if required. Examines the important features of thesauri. This should allow the reader to differentiate between thesauri. Next, a brief overview of the manual thesaurus construction process is given. 2 major approaches for automatic thesaurus construction have been selected for detailed examination. The first is on thesaurus construction from collections of documents,a nd the 2nd, on thesaurus construction by merging existing thesauri. These 2 methods were selected since they rely on statistical techniques alone and are also significantly different from each other. Programs written in C language accompany the discussion of these approaches
Theme
Konzeption und Anwendung des Prinzips Thesaurus

Similar documents (author)

  1. Srinivasan, P.: Expert interface to Library of Congress Subject Headings (1990/91) 5.38
    5.384371 = sum of:
      5.384371 = weight(author_txt:srinivasan in 2209) [ClassicSimilarity], result of:
        5.384371 = fieldWeight in 2209, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.614993 = idf(docFreq=20, maxDocs=42596)
          0.625 = fieldNorm(doc=2209)
    
  2. Srinivasan, P.: Query expansion and MEDLINE (1996) 5.38
    5.384371 = sum of:
      5.384371 = weight(author_txt:srinivasan in 453) [ClassicSimilarity], result of:
        5.384371 = fieldWeight in 453, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.614993 = idf(docFreq=20, maxDocs=42596)
          0.625 = fieldNorm(doc=453)
    
  3. Srinivasan, P.: Intelligent information retrieval using rough set approximations (1989) 5.38
    5.384371 = sum of:
      5.384371 = weight(author_txt:srinivasan in 2595) [ClassicSimilarity], result of:
        5.384371 = fieldWeight in 2595, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.614993 = idf(docFreq=20, maxDocs=42596)
          0.625 = fieldNorm(doc=2595)
    
  4. Srinivasan, P.: On generalizing the Two-Poisson Model (1990) 5.38
    5.384371 = sum of:
      5.384371 = weight(author_txt:srinivasan in 2949) [ClassicSimilarity], result of:
        5.384371 = fieldWeight in 2949, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.614993 = idf(docFreq=20, maxDocs=42596)
          0.625 = fieldNorm(doc=2949)
    
  5. Srinivasan, P.: Optimal document-indexing vocabulary for MEDLINE (1996) 5.38
    5.384371 = sum of:
      5.384371 = weight(author_txt:srinivasan in 6703) [ClassicSimilarity], result of:
        5.384371 = fieldWeight in 6703, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.614993 = idf(docFreq=20, maxDocs=42596)
          0.625 = fieldNorm(doc=6703)
    

Similar documents (content)

  1. Nielsen, M.L.: Future thesauri : what kind of conceptual knowledge do searchers need? (1998) 0.29
    0.29271755 = sum of:
      0.29271755 = product of:
        1.0454198 = sum of:
          0.04487838 = weight(abstract_txt:valuable in 1146) [ClassicSimilarity], result of:
            0.04487838 = score(doc=1146,freq=1.0), product of:
              0.096071295 = queryWeight, product of:
                5.979343 = idf(docFreq=292, maxDocs=42596)
                0.0160672 = queryNorm
              0.46713617 = fieldWeight in 1146, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.979343 = idf(docFreq=292, maxDocs=42596)
                0.078125 = fieldNorm(doc=1146)
          0.06511519 = weight(abstract_txt:searcher in 1146) [ClassicSimilarity], result of:
            0.06511519 = score(doc=1146,freq=1.0), product of:
              0.12312808 = queryWeight, product of:
                1.132092 = boost
                6.7691665 = idf(docFreq=132, maxDocs=42596)
                0.0160672 = queryNorm
              0.52884114 = fieldWeight in 1146, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7691665 = idf(docFreq=132, maxDocs=42596)
                0.078125 = fieldNorm(doc=1146)
          0.03429828 = weight(abstract_txt:indexing in 1146) [ClassicSimilarity], result of:
            0.03429828 = score(doc=1146,freq=1.0), product of:
              0.101179786 = queryWeight, product of:
                1.4513263 = boost
                4.338989 = idf(docFreq=1510, maxDocs=42596)
                0.0160672 = queryNorm
              0.3389835 = fieldWeight in 1146, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.338989 = idf(docFreq=1510, maxDocs=42596)
                0.078125 = fieldNorm(doc=1146)
          0.026150586 = weight(abstract_txt:retrieval in 1146) [ClassicSimilarity], result of:
            0.026150586 = score(doc=1146,freq=1.0), product of:
              0.096663736 = queryWeight, product of:
                1.7373831 = boost
                3.4628031 = idf(docFreq=3628, maxDocs=42596)
                0.0160672 = queryNorm
              0.2705315 = fieldWeight in 1146, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4628031 = idf(docFreq=3628, maxDocs=42596)
                0.078125 = fieldNorm(doc=1146)
          0.3011174 = weight(abstract_txt:thesauri in 1146) [ClassicSimilarity], result of:
            0.3011174 = score(doc=1146,freq=5.0), product of:
              0.31726545 = queryWeight, product of:
                3.634499 = boost
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.0160672 = queryNorm
              0.9491024 = fieldWeight in 1146, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.078125 = fieldNorm(doc=1146)
          0.24760117 = weight(abstract_txt:construction in 1146) [ClassicSimilarity], result of:
            0.24760117 = score(doc=1146,freq=2.0), product of:
              0.40711525 = queryWeight, product of:
                4.6030626 = boost
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.0160672 = queryNorm
              0.60818446 = fieldWeight in 1146, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.078125 = fieldNorm(doc=1146)
          0.32625884 = weight(abstract_txt:thesaurus in 1146) [ClassicSimilarity], result of:
            0.32625884 = score(doc=1146,freq=2.0), product of:
              0.5723088 = queryWeight, product of:
                6.903405 = boost
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.0160672 = queryNorm
              0.57007486 = fieldWeight in 1146, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.078125 = fieldNorm(doc=1146)
        0.28 = coord(7/25)
    
  2. Spiteri, L.F.: ¬The use of facet analysis in information retrieval thesauri : an examination of selected guidelines for thesaurus construction (1997) 0.26
    0.25923142 = sum of:
      0.25923142 = product of:
        1.296157 = sum of:
          0.06003692 = weight(abstract_txt:examination in 1373) [ClassicSimilarity], result of:
            0.06003692 = score(doc=1373,freq=1.0), product of:
              0.10329049 = queryWeight, product of:
                1.0368916 = boost
                6.19993 = idf(docFreq=234, maxDocs=42596)
                0.0160672 = queryNorm
              0.58124346 = fieldWeight in 1373, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.19993 = idf(docFreq=234, maxDocs=42596)
                0.09375 = fieldNorm(doc=1373)
          0.0313807 = weight(abstract_txt:retrieval in 1373) [ClassicSimilarity], result of:
            0.0313807 = score(doc=1373,freq=1.0), product of:
              0.096663736 = queryWeight, product of:
                1.7373831 = boost
                3.4628031 = idf(docFreq=3628, maxDocs=42596)
                0.0160672 = queryNorm
              0.3246378 = fieldWeight in 1373, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4628031 = idf(docFreq=3628, maxDocs=42596)
                0.09375 = fieldNorm(doc=1373)
          0.3613409 = weight(abstract_txt:thesauri in 1373) [ClassicSimilarity], result of:
            0.3613409 = score(doc=1373,freq=5.0), product of:
              0.31726545 = queryWeight, product of:
                3.634499 = boost
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.0160672 = queryNorm
              1.1389229 = fieldWeight in 1373, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.09375 = fieldNorm(doc=1373)
          0.36389792 = weight(abstract_txt:construction in 1373) [ClassicSimilarity], result of:
            0.36389792 = score(doc=1373,freq=3.0), product of:
              0.40711525 = queryWeight, product of:
                4.6030626 = boost
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.0160672 = queryNorm
              0.89384496 = fieldWeight in 1373, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.09375 = fieldNorm(doc=1373)
          0.47950056 = weight(abstract_txt:thesaurus in 1373) [ClassicSimilarity], result of:
            0.47950056 = score(doc=1373,freq=3.0), product of:
              0.5723088 = queryWeight, product of:
                6.903405 = boost
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.0160672 = queryNorm
              0.83783543 = fieldWeight in 1373, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.09375 = fieldNorm(doc=1373)
        0.2 = coord(5/25)
    
  3. Sanatjoo, A.: Development of thesaurus structure through a work-task oriented methodology 0.24
    0.24328119 = sum of:
      0.24328119 = product of:
        1.0136716 = sum of:
          0.0359027 = weight(abstract_txt:valuable in 4716) [ClassicSimilarity], result of:
            0.0359027 = score(doc=4716,freq=1.0), product of:
              0.096071295 = queryWeight, product of:
                5.979343 = idf(docFreq=292, maxDocs=42596)
                0.0160672 = queryNorm
              0.37370893 = fieldWeight in 4716, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.979343 = idf(docFreq=292, maxDocs=42596)
                0.0625 = fieldNorm(doc=4716)
          0.052092154 = weight(abstract_txt:searcher in 4716) [ClassicSimilarity], result of:
            0.052092154 = score(doc=4716,freq=1.0), product of:
              0.12312808 = queryWeight, product of:
                1.132092 = boost
                6.7691665 = idf(docFreq=132, maxDocs=42596)
                0.0160672 = queryNorm
              0.4230729 = fieldWeight in 4716, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7691665 = idf(docFreq=132, maxDocs=42596)
                0.0625 = fieldNorm(doc=4716)
          0.036235314 = weight(abstract_txt:retrieval in 4716) [ClassicSimilarity], result of:
            0.036235314 = score(doc=4716,freq=3.0), product of:
              0.096663736 = queryWeight, product of:
                1.7373831 = boost
                3.4628031 = idf(docFreq=3628, maxDocs=42596)
                0.0160672 = queryNorm
              0.37485942 = fieldWeight in 4716, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4628031 = idf(docFreq=3628, maxDocs=42596)
                0.0625 = fieldNorm(doc=4716)
          0.10773104 = weight(abstract_txt:thesauri in 4716) [ClassicSimilarity], result of:
            0.10773104 = score(doc=4716,freq=1.0), product of:
              0.31726545 = queryWeight, product of:
                3.634499 = boost
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.0160672 = queryNorm
              0.3395612 = fieldWeight in 4716, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.0625 = fieldNorm(doc=4716)
          0.19808094 = weight(abstract_txt:construction in 4716) [ClassicSimilarity], result of:
            0.19808094 = score(doc=4716,freq=2.0), product of:
              0.40711525 = queryWeight, product of:
                4.6030626 = boost
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.0160672 = queryNorm
              0.4865476 = fieldWeight in 4716, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.0625 = fieldNorm(doc=4716)
          0.58362955 = weight(abstract_txt:thesaurus in 4716) [ClassicSimilarity], result of:
            0.58362955 = score(doc=4716,freq=10.0), product of:
              0.5723088 = queryWeight, product of:
                6.903405 = boost
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.0160672 = queryNorm
              1.0197809 = fieldWeight in 4716, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.0625 = fieldNorm(doc=4716)
        0.24 = coord(6/25)
    
  4. McCulloch, E.: Thesauri: practical guidance for construction (2005) 0.24
    0.23956077 = sum of:
      0.23956077 = product of:
        0.9981699 = sum of:
          0.038838193 = weight(abstract_txt:assist in 5725) [ClassicSimilarity], result of:
            0.038838193 = score(doc=5725,freq=1.0), product of:
              0.10123909 = queryWeight, product of:
                1.0265434 = boost
                6.138055 = idf(docFreq=249, maxDocs=42596)
                0.0160672 = queryNorm
              0.38362843 = fieldWeight in 5725, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.138055 = idf(docFreq=249, maxDocs=42596)
                0.0625 = fieldNorm(doc=5725)
          0.020920468 = weight(abstract_txt:retrieval in 5725) [ClassicSimilarity], result of:
            0.020920468 = score(doc=5725,freq=1.0), product of:
              0.096663736 = queryWeight, product of:
                1.7373831 = boost
                3.4628031 = idf(docFreq=3628, maxDocs=42596)
                0.0160672 = queryNorm
              0.2164252 = fieldWeight in 5725, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4628031 = idf(docFreq=3628, maxDocs=42596)
                0.0625 = fieldNorm(doc=5725)
          0.04826883 = weight(abstract_txt:selected in 5725) [ClassicSimilarity], result of:
            0.04826883 = score(doc=5725,freq=1.0), product of:
              0.14744501 = queryWeight, product of:
                1.7519964 = boost
                5.2378936 = idf(docFreq=614, maxDocs=42596)
                0.0160672 = queryNorm
              0.32736835 = fieldWeight in 5725, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2378936 = idf(docFreq=614, maxDocs=42596)
                0.0625 = fieldNorm(doc=5725)
          0.24089393 = weight(abstract_txt:thesauri in 5725) [ClassicSimilarity], result of:
            0.24089393 = score(doc=5725,freq=5.0), product of:
              0.31726545 = queryWeight, product of:
                3.634499 = boost
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.0160672 = queryNorm
              0.75928193 = fieldWeight in 5725, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.0625 = fieldNorm(doc=5725)
          0.28012878 = weight(abstract_txt:construction in 5725) [ClassicSimilarity], result of:
            0.28012878 = score(doc=5725,freq=4.0), product of:
              0.40711525 = queryWeight, product of:
                4.6030626 = boost
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.0160672 = queryNorm
              0.6880822 = fieldWeight in 5725, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.0625 = fieldNorm(doc=5725)
          0.36911973 = weight(abstract_txt:thesaurus in 5725) [ClassicSimilarity], result of:
            0.36911973 = score(doc=5725,freq=4.0), product of:
              0.5723088 = queryWeight, product of:
                6.903405 = boost
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.0160672 = queryNorm
              0.64496607 = fieldWeight in 5725, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.0625 = fieldNorm(doc=5725)
        0.24 = coord(6/25)
    
  5. Hou, H.; Chen, S.: ¬The integration of Chinese classification and thesaurus : its progress and technical features (1996) 0.22
    0.21904261 = sum of:
      0.21904261 = product of:
        1.3690164 = sum of:
          0.09701018 = weight(abstract_txt:indexing in 2319) [ClassicSimilarity], result of:
            0.09701018 = score(doc=2319,freq=2.0), product of:
              0.101179786 = queryWeight, product of:
                1.4513263 = boost
                4.338989 = idf(docFreq=1510, maxDocs=42596)
                0.0160672 = queryNorm
              0.9587901 = fieldWeight in 2319, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.338989 = idf(docFreq=1510, maxDocs=42596)
                0.15625 = fieldNorm(doc=2319)
          0.26932758 = weight(abstract_txt:thesauri in 2319) [ClassicSimilarity], result of:
            0.26932758 = score(doc=2319,freq=1.0), product of:
              0.31726545 = queryWeight, product of:
                3.634499 = boost
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.0160672 = queryNorm
              0.848903 = fieldWeight in 2319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.432979 = idf(docFreq=505, maxDocs=42596)
                0.15625 = fieldNorm(doc=2319)
          0.35016096 = weight(abstract_txt:construction in 2319) [ClassicSimilarity], result of:
            0.35016096 = score(doc=2319,freq=1.0), product of:
              0.40711525 = queryWeight, product of:
                4.6030626 = boost
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.0160672 = queryNorm
              0.8601028 = fieldWeight in 2319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5046577 = idf(docFreq=470, maxDocs=42596)
                0.15625 = fieldNorm(doc=2319)
          0.6525177 = weight(abstract_txt:thesaurus in 2319) [ClassicSimilarity], result of:
            0.6525177 = score(doc=2319,freq=2.0), product of:
              0.5723088 = queryWeight, product of:
                6.903405 = boost
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.0160672 = queryNorm
              1.1401497 = fieldWeight in 2319, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1597285 = idf(docFreq=664, maxDocs=42596)
                0.15625 = fieldNorm(doc=2319)
        0.16 = coord(4/25)