Document (#11988)

Editor
Schatz, B.
Author
Chen, H.
Yim, T.
Fye, D.
Title
Automatic thesaurus generation for an electronic community system
Source
Journal of the American Society for Information Science. 46(1995) no.3, S.175-193
Year
1995
Abstract
Reports an algorithmic approach to the automatic generation of thesauri for electronic community systems. The techniques used included terms filtering, automatic indexing, and cluster analysis. The testbed for the research was the Worm Community System, which contains a comprehensive library of specialized community data and literature, currently in use by molecular biologists who study the nematode worm. The resulting worm thesaurus included 2709 researchers' names, 798 gene names, 20 experimental methods, and 4302 subject descriptors. On average, each term had about 90 weighted neighbouring terms indicating relevant concepts. The thesaurus was developed as an online search aide. Tests the worm thesaurus in an experiment with 6 worm researchers of varying degrees of expertise and background. The experiment showed that the thesaurus was an excellent 'memory jogging' device and that it supported learning and serendipitous browsing. Despite some occurrences of obvious noise, the system was useful in suggesting relevant concepts for the researchers' queries and it helped improve concept recall. With a simple browsing interface, an automatic thesaurus can become a useful tool for online search and can assist researchers in exploring and traversing a dynamic and complex electronic community system
Theme
Konzeption und Anwendung des Prinzips Thesaurus
Verbale Doksprachen im Online-Retrieval

Similar documents (author)

  1. Chen, Y.N.; Chen, S.J.: ¬A metadata practice of the OFLA FRBR model : a case study for the National Palace Museum in Taipai (2004) 4.36
    4.356779 = sum of:
      4.356779 = weight(author_txt:chen in 5385) [ClassicSimilarity], result of:
        4.356779 = fieldWeight in 5385, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.161416 = idf(docFreq=247, maxDocs=43254)
          0.5 = fieldNorm(doc=5385)
    
  2. Chen, C.C.; Chen, H.H.; Chen, K.H.: ¬The design of the XML/Metadata management system (2000) 4.00
    4.001957 = sum of:
      4.001957 = weight(author_txt:chen in 6634) [ClassicSimilarity], result of:
        4.001957 = fieldWeight in 6634, product of:
          1.7320508 = tf(freq=3.0), with freq of:
            3.0 = termFreq=3.0
          6.161416 = idf(docFreq=247, maxDocs=43254)
          0.375 = fieldNorm(doc=6634)
    
  3. Chen, W.Y.: Observations on cataloguing and classification (1991) 3.85
    3.850885 = sum of:
      3.850885 = weight(author_txt:chen in 4184) [ClassicSimilarity], result of:
        3.850885 = fieldWeight in 4184, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.161416 = idf(docFreq=247, maxDocs=43254)
          0.625 = fieldNorm(doc=4184)
    
  4. Chen, H.: Knowledge-based document retrieval : framework and design (1992) 3.85
    3.850885 = sum of:
      3.850885 = weight(author_txt:chen in 5283) [ClassicSimilarity], result of:
        3.850885 = fieldWeight in 5283, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.161416 = idf(docFreq=247, maxDocs=43254)
          0.625 = fieldNorm(doc=5283)
    
  5. Chen, P.S.: On inference rules of logic-based information retrieval systems (1994) 3.85
    3.850885 = sum of:
      3.850885 = weight(author_txt:chen in 6731) [ClassicSimilarity], result of:
        3.850885 = fieldWeight in 6731, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.161416 = idf(docFreq=247, maxDocs=43254)
          0.625 = fieldNorm(doc=6731)
    

Similar documents (content)

  1. Chen, H.; Ng, T.D.; Martinez, J.; Schatz, B.R.: ¬A concept space approach to addressing the vocabulary problem in scientific information retrieval : an experiment on the Worm Community System (1997) 0.41
    0.4084549 = sum of:
      0.4084549 = product of:
        1.2764215 = sum of:
          0.0803476 = weight(abstract_txt:molecular in 561) [ClassicSimilarity], result of:
            0.0803476 = score(doc=561,freq=2.0), product of:
              0.11234615 = queryWeight, product of:
                1.0396119 = boost
                8.091326 = idf(docFreq=35, maxDocs=43254)
                0.013355719 = queryNorm
              0.7151789 = fieldWeight in 561, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.091326 = idf(docFreq=35, maxDocs=43254)
                0.0625 = fieldNorm(doc=561)
          0.028669309 = weight(abstract_txt:terms in 561) [ClassicSimilarity], result of:
            0.028669309 = score(doc=561,freq=4.0), product of:
              0.056518126 = queryWeight, product of:
                1.0428001 = boost
                4.058069 = idf(docFreq=2031, maxDocs=43254)
                0.013355719 = queryNorm
              0.50725865 = fieldWeight in 561, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.058069 = idf(docFreq=2031, maxDocs=43254)
                0.0625 = fieldNorm(doc=561)
          0.06678414 = weight(abstract_txt:biologists in 561) [ClassicSimilarity], result of:
            0.06678414 = score(doc=561,freq=1.0), product of:
              0.12513204 = queryWeight, product of:
                1.0971763 = boost
                8.5393505 = idf(docFreq=22, maxDocs=43254)
                0.013355719 = queryNorm
              0.5337094 = fieldWeight in 561, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.5393505 = idf(docFreq=22, maxDocs=43254)
                0.0625 = fieldNorm(doc=561)
          0.038420845 = weight(abstract_txt:generation in 561) [ClassicSimilarity], result of:
            0.038420845 = score(doc=561,freq=1.0), product of:
              0.10905381 = queryWeight, product of:
                1.4485303 = boost
                5.636974 = idf(docFreq=418, maxDocs=43254)
                0.013355719 = queryNorm
              0.35231087 = fieldWeight in 561, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.636974 = idf(docFreq=418, maxDocs=43254)
                0.0625 = fieldNorm(doc=561)
          0.03896741 = weight(abstract_txt:experiment in 561) [ClassicSimilarity], result of:
            0.03896741 = score(doc=561,freq=1.0), product of:
              0.11008563 = queryWeight, product of:
                1.4553668 = boost
                5.663578 = idf(docFreq=407, maxDocs=43254)
                0.013355719 = queryNorm
              0.35397363 = fieldWeight in 561, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.663578 = idf(docFreq=407, maxDocs=43254)
                0.0625 = fieldNorm(doc=561)
          0.060193606 = weight(abstract_txt:automatic in 561) [ClassicSimilarity], result of:
            0.060193606 = score(doc=561,freq=1.0), product of:
              0.18534172 = queryWeight, product of:
                2.6705995 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.013355719 = queryNorm
              0.32477096 = fieldWeight in 561, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.0625 = fieldNorm(doc=561)
          0.19838892 = weight(abstract_txt:thesaurus in 561) [ClassicSimilarity], result of:
            0.19838892 = score(doc=561,freq=5.0), product of:
              0.27478412 = queryWeight, product of:
                3.9825716 = boost
                5.1660757 = idf(docFreq=670, maxDocs=43254)
                0.013355719 = queryNorm
              0.72198105 = fieldWeight in 561, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.1660757 = idf(docFreq=670, maxDocs=43254)
                0.0625 = fieldNorm(doc=561)
          0.7646497 = weight(abstract_txt:worm in 561) [ClassicSimilarity], result of:
            0.7646497 = score(doc=561,freq=3.0), product of:
              0.7536636 = queryWeight, product of:
                6.0209627 = boost
                9.37226 = idf(docFreq=9, maxDocs=43254)
                0.013355719 = queryNorm
              1.0145769 = fieldWeight in 561, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.37226 = idf(docFreq=9, maxDocs=43254)
                0.0625 = fieldNorm(doc=561)
        0.32 = coord(8/25)
    
  2. Tudhope, D.; Blocks, D.; Cunliffe, D.; Binding, C.: Query expansion via conceptual distance in thesaurus indexed collections (2006) 0.10
    0.096722856 = sum of:
      0.096722856 = product of:
        0.40301192 = sum of:
          0.014334654 = weight(abstract_txt:terms in 4216) [ClassicSimilarity], result of:
            0.014334654 = score(doc=4216,freq=1.0), product of:
              0.056518126 = queryWeight, product of:
                1.0428001 = boost
                4.058069 = idf(docFreq=2031, maxDocs=43254)
                0.013355719 = queryNorm
              0.25362933 = fieldWeight in 4216, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.058069 = idf(docFreq=2031, maxDocs=43254)
                0.0625 = fieldNorm(doc=4216)
          0.024218159 = weight(abstract_txt:useful in 4216) [ClassicSimilarity], result of:
            0.024218159 = score(doc=4216,freq=1.0), product of:
              0.08017218 = queryWeight, product of:
                1.2419926 = boost
                4.8332295 = idf(docFreq=935, maxDocs=43254)
                0.013355719 = queryNorm
              0.30207685 = fieldWeight in 4216, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8332295 = idf(docFreq=935, maxDocs=43254)
                0.0625 = fieldNorm(doc=4216)
          0.036979772 = weight(abstract_txt:browsing in 4216) [ClassicSimilarity], result of:
            0.036979772 = score(doc=4216,freq=1.0), product of:
              0.106309585 = queryWeight, product of:
                1.4301888 = boost
                5.5655975 = idf(docFreq=449, maxDocs=43254)
                0.013355719 = queryNorm
              0.34784985 = fieldWeight in 4216, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5655975 = idf(docFreq=449, maxDocs=43254)
                0.0625 = fieldNorm(doc=4216)
          0.016341375 = weight(abstract_txt:system in 4216) [ClassicSimilarity], result of:
            0.016341375 = score(doc=4216,freq=1.0), product of:
              0.07770793 = queryWeight, product of:
                1.7292383 = boost
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.013355719 = queryNorm
              0.21029225 = fieldWeight in 4216, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.0625 = fieldNorm(doc=4216)
          0.060193606 = weight(abstract_txt:automatic in 4216) [ClassicSimilarity], result of:
            0.060193606 = score(doc=4216,freq=1.0), product of:
              0.18534172 = queryWeight, product of:
                2.6705995 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.013355719 = queryNorm
              0.32477096 = fieldWeight in 4216, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.0625 = fieldNorm(doc=4216)
          0.25094435 = weight(abstract_txt:thesaurus in 4216) [ClassicSimilarity], result of:
            0.25094435 = score(doc=4216,freq=8.0), product of:
              0.27478412 = queryWeight, product of:
                3.9825716 = boost
                5.1660757 = idf(docFreq=670, maxDocs=43254)
                0.013355719 = queryNorm
              0.9132418 = fieldWeight in 4216, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.1660757 = idf(docFreq=670, maxDocs=43254)
                0.0625 = fieldNorm(doc=4216)
        0.24 = coord(6/25)
    
  3. Weiss, A.: Hop, skip, and jump : navigating the World Wide Web (1995) 0.10
    0.095690094 = sum of:
      0.095690094 = product of:
        1.1961262 = sum of:
          0.092449434 = weight(abstract_txt:browsing in 3045) [ClassicSimilarity], result of:
            0.092449434 = score(doc=3045,freq=1.0), product of:
              0.106309585 = queryWeight, product of:
                1.4301888 = boost
                5.5655975 = idf(docFreq=449, maxDocs=43254)
                0.013355719 = queryNorm
              0.8696246 = fieldWeight in 3045, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5655975 = idf(docFreq=449, maxDocs=43254)
                0.15625 = fieldNorm(doc=3045)
          1.1036768 = weight(abstract_txt:worm in 3045) [ClassicSimilarity], result of:
            1.1036768 = score(doc=3045,freq=1.0), product of:
              0.7536636 = queryWeight, product of:
                6.0209627 = boost
                9.37226 = idf(docFreq=9, maxDocs=43254)
                0.013355719 = queryNorm
              1.4644157 = fieldWeight in 3045, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.37226 = idf(docFreq=9, maxDocs=43254)
                0.15625 = fieldNorm(doc=3045)
        0.08 = coord(2/25)
    
  4. Crouch, C.J.: ¬An approach to the automatic construction of global thesauri (1990) 0.09
    0.09414884 = sum of:
      0.09414884 = product of:
        0.47074416 = sum of:
          0.05763126 = weight(abstract_txt:generation in 5111) [ClassicSimilarity], result of:
            0.05763126 = score(doc=5111,freq=1.0), product of:
              0.10905381 = queryWeight, product of:
                1.4485303 = boost
                5.636974 = idf(docFreq=418, maxDocs=43254)
                0.013355719 = queryNorm
              0.5284663 = fieldWeight in 5111, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.636974 = idf(docFreq=418, maxDocs=43254)
                0.09375 = fieldNorm(doc=5111)
          0.024512066 = weight(abstract_txt:system in 5111) [ClassicSimilarity], result of:
            0.024512066 = score(doc=5111,freq=1.0), product of:
              0.07770793 = queryWeight, product of:
                1.7292383 = boost
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.013355719 = queryNorm
              0.3154384 = fieldWeight in 5111, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.09375 = fieldNorm(doc=5111)
          0.07270269 = weight(abstract_txt:researchers in 5111) [ClassicSimilarity], result of:
            0.07270269 = score(doc=5111,freq=1.0), product of:
              0.16041529 = queryWeight, product of:
                2.4845345 = boost
                4.8342986 = idf(docFreq=934, maxDocs=43254)
                0.013355719 = queryNorm
              0.45321548 = fieldWeight in 5111, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8342986 = idf(docFreq=934, maxDocs=43254)
                0.09375 = fieldNorm(doc=5111)
          0.12768991 = weight(abstract_txt:automatic in 5111) [ClassicSimilarity], result of:
            0.12768991 = score(doc=5111,freq=2.0), product of:
              0.18534172 = queryWeight, product of:
                2.6705995 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.013355719 = queryNorm
              0.6889432 = fieldWeight in 5111, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.09375 = fieldNorm(doc=5111)
          0.18820825 = weight(abstract_txt:thesaurus in 5111) [ClassicSimilarity], result of:
            0.18820825 = score(doc=5111,freq=2.0), product of:
              0.27478412 = queryWeight, product of:
                3.9825716 = boost
                5.1660757 = idf(docFreq=670, maxDocs=43254)
                0.013355719 = queryNorm
              0.68493134 = fieldWeight in 5111, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1660757 = idf(docFreq=670, maxDocs=43254)
                0.09375 = fieldNorm(doc=5111)
        0.2 = coord(5/25)
    
  5. Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.08
    0.084567316 = sum of:
      0.084567316 = product of:
        0.30202612 = sum of:
          0.026817685 = weight(abstract_txt:terms in 3212) [ClassicSimilarity], result of:
            0.026817685 = score(doc=3212,freq=14.0), product of:
              0.056518126 = queryWeight, product of:
                1.0428001 = boost
                4.058069 = idf(docFreq=2031, maxDocs=43254)
                0.013355719 = queryNorm
              0.47449705 = fieldWeight in 3212, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                4.058069 = idf(docFreq=2031, maxDocs=43254)
                0.03125 = fieldNorm(doc=3212)
          0.028966254 = weight(abstract_txt:concepts in 3212) [ClassicSimilarity], result of:
            0.028966254 = score(doc=3212,freq=8.0), product of:
              0.07169923 = queryWeight, product of:
                1.1745307 = boost
                4.570701 = idf(docFreq=1216, maxDocs=43254)
                0.013355719 = queryNorm
              0.4039967 = fieldWeight in 3212, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.570701 = idf(docFreq=1216, maxDocs=43254)
                0.03125 = fieldNorm(doc=3212)
          0.015225819 = weight(abstract_txt:relevant in 3212) [ClassicSimilarity], result of:
            0.015225819 = score(doc=3212,freq=2.0), product of:
              0.074129894 = queryWeight, product of:
                1.1942736 = boost
                4.6475306 = idf(docFreq=1126, maxDocs=43254)
                0.013355719 = queryNorm
              0.20539378 = fieldWeight in 3212, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6475306 = idf(docFreq=1126, maxDocs=43254)
                0.03125 = fieldNorm(doc=3212)
          0.033273425 = weight(abstract_txt:generation in 3212) [ClassicSimilarity], result of:
            0.033273425 = score(doc=3212,freq=3.0), product of:
              0.10905381 = queryWeight, product of:
                1.4485303 = boost
                5.636974 = idf(docFreq=418, maxDocs=43254)
                0.013355719 = queryNorm
              0.30511016 = fieldWeight in 3212, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.636974 = idf(docFreq=418, maxDocs=43254)
                0.03125 = fieldNorm(doc=3212)
          0.016341375 = weight(abstract_txt:system in 3212) [ClassicSimilarity], result of:
            0.016341375 = score(doc=3212,freq=4.0), product of:
              0.07770793 = queryWeight, product of:
                1.7292383 = boost
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.013355719 = queryNorm
              0.21029225 = fieldWeight in 3212, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.03125 = fieldNorm(doc=3212)
          0.03427238 = weight(abstract_txt:researchers in 3212) [ClassicSimilarity], result of:
            0.03427238 = score(doc=3212,freq=2.0), product of:
              0.16041529 = queryWeight, product of:
                2.4845345 = boost
                4.8342986 = idf(docFreq=934, maxDocs=43254)
                0.013355719 = queryNorm
              0.21364783 = fieldWeight in 3212, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8342986 = idf(docFreq=934, maxDocs=43254)
                0.03125 = fieldNorm(doc=3212)
          0.14712918 = weight(abstract_txt:thesaurus in 3212) [ClassicSimilarity], result of:
            0.14712918 = score(doc=3212,freq=11.0), product of:
              0.27478412 = queryWeight, product of:
                3.9825716 = boost
                5.1660757 = idf(docFreq=670, maxDocs=43254)
                0.013355719 = queryNorm
              0.5354355 = fieldWeight in 3212, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                5.1660757 = idf(docFreq=670, maxDocs=43254)
                0.03125 = fieldNorm(doc=3212)
        0.28 = coord(7/25)