Document (#11988)

Editor
Schatz, B.
Author
Chen, H.
Yim, T.
Fye, D.
Title
Automatic thesaurus generation for an electronic community system
Source
Journal of the American Society for Information Science. 46(1995) no.3, S.175-193
Year
1995
Abstract
Reports an algorithmic approach to the automatic generation of thesauri for electronic community systems. The techniques used included terms filtering, automatic indexing, and cluster analysis. The testbed for the research was the Worm Community System, which contains a comprehensive library of specialized community data and literature, currently in use by molecular biologists who study the nematode worm. The resulting worm thesaurus included 2709 researchers' names, 798 gene names, 20 experimental methods, and 4302 subject descriptors. On average, each term had about 90 weighted neighbouring terms indicating relevant concepts. The thesaurus was developed as an online search aide. Tests the worm thesaurus in an experiment with 6 worm researchers of varying degrees of expertise and background. The experiment showed that the thesaurus was an excellent 'memory jogging' device and that it supported learning and serendipitous browsing. Despite some occurrences of obvious noise, the system was useful in suggesting relevant concepts for the researchers' queries and it helped improve concept recall. With a simple browsing interface, an automatic thesaurus can become a useful tool for online search and can assist researchers in exploring and traversing a dynamic and complex electronic community system
Theme
Konzeption und Anwendung des Prinzips Thesaurus
Verbale Doksprachen im Online-Retrieval

Similar documents (author)

  1. Chen, Y.N.; Chen, S.J.: ¬A metadata practice of the OFLA FRBR model : a case study for the National Palace Museum in Taipai (2004) 4.35
    4.3499155 = sum of:
      4.3499155 = weight(author_txt:chen in 3384) [ClassicSimilarity], result of:
        4.3499155 = fieldWeight in 3384, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.5 = fieldNorm(doc=3384)
    
  2. Chen, C.C.; Chen, H.H.; Chen, K.H.: ¬The design of the XML/Metadata management system (2000) 4.00
    3.9956524 = sum of:
      3.9956524 = weight(author_txt:chen in 4633) [ClassicSimilarity], result of:
        3.9956524 = fieldWeight in 4633, product of:
          1.7320508 = tf(freq=3.0), with freq of:
            3.0 = termFreq=3.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.375 = fieldNorm(doc=4633)
    
  3. Chen, W.Y.: Observations on cataloguing and classification (1991) 3.84
    3.8448186 = sum of:
      3.8448186 = weight(author_txt:chen in 4184) [ClassicSimilarity], result of:
        3.8448186 = fieldWeight in 4184, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.625 = fieldNorm(doc=4184)
    
  4. Chen, H.: Knowledge-based document retrieval : framework and design (1992) 3.84
    3.8448186 = sum of:
      3.8448186 = weight(author_txt:chen in 5283) [ClassicSimilarity], result of:
        3.8448186 = fieldWeight in 5283, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.625 = fieldNorm(doc=5283)
    
  5. Chen, P.S.: On inference rules of logic-based information retrieval systems (1994) 3.84
    3.8448186 = sum of:
      3.8448186 = weight(author_txt:chen in 6731) [ClassicSimilarity], result of:
        3.8448186 = fieldWeight in 6731, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.1517096 = idf(docFreq=255, maxDocs=44218)
          0.625 = fieldNorm(doc=6731)
    

Similar documents (content)

  1. Chen, H.; Ng, T.D.; Martinez, J.; Schatz, B.R.: ¬A concept space approach to addressing the vocabulary problem in scientific information retrieval : an experiment on the Worm Community System (1997) 0.41
    0.40942818 = sum of:
      0.40942818 = product of:
        1.279463 = sum of:
          0.080055095 = weight(abstract_txt:molecular in 6492) [ClassicSimilarity], result of:
            0.080055095 = score(doc=6492,freq=2.0), product of:
              0.11201132 = queryWeight, product of:
                1.0386782 = boost
                8.085969 = idf(docFreq=36, maxDocs=44218)
                0.013336713 = queryNorm
              0.7147054 = fieldWeight in 6492, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.085969 = idf(docFreq=36, maxDocs=44218)
                0.0625 = fieldNorm(doc=6492)
          0.028322281 = weight(abstract_txt:terms in 6492) [ClassicSimilarity], result of:
            0.028322281 = score(doc=6492,freq=4.0), product of:
              0.0560301 = queryWeight, product of:
                1.0389048 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013336713 = queryNorm
              0.5054833 = fieldWeight in 6492, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=6492)
          0.06719099 = weight(abstract_txt:biologists in 6492) [ClassicSimilarity], result of:
            0.06719099 = score(doc=6492,freq=1.0), product of:
              0.12557021 = queryWeight, product of:
                1.0997485 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.013336713 = queryNorm
              0.53508705 = fieldWeight in 6492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.0625 = fieldNorm(doc=6492)
          0.038136568 = weight(abstract_txt:generation in 6492) [ClassicSimilarity], result of:
            0.038136568 = score(doc=6492,freq=1.0), product of:
              0.10845518 = queryWeight, product of:
                1.4454073 = boost
                5.6261497 = idf(docFreq=432, maxDocs=44218)
                0.013336713 = queryNorm
              0.35163435 = fieldWeight in 6492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6261497 = idf(docFreq=432, maxDocs=44218)
                0.0625 = fieldNorm(doc=6492)
          0.03885801 = weight(abstract_txt:experiment in 6492) [ClassicSimilarity], result of:
            0.03885801 = score(doc=6492,freq=1.0), product of:
              0.1098187 = queryWeight, product of:
                1.4544648 = boost
                5.6614056 = idf(docFreq=417, maxDocs=44218)
                0.013336713 = queryNorm
              0.35383785 = fieldWeight in 6492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6614056 = idf(docFreq=417, maxDocs=44218)
                0.0625 = fieldNorm(doc=6492)
          0.060068157 = weight(abstract_txt:automatic in 6492) [ClassicSimilarity], result of:
            0.060068157 = score(doc=6492,freq=1.0), product of:
              0.1849817 = queryWeight, product of:
                2.6695893 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.013336713 = queryNorm
              0.32472485 = fieldWeight in 6492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=6492)
          0.19805215 = weight(abstract_txt:thesaurus in 6492) [ClassicSimilarity], result of:
            0.19805215 = score(doc=6492,freq=5.0), product of:
              0.2743212 = queryWeight, product of:
                3.9815793 = boost
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.013336713 = queryNorm
              0.72197175 = fieldWeight in 6492, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.0625 = fieldNorm(doc=6492)
          0.7687798 = weight(abstract_txt:worm in 6492) [ClassicSimilarity], result of:
            0.7687798 = score(doc=6492,freq=3.0), product of:
              0.75595653 = queryWeight, product of:
                6.0336967 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.013336713 = queryNorm
              1.016963 = fieldWeight in 6492, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.0625 = fieldNorm(doc=6492)
        0.32 = coord(8/25)
    
  2. Tudhope, D.; Blocks, D.; Cunliffe, D.; Binding, C.: Query expansion via conceptual distance in thesaurus indexed collections (2006) 0.10
    0.096630014 = sum of:
      0.096630014 = product of:
        0.40262508 = sum of:
          0.014161141 = weight(abstract_txt:terms in 2215) [ClassicSimilarity], result of:
            0.014161141 = score(doc=2215,freq=1.0), product of:
              0.0560301 = queryWeight, product of:
                1.0389048 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013336713 = queryNorm
              0.25274166 = fieldWeight in 2215, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2215)
          0.024270298 = weight(abstract_txt:useful in 2215) [ClassicSimilarity], result of:
            0.024270298 = score(doc=2215,freq=1.0), product of:
              0.08024278 = queryWeight, product of:
                1.2432774 = boost
                4.839373 = idf(docFreq=950, maxDocs=44218)
                0.013336713 = queryNorm
              0.30246082 = fieldWeight in 2215, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.839373 = idf(docFreq=950, maxDocs=44218)
                0.0625 = fieldNorm(doc=2215)
          0.037181582 = weight(abstract_txt:browsing in 2215) [ClassicSimilarity], result of:
            0.037181582 = score(doc=2215,freq=1.0), product of:
              0.10663697 = queryWeight, product of:
                1.4332402 = boost
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.013336713 = queryNorm
              0.3486744 = fieldWeight in 2215, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.0625 = fieldNorm(doc=2215)
          0.016425539 = weight(abstract_txt:system in 2215) [ClassicSimilarity], result of:
            0.016425539 = score(doc=2215,freq=1.0), product of:
              0.077931374 = queryWeight, product of:
                1.7327513 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.013336713 = queryNorm
              0.21076928 = fieldWeight in 2215, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.0625 = fieldNorm(doc=2215)
          0.060068157 = weight(abstract_txt:automatic in 2215) [ClassicSimilarity], result of:
            0.060068157 = score(doc=2215,freq=1.0), product of:
              0.1849817 = queryWeight, product of:
                2.6695893 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.013336713 = queryNorm
              0.32472485 = fieldWeight in 2215, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=2215)
          0.25051835 = weight(abstract_txt:thesaurus in 2215) [ClassicSimilarity], result of:
            0.25051835 = score(doc=2215,freq=8.0), product of:
              0.2743212 = queryWeight, product of:
                3.9815793 = boost
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.013336713 = queryNorm
              0.91323006 = fieldWeight in 2215, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.0625 = fieldNorm(doc=2215)
        0.24 = coord(6/25)
    
  3. Weiss, A.: Hop, skip, and jump : navigating the World Wide Web (1995) 0.10
    0.096207365 = sum of:
      0.096207365 = product of:
        1.2025921 = sum of:
          0.09295395 = weight(abstract_txt:browsing in 1976) [ClassicSimilarity], result of:
            0.09295395 = score(doc=1976,freq=1.0), product of:
              0.10663697 = queryWeight, product of:
                1.4332402 = boost
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.013336713 = queryNorm
              0.871686 = fieldWeight in 1976, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.15625 = fieldNorm(doc=1976)
          1.1096382 = weight(abstract_txt:worm in 1976) [ClassicSimilarity], result of:
            1.1096382 = score(doc=1976,freq=1.0), product of:
              0.75595653 = queryWeight, product of:
                6.0336967 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.013336713 = queryNorm
              1.4678597 = fieldWeight in 1976, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.15625 = fieldNorm(doc=1976)
        0.08 = coord(2/25)
    
  4. Crouch, C.J.: ¬An approach to the automatic construction of global thesauri (1990) 0.09
    0.09368756 = sum of:
      0.09368756 = product of:
        0.4684378 = sum of:
          0.05720485 = weight(abstract_txt:generation in 4042) [ClassicSimilarity], result of:
            0.05720485 = score(doc=4042,freq=1.0), product of:
              0.10845518 = queryWeight, product of:
                1.4454073 = boost
                5.6261497 = idf(docFreq=432, maxDocs=44218)
                0.013336713 = queryNorm
              0.5274515 = fieldWeight in 4042, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6261497 = idf(docFreq=432, maxDocs=44218)
                0.09375 = fieldNorm(doc=4042)
          0.024638308 = weight(abstract_txt:system in 4042) [ClassicSimilarity], result of:
            0.024638308 = score(doc=4042,freq=1.0), product of:
              0.077931374 = queryWeight, product of:
                1.7327513 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.013336713 = queryNorm
              0.3161539 = fieldWeight in 4042, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.09375 = fieldNorm(doc=4042)
          0.07128202 = weight(abstract_txt:researchers in 4042) [ClassicSimilarity], result of:
            0.07128202 = score(doc=4042,freq=1.0), product of:
              0.15823106 = queryWeight, product of:
                2.4690275 = boost
                4.805261 = idf(docFreq=983, maxDocs=44218)
                0.013336713 = queryNorm
              0.45049322 = fieldWeight in 4042, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.805261 = idf(docFreq=983, maxDocs=44218)
                0.09375 = fieldNorm(doc=4042)
          0.1274238 = weight(abstract_txt:automatic in 4042) [ClassicSimilarity], result of:
            0.1274238 = score(doc=4042,freq=2.0), product of:
              0.1849817 = queryWeight, product of:
                2.6695893 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.013336713 = queryNorm
              0.6888454 = fieldWeight in 4042, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.09375 = fieldNorm(doc=4042)
          0.18788879 = weight(abstract_txt:thesaurus in 4042) [ClassicSimilarity], result of:
            0.18788879 = score(doc=4042,freq=2.0), product of:
              0.2743212 = queryWeight, product of:
                3.9815793 = boost
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.013336713 = queryNorm
              0.6849226 = fieldWeight in 4042, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.09375 = fieldNorm(doc=4042)
        0.2 = coord(5/25)
    
  5. Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.08
    0.08402796 = sum of:
      0.08402796 = product of:
        0.30009985 = sum of:
          0.026493069 = weight(abstract_txt:terms in 1211) [ClassicSimilarity], result of:
            0.026493069 = score(doc=1211,freq=14.0), product of:
              0.0560301 = queryWeight, product of:
                1.0389048 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013336713 = queryNorm
              0.47283638 = fieldWeight in 1211, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.03125 = fieldNorm(doc=1211)
          0.028588597 = weight(abstract_txt:concepts in 1211) [ClassicSimilarity], result of:
            0.028588597 = score(doc=1211,freq=8.0), product of:
              0.07103534 = queryWeight, product of:
                1.1697749 = boost
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.013336713 = queryNorm
              0.40245596 = fieldWeight in 1211, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.5532694 = idf(docFreq=1265, maxDocs=44218)
                0.03125 = fieldNorm(doc=1211)
          0.015083335 = weight(abstract_txt:relevant in 1211) [ClassicSimilarity], result of:
            0.015083335 = score(doc=1211,freq=2.0), product of:
              0.07362594 = queryWeight, product of:
                1.1909143 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.013336713 = queryNorm
              0.20486443 = fieldWeight in 1211, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.03125 = fieldNorm(doc=1211)
          0.033027235 = weight(abstract_txt:generation in 1211) [ClassicSimilarity], result of:
            0.033027235 = score(doc=1211,freq=3.0), product of:
              0.10845518 = queryWeight, product of:
                1.4454073 = boost
                5.6261497 = idf(docFreq=432, maxDocs=44218)
                0.013336713 = queryNorm
              0.30452427 = fieldWeight in 1211, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.6261497 = idf(docFreq=432, maxDocs=44218)
                0.03125 = fieldNorm(doc=1211)
          0.016425539 = weight(abstract_txt:system in 1211) [ClassicSimilarity], result of:
            0.016425539 = score(doc=1211,freq=4.0), product of:
              0.077931374 = queryWeight, product of:
                1.7327513 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.013336713 = queryNorm
              0.21076928 = fieldWeight in 1211, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.03125 = fieldNorm(doc=1211)
          0.033602666 = weight(abstract_txt:researchers in 1211) [ClassicSimilarity], result of:
            0.033602666 = score(doc=1211,freq=2.0), product of:
              0.15823106 = queryWeight, product of:
                2.4690275 = boost
                4.805261 = idf(docFreq=983, maxDocs=44218)
                0.013336713 = queryNorm
              0.21236454 = fieldWeight in 1211, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.805261 = idf(docFreq=983, maxDocs=44218)
                0.03125 = fieldNorm(doc=1211)
          0.1468794 = weight(abstract_txt:thesaurus in 1211) [ClassicSimilarity], result of:
            0.1468794 = score(doc=1211,freq=11.0), product of:
              0.2743212 = queryWeight, product of:
                3.9815793 = boost
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.013336713 = queryNorm
              0.5354286 = fieldWeight in 1211, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.03125 = fieldNorm(doc=1211)
        0.28 = coord(7/25)