Document (#4892)

Author
Haas, S.
He, S.
Title
Toward the automatic identification of sublanguage vocabulary
Source
Information processing and management. 29(1993) no.6, S.721-744
Year
1993
Abstract
Describes a method developed for automatic identification of sublanguage vocabulary words as they occur in abstracts. Describes the sublanguage vocabulary identification procedures using abstracts from computer science and library and information science as sublanguage sources. Evaluates the results using three criteria. Discuss the practical and theoretical significance of this research and plans for further experiments
Theme
Automatisches Indexieren

Similar documents (author)

  1. Haas, S.W.: ¬A feasibility study of the case hierarchy model for the construction and porting of natural language interfaces (1990) 5.50
    5.504072 = sum of:
      5.504072 = weight(author_txt:haas in 8071) [ClassicSimilarity], result of:
        5.504072 = fieldWeight in 8071, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.625 = fieldNorm(doc=8071)
    
  2. Haas, S.W.: Disciplinary variation in automatic sublanguage term identification (1997) 5.50
    5.504072 = sum of:
      5.504072 = weight(author_txt:haas in 6500) [ClassicSimilarity], result of:
        5.504072 = fieldWeight in 6500, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.625 = fieldNorm(doc=6500)
    
  3. Haas, S.W.: ¬A text filter for the automatic identification of empirical articles (1996) 5.50
    5.504072 = sum of:
      5.504072 = weight(author_txt:haas in 6798) [ClassicSimilarity], result of:
        5.504072 = fieldWeight in 6798, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.625 = fieldNorm(doc=6798)
    
  4. Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 5.50
    5.504072 = sum of:
      5.504072 = weight(author_txt:haas in 7415) [ClassicSimilarity], result of:
        5.504072 = fieldWeight in 7415, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.625 = fieldNorm(doc=7415)
    
  5. Haas, S.: Metadata mania : an overview (1998) 5.50
    5.504072 = sum of:
      5.504072 = weight(author_txt:haas in 2222) [ClassicSimilarity], result of:
        5.504072 = fieldWeight in 2222, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.625 = fieldNorm(doc=2222)
    

Similar documents (content)

  1. Haas, S.W.: Disciplinary variation in automatic sublanguage term identification (1997) 0.30
    0.30277917 = sum of:
      0.30277917 = product of:
        0.84105325 = sum of:
          0.018517373 = weight(abstract_txt:method in 6500) [ClassicSimilarity], result of:
            0.018517373 = score(doc=6500,freq=2.0), product of:
              0.04654577 = queryWeight, product of:
                1.0423983 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0099207 = queryNorm
              0.3978315 = fieldWeight in 6500, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=6500)
          0.015832825 = weight(abstract_txt:practical in 6500) [ClassicSimilarity], result of:
            0.015832825 = score(doc=6500,freq=1.0), product of:
              0.052829463 = queryWeight, product of:
                1.1105336 = boost
                4.79515 = idf(docFreq=993, maxDocs=44218)
                0.0099207 = queryNorm
              0.29969686 = fieldWeight in 6500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.79515 = idf(docFreq=993, maxDocs=44218)
                0.0625 = fieldNorm(doc=6500)
          0.016349586 = weight(abstract_txt:theoretical in 6500) [ClassicSimilarity], result of:
            0.016349586 = score(doc=6500,freq=1.0), product of:
              0.053972818 = queryWeight, product of:
                1.1224865 = boost
                4.846761 = idf(docFreq=943, maxDocs=44218)
                0.0099207 = queryNorm
              0.30292258 = fieldWeight in 6500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.846761 = idf(docFreq=943, maxDocs=44218)
                0.0625 = fieldNorm(doc=6500)
          0.042669397 = weight(abstract_txt:occur in 6500) [ClassicSimilarity], result of:
            0.042669397 = score(doc=6500,freq=1.0), product of:
              0.10230926 = queryWeight, product of:
                1.5454361 = boost
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.0099207 = queryNorm
              0.4170629 = fieldWeight in 6500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.0625 = fieldNorm(doc=6500)
          0.016060274 = weight(abstract_txt:describes in 6500) [ClassicSimilarity], result of:
            0.016060274 = score(doc=6500,freq=1.0), product of:
              0.0671969 = queryWeight, product of:
                1.7712636 = boost
                3.8240511 = idf(docFreq=2624, maxDocs=44218)
                0.0099207 = queryNorm
              0.2390032 = fieldWeight in 6500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8240511 = idf(docFreq=2624, maxDocs=44218)
                0.0625 = fieldNorm(doc=6500)
          0.016529197 = weight(abstract_txt:science in 6500) [ClassicSimilarity], result of:
            0.016529197 = score(doc=6500,freq=1.0), product of:
              0.06849861 = queryWeight, product of:
                1.7883375 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0099207 = queryNorm
              0.24130704 = fieldWeight in 6500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0625 = fieldNorm(doc=6500)
          0.060910974 = weight(abstract_txt:abstracts in 6500) [ClassicSimilarity], result of:
            0.060910974 = score(doc=6500,freq=1.0), product of:
              0.16342217 = queryWeight, product of:
                2.7622569 = boost
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.0099207 = queryNorm
              0.3727216 = fieldWeight in 6500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.0625 = fieldNorm(doc=6500)
          0.12163859 = weight(abstract_txt:identification in 6500) [ClassicSimilarity], result of:
            0.12163859 = score(doc=6500,freq=2.0), product of:
              0.235459 = queryWeight, product of:
                4.0608025 = boost
                5.8446846 = idf(docFreq=347, maxDocs=44218)
                0.0099207 = queryNorm
              0.516602 = fieldWeight in 6500, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8446846 = idf(docFreq=347, maxDocs=44218)
                0.0625 = fieldNorm(doc=6500)
          0.53254503 = weight(abstract_txt:sublanguage in 6500) [ClassicSimilarity], result of:
            0.53254503 = score(doc=6500,freq=1.0), product of:
              0.8738324 = queryWeight, product of:
                9.033117 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.0099207 = queryNorm
              0.6094361 = fieldWeight in 6500, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.0625 = fieldNorm(doc=6500)
        0.36 = coord(9/25)
    
  2. Losee, R.M.; Haas, S.W.: Sublanguage terms : dictionaries, usage, and automatic classification (1995) 0.28
    0.2831253 = sum of:
      0.2831253 = product of:
        1.7695332 = sum of:
          0.017892672 = weight(abstract_txt:using in 2650) [ClassicSimilarity], result of:
            0.017892672 = score(doc=2650,freq=1.0), product of:
              0.055110782 = queryWeight, product of:
                1.6040831 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0099207 = queryNorm
              0.32466736 = fieldWeight in 2650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.09375 = fieldNorm(doc=2650)
          0.024793796 = weight(abstract_txt:science in 2650) [ClassicSimilarity], result of:
            0.024793796 = score(doc=2650,freq=1.0), product of:
              0.06849861 = queryWeight, product of:
                1.7883375 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0099207 = queryNorm
              0.36196056 = fieldWeight in 2650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.09375 = fieldNorm(doc=2650)
          0.12921168 = weight(abstract_txt:abstracts in 2650) [ClassicSimilarity], result of:
            0.12921168 = score(doc=2650,freq=2.0), product of:
              0.16342217 = queryWeight, product of:
                2.7622569 = boost
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.0099207 = queryNorm
              0.79066193 = fieldWeight in 2650, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.09375 = fieldNorm(doc=2650)
          1.597635 = weight(abstract_txt:sublanguage in 2650) [ClassicSimilarity], result of:
            1.597635 = score(doc=2650,freq=4.0), product of:
              0.8738324 = queryWeight, product of:
                9.033117 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.0099207 = queryNorm
              1.8283083 = fieldWeight in 2650, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.09375 = fieldNorm(doc=2650)
        0.16 = coord(4/25)
    
  3. Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.09
    0.09165199 = sum of:
      0.09165199 = product of:
        1.1456499 = sum of:
          0.0805598 = weight(abstract_txt:automatic in 4709) [ClassicSimilarity], result of:
            0.0805598 = score(doc=4709,freq=1.0), product of:
              0.12404317 = queryWeight, product of:
                2.4065506 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0099207 = queryNorm
              0.6494497 = fieldWeight in 4709, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.125 = fieldNorm(doc=4709)
          1.0650901 = weight(abstract_txt:sublanguage in 4709) [ClassicSimilarity], result of:
            1.0650901 = score(doc=4709,freq=1.0), product of:
              0.8738324 = queryWeight, product of:
                9.033117 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.0099207 = queryNorm
              1.2188722 = fieldWeight in 4709, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.125 = fieldNorm(doc=4709)
        0.08 = coord(2/25)
    
  4. Salton, G.: Automatic processing of foreign language documents (1985) 0.09
    0.08833734 = sum of:
      0.08833734 = product of:
        0.22084334 = sum of:
          0.00867011 = weight(abstract_txt:computer in 3650) [ClassicSimilarity], result of:
            0.00867011 = score(doc=3650,freq=1.0), product of:
              0.042836387 = queryWeight, product of:
                4.317879 = idf(docFreq=1601, maxDocs=44218)
                0.0099207 = queryNorm
              0.2024006 = fieldWeight in 3650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.317879 = idf(docFreq=1601, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
          0.010978426 = weight(abstract_txt:further in 3650) [ClassicSimilarity], result of:
            0.010978426 = score(doc=3650,freq=1.0), product of:
              0.050136782 = queryWeight, product of:
                1.0818619 = boost
                4.671349 = idf(docFreq=1124, maxDocs=44218)
                0.0099207 = queryNorm
              0.2189695 = fieldWeight in 3650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.671349 = idf(docFreq=1124, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
          0.011874618 = weight(abstract_txt:practical in 3650) [ClassicSimilarity], result of:
            0.011874618 = score(doc=3650,freq=1.0), product of:
              0.052829463 = queryWeight, product of:
                1.1105336 = boost
                4.79515 = idf(docFreq=993, maxDocs=44218)
                0.0099207 = queryNorm
              0.22477265 = fieldWeight in 3650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.79515 = idf(docFreq=993, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
          0.012262189 = weight(abstract_txt:theoretical in 3650) [ClassicSimilarity], result of:
            0.012262189 = score(doc=3650,freq=1.0), product of:
              0.053972818 = queryWeight, product of:
                1.1224865 = boost
                4.846761 = idf(docFreq=943, maxDocs=44218)
                0.0099207 = queryNorm
              0.22719193 = fieldWeight in 3650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.846761 = idf(docFreq=943, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
          0.016519867 = weight(abstract_txt:words in 3650) [ClassicSimilarity], result of:
            0.016519867 = score(doc=3650,freq=1.0), product of:
              0.06583661 = queryWeight, product of:
                1.2397306 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.0099207 = queryNorm
              0.2509222 = fieldWeight in 3650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
          0.01865547 = weight(abstract_txt:criteria in 3650) [ClassicSimilarity], result of:
            0.01865547 = score(doc=3650,freq=1.0), product of:
              0.071394905 = queryWeight, product of:
                1.2910029 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.0099207 = queryNorm
              0.26129973 = fieldWeight in 3650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
          0.01265203 = weight(abstract_txt:using in 3650) [ClassicSimilarity], result of:
            0.01265203 = score(doc=3650,freq=2.0), product of:
              0.055110782 = queryWeight, product of:
                1.6040831 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0099207 = queryNorm
              0.2295745 = fieldWeight in 3650, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
          0.012396898 = weight(abstract_txt:science in 3650) [ClassicSimilarity], result of:
            0.012396898 = score(doc=3650,freq=1.0), product of:
              0.06849861 = queryWeight, product of:
                1.7883375 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0099207 = queryNorm
              0.18098028 = fieldWeight in 3650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
          0.052325122 = weight(abstract_txt:automatic in 3650) [ClassicSimilarity], result of:
            0.052325122 = score(doc=3650,freq=3.0), product of:
              0.12404317 = queryWeight, product of:
                2.4065506 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0099207 = queryNorm
              0.42182994 = fieldWeight in 3650, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
          0.06450861 = weight(abstract_txt:identification in 3650) [ClassicSimilarity], result of:
            0.06450861 = score(doc=3650,freq=1.0), product of:
              0.235459 = queryWeight, product of:
                4.0608025 = boost
                5.8446846 = idf(docFreq=347, maxDocs=44218)
                0.0099207 = queryNorm
              0.2739696 = fieldWeight in 3650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8446846 = idf(docFreq=347, maxDocs=44218)
                0.046875 = fieldNorm(doc=3650)
        0.4 = coord(10/25)
    
  5. Hmeidi, I.; Kanaan, G.; Evens, M.: Design and implementation of automatic indexing for information retrieval with Arabic documents (1997) 0.08
    0.08189864 = sum of:
      0.08189864 = product of:
        0.29249513 = sum of:
          0.014450183 = weight(abstract_txt:computer in 1660) [ClassicSimilarity], result of:
            0.014450183 = score(doc=1660,freq=1.0), product of:
              0.042836387 = queryWeight, product of:
                4.317879 = idf(docFreq=1601, maxDocs=44218)
                0.0099207 = queryNorm
              0.3373343 = fieldWeight in 1660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.317879 = idf(docFreq=1601, maxDocs=44218)
                0.078125 = fieldNorm(doc=1660)
          0.027186003 = weight(abstract_txt:experiments in 1660) [ClassicSimilarity], result of:
            0.027186003 = score(doc=1660,freq=1.0), product of:
              0.06528211 = queryWeight, product of:
                1.2344987 = boost
                5.3304167 = idf(docFreq=581, maxDocs=44218)
                0.0099207 = queryNorm
              0.41643882 = fieldWeight in 1660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3304167 = idf(docFreq=581, maxDocs=44218)
                0.078125 = fieldNorm(doc=1660)
          0.02753311 = weight(abstract_txt:words in 1660) [ClassicSimilarity], result of:
            0.02753311 = score(doc=1660,freq=1.0), product of:
              0.06583661 = queryWeight, product of:
                1.2397306 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.0099207 = queryNorm
              0.41820365 = fieldWeight in 1660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.078125 = fieldNorm(doc=1660)
          0.025825847 = weight(abstract_txt:using in 1660) [ClassicSimilarity], result of:
            0.025825847 = score(doc=1660,freq=3.0), product of:
              0.055110782 = queryWeight, product of:
                1.6040831 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0099207 = queryNorm
              0.46861696 = fieldWeight in 1660, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.078125 = fieldNorm(doc=1660)
          0.020661497 = weight(abstract_txt:science in 1660) [ClassicSimilarity], result of:
            0.020661497 = score(doc=1660,freq=1.0), product of:
              0.06849861 = queryWeight, product of:
                1.7883375 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0099207 = queryNorm
              0.3016338 = fieldWeight in 1660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.078125 = fieldNorm(doc=1660)
          0.10069975 = weight(abstract_txt:automatic in 1660) [ClassicSimilarity], result of:
            0.10069975 = score(doc=1660,freq=4.0), product of:
              0.12404317 = queryWeight, product of:
                2.4065506 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0099207 = queryNorm
              0.81181216 = fieldWeight in 1660, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.078125 = fieldNorm(doc=1660)
          0.07613872 = weight(abstract_txt:abstracts in 1660) [ClassicSimilarity], result of:
            0.07613872 = score(doc=1660,freq=1.0), product of:
              0.16342217 = queryWeight, product of:
                2.7622569 = boost
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.0099207 = queryNorm
              0.46590203 = fieldWeight in 1660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.078125 = fieldNorm(doc=1660)
        0.28 = coord(7/25)