Document (#11720)

Author
Losee, R.M.
Haas, S.W.
Title
Sublanguage terms : dictionaries, usage, and automatic classification
Source
Journal of the American Society for Information Science. 46(1995) no.7, S.519-529
Year
1995
Abstract
The use of terms from natural and social science titles and abstracts is studied from the perspective of sublanguages and their specialized dictionaries. Explores different notions of sublanguage distinctiveness. Object methods for separating hard and soft sciences are suggested based on measures of sublanguage use, dictionary characteristics, and sublanguage distinctiveness. Abstracts were automatically classified with a high degree of accuracy by using a formula that condsiders the degree of uniqueness of terms in each sublanguage. This may prove useful for text filtering of information retrieval systems
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Haas, S.W.; Losee, R.M.: Looking in text windows : their size and composition (1994) 6.03
    6.0309753 = sum of:
      6.0309753 = sum of:
        2.7452266 = weight(author_txt:losee in 525) [ClassicSimilarity], result of:
          2.7452266 = score(doc=525,freq=1.0), product of:
            0.66360736 = queryWeight, product of:
              8.273647 = idf(docFreq=29, maxDocs=43254)
              0.080207355 = queryNorm
            4.1368237 = fieldWeight in 525, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.273647 = idf(docFreq=29, maxDocs=43254)
              0.5 = fieldNorm(doc=525)
        3.285749 = weight(author_txt:haas in 525) [ClassicSimilarity], result of:
          3.285749 = score(doc=525,freq=1.0), product of:
            0.748081 = queryWeight, product of:
              1.0617414 = boost
              8.784473 = idf(docFreq=17, maxDocs=43254)
              0.080207355 = queryNorm
            4.3922367 = fieldWeight in 525, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.784473 = idf(docFreq=17, maxDocs=43254)
              0.5 = fieldNorm(doc=525)
    
  2. Haas, S.W.: ¬A feasibility study of the case hierarchy model for the construction and porting of natural language interfaces (1990) 2.05
    2.0535932 = sum of:
      2.0535932 = product of:
        4.1071863 = sum of:
          4.1071863 = weight(author_txt:haas in 71) [ClassicSimilarity], result of:
            4.1071863 = score(doc=71,freq=1.0), product of:
              0.748081 = queryWeight, product of:
                1.0617414 = boost
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.080207355 = queryNorm
              5.490296 = fieldWeight in 71, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.625 = fieldNorm(doc=71)
        0.5 = coord(1/2)
    
  3. Haas, S.W.: Disciplinary variation in automatic sublanguage term identification (1997) 2.05
    2.0535932 = sum of:
      2.0535932 = product of:
        4.1071863 = sum of:
          4.1071863 = weight(author_txt:haas in 569) [ClassicSimilarity], result of:
            4.1071863 = score(doc=569,freq=1.0), product of:
              0.748081 = queryWeight, product of:
                1.0617414 = boost
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.080207355 = queryNorm
              5.490296 = fieldWeight in 569, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.625 = fieldNorm(doc=569)
        0.5 = coord(1/2)
    
  4. Haas, S.W.: ¬A text filter for the automatic identification of empirical articles (1996) 2.05
    2.0535932 = sum of:
      2.0535932 = product of:
        4.1071863 = sum of:
          4.1071863 = weight(author_txt:haas in 867) [ClassicSimilarity], result of:
            4.1071863 = score(doc=867,freq=1.0), product of:
              0.748081 = queryWeight, product of:
                1.0617414 = boost
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.080207355 = queryNorm
              5.490296 = fieldWeight in 867, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.625 = fieldNorm(doc=867)
        0.5 = coord(1/2)
    
  5. Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 2.05
    2.0535932 = sum of:
      2.0535932 = product of:
        4.1071863 = sum of:
          4.1071863 = weight(author_txt:haas in 485) [ClassicSimilarity], result of:
            4.1071863 = score(doc=485,freq=1.0), product of:
              0.748081 = queryWeight, product of:
                1.0617414 = boost
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.080207355 = queryNorm
              5.490296 = fieldWeight in 485, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.784473 = idf(docFreq=17, maxDocs=43254)
                0.625 = fieldNorm(doc=485)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Haas, S.; He, S.: Toward the automatic identification of sublanguage vocabulary (1993) 0.24
    0.2379851 = sum of:
      0.2379851 = product of:
        1.9832093 = sum of:
          0.009892295 = weight(abstract_txt:from in 4891) [ClassicSimilarity], result of:
            0.009892295 = score(doc=4891,freq=1.0), product of:
              0.028461035 = queryWeight, product of:
                1.0086213 = boost
                2.7805862 = idf(docFreq=7289, maxDocs=43254)
                0.010148134 = queryNorm
              0.34757328 = fieldWeight in 4891, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7805862 = idf(docFreq=7289, maxDocs=43254)
                0.125 = fieldNorm(doc=4891)
          0.13853449 = weight(abstract_txt:abstracts in 4891) [ClassicSimilarity], result of:
            0.13853449 = score(doc=4891,freq=2.0), product of:
              0.13124454 = queryWeight, product of:
                2.165925 = boost
                5.9710627 = idf(docFreq=299, maxDocs=43254)
                0.010148134 = queryNorm
              1.0555447 = fieldWeight in 4891, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9710627 = idf(docFreq=299, maxDocs=43254)
                0.125 = fieldNorm(doc=4891)
          1.8347825 = weight(abstract_txt:sublanguage in 4891) [ClassicSimilarity], result of:
            1.8347825 = score(doc=4891,freq=3.0), product of:
              0.87106115 = queryWeight, product of:
                8.822611 = boost
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.010148134 = queryNorm
              2.1063762 = fieldWeight in 4891, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.125 = fieldNorm(doc=4891)
        0.12 = coord(3/25)
    
  2. Haas, S.W.: Disciplinary variation in automatic sublanguage term identification (1997) 0.17
    0.17493947 = sum of:
      0.17493947 = product of:
        0.72891444 = sum of:
          0.019281581 = weight(abstract_txt:automatically in 569) [ClassicSimilarity], result of:
            0.019281581 = score(doc=569,freq=1.0), product of:
              0.055953134 = queryWeight, product of:
                5.5136375 = idf(docFreq=473, maxDocs=43254)
                0.010148134 = queryNorm
              0.34460235 = fieldWeight in 569, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5136375 = idf(docFreq=473, maxDocs=43254)
                0.0625 = fieldNorm(doc=569)
          0.0049461476 = weight(abstract_txt:from in 569) [ClassicSimilarity], result of:
            0.0049461476 = score(doc=569,freq=1.0), product of:
              0.028461035 = queryWeight, product of:
                1.0086213 = boost
                2.7805862 = idf(docFreq=7289, maxDocs=43254)
                0.010148134 = queryNorm
              0.17378664 = fieldWeight in 569, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7805862 = idf(docFreq=7289, maxDocs=43254)
                0.0625 = fieldNorm(doc=569)
          0.06503346 = weight(abstract_txt:hard in 569) [ClassicSimilarity], result of:
            0.06503346 = score(doc=569,freq=4.0), product of:
              0.07927457 = queryWeight, product of:
                1.1902953 = boost
                6.562857 = idf(docFreq=165, maxDocs=43254)
                0.010148134 = queryNorm
              0.82035714 = fieldWeight in 569, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.562857 = idf(docFreq=165, maxDocs=43254)
                0.0625 = fieldNorm(doc=569)
          0.048979335 = weight(abstract_txt:abstracts in 569) [ClassicSimilarity], result of:
            0.048979335 = score(doc=569,freq=1.0), product of:
              0.13124454 = queryWeight, product of:
                2.165925 = boost
                5.9710627 = idf(docFreq=299, maxDocs=43254)
                0.010148134 = queryNorm
              0.37319142 = fieldWeight in 569, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9710627 = idf(docFreq=299, maxDocs=43254)
                0.0625 = fieldNorm(doc=569)
          0.061017808 = weight(abstract_txt:terms in 569) [ClassicSimilarity], result of:
            0.061017808 = score(doc=569,freq=7.0), product of:
              0.09093019 = queryWeight, product of:
                2.2080173 = boost
                4.058069 = idf(docFreq=2031, maxDocs=43254)
                0.010148134 = queryNorm
              0.6710401 = fieldWeight in 569, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.058069 = idf(docFreq=2031, maxDocs=43254)
                0.0625 = fieldNorm(doc=569)
          0.5296561 = weight(abstract_txt:sublanguage in 569) [ClassicSimilarity], result of:
            0.5296561 = score(doc=569,freq=1.0), product of:
              0.87106115 = queryWeight, product of:
                8.822611 = boost
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.010148134 = queryNorm
              0.60805845 = fieldWeight in 569, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.0625 = fieldNorm(doc=569)
        0.24 = coord(6/25)
    
  3. Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.09
    0.08553636 = sum of:
      0.08553636 = product of:
        1.0692046 = sum of:
          0.009892295 = weight(abstract_txt:from in 5778) [ClassicSimilarity], result of:
            0.009892295 = score(doc=5778,freq=1.0), product of:
              0.028461035 = queryWeight, product of:
                1.0086213 = boost
                2.7805862 = idf(docFreq=7289, maxDocs=43254)
                0.010148134 = queryNorm
              0.34757328 = fieldWeight in 5778, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7805862 = idf(docFreq=7289, maxDocs=43254)
                0.125 = fieldNorm(doc=5778)
          1.0593122 = weight(abstract_txt:sublanguage in 5778) [ClassicSimilarity], result of:
            1.0593122 = score(doc=5778,freq=1.0), product of:
              0.87106115 = queryWeight, product of:
                8.822611 = boost
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.010148134 = queryNorm
              1.2161169 = fieldWeight in 5778, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.125 = fieldNorm(doc=5778)
        0.08 = coord(2/25)
    
  4. Hutchins, J.: ¬A new era in machine translation research (1995) 0.08
    0.084393926 = sum of:
      0.084393926 = product of:
        0.7032827 = sum of:
          0.0061826846 = weight(abstract_txt:from in 4915) [ClassicSimilarity], result of:
            0.0061826846 = score(doc=4915,freq=1.0), product of:
              0.028461035 = queryWeight, product of:
                1.0086213 = boost
                2.7805862 = idf(docFreq=7289, maxDocs=43254)
                0.010148134 = queryNorm
              0.2172333 = fieldWeight in 4915, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7805862 = idf(docFreq=7289, maxDocs=43254)
                0.078125 = fieldNorm(doc=4915)
          0.035029944 = weight(abstract_txt:specialized in 4915) [ClassicSimilarity], result of:
            0.035029944 = score(doc=4915,freq=1.0), product of:
              0.07179303 = queryWeight, product of:
                1.1327366 = boost
                6.245499 = idf(docFreq=227, maxDocs=43254)
                0.010148134 = queryNorm
              0.4879296 = fieldWeight in 4915, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.245499 = idf(docFreq=227, maxDocs=43254)
                0.078125 = fieldNorm(doc=4915)
          0.6620701 = weight(abstract_txt:sublanguage in 4915) [ClassicSimilarity], result of:
            0.6620701 = score(doc=4915,freq=1.0), product of:
              0.87106115 = queryWeight, product of:
                8.822611 = boost
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.010148134 = queryNorm
              0.76007307 = fieldWeight in 4915, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.078125 = fieldNorm(doc=4915)
        0.12 = coord(3/25)
    
  5. Ananiadou, S.; McNaught, J.: Terms are not alone : term choice and choice terms (1995) 0.08
    0.08004154 = sum of:
      0.08004154 = product of:
        1.0005193 = sum of:
          0.07362109 = weight(abstract_txt:degree in 2860) [ClassicSimilarity], result of:
            0.07362109 = score(doc=2860,freq=1.0), product of:
              0.11859019 = queryWeight, product of:
                2.0588617 = boost
                5.6759086 = idf(docFreq=402, maxDocs=43254)
                0.010148134 = queryNorm
              0.6208025 = fieldWeight in 2860, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6759086 = idf(docFreq=402, maxDocs=43254)
                0.109375 = fieldNorm(doc=2860)
          0.9268982 = weight(abstract_txt:sublanguage in 2860) [ClassicSimilarity], result of:
            0.9268982 = score(doc=2860,freq=1.0), product of:
              0.87106115 = queryWeight, product of:
                8.822611 = boost
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.010148134 = queryNorm
              1.0641023 = fieldWeight in 2860, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.728935 = idf(docFreq=6, maxDocs=43254)
                0.109375 = fieldNorm(doc=2860)
        0.08 = coord(2/25)