Document (#2315)

Author
Renouf, A.
Title
Sticking to the text : a corpus linguist's view of language
Source
Aslib proceedings. 45(1993) no.5, S.131-136
Year
1993
Abstract
Corpus linguistics is the study of large, computer held bodies of text. Some corpus linguists are concerned with language descriptions for its own sake. On the corpus-linguistic continuum, the study of raw ASCII text is situated at one end, and the study of heavily pre-coded text at the other. Discusses the use of word frequency to identify changes in the lexicon; word repetition and word positioning in automatic abstracting and word clusters in automatic text retrieval. Compares the machine extract with manual abstracts. Abstractors and indexers may find themselves taking the original wording of the text more into account as the focus moves towards the electronic medium and away from the hard copy
Theme
Automatisches Indexieren
Computerlinguistik

Similar documents (content)

  1. Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.22
    0.21530785 = sum of:
      0.21530785 = product of:
        0.89711607 = sum of:
          0.0914129 = weight(abstract_txt:linguistics in 894) [ClassicSimilarity], result of:
            0.0914129 = score(doc=894,freq=2.0), product of:
              0.12402254 = queryWeight, product of:
                1.0088115 = boost
                6.6711674 = idf(docFreq=149, maxDocs=43556)
                0.018428449 = queryNorm
              0.7370668 = fieldWeight in 894, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6711674 = idf(docFreq=149, maxDocs=43556)
                0.078125 = fieldNorm(doc=894)
          0.032033764 = weight(abstract_txt:language in 894) [ClassicSimilarity], result of:
            0.032033764 = score(doc=894,freq=1.0), product of:
              0.09785604 = queryWeight, product of:
                1.2672681 = boost
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.018428449 = queryNorm
              0.32735604 = fieldWeight in 894, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.078125 = fieldNorm(doc=894)
          0.15732919 = weight(abstract_txt:linguists in 894) [ClassicSimilarity], result of:
            0.15732919 = score(doc=894,freq=1.0), product of:
              0.22441152 = queryWeight, product of:
                1.3570075 = boost
                8.973753 = idf(docFreq=14, maxDocs=43556)
                0.018428449 = queryNorm
              0.7010745 = fieldWeight in 894, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.973753 = idf(docFreq=14, maxDocs=43556)
                0.078125 = fieldNorm(doc=894)
          0.17421256 = weight(abstract_txt:sake in 894) [ClassicSimilarity], result of:
            0.17421256 = score(doc=894,freq=1.0), product of:
              0.24019203 = queryWeight, product of:
                1.4039091 = boost
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.018428449 = queryNorm
              0.7253053 = fieldWeight in 894, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.078125 = fieldNorm(doc=894)
          0.24277475 = weight(abstract_txt:word in 894) [ClassicSimilarity], result of:
            0.24277475 = score(doc=894,freq=3.0), product of:
              0.32982802 = queryWeight, product of:
                3.2902846 = boost
                5.4395795 = idf(docFreq=513, maxDocs=43556)
                0.018428449 = queryNorm
              0.7360647 = fieldWeight in 894, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4395795 = idf(docFreq=513, maxDocs=43556)
                0.078125 = fieldNorm(doc=894)
          0.19935289 = weight(abstract_txt:corpus in 894) [ClassicSimilarity], result of:
            0.19935289 = score(doc=894,freq=1.0), product of:
              0.41713244 = queryWeight, product of:
                3.7002125 = boost
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.018428449 = queryNorm
              0.4779127 = fieldWeight in 894, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.078125 = fieldNorm(doc=894)
        0.24 = coord(6/25)
    
  2. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.15
    0.15115684 = sum of:
      0.15115684 = product of:
        0.62982017 = sum of:
          0.06463868 = weight(abstract_txt:linguistics in 13) [ClassicSimilarity], result of:
            0.06463868 = score(doc=13,freq=1.0), product of:
              0.12402254 = queryWeight, product of:
                1.0088115 = boost
                6.6711674 = idf(docFreq=149, maxDocs=43556)
                0.018428449 = queryNorm
              0.5211849 = fieldWeight in 13, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6711674 = idf(docFreq=149, maxDocs=43556)
                0.078125 = fieldNorm(doc=13)
          0.045302585 = weight(abstract_txt:language in 13) [ClassicSimilarity], result of:
            0.045302585 = score(doc=13,freq=2.0), product of:
              0.09785604 = queryWeight, product of:
                1.2672681 = boost
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.018428449 = queryNorm
              0.46295136 = fieldWeight in 13, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.078125 = fieldNorm(doc=13)
          0.026645264 = weight(abstract_txt:study in 13) [ClassicSimilarity], result of:
            0.026645264 = score(doc=13,freq=1.0), product of:
              0.099073924 = queryWeight, product of:
                1.5617086 = boost
                3.4424734 = idf(docFreq=3786, maxDocs=43556)
                0.018428449 = queryNorm
              0.26894325 = fieldWeight in 13, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4424734 = idf(docFreq=3786, maxDocs=43556)
                0.078125 = fieldNorm(doc=13)
          0.061070938 = weight(abstract_txt:automatic in 13) [ClassicSimilarity], result of:
            0.061070938 = score(doc=13,freq=1.0), product of:
              0.15045455 = queryWeight, product of:
                1.5713661 = boost
                5.195642 = idf(docFreq=655, maxDocs=43556)
                0.018428449 = queryNorm
              0.40590954 = fieldWeight in 13, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.195642 = idf(docFreq=655, maxDocs=43556)
                0.078125 = fieldNorm(doc=13)
          0.1502351 = weight(abstract_txt:text in 13) [ClassicSimilarity], result of:
            0.1502351 = score(doc=13,freq=3.0), product of:
              0.2741763 = queryWeight, product of:
                3.674094 = boost
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.018428449 = queryNorm
              0.54795074 = fieldWeight in 13, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.078125 = fieldNorm(doc=13)
          0.2819276 = weight(abstract_txt:corpus in 13) [ClassicSimilarity], result of:
            0.2819276 = score(doc=13,freq=2.0), product of:
              0.41713244 = queryWeight, product of:
                3.7002125 = boost
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.018428449 = queryNorm
              0.67587066 = fieldWeight in 13, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.078125 = fieldNorm(doc=13)
        0.24 = coord(6/25)
    
  3. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.11
    0.114129655 = sum of:
      0.114129655 = product of:
        0.57064825 = sum of:
          0.14081459 = weight(abstract_txt:lexicon in 1375) [ClassicSimilarity], result of:
            0.14081459 = score(doc=1375,freq=4.0), product of:
              0.16654006 = queryWeight, product of:
                1.1690122 = boost
                7.730559 = idf(docFreq=51, maxDocs=43556)
                0.018428449 = queryNorm
              0.84552985 = fieldWeight in 1375, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.730559 = idf(docFreq=51, maxDocs=43556)
                0.0546875 = fieldNorm(doc=1375)
          0.11013042 = weight(abstract_txt:wording in 1375) [ClassicSimilarity], result of:
            0.11013042 = score(doc=1375,freq=1.0), product of:
              0.22441152 = queryWeight, product of:
                1.3570075 = boost
                8.973753 = idf(docFreq=14, maxDocs=43556)
                0.018428449 = queryNorm
              0.4907521 = fieldWeight in 1375, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.973753 = idf(docFreq=14, maxDocs=43556)
                0.0546875 = fieldNorm(doc=1375)
          0.018651683 = weight(abstract_txt:study in 1375) [ClassicSimilarity], result of:
            0.018651683 = score(doc=1375,freq=1.0), product of:
              0.099073924 = queryWeight, product of:
                1.5617086 = boost
                3.4424734 = idf(docFreq=3786, maxDocs=43556)
                0.018428449 = queryNorm
              0.18826026 = fieldWeight in 1375, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4424734 = idf(docFreq=3786, maxDocs=43556)
                0.0546875 = fieldNorm(doc=1375)
          0.24033476 = weight(abstract_txt:word in 1375) [ClassicSimilarity], result of:
            0.24033476 = score(doc=1375,freq=6.0), product of:
              0.32982802 = queryWeight, product of:
                3.2902846 = boost
                5.4395795 = idf(docFreq=513, maxDocs=43556)
                0.018428449 = queryNorm
              0.7286669 = fieldWeight in 1375, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.4395795 = idf(docFreq=513, maxDocs=43556)
                0.0546875 = fieldNorm(doc=1375)
          0.060716797 = weight(abstract_txt:text in 1375) [ClassicSimilarity], result of:
            0.060716797 = score(doc=1375,freq=1.0), product of:
              0.2741763 = queryWeight, product of:
                3.674094 = boost
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.018428449 = queryNorm
              0.22145166 = fieldWeight in 1375, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.0546875 = fieldNorm(doc=1375)
        0.2 = coord(5/25)
    
  4. Lund, K.; Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence (1996) 0.11
    0.11082742 = sum of:
      0.11082742 = product of:
        0.6926714 = sum of:
          0.045302585 = weight(abstract_txt:language in 3702) [ClassicSimilarity], result of:
            0.045302585 = score(doc=3702,freq=2.0), product of:
              0.09785604 = queryWeight, product of:
                1.2672681 = boost
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.018428449 = queryNorm
              0.46295136 = fieldWeight in 3702, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1901574 = idf(docFreq=1792, maxDocs=43556)
                0.078125 = fieldNorm(doc=3702)
          0.24277475 = weight(abstract_txt:word in 3702) [ClassicSimilarity], result of:
            0.24277475 = score(doc=3702,freq=3.0), product of:
              0.32982802 = queryWeight, product of:
                3.2902846 = boost
                5.4395795 = idf(docFreq=513, maxDocs=43556)
                0.018428449 = queryNorm
              0.7360647 = fieldWeight in 3702, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4395795 = idf(docFreq=513, maxDocs=43556)
                0.078125 = fieldNorm(doc=3702)
          0.122666456 = weight(abstract_txt:text in 3702) [ClassicSimilarity], result of:
            0.122666456 = score(doc=3702,freq=2.0), product of:
              0.2741763 = queryWeight, product of:
                3.674094 = boost
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.018428449 = queryNorm
              0.4473999 = fieldWeight in 3702, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.078125 = fieldNorm(doc=3702)
          0.2819276 = weight(abstract_txt:corpus in 3702) [ClassicSimilarity], result of:
            0.2819276 = score(doc=3702,freq=2.0), product of:
              0.41713244 = queryWeight, product of:
                3.7002125 = boost
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.018428449 = queryNorm
              0.67587066 = fieldWeight in 3702, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.078125 = fieldNorm(doc=3702)
        0.16 = coord(4/25)
    
  5. Li, J.; Zhang, Z.; Li, X.; Chen, H.: Kernel-based learning for biomedical relation extraction (2008) 0.11
    0.110358074 = sum of:
      0.110358074 = product of:
        0.55179036 = sum of:
          0.06295968 = weight(abstract_txt:extract in 3609) [ClassicSimilarity], result of:
            0.06295968 = score(doc=3609,freq=1.0), product of:
              0.12186546 = queryWeight, product of:
                6.6128983 = idf(docFreq=158, maxDocs=43556)
                0.018428449 = queryNorm
              0.5166327 = fieldWeight in 3609, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6128983 = idf(docFreq=158, maxDocs=43556)
                0.078125 = fieldNorm(doc=3609)
          0.026645264 = weight(abstract_txt:study in 3609) [ClassicSimilarity], result of:
            0.026645264 = score(doc=3609,freq=1.0), product of:
              0.099073924 = queryWeight, product of:
                1.5617086 = boost
                3.4424734 = idf(docFreq=3786, maxDocs=43556)
                0.018428449 = queryNorm
              0.26894325 = fieldWeight in 3609, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4424734 = idf(docFreq=3786, maxDocs=43556)
                0.078125 = fieldNorm(doc=3609)
          0.14016607 = weight(abstract_txt:word in 3609) [ClassicSimilarity], result of:
            0.14016607 = score(doc=3609,freq=1.0), product of:
              0.32982802 = queryWeight, product of:
                3.2902846 = boost
                5.4395795 = idf(docFreq=513, maxDocs=43556)
                0.018428449 = queryNorm
              0.42496714 = fieldWeight in 3609, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4395795 = idf(docFreq=513, maxDocs=43556)
                0.078125 = fieldNorm(doc=3609)
          0.122666456 = weight(abstract_txt:text in 3609) [ClassicSimilarity], result of:
            0.122666456 = score(doc=3609,freq=2.0), product of:
              0.2741763 = queryWeight, product of:
                3.674094 = boost
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.018428449 = queryNorm
              0.4473999 = fieldWeight in 3609, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0494018 = idf(docFreq=2063, maxDocs=43556)
                0.078125 = fieldNorm(doc=3609)
          0.19935289 = weight(abstract_txt:corpus in 3609) [ClassicSimilarity], result of:
            0.19935289 = score(doc=3609,freq=1.0), product of:
              0.41713244 = queryWeight, product of:
                3.7002125 = boost
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.018428449 = queryNorm
              0.4779127 = fieldWeight in 3609, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.078125 = fieldNorm(doc=3609)
        0.2 = coord(5/25)