Document (#2315)

Author
Renouf, A.
Title
Sticking to the text : a corpus linguist's view of language
Source
Aslib proceedings. 45(1993) no.5, S.131-136
Year
1993
Abstract
Corpus linguistics is the study of large, computer held bodies of text. Some corpus linguists are concerned with language descriptions for its own sake. On the corpus-linguistic continuum, the study of raw ASCII text is situated at one end, and the study of heavily pre-coded text at the other. Discusses the use of word frequency to identify changes in the lexicon; word repetition and word positioning in automatic abstracting and word clusters in automatic text retrieval. Compares the machine extract with manual abstracts. Abstractors and indexers may find themselves taking the original wording of the text more into account as the focus moves towards the electronic medium and away from the hard copy
Theme
Automatisches Indexieren
Computerlinguistik

Similar documents (content)

  1. Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.22
    0.2154988 = sum of:
      0.2154988 = product of:
        0.8979117 = sum of:
          0.09163082 = weight(abstract_txt:linguistics in 897) [ClassicSimilarity], result of:
            0.09163082 = score(doc=897,freq=2.0), product of:
              0.124197826 = queryWeight, product of:
                1.0108527 = boost
                6.677633 = idf(docFreq=147, maxDocs=43254)
                0.018399397 = queryNorm
              0.73778117 = fieldWeight in 897, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.677633 = idf(docFreq=147, maxDocs=43254)
                0.078125 = fieldNorm(doc=897)
          0.032062944 = weight(abstract_txt:language in 897) [ClassicSimilarity], result of:
            0.032062944 = score(doc=897,freq=1.0), product of:
              0.09789831 = queryWeight, product of:
                1.26921 = boost
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.018399397 = queryNorm
              0.32751274 = fieldWeight in 897, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.078125 = fieldNorm(doc=897)
          0.15688111 = weight(abstract_txt:linguists in 897) [ClassicSimilarity], result of:
            0.15688111 = score(doc=897,freq=1.0), product of:
              0.22394603 = queryWeight, product of:
                1.3573835 = boost
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.018399397 = queryNorm
              0.7005309 = fieldWeight in 897, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.078125 = fieldNorm(doc=897)
          0.1737299 = weight(abstract_txt:sake in 897) [ClassicSimilarity], result of:
            0.1737299 = score(doc=897,freq=1.0), product of:
              0.23970623 = queryWeight, product of:
                1.4043344 = boost
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.018399397 = queryNorm
              0.7247617 = fieldWeight in 897, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.078125 = fieldNorm(doc=897)
          0.24276178 = weight(abstract_txt:word in 897) [ClassicSimilarity], result of:
            0.24276178 = score(doc=897,freq=3.0), product of:
              0.32975852 = queryWeight, product of:
                3.2942677 = boost
                5.4404345 = idf(docFreq=509, maxDocs=43254)
                0.018399397 = queryNorm
              0.7361804 = fieldWeight in 897, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4404345 = idf(docFreq=509, maxDocs=43254)
                0.078125 = fieldNorm(doc=897)
          0.20084517 = weight(abstract_txt:corpus in 897) [ClassicSimilarity], result of:
            0.20084517 = score(doc=897,freq=1.0), product of:
              0.41913816 = queryWeight, product of:
                3.7139792 = boost
                6.1335816 = idf(docFreq=254, maxDocs=43254)
                0.018399397 = queryNorm
              0.47918606 = fieldWeight in 897, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1335816 = idf(docFreq=254, maxDocs=43254)
                0.078125 = fieldNorm(doc=897)
        0.24 = coord(6/25)
    
  2. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.15
    0.15174991 = sum of:
      0.15174991 = product of:
        0.6322913 = sum of:
          0.064792775 = weight(abstract_txt:linguistics in 4480) [ClassicSimilarity], result of:
            0.064792775 = score(doc=4480,freq=1.0), product of:
              0.124197826 = queryWeight, product of:
                1.0108527 = boost
                6.677633 = idf(docFreq=147, maxDocs=43254)
                0.018399397 = queryNorm
              0.5216901 = fieldWeight in 4480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677633 = idf(docFreq=147, maxDocs=43254)
                0.078125 = fieldNorm(doc=4480)
          0.04534385 = weight(abstract_txt:language in 4480) [ClassicSimilarity], result of:
            0.04534385 = score(doc=4480,freq=2.0), product of:
              0.09789831 = queryWeight, product of:
                1.26921 = boost
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.018399397 = queryNorm
              0.46317294 = fieldWeight in 4480, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.078125 = fieldNorm(doc=4480)
          0.02685978 = weight(abstract_txt:study in 4480) [ClassicSimilarity], result of:
            0.02685978 = score(doc=4480,freq=1.0), product of:
              0.099587545 = queryWeight, product of:
                1.5678121 = boost
                3.4522913 = idf(docFreq=3723, maxDocs=43254)
                0.018399397 = queryNorm
              0.26971024 = fieldWeight in 4480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4522913 = idf(docFreq=3723, maxDocs=43254)
                0.078125 = fieldNorm(doc=4480)
          0.06106332 = weight(abstract_txt:automatic in 4480) [ClassicSimilarity], result of:
            0.06106332 = score(doc=4480,freq=1.0), product of:
              0.15041572 = queryWeight, product of:
                1.573231 = boost
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.018399397 = queryNorm
              0.4059637 = fieldWeight in 4480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1963353 = idf(docFreq=650, maxDocs=43254)
                0.078125 = fieldNorm(doc=4480)
          0.15019365 = weight(abstract_txt:text in 4480) [ClassicSimilarity], result of:
            0.15019365 = score(doc=4480,freq=3.0), product of:
              0.2740779 = queryWeight, product of:
                3.6782691 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.018399397 = queryNorm
              0.5479962 = fieldWeight in 4480, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.078125 = fieldNorm(doc=4480)
          0.28403795 = weight(abstract_txt:corpus in 4480) [ClassicSimilarity], result of:
            0.28403795 = score(doc=4480,freq=2.0), product of:
              0.41913816 = queryWeight, product of:
                3.7139792 = boost
                6.1335816 = idf(docFreq=254, maxDocs=43254)
                0.018399397 = queryNorm
              0.67767143 = fieldWeight in 4480, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1335816 = idf(docFreq=254, maxDocs=43254)
                0.078125 = fieldNorm(doc=4480)
        0.24 = coord(6/25)
    
  3. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.11
    0.11421259 = sum of:
      0.11421259 = product of:
        0.5710629 = sum of:
          0.14142236 = weight(abstract_txt:lexicon in 90) [ClassicSimilarity], result of:
            0.14142236 = score(doc=90,freq=4.0), product of:
              0.16698968 = queryWeight, product of:
                1.1721298 = boost
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.018399397 = queryNorm
              0.8468928 = fieldWeight in 90, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.109816775 = weight(abstract_txt:wording in 90) [ClassicSimilarity], result of:
            0.109816775 = score(doc=90,freq=1.0), product of:
              0.22394603 = queryWeight, product of:
                1.3573835 = boost
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.018399397 = queryNorm
              0.49037158 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.018801847 = weight(abstract_txt:study in 90) [ClassicSimilarity], result of:
            0.018801847 = score(doc=90,freq=1.0), product of:
              0.099587545 = queryWeight, product of:
                1.5678121 = boost
                3.4522913 = idf(docFreq=3723, maxDocs=43254)
                0.018399397 = queryNorm
              0.18879718 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4522913 = idf(docFreq=3723, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.2403219 = weight(abstract_txt:word in 90) [ClassicSimilarity], result of:
            0.2403219 = score(doc=90,freq=6.0), product of:
              0.32975852 = queryWeight, product of:
                3.2942677 = boost
                5.4404345 = idf(docFreq=509, maxDocs=43254)
                0.018399397 = queryNorm
              0.72878146 = fieldWeight in 90, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.4404345 = idf(docFreq=509, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.060700044 = weight(abstract_txt:text in 90) [ClassicSimilarity], result of:
            0.060700044 = score(doc=90,freq=1.0), product of:
              0.2740779 = queryWeight, product of:
                3.6782691 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.018399397 = queryNorm
              0.22147004 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
        0.2 = coord(5/25)
    
  4. Lund, K.; Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence (1996) 0.11
    0.11116418 = sum of:
      0.11116418 = product of:
        0.6947762 = sum of:
          0.04534385 = weight(abstract_txt:language in 3169) [ClassicSimilarity], result of:
            0.04534385 = score(doc=3169,freq=2.0), product of:
              0.09789831 = queryWeight, product of:
                1.26921 = boost
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.018399397 = queryNorm
              0.46317294 = fieldWeight in 3169, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.192163 = idf(docFreq=1776, maxDocs=43254)
                0.078125 = fieldNorm(doc=3169)
          0.24276178 = weight(abstract_txt:word in 3169) [ClassicSimilarity], result of:
            0.24276178 = score(doc=3169,freq=3.0), product of:
              0.32975852 = queryWeight, product of:
                3.2942677 = boost
                5.4404345 = idf(docFreq=509, maxDocs=43254)
                0.018399397 = queryNorm
              0.7361804 = fieldWeight in 3169, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4404345 = idf(docFreq=509, maxDocs=43254)
                0.078125 = fieldNorm(doc=3169)
          0.1226326 = weight(abstract_txt:text in 3169) [ClassicSimilarity], result of:
            0.1226326 = score(doc=3169,freq=2.0), product of:
              0.2740779 = queryWeight, product of:
                3.6782691 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.018399397 = queryNorm
              0.44743705 = fieldWeight in 3169, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.078125 = fieldNorm(doc=3169)
          0.28403795 = weight(abstract_txt:corpus in 3169) [ClassicSimilarity], result of:
            0.28403795 = score(doc=3169,freq=2.0), product of:
              0.41913816 = queryWeight, product of:
                3.7139792 = boost
                6.1335816 = idf(docFreq=254, maxDocs=43254)
                0.018399397 = queryNorm
              0.67767143 = fieldWeight in 3169, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1335816 = idf(docFreq=254, maxDocs=43254)
                0.078125 = fieldNorm(doc=3169)
        0.16 = coord(4/25)
    
  5. Li, J.; Zhang, Z.; Li, X.; Chen, H.: Kernel-based learning for biomedical relation extraction (2008) 0.11
    0.11064487 = sum of:
      0.11064487 = product of:
        0.5532243 = sum of:
          0.062728226 = weight(abstract_txt:extract in 3612) [ClassicSimilarity], result of:
            0.062728226 = score(doc=3612,freq=1.0), product of:
              0.12154533 = queryWeight, product of:
                6.605941 = idf(docFreq=158, maxDocs=43254)
                0.018399397 = queryNorm
              0.51608914 = fieldWeight in 3612, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.605941 = idf(docFreq=158, maxDocs=43254)
                0.078125 = fieldNorm(doc=3612)
          0.02685978 = weight(abstract_txt:study in 3612) [ClassicSimilarity], result of:
            0.02685978 = score(doc=3612,freq=1.0), product of:
              0.099587545 = queryWeight, product of:
                1.5678121 = boost
                3.4522913 = idf(docFreq=3723, maxDocs=43254)
                0.018399397 = queryNorm
              0.26971024 = fieldWeight in 3612, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4522913 = idf(docFreq=3723, maxDocs=43254)
                0.078125 = fieldNorm(doc=3612)
          0.14015856 = weight(abstract_txt:word in 3612) [ClassicSimilarity], result of:
            0.14015856 = score(doc=3612,freq=1.0), product of:
              0.32975852 = queryWeight, product of:
                3.2942677 = boost
                5.4404345 = idf(docFreq=509, maxDocs=43254)
                0.018399397 = queryNorm
              0.42503393 = fieldWeight in 3612, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4404345 = idf(docFreq=509, maxDocs=43254)
                0.078125 = fieldNorm(doc=3612)
          0.1226326 = weight(abstract_txt:text in 3612) [ClassicSimilarity], result of:
            0.1226326 = score(doc=3612,freq=2.0), product of:
              0.2740779 = queryWeight, product of:
                3.6782691 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.018399397 = queryNorm
              0.44743705 = fieldWeight in 3612, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.078125 = fieldNorm(doc=3612)
          0.20084517 = weight(abstract_txt:corpus in 3612) [ClassicSimilarity], result of:
            0.20084517 = score(doc=3612,freq=1.0), product of:
              0.41913816 = queryWeight, product of:
                3.7139792 = boost
                6.1335816 = idf(docFreq=254, maxDocs=43254)
                0.018399397 = queryNorm
              0.47918606 = fieldWeight in 3612, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1335816 = idf(docFreq=254, maxDocs=43254)
                0.078125 = fieldNorm(doc=3612)
        0.2 = coord(5/25)