Search (598 results, page 1 of 30)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.38
    0.37924933 = product of:
      0.5056658 = sum of:
        0.06558679 = product of:
          0.19676036 = sum of:
            0.19676036 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.19676036 = score(doc=562,freq=2.0), product of:
                0.35009617 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.041294612 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.19676036 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.19676036 = score(doc=562,freq=2.0), product of:
            0.35009617 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.041294612 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.014968331 = weight(_text_:of in 562) [ClassicSimilarity], result of:
          0.014968331 = score(doc=562,freq=10.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.23179851 = fieldWeight in 562, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.19676036 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.19676036 = score(doc=562,freq=2.0), product of:
            0.35009617 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.041294612 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.014805362 = product of:
          0.029610723 = sum of:
            0.029610723 = weight(_text_:on in 562) [ClassicSimilarity], result of:
              0.029610723 = score(doc=562,freq=10.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.32602316 = fieldWeight in 562, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
        0.016784549 = product of:
          0.033569098 = sum of:
            0.033569098 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.033569098 = score(doc=562,freq=2.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.75 = coord(6/8)
    
    Abstract
    Document representations for text classification are typically based on the classical Bag-Of-Words paradigm. This approach comes with deficiencies that motivate the integration of features on a higher semantic level than single words. In this paper we propose an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting is used for actual classification. Experimental evaluations on two well known text corpora support our approach through consistent improvement of the results.
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
    Source
    Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK
  2. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.35
    0.34568116 = product of:
      0.4609082 = sum of:
        0.025048172 = weight(_text_:retrieval in 563) [ClassicSimilarity], result of:
          0.025048172 = score(doc=563,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.20052543 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.19676036 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.19676036 = score(doc=563,freq=2.0), product of:
            0.35009617 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.041294612 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.018933605 = weight(_text_:of in 563) [ClassicSimilarity], result of:
          0.018933605 = score(doc=563,freq=16.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.2932045 = fieldWeight in 563, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.19676036 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.19676036 = score(doc=563,freq=2.0), product of:
            0.35009617 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.041294612 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.006621159 = product of:
          0.013242318 = sum of:
            0.013242318 = weight(_text_:on in 563) [ClassicSimilarity], result of:
              0.013242318 = score(doc=563,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.14580199 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
        0.016784549 = product of:
          0.033569098 = sum of:
            0.033569098 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.033569098 = score(doc=563,freq=2.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.75 = coord(6/8)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
    Imprint
    Guelph, Ontario : University of Guelph
  3. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.24
    0.23703793 = product of:
      0.47407585 = sum of:
        0.06558679 = product of:
          0.19676036 = sum of:
            0.19676036 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.19676036 = score(doc=862,freq=2.0), product of:
                0.35009617 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.041294612 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.19676036 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.19676036 = score(doc=862,freq=2.0), product of:
            0.35009617 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.041294612 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
        0.014968331 = weight(_text_:of in 862) [ClassicSimilarity], result of:
          0.014968331 = score(doc=862,freq=10.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.23179851 = fieldWeight in 862, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
        0.19676036 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.19676036 = score(doc=862,freq=2.0), product of:
            0.35009617 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.041294612 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.5 = coord(4/8)
    
    Abstract
    This research revisits the classic Turing test and compares recent large language models such as ChatGPT for their abilities to reproduce human-level comprehension and compelling text generation. Two task challenges- summary and question answering- prompt ChatGPT to produce original content (98-99%) from a single text entry and sequential questions initially posed by Turing in 1950. We score the original and generated content against the OpenAI GPT-2 Output Detector from 2019, and establish multiple cases where the generated content proves original and undetectable (98%). The question of a machine fooling a human judge recedes in this work relative to the question of "how would one prove it?" The original contribution of the work presents a metric and simple grammatical set for understanding the writing mechanics of chatbots in evaluating their readability and statistical clarity, engagement, delivery, overall quality, and plagiarism risks. While Turing's original prose scores at least 14% below the machine-generated output, whether an algorithm displays hints of Turing's true initial thoughts (the "Lovelace 2.0" test) remains unanswerable.
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  4. Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.12
    0.123899214 = product of:
      0.3303979 = sum of:
        0.059891056 = weight(_text_:use in 3117) [ClassicSimilarity], result of:
          0.059891056 = score(doc=3117,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.47364265 = fieldWeight in 3117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.109375 = fieldNorm(doc=3117)
        0.015619429 = weight(_text_:of in 3117) [ClassicSimilarity], result of:
          0.015619429 = score(doc=3117,freq=2.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.24188137 = fieldWeight in 3117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.109375 = fieldNorm(doc=3117)
        0.2548874 = sum of:
          0.17655951 = weight(_text_:computers in 3117) [ClassicSimilarity], result of:
            0.17655951 = score(doc=3117,freq=2.0), product of:
              0.21710795 = queryWeight, product of:
                5.257537 = idf(docFreq=625, maxDocs=44218)
                0.041294612 = queryNorm
              0.81323373 = fieldWeight in 3117, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.257537 = idf(docFreq=625, maxDocs=44218)
                0.109375 = fieldNorm(doc=3117)
          0.078327894 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
            0.078327894 = score(doc=3117,freq=2.0), product of:
              0.1446067 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.041294612 = queryNorm
              0.5416616 = fieldWeight in 3117, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.109375 = fieldNorm(doc=3117)
      0.375 = coord(3/8)
    
    Abstract
    Relation between meaning, lexical productivity and frequency of use
    Date
    28. 2.1999 10:48:22
    Source
    Computers and the humanities. 31(1997) no.4, S.281-291
  5. Hutchins, W.J.; Somers, H.L.: ¬An introduction to machine translation (1992) 0.09
    0.08686289 = product of:
      0.23163438 = sum of:
        0.08469875 = weight(_text_:use in 5017) [ClassicSimilarity], result of:
          0.08469875 = score(doc=5017,freq=4.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.66983193 = fieldWeight in 5017, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.109375 = fieldNorm(doc=5017)
        0.022089208 = weight(_text_:of in 5017) [ClassicSimilarity], result of:
          0.022089208 = score(doc=5017,freq=4.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.34207192 = fieldWeight in 5017, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.109375 = fieldNorm(doc=5017)
        0.12484643 = product of:
          0.24969286 = sum of:
            0.24969286 = weight(_text_:computers in 5017) [ClassicSimilarity], result of:
              0.24969286 = score(doc=5017,freq=4.0), product of:
                0.21710795 = queryWeight, product of:
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.041294612 = queryNorm
                1.1500862 = fieldWeight in 5017, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.109375 = fieldNorm(doc=5017)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    COMPASS
    Translation / Use of / Computers
    Subject
    Translation / Use of / Computers
  6. Beardon, C.; Lumsden, D.; Holmes, G.: Natural language and computational linguistics (1991) 0.08
    0.08347274 = product of:
      0.22259398 = sum of:
        0.07259893 = weight(_text_:use in 645) [ClassicSimilarity], result of:
          0.07259893 = score(doc=645,freq=4.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.5741416 = fieldWeight in 645, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.09375 = fieldNorm(doc=645)
        0.018933605 = weight(_text_:of in 645) [ClassicSimilarity], result of:
          0.018933605 = score(doc=645,freq=4.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.2932045 = fieldWeight in 645, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.09375 = fieldNorm(doc=645)
        0.13106145 = product of:
          0.2621229 = sum of:
            0.2621229 = weight(_text_:computers in 645) [ClassicSimilarity], result of:
              0.2621229 = score(doc=645,freq=6.0), product of:
                0.21710795 = queryWeight, product of:
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.041294612 = queryNorm
                1.207339 = fieldWeight in 645, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.09375 = fieldNorm(doc=645)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    COMPASS
    Computers / Use of / Natural language
    Series
    Ellis Horwood series in computers and their applications
    Subject
    Computers / Use of / Natural language
  7. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.08
    0.081611864 = product of:
      0.16322373 = sum of:
        0.029519552 = weight(_text_:retrieval in 2541) [ClassicSimilarity], result of:
          0.029519552 = score(doc=2541,freq=4.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.23632148 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.02011309 = weight(_text_:of in 2541) [ClassicSimilarity], result of:
          0.02011309 = score(doc=2541,freq=26.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.31146988 = fieldWeight in 2541, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.0938103 = sum of:
          0.022070529 = weight(_text_:on in 2541) [ClassicSimilarity], result of:
            0.022070529 = score(doc=2541,freq=8.0), product of:
              0.090823986 = queryWeight, product of:
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.041294612 = queryNorm
              0.24300331 = fieldWeight in 2541, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2541)
          0.07173977 = weight(_text_:line in 2541) [ClassicSimilarity], result of:
            0.07173977 = score(doc=2541,freq=2.0), product of:
              0.23157367 = queryWeight, product of:
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.041294612 = queryNorm
              0.30979243 = fieldWeight in 2541, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2541)
        0.019780781 = product of:
          0.039561562 = sum of:
            0.039561562 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
              0.039561562 = score(doc=2541,freq=4.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.27358043 = fieldWeight in 2541, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  8. Computational linguistics for the new millennium : divergence or synergy? Proceedings of the International Symposium held at the Ruprecht-Karls Universität Heidelberg, 21-22 July 2000. Festschrift in honour of Peter Hellwig on the occasion of his 60th birthday (2002) 0.07
    0.06873793 = product of:
      0.13747586 = sum of:
        0.021389665 = weight(_text_:use in 4900) [ClassicSimilarity], result of:
          0.021389665 = score(doc=4900,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.1691581 = fieldWeight in 4900, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4900)
        0.019324033 = weight(_text_:of in 4900) [ClassicSimilarity], result of:
          0.019324033 = score(doc=4900,freq=24.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.2992506 = fieldWeight in 4900, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4900)
        0.082775034 = sum of:
          0.0110352645 = weight(_text_:on in 4900) [ClassicSimilarity], result of:
            0.0110352645 = score(doc=4900,freq=2.0), product of:
              0.090823986 = queryWeight, product of:
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.041294612 = queryNorm
              0.121501654 = fieldWeight in 4900, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4900)
          0.07173977 = weight(_text_:line in 4900) [ClassicSimilarity], result of:
            0.07173977 = score(doc=4900,freq=2.0), product of:
              0.23157367 = queryWeight, product of:
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.041294612 = queryNorm
              0.30979243 = fieldWeight in 4900, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4900)
        0.013987125 = product of:
          0.02797425 = sum of:
            0.02797425 = weight(_text_:22 in 4900) [ClassicSimilarity], result of:
              0.02797425 = score(doc=4900,freq=2.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.19345059 = fieldWeight in 4900, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4900)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    The two seemingly conflicting tendencies, synergy and divergence, are both fundamental to the advancement of any science. Their interplay defines the demarcation line between application-oriented and theoretical research. The papers in this festschrift in honour of Peter Hellwig are geared to answer questions that arise from this insight: where does the discipline of Computational Linguistics currently stand, what has been achieved so far and what should be done next. Given the complexity of such questions, no simple answers can be expected. However, each of the practitioners and researchers are contributing from their very own perspective a piece of insight into the overall picture of today's and tomorrow's computational linguistics.
    Content
    Contents: Manfred Klenner / Henriette Visser: Introduction - Khurshid Ahmad: Writing Linguistics: When I use a word it means what I choose it to mean - Jürgen Handke: 2000 and Beyond: The Potential of New Technologies in Linguistics - Jurij Apresjan / Igor Boguslavsky / Leonid Iomdin / Leonid Tsinman: Lexical Functions in NU: Possible Uses - Hubert Lehmann: Practical Machine Translation and Linguistic Theory - Karin Haenelt: A Contextbased Approach towards Content Processing of Electronic Documents - Petr Sgall / Eva Hajicová: Are Linguistic Frameworks Comparable? - Wolfgang Menzel: Theory and Applications in Computational Linguistics - Is there Common Ground? - Robert Porzel / Michael Strube: Towards Context-adaptive Natural Language Processing Systems - Nicoletta Calzolari: Language Resources in a Multilingual Setting: The European Perspective - Piek Vossen: Computational Linguistics for Theory and Practice.
  9. Metzler, D.P.; Haas, S.W.: ¬The constituent object parser : syntactic structure matching for information retrieval (1989) 0.06
    0.064315856 = product of:
      0.12863171 = sum of:
        0.059039105 = weight(_text_:retrieval in 3607) [ClassicSimilarity], result of:
          0.059039105 = score(doc=3607,freq=4.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.47264296 = fieldWeight in 3607, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=3607)
        0.04277933 = weight(_text_:use in 3607) [ClassicSimilarity], result of:
          0.04277933 = score(doc=3607,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.3383162 = fieldWeight in 3607, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.078125 = fieldNorm(doc=3607)
        0.015778005 = weight(_text_:of in 3607) [ClassicSimilarity], result of:
          0.015778005 = score(doc=3607,freq=4.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.24433708 = fieldWeight in 3607, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=3607)
        0.0110352645 = product of:
          0.022070529 = sum of:
            0.022070529 = weight(_text_:on in 3607) [ClassicSimilarity], result of:
              0.022070529 = score(doc=3607,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.24300331 = fieldWeight in 3607, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3607)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    The constituent object parser is designed to improve the precision and recall performance of information retrieval by providing more powerful matching procedures. Describes the dependency tree representations and the relationship between the intended use of the parser and its design.
    Source
    ACM transactions on information systems. 7(1989) no.3, S.292-316
  10. Pereira, C.N.; Grosz, B.J.: Natural language processing (1994) 0.06
    0.062044926 = product of:
      0.16545314 = sum of:
        0.06049911 = weight(_text_:use in 8602) [ClassicSimilarity], result of:
          0.06049911 = score(doc=8602,freq=4.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.47845137 = fieldWeight in 8602, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.078125 = fieldNorm(doc=8602)
        0.015778005 = weight(_text_:of in 8602) [ClassicSimilarity], result of:
          0.015778005 = score(doc=8602,freq=4.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.24433708 = fieldWeight in 8602, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=8602)
        0.08917602 = product of:
          0.17835204 = sum of:
            0.17835204 = weight(_text_:computers in 8602) [ClassicSimilarity], result of:
              0.17835204 = score(doc=8602,freq=4.0), product of:
                0.21710795 = queryWeight, product of:
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.041294612 = queryNorm
                0.82149017 = fieldWeight in 8602, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.078125 = fieldNorm(doc=8602)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    COMPASS
    Computers / Use of / Natural language
    Subject
    Computers / Use of / Natural language
  11. Whitelock, P.; Kilby, K.: Linguistic and computational techniques in machine translation system design : 2nd ed (1995) 0.06
    0.062044926 = product of:
      0.16545314 = sum of:
        0.06049911 = weight(_text_:use in 1750) [ClassicSimilarity], result of:
          0.06049911 = score(doc=1750,freq=4.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.47845137 = fieldWeight in 1750, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.078125 = fieldNorm(doc=1750)
        0.015778005 = weight(_text_:of in 1750) [ClassicSimilarity], result of:
          0.015778005 = score(doc=1750,freq=4.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.24433708 = fieldWeight in 1750, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=1750)
        0.08917602 = product of:
          0.17835204 = sum of:
            0.17835204 = weight(_text_:computers in 1750) [ClassicSimilarity], result of:
              0.17835204 = score(doc=1750,freq=4.0), product of:
                0.21710795 = queryWeight, product of:
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.041294612 = queryNorm
                0.82149017 = fieldWeight in 1750, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1750)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    COMPASS
    Linguistics / Use of / Computers
    Subject
    Linguistics / Use of / Computers
  12. Hess, M.: ¬An incrementally extensible document retrieval system based on linguistic and logical principles (1992) 0.06
    0.060743917 = product of:
      0.12148783 = sum of:
        0.05600942 = weight(_text_:retrieval in 2413) [ClassicSimilarity], result of:
          0.05600942 = score(doc=2413,freq=10.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.44838852 = fieldWeight in 2413, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2413)
        0.036299463 = weight(_text_:use in 2413) [ClassicSimilarity], result of:
          0.036299463 = score(doc=2413,freq=4.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.2870708 = fieldWeight in 2413, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=2413)
        0.017710768 = weight(_text_:of in 2413) [ClassicSimilarity], result of:
          0.017710768 = score(doc=2413,freq=14.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.2742677 = fieldWeight in 2413, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2413)
        0.011468184 = product of:
          0.022936368 = sum of:
            0.022936368 = weight(_text_:on in 2413) [ClassicSimilarity], result of:
              0.022936368 = score(doc=2413,freq=6.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.25253648 = fieldWeight in 2413, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2413)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Most natural language based document retrieval systems use the syntax structures of constituent phrases of documents as index terms. Many of these systems also attempt to reduce the syntactic variability of natural language by some normalisation procedure applied to these syntax structures. However, the retrieval performance of such systems remains fairly disappointing. Some systems therefore use a meaning representation language to index and retrieve documents. In this paper, a system is presented that uses Horn Clause Logic as meaning representation language, employs advanced techniques from Natural Language Processing to achieve incremental extensibility, and uses methods from Logic Programming to achieve robustness in the face of insufficient data. An Incrementally Extensible Document Retrieval System Based on Linguistic and Logical Principles.
    Source
    SIGIR '92: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
  13. Wacholder, N.; Byrd, R.J.: Retrieving information from full text using linguistic knowledge (1994) 0.06
    0.056636915 = product of:
      0.15103178 = sum of:
        0.025048172 = weight(_text_:retrieval in 8524) [ClassicSimilarity], result of:
          0.025048172 = score(doc=8524,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.20052543 = fieldWeight in 8524, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=8524)
        0.021168415 = weight(_text_:of in 8524) [ClassicSimilarity], result of:
          0.021168415 = score(doc=8524,freq=20.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.32781258 = fieldWeight in 8524, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=8524)
        0.10481519 = sum of:
          0.018727465 = weight(_text_:on in 8524) [ClassicSimilarity], result of:
            0.018727465 = score(doc=8524,freq=4.0), product of:
              0.090823986 = queryWeight, product of:
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.041294612 = queryNorm
              0.20619515 = fieldWeight in 8524, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.046875 = fieldNorm(doc=8524)
          0.086087726 = weight(_text_:line in 8524) [ClassicSimilarity], result of:
            0.086087726 = score(doc=8524,freq=2.0), product of:
              0.23157367 = queryWeight, product of:
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.041294612 = queryNorm
              0.37175092 = fieldWeight in 8524, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.046875 = fieldNorm(doc=8524)
      0.375 = coord(3/8)
    
    Abstract
    Examines how techniques in the field of natural language processing can be applied to the analysis of text in information retrieval. State of the art text searching programs cannot distinguish, for example, between occurrences of the sickness, AIDS and aids as tool or between library school and school nor equate such terms as online or on-line which are variants of the same form. To make these distinction, systems must incorporate knowledge about the meaning of words in context. Research in natural language processing has concentrated on the automatic 'understanding' of language; how to analyze the grammatical structure and meaning of text. Although many asoects of this research remain experimental, describes how these techniques to recognize spelling variants, names, acronyms, and abbreviations
    Source
    Proceedings of the 15th National Online Meeting 1994, New York, 10-12 May 1994. Ed. by M.E. Williams
  14. Zaitseva, E.M.: Developing linguistic tools of thematic search in library information systems (2023) 0.06
    0.055401772 = product of:
      0.110803545 = sum of:
        0.036153924 = weight(_text_:retrieval in 1187) [ClassicSimilarity], result of:
          0.036153924 = score(doc=1187,freq=6.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.28943354 = fieldWeight in 1187, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1187)
        0.04277933 = weight(_text_:use in 1187) [ClassicSimilarity], result of:
          0.04277933 = score(doc=1187,freq=8.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.3383162 = fieldWeight in 1187, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1187)
        0.02231347 = weight(_text_:of in 1187) [ClassicSimilarity], result of:
          0.02231347 = score(doc=1187,freq=32.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.34554482 = fieldWeight in 1187, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1187)
        0.00955682 = product of:
          0.01911364 = sum of:
            0.01911364 = weight(_text_:on in 1187) [ClassicSimilarity], result of:
              0.01911364 = score(doc=1187,freq=6.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.21044704 = fieldWeight in 1187, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1187)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Within the R&D program "Information support of research by scientists and specialists on the basis of RNPLS&T Open Archive - the system of scientific knowledge aggregation", the RNPLS&T analyzes the use of linguistic tools of thematic search in the modern library information systems and the prospects for their development. The author defines the key common characteristics of e-catalogs of the largest Russian libraries revealed at the first stage of the analysis. Based on the specified common characteristics and detailed comparison analysis, the author outlines and substantiates the vectors for enhancing search inter faces of e-catalogs. The focus is made on linguistic tools of thematic search in library information systems; the key vectors are suggested: use of thematic search at different search levels with the clear-cut level differentiation; use of combined functionality within thematic search system; implementation of classification search in all e-catalogs; hierarchical representation of classifications; use of the matching systems for classification information retrieval languages, and in the long term classification and verbal information retrieval languages, and various verbal information retrieval languages. The author formulates practical recommendations to improve thematic search in library information systems.
  15. Bookstein, A.; Kulyukin, V.; Raita, T.; Nicholson, J.: Adapting measures of clumping strength to assess term-term similarity (2003) 0.05
    0.054967437 = product of:
      0.14657983 = sum of:
        0.025048172 = weight(_text_:retrieval in 1609) [ClassicSimilarity], result of:
          0.025048172 = score(doc=1609,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.20052543 = fieldWeight in 1609, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1609)
        0.022201622 = weight(_text_:of in 1609) [ClassicSimilarity], result of:
          0.022201622 = score(doc=1609,freq=22.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.34381276 = fieldWeight in 1609, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1609)
        0.099330045 = sum of:
          0.013242318 = weight(_text_:on in 1609) [ClassicSimilarity], result of:
            0.013242318 = score(doc=1609,freq=2.0), product of:
              0.090823986 = queryWeight, product of:
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.041294612 = queryNorm
              0.14580199 = fieldWeight in 1609, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.046875 = fieldNorm(doc=1609)
          0.086087726 = weight(_text_:line in 1609) [ClassicSimilarity], result of:
            0.086087726 = score(doc=1609,freq=2.0), product of:
              0.23157367 = queryWeight, product of:
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.041294612 = queryNorm
              0.37175092 = fieldWeight in 1609, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.046875 = fieldNorm(doc=1609)
      0.375 = coord(3/8)
    
    Abstract
    Automated information retrieval relies heavily an statistical regularities that emerge as terms are deposited to produce text. This paper examines statistical patterns expected of a pair of terms that are semantically related to each other. Guided by a conceptualization of the text generation process, we derive measures of how tightly two terms are semantically associated. Our main objective is to probe whether such measures yield reasonable results. Specifically, we examine how the tendency of a content bearing term to clump, as quantified by previously developed measures of term clumping, is influenced by the presence of other terms. This approach allows us to present a toolkit from which a range of measures can be constructed. As an illustration, one of several suggested measures is evaluated an a large text corpus built from an on-line encyclopedia.
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.7, S.611-620
  16. Hutchins, W.J.; Somers, H.L.: ¬An introduction to machine translation (1992) 0.05
    0.05443874 = product of:
      0.10887748 = sum of:
        0.030249555 = weight(_text_:use in 4512) [ClassicSimilarity], result of:
          0.030249555 = score(doc=4512,freq=4.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.23922569 = fieldWeight in 4512, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4512)
        0.01850135 = weight(_text_:of in 4512) [ClassicSimilarity], result of:
          0.01850135 = score(doc=4512,freq=22.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.28651062 = fieldWeight in 4512, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4512)
        0.0055176322 = product of:
          0.0110352645 = sum of:
            0.0110352645 = weight(_text_:on in 4512) [ClassicSimilarity], result of:
              0.0110352645 = score(doc=4512,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.121501654 = fieldWeight in 4512, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4512)
          0.5 = coord(1/2)
        0.05460894 = product of:
          0.10921788 = sum of:
            0.10921788 = weight(_text_:computers in 4512) [ClassicSimilarity], result of:
              0.10921788 = score(doc=4512,freq=6.0), product of:
                0.21710795 = queryWeight, product of:
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.041294612 = queryNorm
                0.50305796 = fieldWeight in 4512, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4512)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    The translation of foreign language texts by computers was one of the first tasks that the pioneers of Computing and Artificial Intelligence set themselves. Machine translation is again becoming an importantfield of research and development as the need for translations of technical and commercial documentation is growing well beyond the capacity of the translation profession.This is the first textbook of machine translation, providing a full course on both general machine translation systems characteristics and the computational linguistic foundations of the field. The book assumes no previous knowledge of machine translation and provides the basic background information to the linguistic and computational linguistics, artificial intelligence, natural language processing and information science.
    COMPASS
    Translation / Use of / Computers
    Subject
    Translation / Use of / Computers
  17. Chowdhury, G.G.: Natural language processing (2002) 0.05
    0.053741775 = product of:
      0.10748355 = sum of:
        0.025048172 = weight(_text_:retrieval in 4284) [ClassicSimilarity], result of:
          0.025048172 = score(doc=4284,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.20052543 = fieldWeight in 4284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=4284)
        0.025667597 = weight(_text_:use in 4284) [ClassicSimilarity], result of:
          0.025667597 = score(doc=4284,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.20298971 = fieldWeight in 4284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=4284)
        0.018933605 = weight(_text_:of in 4284) [ClassicSimilarity], result of:
          0.018933605 = score(doc=4284,freq=16.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.2932045 = fieldWeight in 4284, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=4284)
        0.037834182 = product of:
          0.075668365 = sum of:
            0.075668365 = weight(_text_:computers in 4284) [ClassicSimilarity], result of:
              0.075668365 = score(doc=4284,freq=2.0), product of:
                0.21710795 = queryWeight, product of:
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.041294612 = queryNorm
                0.34852874 = fieldWeight in 4284, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4284)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. NLP researchers aim to gather knowledge an how human beings understand and use language so that appropriate tools and techniques can be developed to make computer systems understand and manipulate natural languages to perform desired tasks. The foundations of NLP lie in a number of disciplines, namely, computer and information sciences, linguistics, mathematics, electrical and electronic engineering, artificial intelligence and robotics, and psychology. Applications of NLP include a number of fields of study, such as machine translation, natural language text processing and summarization, user interfaces, multilingual and cross-language information retrieval (CLIR), speech recognition, artificial intelligence, and expert systems. One important application area that is relatively new and has not been covered in previous ARIST chapters an NLP relates to the proliferation of the World Wide Web and digital libraries.
    Source
    Annual review of information science and technology. 37(2003), S.51-90
  18. Hutchins, J.: From first conception to first demonstration : the nascent years of machine translation, 1947-1954. A chronology (1997) 0.05
    0.051093977 = product of:
      0.20437591 = sum of:
        0.02231347 = weight(_text_:of in 1463) [ClassicSimilarity], result of:
          0.02231347 = score(doc=1463,freq=8.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.34554482 = fieldWeight in 1463, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=1463)
        0.18206243 = sum of:
          0.12611394 = weight(_text_:computers in 1463) [ClassicSimilarity], result of:
            0.12611394 = score(doc=1463,freq=2.0), product of:
              0.21710795 = queryWeight, product of:
                5.257537 = idf(docFreq=625, maxDocs=44218)
                0.041294612 = queryNorm
              0.58088124 = fieldWeight in 1463, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.257537 = idf(docFreq=625, maxDocs=44218)
                0.078125 = fieldNorm(doc=1463)
          0.0559485 = weight(_text_:22 in 1463) [ClassicSimilarity], result of:
            0.0559485 = score(doc=1463,freq=2.0), product of:
              0.1446067 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.041294612 = queryNorm
              0.38690117 = fieldWeight in 1463, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.078125 = fieldNorm(doc=1463)
      0.25 = coord(2/8)
    
    Abstract
    Chronicles the early history of applying electronic computers to the task of translating natural languages, from the 1st suggestions by Warren Weaver in Mar 1947 to the 1st demonstration of a working, if limited, program in Jan 1954
    Date
    31. 7.1996 9:22:19
  19. Addison, E.R.; Wilson, H.D.; Feder, J.: ¬The impact of plain English searching on end users (1993) 0.05
    0.05098432 = product of:
      0.10196864 = sum of:
        0.033397563 = weight(_text_:retrieval in 5354) [ClassicSimilarity], result of:
          0.033397563 = score(doc=5354,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.26736724 = fieldWeight in 5354, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=5354)
        0.03422346 = weight(_text_:use in 5354) [ClassicSimilarity], result of:
          0.03422346 = score(doc=5354,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.27065295 = fieldWeight in 5354, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0625 = fieldNorm(doc=5354)
        0.021862645 = weight(_text_:of in 5354) [ClassicSimilarity], result of:
          0.021862645 = score(doc=5354,freq=12.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.33856338 = fieldWeight in 5354, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=5354)
        0.012484977 = product of:
          0.024969954 = sum of:
            0.024969954 = weight(_text_:on in 5354) [ClassicSimilarity], result of:
              0.024969954 = score(doc=5354,freq=4.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.27492687 = fieldWeight in 5354, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5354)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Commercial software products are available with plain English searching capabilities as engines for online and CD-ROM information services, and for internal text information management. With plain English interfaces, end users do not need to master the keyword and connector approach of the Boolean search query language. Describes plain English searching and its impact on the process of full text retrieval. Explores the issues of ease of use, reliability and implications for the total research process
    Source
    Proceedings of the 14th National Online Meeting 1993, New York, 4-6 May 1993. Ed.: M.E. Williams
  20. Colace, F.; Santo, M. De; Greco, L.; Napoletano, P.: Weighted word pairs for query expansion (2015) 0.05
    0.050663535 = product of:
      0.10132707 = sum of:
        0.041327372 = weight(_text_:retrieval in 2687) [ClassicSimilarity], result of:
          0.041327372 = score(doc=2687,freq=4.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.33085006 = fieldWeight in 2687, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2687)
        0.029945528 = weight(_text_:use in 2687) [ClassicSimilarity], result of:
          0.029945528 = score(doc=2687,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.23682132 = fieldWeight in 2687, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2687)
        0.019129815 = weight(_text_:of in 2687) [ClassicSimilarity], result of:
          0.019129815 = score(doc=2687,freq=12.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.29624295 = fieldWeight in 2687, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2687)
        0.010924355 = product of:
          0.02184871 = sum of:
            0.02184871 = weight(_text_:on in 2687) [ClassicSimilarity], result of:
              0.02184871 = score(doc=2687,freq=4.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.24056101 = fieldWeight in 2687, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2687)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    This paper proposes a novel query expansion method to improve accuracy of text retrieval systems. Our method makes use of a minimal relevance feedback to expand the initial query with a structured representation composed of weighted pairs of words. Such a structure is obtained from the relevance feedback through a method for pairs of words selection based on the Probabilistic Topic Model. We compared our method with other baseline query expansion schemes and methods. Evaluations performed on TREC-8 demonstrated the effectiveness of the proposed method with respect to the baseline.
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval

Languages

Types

  • a 499
  • el 58
  • m 51
  • s 25
  • x 14
  • p 7
  • b 2
  • d 2
  • n 1
  • r 1
  • More… Less…

Subjects

Classifications