Search (439 results, page 1 of 22)

Natural language processing (1996) 0.39

0.38776773 = product of:
  0.5170236 = sum of:
    0.02946245 = weight(_text_:for in 6824) [ClassicSimilarity], result of:
      0.02946245 = score(doc=6824,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.33190575 = fieldWeight in 6824, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.125 = fieldNorm(doc=6824)
    0.25572333 = weight(_text_:computing in 6824) [ClassicSimilarity], result of:
      0.25572333 = score(doc=6824,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.9778349 = fieldWeight in 6824, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.125 = fieldNorm(doc=6824)
    0.23183785 = product of:
      0.4636757 = sum of:
        0.4636757 = weight(_text_:machinery in 6824) [ClassicSimilarity], result of:
          0.4636757 = score(doc=6824,freq=2.0), product of:
            0.35214928 = queryWeight, product of:
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.047278564 = queryNorm
            1.3167021 = fieldWeight in 6824, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.125 = fieldNorm(doc=6824)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Source: Communications of the Association for Computing Machinery. 39(1996) no.1, S.60-110

King, M.: Evaluating natural language processing systems (1996) 0.39

0.38776773 = product of:
  0.5170236 = sum of:
    0.02946245 = weight(_text_:for in 6826) [ClassicSimilarity], result of:
      0.02946245 = score(doc=6826,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.33190575 = fieldWeight in 6826, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.125 = fieldNorm(doc=6826)
    0.25572333 = weight(_text_:computing in 6826) [ClassicSimilarity], result of:
      0.25572333 = score(doc=6826,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.9778349 = fieldWeight in 6826, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.125 = fieldNorm(doc=6826)
    0.23183785 = product of:
      0.4636757 = sum of:
        0.4636757 = weight(_text_:machinery in 6826) [ClassicSimilarity], result of:
          0.4636757 = score(doc=6826,freq=2.0), product of:
            0.35214928 = queryWeight, product of:
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.047278564 = queryNorm
            1.3167021 = fieldWeight in 6826, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.125 = fieldNorm(doc=6826)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Source: Communications of the Association for Computing Machinery. 39(1996) no.1, S.xx-xx

Cowie, J.; Lehner, W.: Information extraction (1996) 0.39

0.38776773 = product of:
  0.5170236 = sum of:
    0.02946245 = weight(_text_:for in 6827) [ClassicSimilarity], result of:
      0.02946245 = score(doc=6827,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.33190575 = fieldWeight in 6827, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.125 = fieldNorm(doc=6827)
    0.25572333 = weight(_text_:computing in 6827) [ClassicSimilarity], result of:
      0.25572333 = score(doc=6827,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.9778349 = fieldWeight in 6827, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.125 = fieldNorm(doc=6827)
    0.23183785 = product of:
      0.4636757 = sum of:
        0.4636757 = weight(_text_:machinery in 6827) [ClassicSimilarity], result of:
          0.4636757 = score(doc=6827,freq=2.0), product of:
            0.35214928 = queryWeight, product of:
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.047278564 = queryNorm
            1.3167021 = fieldWeight in 6827, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.125 = fieldNorm(doc=6827)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Source: Communications of the Association for Computing Machinery. 39(1996) no.1, S.xx-xx

Lewis, D.D.; Sparck Jones, K.: Natural language processing for information retrieval (1996) 0.35

0.34730545 = product of:
  0.46307394 = sum of:
    0.03645792 = weight(_text_:for in 4144) [ClassicSimilarity], result of:
      0.03645792 = score(doc=4144,freq=4.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.41071242 = fieldWeight in 4144, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.109375 = fieldNorm(doc=4144)
    0.22375791 = weight(_text_:computing in 4144) [ClassicSimilarity], result of:
      0.22375791 = score(doc=4144,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.85560554 = fieldWeight in 4144, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.109375 = fieldNorm(doc=4144)
    0.20285812 = product of:
      0.40571624 = sum of:
        0.40571624 = weight(_text_:machinery in 4144) [ClassicSimilarity], result of:
          0.40571624 = score(doc=4144,freq=2.0), product of:
            0.35214928 = queryWeight, product of:
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.047278564 = queryNorm
            1.1521144 = fieldWeight in 4144, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.109375 = fieldNorm(doc=4144)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Source: Communications of the Association for Computing Machinery. 39(1996) no.1, S.92-101

Guthrie, L.; Pustejovsky, J.; Wilks, Y.; Slator, B.M.: ¬The role of lexicons in natural language processing (1996) 0.34

0.33929676 = product of:
  0.45239568 = sum of:
    0.025779642 = weight(_text_:for in 6825) [ClassicSimilarity], result of:
      0.025779642 = score(doc=6825,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.29041752 = fieldWeight in 6825, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.109375 = fieldNorm(doc=6825)
    0.22375791 = weight(_text_:computing in 6825) [ClassicSimilarity], result of:
      0.22375791 = score(doc=6825,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.85560554 = fieldWeight in 6825, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.109375 = fieldNorm(doc=6825)
    0.20285812 = product of:
      0.40571624 = sum of:
        0.40571624 = weight(_text_:machinery in 6825) [ClassicSimilarity], result of:
          0.40571624 = score(doc=6825,freq=2.0), product of:
            0.35214928 = queryWeight, product of:
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.047278564 = queryNorm
            1.1521144 = fieldWeight in 6825, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.109375 = fieldNorm(doc=6825)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Source: Communications of the Association for Computing Machinery. 39(1996) no.1, S.xx-xx

Wiebe, J.; Hirst, G.; Horton, D.: Language use in context (1996) 0.34

0.33929676 = product of:
  0.45239568 = sum of:
    0.025779642 = weight(_text_:for in 6828) [ClassicSimilarity], result of:
      0.025779642 = score(doc=6828,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.29041752 = fieldWeight in 6828, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.109375 = fieldNorm(doc=6828)
    0.22375791 = weight(_text_:computing in 6828) [ClassicSimilarity], result of:
      0.22375791 = score(doc=6828,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.85560554 = fieldWeight in 6828, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.109375 = fieldNorm(doc=6828)
    0.20285812 = product of:
      0.40571624 = sum of:
        0.40571624 = weight(_text_:machinery in 6828) [ClassicSimilarity], result of:
          0.40571624 = score(doc=6828,freq=2.0), product of:
            0.35214928 = queryWeight, product of:
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.047278564 = queryNorm
            1.1521144 = fieldWeight in 6828, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.109375 = fieldNorm(doc=6828)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Source: Communications of the Association for Computing Machinery. 39(1996) no.1, S.xx-xx

Czejdo. B.D.; Tucci, R.P.: ¬A dataflow graphical language for database applications (1994) 0.10

0.095860556 = product of:
  0.19172111 = sum of:
    0.03189404 = weight(_text_:for in 559) [ClassicSimilarity], result of:
      0.03189404 = score(doc=559,freq=6.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.35929856 = fieldWeight in 559, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.078125 = fieldNorm(doc=559)
    0.15982707 = weight(_text_:computing in 559) [ClassicSimilarity], result of:
      0.15982707 = score(doc=559,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.6111468 = fieldWeight in 559, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.078125 = fieldNorm(doc=559)
  0.5 = coord(2/4)

Abstract: Discusses a graphical language for information retrieval and processing. A lot of recent activity has occured in the area of improving access to database systems. However, current results are restricted to simple interfacing of database systems. Proposes a graphical language for specifying complex applications
Source: CIT - Journal of computing and information technology. 2(1994) no.1, S.39-50

Yeap, W.K.: Computing rich semantic models of text in legal domains (1998) 0.09

0.08555527 = product of:
  0.17111054 = sum of:
    0.012889821 = weight(_text_:for in 2675) [ClassicSimilarity], result of:
      0.012889821 = score(doc=2675,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.14520876 = fieldWeight in 2675, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2675)
    0.15822072 = weight(_text_:computing in 2675) [ClassicSimilarity], result of:
      0.15822072 = score(doc=2675,freq=4.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.60500443 = fieldWeight in 2675, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2675)
  0.5 = coord(2/4)

Abstract: The richness provided in a deep demantic model of text is appealing and yet few such models have been developed. Considers the problems with existing practical natural language processing (NLP) systems and the difficulties in developing such a model. Argues that a possible solution must focus on the reasoning process using knowledge of words rather than the use of other mechanisms and especially those that speed up the pre processing stage. Suggests also that computing representations of text that are transcripts of judges' oral reports on Family Law cases is a challenging text area for these techniques

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.08

0.08244937 = product of:
  0.10993249 = sum of:
    0.07509089 = product of:
      0.22527267 = sum of:
        0.22527267 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.22527267 = score(doc=562,freq=2.0), product of:
            0.40082818 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.047278564 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.015624823 = weight(_text_:for in 562) [ClassicSimilarity], result of:
      0.015624823 = score(doc=562,freq=4.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.17601961 = fieldWeight in 562, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.019216778 = product of:
      0.038433556 = sum of:
        0.038433556 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.038433556 = score(doc=562,freq=2.0), product of:
            0.16556148 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047278564 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Document representations for text classification are typically based on the classical Bag-Of-Words paradigm. This approach comes with deficiencies that motivate the integration of features on a higher semantic level than single words. In this paper we propose an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting is used for actual classification. Experimental evaluations on two well known text corpora support our approach through consistent improvement of the results.
Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Paice, C.D.: Method for evaluation of stemming algorithms based on error counting (1996) 0.08

0.07866205 = product of:
  0.1573241 = sum of:
    0.02946245 = weight(_text_:for in 5799) [ClassicSimilarity], result of:
      0.02946245 = score(doc=5799,freq=8.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.33190575 = fieldWeight in 5799, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0625 = fieldNorm(doc=5799)
    0.12786166 = weight(_text_:computing in 5799) [ClassicSimilarity], result of:
      0.12786166 = score(doc=5799,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.48891744 = fieldWeight in 5799, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.0625 = fieldNorm(doc=5799)
  0.5 = coord(2/4)

Abstract: Assesses the effectiveness of stemming algorithms by counting the number of identifiable errors during the stemming of words from various text samples. This entails manual groupings of the words in each sample using software developed for this purpose, stemming the words and computing indeices which represent the rate of understemming and overstemming. Presents the results for 3 stemmers (Lovins, Porter, and Paice/Husk), in each case using 3 text samples
Source: Journal of the American Society for Information Science. 47(1996) no.8, S.632-649

Ghazzawi, N.; Robichaud, B.; Drouin, P.; Sadat, F.: Automatic extraction of specialized verbal units (2018) 0.08
```
0.07562129 = product of:
  0.15124258 = sum of:
    0.015624823 = weight(_text_:for in 4094) [ClassicSimilarity], result of:
      0.015624823 = score(doc=4094,freq=4.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.17601961 = fieldWeight in 4094, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.046875 = fieldNorm(doc=4094)
    0.13561776 = weight(_text_:computing in 4094) [ClassicSimilarity], result of:
      0.13561776 = score(doc=4094,freq=4.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.51857525 = fieldWeight in 4094, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.046875 = fieldNorm(doc=4094)
  0.5 = coord(2/4)
```
Abstract

This paper presents a methodology for the automatic extraction of specialized Arabic, English and French verbs of the field of computing. Since nominal terms are predominant in terminology, our interest is to explore to what extent verbs can also be part of a terminological analysis. Hence, our objective is to verify how an existing extraction tool will perform when it comes to specialized verbs in a given specialized domain. Furthermore, we want to investigate any particularities that a language can represent regarding verbal terms from the automatic extraction perspective. Our choice to operate on three different languages reflects our desire to see whether the chosen tool can perform better on one language compared to the others. Moreover, given that Arabic is a morphologically rich and complex language, we consider investigating the results yielded by the extraction tool. The extractor used for our experiment is TermoStat (Drouin 2003). So far, our results show that the extraction of verbs of computing represents certain differences in terms of quality and particularities of these units in this specialized domain between the languages under question.

Litkowski, K.C.: Category development based on semantic principles (1997) 0.07

0.065053955 = product of:
  0.13010791 = sum of:
    0.01822896 = weight(_text_:for in 1824) [ClassicSimilarity], result of:
      0.01822896 = score(doc=1824,freq=4.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.20535621 = fieldWeight in 1824, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1824)
    0.111878954 = weight(_text_:computing in 1824) [ClassicSimilarity], result of:
      0.111878954 = score(doc=1824,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.42780277 = fieldWeight in 1824, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1824)
  0.5 = coord(2/4)

Abstract: Describes the beginnings of computerized information retrieval and text analysis, particularly from the perspective of the use of thesauri and cataloguing systems. Describes formalisations of linguistic principles in the development of formal grammars and semantics. Presents the principles for category development, based on research in linguistic formalism continuing with ever richer grammars and semantic formalism. Descrines the progress of these formalisms in the examiniation of the categories used in Minnesota Contextual Content Analysis approach. Describes current research toward an integration of semantic principles into content analysis abstraction procedures for characterising the category of any text
Footnote: Contribution to a symposium based on presentations made at a panel of the 7th annual Conference of the Social Science Computing Association entitled Possibilities in Computer Content Analysis of Text, Minneapolis, Minnesota, USA, 1996

Wong, W.; Liu, W.; Bennamoun, M.: Ontology learning from text : a look back and into the future (2010) 0.07

0.065053955 = product of:
  0.13010791 = sum of:
    0.01822896 = weight(_text_:for in 4733) [ClassicSimilarity], result of:
      0.01822896 = score(doc=4733,freq=4.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.20535621 = fieldWeight in 4733, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4733)
    0.111878954 = weight(_text_:computing in 4733) [ClassicSimilarity], result of:
      0.111878954 = score(doc=4733,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.42780277 = fieldWeight in 4733, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4733)
  0.5 = coord(2/4)

Abstract: Ontologies are often viewed as the answer to the need for inter-operable semantics in modern information systems. The explosion of textual information on the "Read/Write" Web coupled with the increasing demand for ontologies to power the Semantic Web have made (semi-)automatic ontology learning from text a very promising research area. This together with the advanced state in related areas such as natural language processing have fuelled research into ontology learning over the past decade. This survey looks at how far we have come since the turn of the millennium, and discusses the remaining challenges that will define the research directions in this area in the near future.
Content: Pre-publication version für: ACM Computing Surveys, Vol. X, No. X, Article X, Publication date: X 2011.

Yang, C.C.; Luk, J.: Automatic generation of English/Chinese thesaurus based on a parallel corpus in laws (2003) 0.06
```
0.060029313 = product of:
  0.080039084 = sum of:
    0.012889821 = weight(_text_:for in 1616) [ClassicSimilarity], result of:
      0.012889821 = score(doc=1616,freq=8.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.14520876 = fieldWeight in 1616, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.055939477 = weight(_text_:computing in 1616) [ClassicSimilarity], result of:
      0.055939477 = score(doc=1616,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.21390139 = fieldWeight in 1616, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.011209788 = product of:
      0.022419576 = sum of:
        0.022419576 = weight(_text_:22 in 1616) [ClassicSimilarity], result of:
          0.022419576 = score(doc=1616,freq=2.0), product of:
            0.16556148 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047278564 = queryNorm
            0.1354154 = fieldWeight in 1616, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1616)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

The information available in languages other than English in the World Wide Web is increasing significantly. According to a report from Computer Economics in 1999, 54% of Internet users are English speakers ("English Will Dominate Web for Only Three More Years," Computer Economics, July 9, 1999, http://www.computereconomics. com/new4/pr/pr990610.html). However, it is predicted that there will be only 60% increase in Internet users among English speakers verses a 150% growth among nonEnglish speakers for the next five years. By 2005, 57% of Internet users will be non-English speakers. A report by CNN.com in 2000 showed that the number of Internet users in China had been increased from 8.9 million to 16.9 million from January to June in 2000 ("Report: China Internet users double to 17 million," CNN.com, July, 2000, http://cnn.org/2000/TECH/computing/07/27/ china.internet.reut/index.html). According to Nielsen/ NetRatings, there was a dramatic leap from 22.5 millions to 56.6 millions Internet users from 2001 to 2002. China had become the second largest global at-home Internet population in 2002 (US's Internet population was 166 millions) (Robyn Greenspan, "China Pulls Ahead of Japan," Internet.com, April 22, 2002, http://cyberatias.internet.com/big-picture/geographics/article/0,,5911_1013841,00. html). All of the evidences reveal the importance of crosslingual research to satisfy the needs in the near future. Digital library research has been focusing in structural and semantic interoperability in the past. Searching and retrieving objects across variations in protocols, formats and disciplines are widely explored (Schatz, B., & Chen, H. (1999). Digital libraries: technological advances and social impacts. IEEE Computer, Special Issue an Digital Libraries, February, 32(2), 45-50.; Chen, H., Yen, J., & Yang, C.C. (1999). International activities: development of Asian digital libraries. IEEE Computer, Special Issue an Digital Libraries, 32(2), 48-49.). However, research in crossing language boundaries, especially across European languages and Oriental languages, is still in the initial stage. In this proposal, we put our focus an cross-lingual semantic interoperability by developing automatic generation of a cross-lingual thesaurus based an English/Chinese parallel corpus. When the searchers encounter retrieval problems, Professional librarians usually consult the thesaurus to identify other relevant vocabularies. In the problem of searching across language boundaries, a cross-lingual thesaurus, which is generated by co-occurrence analysis and Hopfield network, can be used to generate additional semantically relevant terms that cannot be obtained from dictionary. In particular, the automatically generated cross-lingual thesaurus is able to capture the unknown words that do not exist in a dictionary, such as names of persons, organizations, and events. Due to Hong Kong's unique history background, both English and Chinese are used as official languages in all legal documents. Therefore, English/Chinese cross-lingual information retrieval is critical for applications in courts and the government. In this paper, we develop an automatic thesaurus by the Hopfield network based an a parallel corpus collected from the Web site of the Department of Justice of the Hong Kong Special Administrative Region (HKSAR) Government. Experiments are conducted to measure the precision and recall of the automatic generated English/Chinese thesaurus. The result Shows that such thesaurus is a promising tool to retrieve relevant terms, especially in the language that is not the same as the input term. The direct translation of the input term can also be retrieved in most of the cases.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.7, S.671-682

Proszeky, G.: Language technology tools in the translator's practice (1999) 0.06

0.055939477 = product of:
  0.22375791 = sum of:
    0.22375791 = weight(_text_:computing in 6873) [ClassicSimilarity], result of:
      0.22375791 = score(doc=6873,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.85560554 = fieldWeight in 6873, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.109375 = fieldNorm(doc=6873)
  0.25 = coord(1/4)

Source: Journal of computing and information technoloyg. 7(1999) no.3, S.221-227

Spitkovsky, V.I.; Chang, A.X.: ¬A cross-lingual dictionary for english Wikipedia concepts (2012) 0.05
```
0.054518014 = product of:
  0.10903603 = sum of:
    0.022096837 = weight(_text_:for in 336) [ClassicSimilarity], result of:
      0.022096837 = score(doc=336,freq=8.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.24892932 = fieldWeight in 336, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.046875 = fieldNorm(doc=336)
    0.08693919 = product of:
      0.17387839 = sum of:
        0.17387839 = weight(_text_:machinery in 336) [ClassicSimilarity], result of:
          0.17387839 = score(doc=336,freq=2.0), product of:
            0.35214928 = queryWeight, product of:
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.047278564 = queryNorm
            0.4937633 = fieldWeight in 336, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.448392 = idf(docFreq=69, maxDocs=44218)
              0.046875 = fieldNorm(doc=336)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

We present a resource for automatically associating strings of text with English Wikipedia concepts. Our machinery is bi-directional, in the sense that it uses the same fundamental probabilistic methods to map strings to empirical distributions over Wikipedia articles as it does to map article URLs to distributions over short, language-independent strings of natural language text. For maximal interoperability, we release our resource as a set of ?at line-based text ?les, lexicographically sorted and encoded with UTF-8. These files capture joint probability distributions underlying concepts (we use the terms article, concept and Wikipedia URL interchangeably) and associated snippets of text, as well as other features that can come in handy when working with Wikipedia articles and related information.

Content

Vgl. auch: Spitkovsky, V., P. Norvig: From words to concepts and back: dictionaries for linking text, entities and ideas. In: http://googleresearch.blogspot.de/2012/05/from-words-to-concepts-and-back.html. Für den Datenpool vgl.: nlp.stanford.edu/pubs/corsswikis-data.tar.bz2.

Peis, E.; Herrera-Viedma, E.; Herrera, J.C.: On the evaluation of XML documents using Fuzzy linguistic techniques (2003) 0.05

0.053472333 = product of:
  0.106944665 = sum of:
    0.0110484185 = weight(_text_:for in 2778) [ClassicSimilarity], result of:
      0.0110484185 = score(doc=2778,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.12446466 = fieldWeight in 2778, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.046875 = fieldNorm(doc=2778)
    0.095896244 = weight(_text_:computing in 2778) [ClassicSimilarity], result of:
      0.095896244 = score(doc=2778,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.36668807 = fieldWeight in 2778, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.046875 = fieldNorm(doc=2778)
  0.5 = coord(2/4)

Abstract: Recommender systems evaluate and filter the great amount of information available an the Web to assist people in their search processes. A fuzzy evaluation method of XML documents based an computing with words is presented. Given an XML document type (e.g. scientific article), we consider that its elements are not equally informative. This is indicated by the use of a DTD and defining linguistic importance attributes to the more meaningful elements of the DTD designed. Then, the evaluation method generates linguistic recommendations from linguistic evaluation judgements provided by different recommenders an meaningful elements of DTD.
Source: Challenges in knowledge representation and organization for the 21st century: Integration of knowledge across boundaries. Proceedings of the 7th ISKO International Conference Granada, Spain, July 10-13, 2002. Ed.: M. López-Huertas

Dunning, T.: Statistical identification of language (1994) 0.05
```
0.053472333 = product of:
  0.106944665 = sum of:
    0.0110484185 = weight(_text_:for in 3627) [ClassicSimilarity], result of:
      0.0110484185 = score(doc=3627,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.12446466 = fieldWeight in 3627, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.046875 = fieldNorm(doc=3627)
    0.095896244 = weight(_text_:computing in 3627) [ClassicSimilarity], result of:
      0.095896244 = score(doc=3627,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.36668807 = fieldWeight in 3627, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.046875 = fieldNorm(doc=3627)
  0.5 = coord(2/4)
```
Abstract

A statistically based program has been written which learns to distinguish between languages. The amount of training text that such a program needs is surprisingly small, and the amount of text needed to make an identification is also quite small. The program incorporates no linguistic presuppositions other than the assumption that text can be encoded as a string of bytes. Such a program can be used to determine which language small bits of text are in. It also shows a potential for what might be called 'statistical philology' in that it may be applied directly to phonetic transcriptions to help elucidate family trees among language dialects. A variant of this program has been shown to be useful as a quality control in biochemistry. In this application, genetic sequences are assumed to be expressions in a language peculiar to the organism from which the sequence is taken. Thus language identification becomes species identification.

Series

Technical report CRL MCCS-94-273, Computing Research Lab, New Mexico State University
Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.05
```
0.045357857 = product of:
  0.090715714 = sum of:
    0.07509089 = product of:
      0.22527267 = sum of:
        0.22527267 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.22527267 = score(doc=862,freq=2.0), product of:
            0.40082818 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.047278564 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
    0.015624823 = weight(_text_:for in 862) [ClassicSimilarity], result of:
      0.015624823 = score(doc=862,freq=4.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.17601961 = fieldWeight in 862, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
  0.5 = coord(2/4)
```
Abstract

This research revisits the classic Turing test and compares recent large language models such as ChatGPT for their abilities to reproduce human-level comprehension and compelling text generation. Two task challenges- summary and question answering- prompt ChatGPT to produce original content (98-99%) from a single text entry and sequential questions initially posed by Turing in 1950. We score the original and generated content against the OpenAI GPT-2 Output Detector from 2019, and establish multiple cases where the generated content proves original and undetectable (98%). The question of a machine fooling a human judge recedes in this work relative to the question of "how would one prove it?" The original contribution of the work presents a metric and simple grammatical set for understanding the writing mechanics of chatbots in evaluating their readability and statistical clarity, engagement, delivery, overall quality, and plagiarism risks. While Turing's original prose scores at least 14% below the machine-generated output, whether an algorithm displays hints of Turing's true initial thoughts (the "Lovelace 2.0" test) remains unanswerable.

Source

https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
From information to knowledge : conceptual and content analysis by computer (1995) 0.04
```
0.044560276 = product of:
  0.08912055 = sum of:
    0.009207015 = weight(_text_:for in 5392) [ClassicSimilarity], result of:
      0.009207015 = score(doc=5392,freq=2.0), product of:
        0.08876751 = queryWeight, product of:
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.047278564 = queryNorm
        0.103720546 = fieldWeight in 5392, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.8775425 = idf(docFreq=18385, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5392)
    0.079913534 = weight(_text_:computing in 5392) [ClassicSimilarity], result of:
      0.079913534 = score(doc=5392,freq=2.0), product of:
        0.26151994 = queryWeight, product of:
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.047278564 = queryNorm
        0.3055734 = fieldWeight in 5392, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.5314693 = idf(docFreq=475, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5392)
  0.5 = coord(2/4)
```
Content

SCHMIDT, K.M.: Concepts - content - meaning: an introduction; DUCHASTEL, J. et al.: The SACAO project: using computation toward textual data analysis; PAQUIN, L.-C. u. L. DUPUY: An approach to expertise transfer: computer-assisted text analysis; HOGENRAAD, R., Y. BESTGEN u. J.-L. NYSTEN: Terrorist rhetoric: texture and architecture; MOHLER, P.P.: On the interaction between reading and computing: an interpretative approach to content analysis; LANCASHIRE, I.: Computer tools for cognitive stylistics; MERGENTHALER, E.: An outline of knowledge based text analysis; NAMENWIRTH, J.Z.: Ideography in computer-aided content analysis; WEBER, R.P. u. J.Z. Namenwirth: Content-analytic indicators: a self-critique; McKINNON, A.: Optimizing the aberrant frequency word technique; ROSATI, R.: Factor analysis in classical archaeology: export patterns of Attic pottery trade; PETRILLO, P.S.: Old and new worlds: ancient coinage and modern technology; DARANYI, S., S. MARJAI u.a.: Caryatids and the measurement of semiosis in architecture; ZARRI, G.P.: Intelligent information retrieval: an application in the field of historical biographical data; BOUCHARD, G., R. ROY u.a.: Computers and genealogy: from family reconstitution to population reconstruction; DEMÉLAS-BOHY, M.-D. u. M. RENAUD: Instability, networks and political parties: a political history expert system prototype; DARANYI, S., A. ABRANYI u. G. KOVACS: Knowledge extraction from ethnopoetic texts by multivariate statistical methods; FRAUTSCHI, R.L.: Measures of narrative voice in French prose fiction applied to textual samples from the enlightenment to the twentieth century; DANNENBERG, R. u.a.: A project in computer music: the musician's workbench

Search (439 results, page 1 of 22)

Authors

Years

Languages

Types

Themes

Subjects

Classifications