Search (34 results, page 1 of 2)

  • × language_ss:"e"
  • × theme_ss:"Computerlinguistik"
  • × year_i:[2000 TO 2010}
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.05
    0.054003473 = product of:
      0.08100521 = sum of:
        0.0691992 = product of:
          0.20759758 = sum of:
            0.20759758 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.20759758 = score(doc=562,freq=2.0), product of:
                0.36937886 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.043569047 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.011806009 = product of:
          0.035418026 = sum of:
            0.035418026 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.035418026 = score(doc=562,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
      0.6666667 = coord(2/3)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.02
    0.024116507 = product of:
      0.07234952 = sum of:
        0.07234952 = product of:
          0.10852428 = sum of:
            0.06682816 = weight(_text_:network in 1595) [ClassicSimilarity], result of:
              0.06682816 = score(doc=1595,freq=2.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.3444231 = fieldWeight in 1595, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1595)
            0.041696113 = weight(_text_:29 in 1595) [ClassicSimilarity], result of:
              0.041696113 = score(doc=1595,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 1595, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1595)
          0.6666667 = coord(2/3)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper presents a method that exploits the hierarchical structure of an indexing vocabulary to guide the development and training of machine learning methods for automatic text categorization. We present the design of a hierarchical classifier based an the divide-and-conquer principle. The method is evaluated using backpropagation neural networks, such as the machine learning algorithm, that leam to assign MeSH categories to a subset of MEDLINE records. Comparisons with traditional Rocchio's algorithm adapted for text categorization, as well as flat neural network classifiers, are provided. The results indicate that the use of hierarchical structures improves Performance significantly.
    Date
    11. 5.2003 18:29:44
  3. Sidhom, S.; Hassoun, M.: Morpho-syntactic parsing for a text mining environment : An NP recognition model for knowledge visualization and information retrieval (2002) 0.02
    0.020671291 = product of:
      0.062013872 = sum of:
        0.062013872 = product of:
          0.093020804 = sum of:
            0.057281278 = weight(_text_:network in 1852) [ClassicSimilarity], result of:
              0.057281278 = score(doc=1852,freq=2.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.29521978 = fieldWeight in 1852, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1852)
            0.035739526 = weight(_text_:29 in 1852) [ClassicSimilarity], result of:
              0.035739526 = score(doc=1852,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.23319192 = fieldWeight in 1852, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1852)
          0.6666667 = coord(2/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Sidhom and Hassoun discuss the crucial role of NLP tools in Knowledge Extraction and Management as well as in the design of Information Retrieval Systems. The authors focus more specifically an the morpho-syntactic issues by describing their morpho-syntactic analysis platform, which has been implemented to cover the automatic indexing and information retrieval topics. To this end they implemented the Cascaded "Augmented Transition Network (ATN)". They used this formalism in order to analyse French text descriptions of Multimedia documents. An implementation of an ATN parsing automaton is briefly described. The Platform in its logical operation is considered as an investigative tool towards the knowledge organization (based an an NP recognition model) and management of multiform e-documents (text, multimedia, audio, image) using their text descriptions.
    Source
    Knowledge organization. 29(2002) nos.3/4, S.171-180
  4. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.02
    0.01589411 = product of:
      0.047682326 = sum of:
        0.047682326 = product of:
          0.07152349 = sum of:
            0.029782942 = weight(_text_:29 in 2541) [ClassicSimilarity], result of:
              0.029782942 = score(doc=2541,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.19432661 = fieldWeight in 2541, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
            0.041740544 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
              0.041740544 = score(doc=2541,freq=4.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27358043 = fieldWeight in 2541, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
          0.6666667 = coord(2/3)
      0.33333334 = coord(1/3)
    
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  5. Yang, C.C.; Luk, J.: Automatic generation of English/Chinese thesaurus based on a parallel corpus in laws (2003) 0.02
    0.015092259 = product of:
      0.045276776 = sum of:
        0.045276776 = product of:
          0.067915164 = sum of:
            0.047254648 = weight(_text_:network in 1616) [ClassicSimilarity], result of:
              0.047254648 = score(doc=1616,freq=4.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.24354391 = fieldWeight in 1616, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1616)
            0.020660516 = weight(_text_:22 in 1616) [ClassicSimilarity], result of:
              0.020660516 = score(doc=1616,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.1354154 = fieldWeight in 1616, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1616)
          0.6666667 = coord(2/3)
      0.33333334 = coord(1/3)
    
    Abstract
    The information available in languages other than English in the World Wide Web is increasing significantly. According to a report from Computer Economics in 1999, 54% of Internet users are English speakers ("English Will Dominate Web for Only Three More Years," Computer Economics, July 9, 1999, http://www.computereconomics. com/new4/pr/pr990610.html). However, it is predicted that there will be only 60% increase in Internet users among English speakers verses a 150% growth among nonEnglish speakers for the next five years. By 2005, 57% of Internet users will be non-English speakers. A report by CNN.com in 2000 showed that the number of Internet users in China had been increased from 8.9 million to 16.9 million from January to June in 2000 ("Report: China Internet users double to 17 million," CNN.com, July, 2000, http://cnn.org/2000/TECH/computing/07/27/ china.internet.reut/index.html). According to Nielsen/ NetRatings, there was a dramatic leap from 22.5 millions to 56.6 millions Internet users from 2001 to 2002. China had become the second largest global at-home Internet population in 2002 (US's Internet population was 166 millions) (Robyn Greenspan, "China Pulls Ahead of Japan," Internet.com, April 22, 2002, http://cyberatias.internet.com/big-picture/geographics/article/0,,5911_1013841,00. html). All of the evidences reveal the importance of crosslingual research to satisfy the needs in the near future. Digital library research has been focusing in structural and semantic interoperability in the past. Searching and retrieving objects across variations in protocols, formats and disciplines are widely explored (Schatz, B., & Chen, H. (1999). Digital libraries: technological advances and social impacts. IEEE Computer, Special Issue an Digital Libraries, February, 32(2), 45-50.; Chen, H., Yen, J., & Yang, C.C. (1999). International activities: development of Asian digital libraries. IEEE Computer, Special Issue an Digital Libraries, 32(2), 48-49.). However, research in crossing language boundaries, especially across European languages and Oriental languages, is still in the initial stage. In this proposal, we put our focus an cross-lingual semantic interoperability by developing automatic generation of a cross-lingual thesaurus based an English/Chinese parallel corpus. When the searchers encounter retrieval problems, Professional librarians usually consult the thesaurus to identify other relevant vocabularies. In the problem of searching across language boundaries, a cross-lingual thesaurus, which is generated by co-occurrence analysis and Hopfield network, can be used to generate additional semantically relevant terms that cannot be obtained from dictionary. In particular, the automatically generated cross-lingual thesaurus is able to capture the unknown words that do not exist in a dictionary, such as names of persons, organizations, and events. Due to Hong Kong's unique history background, both English and Chinese are used as official languages in all legal documents. Therefore, English/Chinese cross-lingual information retrieval is critical for applications in courts and the government. In this paper, we develop an automatic thesaurus by the Hopfield network based an a parallel corpus collected from the Web site of the Department of Justice of the Hong Kong Special Administrative Region (HKSAR) Government. Experiments are conducted to measure the precision and recall of the automatic generated English/Chinese thesaurus. The result Shows that such thesaurus is a promising tool to retrieve relevant terms, especially in the language that is not the same as the input term. The direct translation of the input term can also be retrieved in most of the cases.
  6. Toutanova, K.; Klein, D.; Manning, C.D.; Singer, Y.: Feature-rich Part-of-Speech Tagging with a cyclic dependency network (2003) 0.01
    0.010501034 = product of:
      0.0315031 = sum of:
        0.0315031 = product of:
          0.094509296 = sum of:
            0.094509296 = weight(_text_:network in 1059) [ClassicSimilarity], result of:
              0.094509296 = score(doc=1059,freq=4.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.48708782 = fieldWeight in 1059, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1059)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24%accuracy on the Penn TreebankWSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.
  7. Martínez, F.; Martín, M.T.; Rivas, V.M.; Díaz, M.C.; Ureña, L.A.: Using neural networks for multiword recognition in IR (2003) 0.01
    0.009000885 = product of:
      0.027002655 = sum of:
        0.027002655 = product of:
          0.081007965 = sum of:
            0.081007965 = weight(_text_:network in 2777) [ClassicSimilarity], result of:
              0.081007965 = score(doc=2777,freq=4.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.41750383 = fieldWeight in 2777, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2777)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    In this paper, a supervised neural network has been used to classify pairs of terms as being multiwords or non-multiwords. Classification is based an the values yielded by different estimators, currently available in literature, used as inputs for the neural network. Lists of multiwords and non-multiwords have been built to train the net. Afterward, many other pairs of terms have been classified using the trained net. Results obtained in this classification have been used to perform information retrieval tasks. Experiments show that detecting multiwords results in better performance of the IR methods.
  8. Mustafa El Hadi, W.: Evaluating human language technology : general applications to information access and management (2002) 0.01
    0.007942118 = product of:
      0.023826351 = sum of:
        0.023826351 = product of:
          0.07147905 = sum of:
            0.07147905 = weight(_text_:29 in 1840) [ClassicSimilarity], result of:
              0.07147905 = score(doc=1840,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.46638384 = fieldWeight in 1840, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1840)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Knowledge organization. 29(2002) nos.3/4, S.124-134
  9. Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.01
    0.007870673 = product of:
      0.023612019 = sum of:
        0.023612019 = product of:
          0.07083605 = sum of:
            0.07083605 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
              0.07083605 = score(doc=4888,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.46428138 = fieldWeight in 4888, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4888)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    1. 3.2013 14:56:22
  10. Griffiths, T.L.; Steyvers, M.: ¬A probabilistic approach to semantic representation (2002) 0.01
    0.0074879 = product of:
      0.0224637 = sum of:
        0.0224637 = product of:
          0.0673911 = sum of:
            0.0673911 = weight(_text_:29 in 3671) [ClassicSimilarity], result of:
              0.0673911 = score(doc=3671,freq=4.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.43971092 = fieldWeight in 3671, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3671)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 6.2015 14:55:01
    29. 6.2015 16:09:05
  11. Liu, S.; Liu, F.; Yu, C.; Meng, W.: ¬An effective approach to document retrieval via utilizing WordNet and recognizing phrases (2004) 0.01
    0.006618432 = product of:
      0.019855294 = sum of:
        0.019855294 = product of:
          0.059565883 = sum of:
            0.059565883 = weight(_text_:29 in 4078) [ClassicSimilarity], result of:
              0.059565883 = score(doc=4078,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.38865322 = fieldWeight in 4078, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4078)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    10.10.2005 10:29:08
  12. Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.01
    0.0063645868 = product of:
      0.01909376 = sum of:
        0.01909376 = product of:
          0.057281278 = sum of:
            0.057281278 = weight(_text_:network in 5480) [ClassicSimilarity], result of:
              0.057281278 = score(doc=5480,freq=2.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.29521978 = fieldWeight in 5480, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5480)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    (Automatic) document classification is generally defined as content-based assignment of one or more predefined categories to documents. Usually, machine learning, statistical pattern recognition, or neural network approaches are used to construct classifiers automatically. In this paper we thoroughly evaluate a wide variety of these methods on a document classification task for German text. We evaluate different feature construction and selection methods and various classifiers. Our main results are: (1) feature selection is necessary not only to reduce learning and classification time, but also to avoid overfitting (even for Support Vector Machines); (2) surprisingly, our morphological analysis does not improve classification quality compared to a letter 5-gram approach; (3) Support Vector Machines are significantly better than all other classification methods
  13. Warner, J.: Analogies between linguistics and information theory (2007) 0.01
    0.0053038225 = product of:
      0.015911467 = sum of:
        0.015911467 = product of:
          0.047734402 = sum of:
            0.047734402 = weight(_text_:network in 138) [ClassicSimilarity], result of:
              0.047734402 = score(doc=138,freq=2.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.2460165 = fieldWeight in 138, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=138)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    An analogy is established between the syntagm and paradigm from Saussurean linguistics and the message and messages for selection from the information theory initiated by Claude Shannon. The analogy is pursued both as an end in itself and for its analytic value in understanding patterns of retrieval from full-text systems. The multivalency of individual words when isolated from their syntagm is contrasted with the relative stability of meaning of multiword sequences, when searching ordinary written discourse. The syntagm is understood as the linear sequence of oral and written language. Saussure's understanding of the word, as a unit that compels recognition by the mind, is endorsed, although not regarded as final. The lesser multivalency of multiword sequences is understood as the greater determination of signification by the extended syntagm. The paradigm is primarily understood as the network of associations a word acquires when considered apart from the syntagm. The restriction of information theory to expression or signals, and its focus on the combinatorial aspects of the message, is sustained. The message in the model of communication in information theory can include sequences of written language. Shannon's understanding of the written word, as a cohesive group of letters, with strong internal statistical influences, is added to the Saussurean conception. Sequences of more than one word are regarded as weakly correlated concatenations of cohesive units.
  14. Rindflesch, T.C.; Fizsman, M.: The interaction of domain knowledge and linguistic structure in natural language processing : interpreting hypernymic propositions in biomedical text (2003) 0.01
    0.0053038225 = product of:
      0.015911467 = sum of:
        0.015911467 = product of:
          0.047734402 = sum of:
            0.047734402 = weight(_text_:network in 2097) [ClassicSimilarity], result of:
              0.047734402 = score(doc=2097,freq=2.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.2460165 = fieldWeight in 2097, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2097)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Interpretation of semantic propositions in free-text documents such as MEDLINE citations would provide valuable support for biomedical applications, and several approaches to semantic interpretation are being pursued in the biomedical informatics community. In this paper, we describe a methodology for interpreting linguistic structures that encode hypernymic propositions, in which a more specific concept is in a taxonomic relationship with a more general concept. In order to effectively process these constructions, we exploit underspecified syntactic analysis and structured domain knowledge from the Unified Medical Language System (UMLS). After introducing the syntactic processing on which our system depends, we focus on the UMLS knowledge that supports interpretation of hypernymic propositions. We first use semantic groups from the Semantic Network to ensure that the two concepts involved are compatible; hierarchical information in the Metathesaurus then determines which concept is more general and which more specific. A preliminary evaluation of a sample based on the semantic group Chemicals and Drugs provides 83% precision. An error analysis was conducted and potential solutions to the problems encountered are presented. The research discussed here serves as a paradigm for investigating the interaction between domain knowledge and linguistic structure in natural language processing, and could also make a contribution to research on automatic processing of discourse structure. Additional implications of the system we present include its integration in advanced semantic interpretation processors for biomedical text and its use for information extraction in specific domains. The approach has the potential to support a range of applications, including information retrieval and ontology engineering.
  15. Mustafa El Hadi, W.: Terminologies, ontologies and information access (2006) 0.01
    0.0052947453 = product of:
      0.015884236 = sum of:
        0.015884236 = product of:
          0.047652703 = sum of:
            0.047652703 = weight(_text_:29 in 1488) [ClassicSimilarity], result of:
              0.047652703 = score(doc=1488,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.31092256 = fieldWeight in 1488, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1488)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 2.2008 16:25:23
  16. Saeed, K.; Dardzinska, A.: Natural language processing : word recognition without segmentation (2001) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 7707) [ClassicSimilarity], result of:
              0.041696113 = score(doc=7707,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 7707, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7707)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    16.12.2001 18:29:38
  17. Chen, K.-H.: Evaluating Chinese text retrieval with multilingual queries (2002) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 1851) [ClassicSimilarity], result of:
              0.041696113 = score(doc=1851,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 1851, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1851)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Knowledge organization. 29(2002) nos.3/4, S.156-170
  18. Bowker, L.: Information retrieval in translation memory systems : assessment of current limitations and possibilities for future development (2002) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 1854) [ClassicSimilarity], result of:
              0.041696113 = score(doc=1854,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 1854, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1854)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Knowledge organization. 29(2002) nos.3/4, S.198-203
  19. Hammwöhner, R.: TransRouter revisited : Decision support in the routing of translation projects (2000) 0.00
    0.004591226 = product of:
      0.013773678 = sum of:
        0.013773678 = product of:
          0.04132103 = sum of:
            0.04132103 = weight(_text_:22 in 5483) [ClassicSimilarity], result of:
              0.04132103 = score(doc=5483,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.2708308 = fieldWeight in 5483, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5483)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    10.12.2000 18:22:35
  20. Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.00
    0.004591226 = product of:
      0.013773678 = sum of:
        0.013773678 = product of:
          0.04132103 = sum of:
            0.04132103 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
              0.04132103 = score(doc=156,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.2708308 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    8. 3.2007 19:55:22