Search (80 results, page 1 of 4)

  • × theme_ss:"Computerlinguistik"
  1. Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.13
    0.12936863 = product of:
      0.19405293 = sum of:
        0.17030893 = weight(_text_:citation in 156) [ClassicSimilarity], result of:
          0.17030893 = score(doc=156,freq=8.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.725337 = fieldWeight in 156, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
        0.023743998 = product of:
          0.047487997 = sum of:
            0.047487997 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
              0.047487997 = score(doc=156,freq=2.0), product of:
                0.17534193 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050071523 = queryNorm
                0.2708308 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The present study investigates the ability of a bibliometric based semi-automatic method to select candidate thesaurus terms from citation contexts. The method consists of document co-citation analysis, citation context analysis, and noun phrase parsing. The investigation is carried out within the specialty area of periodontology. The results clearly demonstrate that the method is able to select important candidate thesaurus terms within the chosen specialty area.
    Date
    8. 3.2007 19:55:22
  2. Radev, D.R.; Joseph, M.T.; Gibson, B.; Muthukrishnan, P.: ¬A bibliometric and network analysis of the field of computational linguistics (2016) 0.10
    0.1049328 = product of:
      0.15739919 = sum of:
        0.12042661 = weight(_text_:citation in 2764) [ClassicSimilarity], result of:
          0.12042661 = score(doc=2764,freq=4.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.51289076 = fieldWeight in 2764, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2764)
        0.036972582 = product of:
          0.073945165 = sum of:
            0.073945165 = weight(_text_:index in 2764) [ClassicSimilarity], result of:
              0.073945165 = score(doc=2764,freq=2.0), product of:
                0.21880072 = queryWeight, product of:
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.050071523 = queryNorm
                0.33795667 = fieldWeight in 2764, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2764)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The ACL Anthology is a large collection of research papers in computational linguistics. Citation data were obtained using text extraction from a collection of PDF files with significant manual postprocessing performed to clean up the results. Manual annotation of the references was then performed to complete the citation network. We analyzed the networks of paper citations, author citations, and author collaborations in an attempt to identify the most central papers and authors. The analysis includes general network statistics, PageRank, metrics across publication years and venues, the impact factor and h-index, as well as other measures.
  3. Garfield, E.: ¬The relationship between mechanical indexing, structural linguistics and information retrieval (1992) 0.09
    0.09304918 = product of:
      0.13957377 = sum of:
        0.097319394 = weight(_text_:citation in 3632) [ClassicSimilarity], result of:
          0.097319394 = score(doc=3632,freq=2.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.4144783 = fieldWeight in 3632, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.0625 = fieldNorm(doc=3632)
        0.042254377 = product of:
          0.084508754 = sum of:
            0.084508754 = weight(_text_:index in 3632) [ClassicSimilarity], result of:
              0.084508754 = score(doc=3632,freq=2.0), product of:
                0.21880072 = queryWeight, product of:
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.050071523 = queryNorm
                0.3862362 = fieldWeight in 3632, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3632)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    It is possible to locate over 60% of indexing terms used in the Current List of Medical Literature by analysing the titles of the articles. Citation indexes contain 'noise' and lack many pertinent citations. Mechanical indexing or analysis of text must begin with some linguistic technique. Discusses Harris' methods of structural linguistics, discourse analysis and transformational analysis. Provides 3 examples with references, abstracts and index entries
  4. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.07
    0.066585906 = product of:
      0.099878855 = sum of:
        0.07952686 = product of:
          0.23858055 = sum of:
            0.23858055 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.23858055 = score(doc=562,freq=2.0), product of:
                0.42450693 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.050071523 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.020351999 = product of:
          0.040703997 = sum of:
            0.040703997 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.040703997 = score(doc=562,freq=2.0), product of:
                0.17534193 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050071523 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  5. Way, E.C.: Knowledge representation and metaphor (oder: meaning) (1994) 0.05
    0.046260253 = product of:
      0.13878076 = sum of:
        0.13878076 = sum of:
          0.084508754 = weight(_text_:index in 771) [ClassicSimilarity], result of:
            0.084508754 = score(doc=771,freq=2.0), product of:
              0.21880072 = queryWeight, product of:
                4.369764 = idf(docFreq=1520, maxDocs=44218)
                0.050071523 = queryNorm
              0.3862362 = fieldWeight in 771, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.369764 = idf(docFreq=1520, maxDocs=44218)
                0.0625 = fieldNorm(doc=771)
          0.054272 = weight(_text_:22 in 771) [ClassicSimilarity], result of:
            0.054272 = score(doc=771,freq=2.0), product of:
              0.17534193 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.050071523 = queryNorm
              0.30952093 = fieldWeight in 771, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=771)
      0.33333334 = coord(1/3)
    
    Content
    Enthält folgende 9 Kapitel: The literal and the metaphoric; Views of metaphor; Knowledge representation; Representation schemes and conceptual graphs; The dynamic type hierarchy theory of metaphor; Computational approaches to metaphor; Thenature and structure of semantic hierarchies; Language games, open texture and family resemblance; Programming the dynamic type hierarchy; Subject index
    Footnote
    Bereits 1991 bei Kluwer publiziert // Rez. in: Knowledge organization 22(1995) no.1, S.48-49 (O. Sechser)
  6. Ruge, G.: Sprache und Computer : Wortbedeutung und Termassoziation. Methoden zur automatischen semantischen Klassifikation (1995) 0.05
    0.046260253 = product of:
      0.13878076 = sum of:
        0.13878076 = sum of:
          0.084508754 = weight(_text_:index in 1534) [ClassicSimilarity], result of:
            0.084508754 = score(doc=1534,freq=2.0), product of:
              0.21880072 = queryWeight, product of:
                4.369764 = idf(docFreq=1520, maxDocs=44218)
                0.050071523 = queryNorm
              0.3862362 = fieldWeight in 1534, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.369764 = idf(docFreq=1520, maxDocs=44218)
                0.0625 = fieldNorm(doc=1534)
          0.054272 = weight(_text_:22 in 1534) [ClassicSimilarity], result of:
            0.054272 = score(doc=1534,freq=2.0), product of:
              0.17534193 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.050071523 = queryNorm
              0.30952093 = fieldWeight in 1534, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=1534)
      0.33333334 = coord(1/3)
    
    Content
    Enthält folgende Kapitel: (1) Motivation; (2) Language philosophical foundations; (3) Structural comparison of extensions; (4) Earlier approaches towards term association; (5) Experiments; (6) Spreading-activation networks or memory models; (7) Perspective. Appendices: Heads and modifiers of 'car'. Glossary. Index. Language and computer. Word semantics and term association. Methods towards an automatic semantic classification
    Footnote
    Rez. in: Knowledge organization 22(1995) no.3/4, S.182-184 (M.T. Rolland)
  7. Levin, M.; Krawczyk, S.; Bethard, S.; Jurafsky, D.: Citation-based bootstrapping for large-scale author disambiguation (2012) 0.05
    0.045336 = product of:
      0.136008 = sum of:
        0.136008 = weight(_text_:citation in 246) [ClassicSimilarity], result of:
          0.136008 = score(doc=246,freq=10.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.57925105 = fieldWeight in 246, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.0390625 = fieldNorm(doc=246)
      0.33333334 = coord(1/3)
    
    Abstract
    We present a new, two-stage, self-supervised algorithm for author disambiguation in large bibliographic databases. In the first "bootstrap" stage, a collection of high-precision features is used to bootstrap a training set with positive and negative examples of coreferring authors. A supervised feature-based classifier is then trained on the bootstrap clusters and used to cluster the authors in a larger unlabeled dataset. Our self-supervised approach shares the advantages of unsupervised approaches (no need for expensive hand labels) as well as supervised approaches (a rich set of features that can be discriminatively trained). The algorithm disambiguates 54,000,000 author instances in Thomson Reuters' Web of Knowledge with B3 F1 of.807. We analyze parameters and features, particularly those from citation networks, which have not been deeply investigated in author disambiguation. The most important citation feature is self-citation, which can be approximated without expensive extraction of the full network. For the supervised stage, the minor improvement due to other citation features (increasing F1 from.748 to.767) suggests they may not be worth the trouble of extracting from databases that don't already have them. A lean feature set without expensive abstract and title features performs 130 times faster with about equal F1.
  8. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.03
    0.026508953 = product of:
      0.07952686 = sum of:
        0.07952686 = product of:
          0.23858055 = sum of:
            0.23858055 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.23858055 = score(doc=862,freq=2.0), product of:
                0.42450693 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.050071523 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  9. Campe, P.: Case, semantic roles, and grammatical relations : a comprehensive bibliography (1994) 0.02
    0.02112719 = product of:
      0.06338157 = sum of:
        0.06338157 = product of:
          0.12676314 = sum of:
            0.12676314 = weight(_text_:index in 8663) [ClassicSimilarity], result of:
              0.12676314 = score(doc=8663,freq=2.0), product of:
                0.21880072 = queryWeight, product of:
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.050071523 = queryNorm
                0.5793543 = fieldWeight in 8663, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.09375 = fieldNorm(doc=8663)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Contains references to more than 6000 publications with a subject and a language index as well as a guide to the relevant languages and language families
  10. Peters, W.; Vossen, P.; Diez-Orzas, P.; Adriaens, G.: Cross-linguistic alignment of WordNets with an inter-lingual-index (1998) 0.02
    0.02112719 = product of:
      0.06338157 = sum of:
        0.06338157 = product of:
          0.12676314 = sum of:
            0.12676314 = weight(_text_:index in 6446) [ClassicSimilarity], result of:
              0.12676314 = score(doc=6446,freq=2.0), product of:
                0.21880072 = queryWeight, product of:
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.050071523 = queryNorm
                0.5793543 = fieldWeight in 6446, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6446)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
  11. He, Q.: ¬A study of the strength indexes in co-word analysis (2000) 0.02
    0.02112719 = product of:
      0.06338157 = sum of:
        0.06338157 = product of:
          0.12676314 = sum of:
            0.12676314 = weight(_text_:index in 111) [ClassicSimilarity], result of:
              0.12676314 = score(doc=111,freq=8.0), product of:
                0.21880072 = queryWeight, product of:
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.050071523 = queryNorm
                0.5793543 = fieldWeight in 111, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.046875 = fieldNorm(doc=111)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Co-word analysis is a technique for detecting the knowledge structure of scientific literature and mapping the dynamics in a research field. It is used to count the co-occurrences of term pairs, compute the strength between term pairs, and map the research field by inserting terms and their linkages into a graphical structure according to the strength values. In previous co-word studies, there are two indexes used to measure the strength between term pairs in order to identify the major areas in a research field - the inclusion index (I) and the equivalence index (E). This study will conduct two co-word analysis experiments using the two indexes, respectively, and compare the results from the two experiments. The results show, due to the difference in their computation, index I is more likely to identify general subject areas in a research field while index E is more likely to identify subject areas at more specific levels
  12. Ibekwe-SanJuan, F.; SanJuan, E.: From term variants to research topics (2002) 0.02
    0.020274874 = product of:
      0.06082462 = sum of:
        0.06082462 = weight(_text_:citation in 1853) [ClassicSimilarity], result of:
          0.06082462 = score(doc=1853,freq=2.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.25904894 = fieldWeight in 1853, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1853)
      0.33333334 = coord(1/3)
    
    Abstract
    In a scientific and technological watch (STW) task, an expert user needs to survey the evolution of research topics in his area of specialisation in order to detect interesting changes. The majority of methods proposing evaluation metrics (bibliometrics and scientometrics studies) for STW rely solely an statistical data analysis methods (Co-citation analysis, co-word analysis). Such methods usually work an structured databases where the units of analysis (words, keywords) are already attributed to documents by human indexers. The advent of huge amounts of unstructured textual data has rendered necessary the integration of natural language processing (NLP) techniques to first extract meaningful units from texts. We propose a method for STW which is NLP-oriented. The method not only analyses texts linguistically in order to extract terms from them, but also uses linguistic relations (syntactic variations) as the basis for clustering. Terms and variation relations are formalised as weighted di-graphs which the clustering algorithm, CPCL (Classification by Preferential Clustered Link) will seek to reduce in order to produces classes. These classes ideally represent the research topics present in the corpus. The results of the classification are subjected to validation by an expert in STW.
  13. Chen, L.; Fang, H.: ¬An automatic method for ex-tracting innovative ideas based on the Scopus® database (2019) 0.02
    0.020274874 = product of:
      0.06082462 = sum of:
        0.06082462 = weight(_text_:citation in 5310) [ClassicSimilarity], result of:
          0.06082462 = score(doc=5310,freq=2.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.25904894 = fieldWeight in 5310, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5310)
      0.33333334 = coord(1/3)
    
    Abstract
    The novelty of knowledge claims in a research paper can be considered an evaluation criterion for papers to supplement citations. To provide a foundation for research evaluation from the perspective of innovativeness, we propose an automatic approach for extracting innovative ideas from the abstracts of technology and engineering papers. The approach extracts N-grams as candidates based on part-of-speech tagging and determines whether they are novel by checking the Scopus® database to determine whether they had ever been presented previously. Moreover, we discussed the distributions of innovative ideas in different abstract structures. To improve the performance by excluding noisy N-grams, a list of stopwords and a list of research description characteristics were developed. We selected abstracts of articles published from 2011 to 2017 with the topic of semantic analysis as the experimental texts. Excluding noisy N-grams, considering the distribution of innovative ideas in abstracts, and suitably combining N-grams can effectively improve the performance of automatic innovative idea extraction. Unlike co-word and co-citation analysis, innovative-idea extraction aims to identify the differences in a paper from all previously published papers.
  14. Soni, S.; Lerman, K.; Eisenstein, J.: Follow the leader : documents on the leading edge of semantic change get more citations (2021) 0.02
    0.020274874 = product of:
      0.06082462 = sum of:
        0.06082462 = weight(_text_:citation in 169) [ClassicSimilarity], result of:
          0.06082462 = score(doc=169,freq=2.0), product of:
            0.23479973 = queryWeight, product of:
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.050071523 = queryNorm
            0.25904894 = fieldWeight in 169, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6892867 = idf(docFreq=1104, maxDocs=44218)
              0.0390625 = fieldNorm(doc=169)
      0.33333334 = coord(1/3)
    
    Abstract
    Diachronic word embeddings-vector representations of words over time-offer remarkable insights into the evolution of language and provide a tool for quantifying sociocultural change from text documents. Prior work has used such embeddings to identify shifts in the meaning of individual words. However, simply knowing that a word has changed in meaning is insufficient to identify the instances of word usage that convey the historical meaning or the newer meaning. In this study, we link diachronic word embeddings to documents, by situating those documents as leaders or laggards with respect to ongoing semantic changes. Specifically, we propose a novel method to quantify the degree of semantic progressiveness in each word usage, and then show how these usages can be aggregated to obtain scores for each document. We analyze two large collections of documents, representing legal opinions and scientific articles. Documents that are scored as semantically progressive receive a larger number of citations, indicating that they are especially influential. Our work thus provides a new technique for identifying lexical semantic leaders and demonstrates a new link between progressive use of language and influence in a citation network.
  15. Abu-Salem, H.; Al-Omari, M.; Evens, M.W.: Stemming methodologies over individual query words for an Arabic information retrieval system (1999) 0.02
    0.019684099 = product of:
      0.059052292 = sum of:
        0.059052292 = product of:
          0.118104585 = sum of:
            0.118104585 = weight(_text_:index in 3672) [ClassicSimilarity], result of:
              0.118104585 = score(doc=3672,freq=10.0), product of:
                0.21880072 = queryWeight, product of:
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.050071523 = queryNorm
                0.5397815 = fieldWeight in 3672, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3672)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Stemming is one of the most important factors that affect the performance of information retrieval systems. This article investigates how to improve the performance of an Arabic information retrieval system by imposing the retrieval method over individual words of a query depending on the importance of the WORD, the STEM, or the ROOT of the query terms in the database. This method, called Mxed Stemming, computes term importance using a weighting scheme that use the Term Frequency (TF) and the Inverse Document Frequency (IDF), called TFxIDF. An extended version of the Arabic IRS system is designed, implemented, and evaluated to reduce the number of irrelevant documents retrieved. The results of the experiment suggest that the proposed method outperforms the Word index method using the TFxIDF weighting scheme. It also outperforms the Stem index method using the Binary weighting scheme but does not outperform the Stem index method using the TFxIDF weighting scheme, and again it outperforms the Root index method using the Binary weighting scheme but does not outperform the Root index method using the TFxIDF weighting scheme
  16. Warner, A.J.: Natural language processing (1987) 0.02
    0.018090667 = product of:
      0.054272 = sum of:
        0.054272 = product of:
          0.108544 = sum of:
            0.108544 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
              0.108544 = score(doc=337,freq=2.0), product of:
                0.17534193 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050071523 = queryNorm
                0.61904186 = fieldWeight in 337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=337)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Annual review of information science and technology. 22(1987), S.79-108
  17. Mock, K.J.; Vemuri, V.R.: Information filtering via hill climbing, WordNet, and index patterns (1997) 0.02
    0.017429043 = product of:
      0.052287128 = sum of:
        0.052287128 = product of:
          0.104574256 = sum of:
            0.104574256 = weight(_text_:index in 1517) [ClassicSimilarity], result of:
              0.104574256 = score(doc=1517,freq=4.0), product of:
                0.21880072 = queryWeight, product of:
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.050071523 = queryNorm
                0.4779429 = fieldWeight in 1517, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.369764 = idf(docFreq=1520, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1517)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    The INFOS (Intelligent News Filtering Organizational System) project is designed to reduce the user's search burden by automatically categorising data as relevant or irrelevant based upon user interests. These predictions are learned automatically based upon features taken from input articles and collaborative features derived from other users. The filtering is performed by a hybrid technique that combines elements of a keyword-based hill climbing method, knowledge-based conceptual representation via WordNet, and partial parsing via index patterns. The hybrid systems integrating all these approaches combines the benefits of each while maintaing robustness and acalability
  18. McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.02
    0.015829332 = product of:
      0.047487997 = sum of:
        0.047487997 = product of:
          0.09497599 = sum of:
            0.09497599 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
              0.09497599 = score(doc=3164,freq=2.0), product of:
                0.17534193 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050071523 = queryNorm
                0.5416616 = fieldWeight in 3164, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3164)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Computational linguistics. 22(1996) no.2, S.217-248
  19. Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.02
    0.015829332 = product of:
      0.047487997 = sum of:
        0.047487997 = product of:
          0.09497599 = sum of:
            0.09497599 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
              0.09497599 = score(doc=4506,freq=2.0), product of:
                0.17534193 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050071523 = queryNorm
                0.5416616 = fieldWeight in 4506, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4506)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    8.10.2000 11:52:22
  20. Somers, H.: Example-based machine translation : Review article (1999) 0.02
    0.015829332 = product of:
      0.047487997 = sum of:
        0.047487997 = product of:
          0.09497599 = sum of:
            0.09497599 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
              0.09497599 = score(doc=6672,freq=2.0), product of:
                0.17534193 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050071523 = queryNorm
                0.5416616 = fieldWeight in 6672, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6672)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    31. 7.1996 9:22:19

Years

Languages

  • e 63
  • d 16
  • m 1
  • More… Less…

Types

  • a 65
  • m 7
  • el 6
  • s 4
  • p 2
  • x 2
  • b 1
  • d 1
  • More… Less…