Search (540 results, page 1 of 27)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.18
    0.17869775 = product of:
      0.31272104 = sum of:
        0.06973943 = product of:
          0.20921828 = sum of:
            0.20921828 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.20921828 = score(doc=562,freq=2.0), product of:
                0.37226257 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.043909185 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.015916053 = weight(_text_:of in 562) [ClassicSimilarity], result of:
          0.015916053 = score(doc=562,freq=10.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.23179851 = fieldWeight in 562, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.20921828 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.20921828 = score(doc=562,freq=2.0), product of:
            0.37226257 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.043909185 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.017847266 = product of:
          0.035694532 = sum of:
            0.035694532 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.035694532 = score(doc=562,freq=2.0), product of:
                0.15376249 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043909185 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Abstract
    Document representations for text classification are typically based on the classical Bag-Of-Words paradigm. This approach comes with deficiencies that motivate the integration of features on a higher semantic level than single words. In this paper we propose an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting is used for actual classification. Experimental evaluations on two well known text corpora support our approach through consistent improvement of the results.
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
    Source
    Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK
  2. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.13
    0.12637448 = product of:
      0.29487377 = sum of:
        0.06973943 = product of:
          0.20921828 = sum of:
            0.20921828 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.20921828 = score(doc=862,freq=2.0), product of:
                0.37226257 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.043909185 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.015916053 = weight(_text_:of in 862) [ClassicSimilarity], result of:
          0.015916053 = score(doc=862,freq=10.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.23179851 = fieldWeight in 862, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
        0.20921828 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.20921828 = score(doc=862,freq=2.0), product of:
            0.37226257 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.043909185 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.42857143 = coord(3/7)
    
    Abstract
    This research revisits the classic Turing test and compares recent large language models such as ChatGPT for their abilities to reproduce human-level comprehension and compelling text generation. Two task challenges- summary and question answering- prompt ChatGPT to produce original content (98-99%) from a single text entry and sequential questions initially posed by Turing in 1950. We score the original and generated content against the OpenAI GPT-2 Output Detector from 2019, and establish multiple cases where the generated content proves original and undetectable (98%). The question of a machine fooling a human judge recedes in this work relative to the question of "how would one prove it?" The original contribution of the work presents a metric and simple grammatical set for understanding the writing mechanics of chatbots in evaluating their readability and statistical clarity, engagement, delivery, overall quality, and plagiarism risks. While Turing's original prose scores at least 14% below the machine-generated output, whether an algorithm displays hints of Turing's true initial thoughts (the "Lovelace 2.0" test) remains unanswerable.
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  3. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.11
    0.10594197 = product of:
      0.24719794 = sum of:
        0.020132389 = weight(_text_:of in 563) [ClassicSimilarity], result of:
          0.020132389 = score(doc=563,freq=16.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.2932045 = fieldWeight in 563, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.20921828 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.20921828 = score(doc=563,freq=2.0), product of:
            0.37226257 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.043909185 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.017847266 = product of:
          0.035694532 = sum of:
            0.035694532 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.035694532 = score(doc=563,freq=2.0), product of:
                0.15376249 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043909185 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
    Imprint
    Guelph, Ontario : University of Guelph
  4. Chou, C.; Chu, T.: ¬An analysis of BERT (NLP) for assisted subject indexing for Project Gutenberg (2022) 0.08
    0.07885902 = product of:
      0.18400437 = sum of:
        0.021970814 = weight(_text_:of in 1139) [ClassicSimilarity], result of:
          0.021970814 = score(doc=1139,freq=14.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.31997898 = fieldWeight in 1139, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1139)
        0.109286554 = weight(_text_:congress in 1139) [ClassicSimilarity], result of:
          0.109286554 = score(doc=1139,freq=4.0), product of:
            0.20946044 = queryWeight, product of:
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.043909185 = queryNorm
            0.5217527 = fieldWeight in 1139, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1139)
        0.052747004 = weight(_text_:cataloging in 1139) [ClassicSimilarity], result of:
          0.052747004 = score(doc=1139,freq=2.0), product of:
            0.17305137 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.043909185 = queryNorm
            0.30480546 = fieldWeight in 1139, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1139)
      0.42857143 = coord(3/7)
    
    Abstract
    In light of AI (Artificial Intelligence) and NLP (Natural language processing) technologies, this article examines the feasibility of using AI/NLP models to enhance the subject indexing of digital resources. While BERT (Bidirectional Encoder Representations from Transformers) models are widely used in scholarly communities, the authors assess whether BERT models can be used in machine-assisted indexing in the Project Gutenberg collection, through suggesting Library of Congress subject headings filtered by certain Library of Congress Classification subclass labels. The findings of this study are informative for further research on BERT models to assist with automatic subject indexing for digital library collections.
    Source
    Cataloging and classification quarterly. 60(2022) no.8, p.807-835
  5. Mustafa el Hadi, W.: Human language technology and its role in information access and management (2003) 0.06
    0.056767557 = product of:
      0.13245763 = sum of:
        0.022193875 = weight(_text_:of in 5524) [ClassicSimilarity], result of:
          0.022193875 = score(doc=5524,freq=28.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.32322758 = fieldWeight in 5524, product of:
              5.2915025 = tf(freq=28.0), with freq of:
                28.0 = termFreq=28.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5524)
        0.03767643 = weight(_text_:cataloging in 5524) [ClassicSimilarity], result of:
          0.03767643 = score(doc=5524,freq=2.0), product of:
            0.17305137 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.043909185 = queryNorm
            0.21771818 = fieldWeight in 5524, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5524)
        0.07258732 = weight(_text_:distribution in 5524) [ClassicSimilarity], result of:
          0.07258732 = score(doc=5524,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.30219704 = fieldWeight in 5524, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5524)
      0.42857143 = coord(3/7)
    
    Abstract
    The role of linguistics in information access, extraction and dissemination is essential. Radical changes in the techniques of information and communication at the end of the twentieth century have had a significant effect on the function of the linguistic paradigm and its applications in all forms of communication. The introduction of new technical means have deeply changed the possibilities for the distribution of information. In this situation, what is the role of the linguistic paradigm and its practical applications, i.e., natural language processing (NLP) techniques when applied to information access? What solutions can linguistics offer in human computer interaction, extraction and management? Many fields show the relevance of the linguistic paradigm through the various technologies that require NLP, such as document and message understanding, information detection, extraction, and retrieval, question and answer, cross-language information retrieval (CLIR), text summarization, filtering, and spoken document retrieval. This paper focuses on the central role of human language technologies in the information society, surveys the current situation, describes the benefits of the above mentioned applications, outlines successes and challenges, and discusses solutions. It reviews the resources and means needed to advance information access and dissemination across language boundaries in the twenty-first century. Multilingualism, which is a natural result of globalization, requires more effort in the direction of language technology. The scope of human language technology (HLT) is large, so we limit our review to applications that involve multilinguality.
    Source
    Cataloging and classification quarterly. 37(2003) nos.1/2, S.131-151
  6. Dorr, B.J.: Large-scale dictionary construction for foreign language tutoring and interlingual machine translation (1997) 0.05
    0.054131 = product of:
      0.12630567 = sum of:
        0.021353623 = weight(_text_:of in 3244) [ClassicSimilarity], result of:
          0.021353623 = score(doc=3244,freq=18.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.3109903 = fieldWeight in 3244, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3244)
        0.08710478 = weight(_text_:distribution in 3244) [ClassicSimilarity], result of:
          0.08710478 = score(doc=3244,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.36263645 = fieldWeight in 3244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.046875 = fieldNorm(doc=3244)
        0.017847266 = product of:
          0.035694532 = sum of:
            0.035694532 = weight(_text_:22 in 3244) [ClassicSimilarity], result of:
              0.035694532 = score(doc=3244,freq=2.0), product of:
                0.15376249 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043909185 = queryNorm
                0.23214069 = fieldWeight in 3244, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3244)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Abstract
    Describes techniques for automatic construction of dictionaries for use in large-scale foreign language tutoring (FLT) and interlingual machine translation (MT) systems. The dictionaries are based on a language independent representation called lexical conceptual structure (LCS). Demonstrates that synonymous verb senses share distribution patterns. Shows how the syntax-semantics relation can be used to develop a lexical acquisition approach that contributes both toward the enrichment of existing online resources and toward the development of lexicons containing more complete information than is provided in any of these resources alone. Describes the structure of the LCS and shows how this representation is used in FLT and MT. Focuses on the problem of building LCS dictionaries for large-scale FLT and MT. Describes authoring tools for manual and semi-automatic construction of LCS dictionaries. Presents an approach that uses linguistic techniques for building word definitions automatically. The techniques have been implemented as part of a set of lixicon-development tools used in the MILT FLT project
    Date
    31. 7.1996 9:22:19
  7. Ferret, O.; Grau, B.; Masson, N.: Utilisation d'un réseau de cooccurences lexikales pour a méliorer une analyse thématique fondée sur la distribution des mots (1999) 0.05
    0.051624108 = product of:
      0.18068437 = sum of:
        0.016438028 = weight(_text_:of in 6295) [ClassicSimilarity], result of:
          0.016438028 = score(doc=6295,freq=6.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.23940048 = fieldWeight in 6295, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=6295)
        0.16424635 = weight(_text_:distribution in 6295) [ClassicSimilarity], result of:
          0.16424635 = score(doc=6295,freq=4.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.68379384 = fieldWeight in 6295, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.0625 = fieldNorm(doc=6295)
      0.2857143 = coord(2/7)
    
    Footnote
    Übers. d. Titels: Use of a network of lexical co-occurences to improve a thematic analysis based on distribution of words
  8. Arsenault, C.: Aggregation consistency and frequency of Chinese words and characters (2006) 0.05
    0.047349196 = product of:
      0.16572218 = sum of:
        0.020547535 = weight(_text_:of in 609) [ClassicSimilarity], result of:
          0.020547535 = score(doc=609,freq=24.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.2992506 = fieldWeight in 609, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=609)
        0.14517464 = weight(_text_:distribution in 609) [ClassicSimilarity], result of:
          0.14517464 = score(doc=609,freq=8.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.6043941 = fieldWeight in 609, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.0390625 = fieldNorm(doc=609)
      0.2857143 = coord(2/7)
    
    Abstract
    Purpose - Aims to measure syllable aggregation consistency of Romanized Chinese data in the title fields of bibliographic records. Also aims to verify if the term frequency distributions satisfy conventional bibliometric laws. Design/methodology/approach - Uses Cooper's interindexer formula to evaluate aggregation consistency within and between two sets of Chinese bibliographic data. Compares the term frequency distributions of polysyllabic words and monosyllabic characters (for vernacular and Romanized data) with the Lotka and the generalised Zipf theoretical distributions. The fits are tested with the Kolmogorov-Smirnov test. Findings - Finds high internal aggregation consistency within each data set but some aggregation discrepancy between sets. Shows that word (polysyllabic) distributions satisfy Lotka's law but that character (monosyllabic) distributions do not abide by the law. Research limitations/implications - The findings are limited to only two sets of bibliographic data (for aggregation consistency analysis) and to one set of data for the frequency distribution analysis. Only two bibliometric distributions are tested. Internal consistency within each database remains fairly high. Therefore the main argument against syllable aggregation does not appear to hold true. The analysis revealed that Chinese words and characters behave differently in terms of frequency distribution but that there is no noticeable difference between vernacular and Romanized data. The distribution of Romanized characters exhibits the worst case in terms of fit to either Lotka's or Zipf's laws, which indicates that Romanized data in aggregated form appear to be a preferable option. Originality/value - Provides empirical data on consistency and distribution of Romanized Chinese titles in bibliographic records.
    Source
    Journal of documentation. 62(2006) no.5, S.606-633
  9. Morris, V.: Automated language identification of bibliographic resources (2020) 0.05
    0.045996655 = product of:
      0.107325524 = sum of:
        0.02324688 = weight(_text_:of in 5749) [ClassicSimilarity], result of:
          0.02324688 = score(doc=5749,freq=12.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.33856338 = fieldWeight in 5749, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=5749)
        0.06028229 = weight(_text_:cataloging in 5749) [ClassicSimilarity], result of:
          0.06028229 = score(doc=5749,freq=2.0), product of:
            0.17305137 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.043909185 = queryNorm
            0.3483491 = fieldWeight in 5749, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0625 = fieldNorm(doc=5749)
        0.023796353 = product of:
          0.047592707 = sum of:
            0.047592707 = weight(_text_:22 in 5749) [ClassicSimilarity], result of:
              0.047592707 = score(doc=5749,freq=2.0), product of:
                0.15376249 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043909185 = queryNorm
                0.30952093 = fieldWeight in 5749, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5749)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Abstract
    This article describes experiments in the use of machine learning techniques at the British Library to assign language codes to catalog records, in order to provide information about the language of content of the resources described. In the first phase of the project, language codes were assigned to 1.15 million records with 99.7% confidence. The automated language identification tools developed will be used to contribute to future enhancement of over 4 million legacy records.
    Date
    2. 3.2020 19:04:22
    Source
    Cataloging and classification quarterly. 58(2020) no.1, S.1-27
  10. Meng, K.; Ba, Z.; Ma, Y.; Li, G.: ¬A network coupling approach to detecting hierarchical linkages between science and technology (2024) 0.04
    0.04094776 = product of:
      0.14331716 = sum of:
        0.020132389 = weight(_text_:of in 1205) [ClassicSimilarity], result of:
          0.020132389 = score(doc=1205,freq=16.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.2932045 = fieldWeight in 1205, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1205)
        0.12318477 = weight(_text_:distribution in 1205) [ClassicSimilarity], result of:
          0.12318477 = score(doc=1205,freq=4.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.5128454 = fieldWeight in 1205, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.046875 = fieldNorm(doc=1205)
      0.2857143 = coord(2/7)
    
    Abstract
    Detecting science-technology hierarchical linkages is beneficial for understanding deep interactions between science and technology (S&T). Previous studies have mainly focused on linear linkages between S&T but ignored their structural linkages. In this paper, we propose a network coupling approach to inspect hierarchical interactions of S&T by integrating their knowledge linkages and structural linkages. S&T knowledge networks are first enhanced with bidirectional encoder representation from transformers (BERT) knowledge alignment, and then their hierarchical structures are identified based on K-core decomposition. Hierarchical coupling preferences and strengths of the S&T networks over time are further calculated based on similarities of coupling nodes' degree distribution and similarities of coupling edges' weight distribution. Extensive experimental results indicate that our approach is feasible and robust in identifying the coupling hierarchy with superior performance compared to other isomorphism and dissimilarity algorithms. Our research extends the mindset of S&T linkage measurement by identifying patterns and paths of the interaction of S&T hierarchical knowledge.
    Source
    Journal of the Association for Information Science and Technology. 75(2023) no.2, S.167-187
  11. Robertson, S.E.; Sparck Jones, K.: Relevance weighting of search terms (1976) 0.04
    0.04035692 = product of:
      0.14124921 = sum of:
        0.025109503 = weight(_text_:of in 71) [ClassicSimilarity], result of:
          0.025109503 = score(doc=71,freq=14.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.36569026 = fieldWeight in 71, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=71)
        0.11613971 = weight(_text_:distribution in 71) [ClassicSimilarity], result of:
          0.11613971 = score(doc=71,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.48351526 = fieldWeight in 71, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.0625 = fieldNorm(doc=71)
      0.2857143 = coord(2/7)
    
    Abstract
    Examines statistical techniques for exploiting relevance information to weight search terms. These techniques are presented as a natural extension of weighting methods using information about the distribution of index terms in documents in general. A series of relevance weighting functions is derived and is justified by theoretical considerations. In particular, it is shown that specific weighted search methods are implied by a general probabilistic theory of retrieval. Different applications of relevance weighting are illustrated by experimental results for test collections
    Source
    Journal of the American Society for Information Science. 27(1976), S.129-146
  12. Hoenkamp, E.; Bruza, P.D.; Song, D.; Huang, Q.: ¬An effective approach to verbose queries using a limited dependencies language model (2009) 0.03
    0.033024497 = product of:
      0.11558574 = sum of:
        0.015005797 = weight(_text_:of in 2122) [ClassicSimilarity], result of:
          0.015005797 = score(doc=2122,freq=20.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.21854173 = fieldWeight in 2122, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=2122)
        0.10057994 = weight(_text_:distribution in 2122) [ClassicSimilarity], result of:
          0.10057994 = score(doc=2122,freq=6.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.41873652 = fieldWeight in 2122, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.03125 = fieldNorm(doc=2122)
      0.2857143 = coord(2/7)
    
    Abstract
    Intuitively, any 'bag of words' approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distributions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document's initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur's search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophisticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.
    Source
    Second International Conference on the Theory of Information Retrieval, ICTIR 2009 Cambridge, UK, September 10-12, 2009 Proceedings. Ed.: L. Azzopardi
  13. Mustafa el Hadi, W.: Terminology & information retrieval : new tools for new needs. Integration of knowledge across boundaries (2003) 0.03
    0.032219615 = product of:
      0.11276864 = sum of:
        0.025663862 = weight(_text_:of in 2688) [ClassicSimilarity], result of:
          0.025663862 = score(doc=2688,freq=26.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.37376386 = fieldWeight in 2688, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2688)
        0.08710478 = weight(_text_:distribution in 2688) [ClassicSimilarity], result of:
          0.08710478 = score(doc=2688,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.36263645 = fieldWeight in 2688, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.046875 = fieldNorm(doc=2688)
      0.2857143 = coord(2/7)
    
    Abstract
    The radical changes in information and communication techniques at the end of the 20th century have significantly modified the function of terminology and its applications in all forms of communication. The introduction of new mediums has deeply changed the possibilities of distribution of scientific information. What in this situation is the role of terminology and its practical applications? What is the place for multiple functions of terminology in the communication society? What is the impact of natural language (NLP) techniques used in its processing and management? In this article we will focus an the possibilities NLP techniques offer and how they can be directed towards the satisfaction of the newly expressed needs.
    Source
    Challenges in knowledge representation and organization for the 21st century: Integration of knowledge across boundaries. Proceedings of the 7th ISKO International Conference Granada, Spain, July 10-13, 2002. Ed.: M. López-Huertas
  14. Mustafa el Hadi, W.; Jouis, C.: Evaluating natural language processing systems as a tool for building terminological databases (1996) 0.03
    0.02789094 = product of:
      0.09761828 = sum of:
        0.02034102 = weight(_text_:of in 5191) [ClassicSimilarity], result of:
          0.02034102 = score(doc=5191,freq=12.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.29624295 = fieldWeight in 5191, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5191)
        0.07727726 = weight(_text_:congress in 5191) [ClassicSimilarity], result of:
          0.07727726 = score(doc=5191,freq=2.0), product of:
            0.20946044 = queryWeight, product of:
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.043909185 = queryNorm
            0.36893487 = fieldWeight in 5191, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5191)
      0.2857143 = coord(2/7)
    
    Abstract
    Natural language processing systems use various modules in order to identify terms or concept names and the logico-semantic relations they entertain. The approaches involved in corpus analysis are either based on morpho-syntactic analysis, statistical analysis, semantic analysis, recent connexionist models or any combination of 2 or more of these approaches. This paper will examine the capacity of natural language processing systems to create databases from extensive textual data. We are endeavouring to evaluate the contribution of these systems, their advantages and their shortcomings
    Source
    Knowledge organization and change: Proceedings of the Fourth International ISKO Conference, 15-18 July 1996, Library of Congress, Washington, DC. Ed.: R. Green
  15. Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.03
    0.027080342 = product of:
      0.09478119 = sum of:
        0.022193875 = weight(_text_:of in 4277) [ClassicSimilarity], result of:
          0.022193875 = score(doc=4277,freq=28.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.32322758 = fieldWeight in 4277, product of:
              5.2915025 = tf(freq=28.0), with freq of:
                28.0 = termFreq=28.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4277)
        0.07258732 = weight(_text_:distribution in 4277) [ClassicSimilarity], result of:
          0.07258732 = score(doc=4277,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.30219704 = fieldWeight in 4277, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4277)
      0.2857143 = coord(2/7)
    
    Abstract
    This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.
    Source
    Annual review of information science and technology. 39(2005), S.3-32
  16. Mustafa el Hadi, W.; Jouis, C.: Natural language processing-based systems for terminological construction and their contribution to information retrieval (1996) 0.03
    0.026824467 = product of:
      0.09388563 = sum of:
        0.016608374 = weight(_text_:of in 6331) [ClassicSimilarity], result of:
          0.016608374 = score(doc=6331,freq=8.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.24188137 = fieldWeight in 6331, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6331)
        0.07727726 = weight(_text_:congress in 6331) [ClassicSimilarity], result of:
          0.07727726 = score(doc=6331,freq=2.0), product of:
            0.20946044 = queryWeight, product of:
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.043909185 = queryNorm
            0.36893487 = fieldWeight in 6331, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6331)
      0.2857143 = coord(2/7)
    
    Abstract
    This paper will survey the capacity of natural language processing (NLP) systems to identify terms or concept names related to a specific field of knowledge (construction of a reference terminology) and the logico-semantic relations they entertain. The scope of our study will be limited to French language NLP systems whose purpose is automatic terms identification with textual area-grounded terms providing access keys to information
    Source
    TKE'96: Terminology and knowledge engineering. Proceedings 4th International Congress on Terminology and Knowledge Engineering, 26.-28.8.1996, Wien. Ed.: C. Galinski u. K.-D. Schmitz
  17. Chen, L.; Fang, H.: ¬An automatic method for ex-tracting innovative ideas based on the Scopus® database (2019) 0.03
    0.026360026 = product of:
      0.092260085 = sum of:
        0.019672766 = weight(_text_:of in 5310) [ClassicSimilarity], result of:
          0.019672766 = score(doc=5310,freq=22.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.28651062 = fieldWeight in 5310, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5310)
        0.07258732 = weight(_text_:distribution in 5310) [ClassicSimilarity], result of:
          0.07258732 = score(doc=5310,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.30219704 = fieldWeight in 5310, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5310)
      0.2857143 = coord(2/7)
    
    Abstract
    The novelty of knowledge claims in a research paper can be considered an evaluation criterion for papers to supplement citations. To provide a foundation for research evaluation from the perspective of innovativeness, we propose an automatic approach for extracting innovative ideas from the abstracts of technology and engineering papers. The approach extracts N-grams as candidates based on part-of-speech tagging and determines whether they are novel by checking the Scopus® database to determine whether they had ever been presented previously. Moreover, we discussed the distributions of innovative ideas in different abstract structures. To improve the performance by excluding noisy N-grams, a list of stopwords and a list of research description characteristics were developed. We selected abstracts of articles published from 2011 to 2017 with the topic of semantic analysis as the experimental texts. Excluding noisy N-grams, considering the distribution of innovative ideas in abstracts, and suitably combining N-grams can effectively improve the performance of automatic innovative idea extraction. Unlike co-word and co-citation analysis, innovative-idea extraction aims to identify the differences in a paper from all previously published papers.
  18. Kim, W.; Wilbur, W.J.: Corpus-based statistical screening for content-bearing terms (2001) 0.02
    0.023370316 = product of:
      0.0817961 = sum of:
        0.02372625 = weight(_text_:of in 5188) [ClassicSimilarity], result of:
          0.02372625 = score(doc=5188,freq=50.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.34554482 = fieldWeight in 5188, product of:
              7.071068 = tf(freq=50.0), with freq of:
                50.0 = termFreq=50.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=5188)
        0.058069855 = weight(_text_:distribution in 5188) [ClassicSimilarity], result of:
          0.058069855 = score(doc=5188,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.24175763 = fieldWeight in 5188, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.03125 = fieldNorm(doc=5188)
      0.2857143 = coord(2/7)
    
    Abstract
    Kim and Wilber present three techniques for the algorithmic identification in text of content bearing terms and phrases intended for human use as entry points or hyperlinks. Using a set of 1,075 terms from MEDLINE evaluated on a zero to four, stop word to definite content word scale, they evaluate the ranked lists of their three methods based on their placement of content words in the top ranks. Data consist of the natural language elements of 304,057 MEDLINE records from 1996, and 173,252 Wall Street Journal records from the TIPSTER collection. Phrases are extracted by breaking at punctuation marks and stop words, normalized by lower casing, replacement of nonalphanumerics with spaces, and the reduction of multiple spaces. In the ``strength of context'' approach each document is a vector of binary values for each word or word pair. The words or word pairs are removed from all documents, and the Robertson, Spark Jones relevance weight for each term computed, negative weights replaced with zero, those below a randomness threshold ignored, and the remainder summed for each document, to yield a score for the document and finally to assign to the term the average document score for documents in which it occurred. The average of these word scores is assigned to the original phrase. The ``frequency clumping'' approach defines a random phrase as one whose distribution among documents is Poisson in character. A pvalue, the probability that a phrase frequency of occurrence would be equal to, or less than, Poisson expectations is computed, and a score assigned which is the negative log of that value. In the ``database comparison'' approach if a phrase occurring in a document allows prediction that the document is in MEDLINE rather that in the Wall Street Journal, it is considered to be content bearing for MEDLINE. The score is computed by dividing the number of occurrences of the term in MEDLINE by occurrences in the Journal, and taking the product of all these values. The one hundred top and bottom ranked phrases that occurred in at least 500 documents were collected for each method. The union set had 476 phrases. A second selection was made of two word phrases occurring each in only three documents with a union of 599 phrases. A judge then ranked the two sets of terms as to subject specificity on a 0 to 4 scale. Precision was the average subject specificity of the first r ranks and recall the fraction of the subject specific phrases in the first r ranks and eleven point average precision was used as a summary measure. The three methods all move content bearing terms forward in the lists as does the use of the sum of the logs of the three methods.
    Source
    Journal of the American Society for Information Science and technology. 52(2001) no.3, S.247-259
  19. Sienel, J.; Weiss, M.; Laube, M.: Sprachtechnologien für die Informationsgesellschaft des 21. Jahrhunderts (2000) 0.02
    0.022916993 = product of:
      0.08020947 = sum of:
        0.0059315623 = weight(_text_:of in 5557) [ClassicSimilarity], result of:
          0.0059315623 = score(doc=5557,freq=2.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.086386204 = fieldWeight in 5557, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5557)
        0.07427791 = sum of:
          0.044532467 = weight(_text_:service in 5557) [ClassicSimilarity], result of:
            0.044532467 = score(doc=5557,freq=2.0), product of:
              0.18813887 = queryWeight, product of:
                4.284727 = idf(docFreq=1655, maxDocs=44218)
                0.043909185 = queryNorm
              0.23669997 = fieldWeight in 5557, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.284727 = idf(docFreq=1655, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5557)
          0.029745443 = weight(_text_:22 in 5557) [ClassicSimilarity], result of:
            0.029745443 = score(doc=5557,freq=2.0), product of:
              0.15376249 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.043909185 = queryNorm
              0.19345059 = fieldWeight in 5557, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5557)
      0.2857143 = coord(2/7)
    
    Date
    26.12.2000 13:22:17
    Source
    Sprachtechnologie für eine dynamische Wirtschaft im Medienzeitalter - Language technologies for dynamic business in the age of the media - L'ingénierie linguistique au service de la dynamisation économique à l'ère du multimédia: Tagungsakten der XXVI. Jahrestagung der Internationalen Vereinigung Sprache und Wirtschaft e.V., 23.-25.11.2000, Fachhochschule Köln. Hrsg.: K.-D. Schmitz
  20. Warner, A.J.: Natural language processing (1987) 0.02
    0.01902106 = product of:
      0.06657371 = sum of:
        0.018981 = weight(_text_:of in 337) [ClassicSimilarity], result of:
          0.018981 = score(doc=337,freq=2.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.27643585 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.125 = fieldNorm(doc=337)
        0.047592707 = product of:
          0.095185414 = sum of:
            0.095185414 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
              0.095185414 = score(doc=337,freq=2.0), product of:
                0.15376249 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043909185 = queryNorm
                0.61904186 = fieldWeight in 337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=337)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Source
    Annual review of information science and technology. 22(1987), S.79-108

Languages

Types

  • a 458
  • el 56
  • m 39
  • s 21
  • x 12
  • p 7
  • b 1
  • d 1
  • n 1
  • r 1
  • More… Less…

Subjects

Classifications