Search (146 results, page 1 of 8)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.42
    0.42283043 = product of:
      0.63424563 = sum of:
        0.058476865 = product of:
          0.1754306 = sum of:
            0.1754306 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.1754306 = score(doc=562,freq=2.0), product of:
                0.31214407 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.036818076 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.1754306 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.1754306 = score(doc=562,freq=2.0), product of:
            0.31214407 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036818076 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.034511987 = weight(_text_:data in 562) [ClassicSimilarity], result of:
          0.034511987 = score(doc=562,freq=4.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.29644224 = fieldWeight in 562, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.1754306 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.1754306 = score(doc=562,freq=2.0), product of:
            0.31214407 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036818076 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.1754306 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.1754306 = score(doc=562,freq=2.0), product of:
            0.31214407 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036818076 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.014965023 = product of:
          0.029930046 = sum of:
            0.029930046 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.029930046 = score(doc=562,freq=2.0), product of:
                0.12893063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.036818076 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.6666667 = coord(6/9)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
    Source
    Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK
  2. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.31
    0.3142558 = product of:
      0.5656604 = sum of:
        0.1754306 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.1754306 = score(doc=563,freq=2.0), product of:
            0.31214407 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036818076 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.024403658 = weight(_text_:data in 563) [ClassicSimilarity], result of:
          0.024403658 = score(doc=563,freq=2.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.2096163 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.1754306 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.1754306 = score(doc=563,freq=2.0), product of:
            0.31214407 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036818076 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.1754306 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.1754306 = score(doc=563,freq=2.0), product of:
            0.31214407 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036818076 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.014965023 = product of:
          0.029930046 = sum of:
            0.029930046 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.029930046 = score(doc=563,freq=2.0), product of:
                0.12893063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.036818076 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.5555556 = coord(5/9)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  3. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.26
    0.25989717 = product of:
      0.58476865 = sum of:
        0.058476865 = product of:
          0.1754306 = sum of:
            0.1754306 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.1754306 = score(doc=862,freq=2.0), product of:
                0.31214407 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.036818076 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.1754306 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.1754306 = score(doc=862,freq=2.0), product of:
            0.31214407 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036818076 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
        0.1754306 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.1754306 = score(doc=862,freq=2.0), product of:
            0.31214407 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036818076 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
        0.1754306 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.1754306 = score(doc=862,freq=2.0), product of:
            0.31214407 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036818076 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.44444445 = coord(4/9)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  4. Arsenault, C.: Aggregation consistency and frequency of Chinese words and characters (2006) 0.03
    0.027257957 = product of:
      0.1226608 = sum of:
        0.06165166 = weight(_text_:bibliographic in 609) [ClassicSimilarity], result of:
          0.06165166 = score(doc=609,freq=8.0), product of:
            0.14333439 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.036818076 = queryNorm
            0.43012467 = fieldWeight in 609, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0390625 = fieldNorm(doc=609)
        0.061009146 = weight(_text_:data in 609) [ClassicSimilarity], result of:
          0.061009146 = score(doc=609,freq=18.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.52404076 = fieldWeight in 609, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=609)
      0.22222222 = coord(2/9)
    
    Abstract
    Purpose - Aims to measure syllable aggregation consistency of Romanized Chinese data in the title fields of bibliographic records. Also aims to verify if the term frequency distributions satisfy conventional bibliometric laws. Design/methodology/approach - Uses Cooper's interindexer formula to evaluate aggregation consistency within and between two sets of Chinese bibliographic data. Compares the term frequency distributions of polysyllabic words and monosyllabic characters (for vernacular and Romanized data) with the Lotka and the generalised Zipf theoretical distributions. The fits are tested with the Kolmogorov-Smirnov test. Findings - Finds high internal aggregation consistency within each data set but some aggregation discrepancy between sets. Shows that word (polysyllabic) distributions satisfy Lotka's law but that character (monosyllabic) distributions do not abide by the law. Research limitations/implications - The findings are limited to only two sets of bibliographic data (for aggregation consistency analysis) and to one set of data for the frequency distribution analysis. Only two bibliometric distributions are tested. Internal consistency within each database remains fairly high. Therefore the main argument against syllable aggregation does not appear to hold true. The analysis revealed that Chinese words and characters behave differently in terms of frequency distribution but that there is no noticeable difference between vernacular and Romanized data. The distribution of Romanized characters exhibits the worst case in terms of fit to either Lotka's or Zipf's laws, which indicates that Romanized data in aggregated form appear to be a preferable option. Originality/value - Provides empirical data on consistency and distribution of Romanized Chinese titles in bibliographic records.
  5. Polity, Y.: Vers une ergonomie linguistique (1994) 0.02
    0.018191008 = product of:
      0.08185954 = sum of:
        0.049321324 = weight(_text_:bibliographic in 36) [ClassicSimilarity], result of:
          0.049321324 = score(doc=36,freq=2.0), product of:
            0.14333439 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.036818076 = queryNorm
            0.34409973 = fieldWeight in 36, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0625 = fieldNorm(doc=36)
        0.032538213 = weight(_text_:data in 36) [ClassicSimilarity], result of:
          0.032538213 = score(doc=36,freq=2.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.2794884 = fieldWeight in 36, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0625 = fieldNorm(doc=36)
      0.22222222 = coord(2/9)
    
    Abstract
    Analyzed a special type of man-mchine interaction, that of searching an information system with natural language. A model for full text processing for information retrieval was proposed that considered the system's users and how they employ information. Describes how INIST (the National Institute for Scientific and Technical Information) is developing computer assisted indexing as an aid to improving relevance when retrieving information from bibliographic data banks
  6. Morris, V.: Automated language identification of bibliographic resources (2020) 0.02
    0.015394375 = product of:
      0.06927469 = sum of:
        0.049321324 = weight(_text_:bibliographic in 5749) [ClassicSimilarity], result of:
          0.049321324 = score(doc=5749,freq=2.0), product of:
            0.14333439 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.036818076 = queryNorm
            0.34409973 = fieldWeight in 5749, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0625 = fieldNorm(doc=5749)
        0.019953365 = product of:
          0.03990673 = sum of:
            0.03990673 = weight(_text_:22 in 5749) [ClassicSimilarity], result of:
              0.03990673 = score(doc=5749,freq=2.0), product of:
                0.12893063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.036818076 = queryNorm
                0.30952093 = fieldWeight in 5749, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5749)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Date
    2. 3.2020 19:04:22
  7. Basili, R.; Pazienza, M.T.; Velardi, P.: ¬An empirical symbolic approach to natural language processing (1996) 0.01
    0.0146598555 = product of:
      0.06596935 = sum of:
        0.04601598 = weight(_text_:data in 6753) [ClassicSimilarity], result of:
          0.04601598 = score(doc=6753,freq=4.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.3952563 = fieldWeight in 6753, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0625 = fieldNorm(doc=6753)
        0.019953365 = product of:
          0.03990673 = sum of:
            0.03990673 = weight(_text_:22 in 6753) [ClassicSimilarity], result of:
              0.03990673 = score(doc=6753,freq=2.0), product of:
                0.12893063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.036818076 = queryNorm
                0.30952093 = fieldWeight in 6753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6753)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Abstract
    Describes and evaluates the results of a large scale lexical learning system, ARISTO-LEX, that uses a combination of probabilisitc and knowledge based methods for the acquisition of selectional restrictions of words in sublanguages. Presents experimental data obtained from different corpora in different doamins and languages, and shows that the acquired lexical data not only have practical applications in natural language processing, but they are useful for a comparative analysis of sublanguages
    Date
    6. 3.1997 16:22:15
  8. Conceptual structures : logical, linguistic, and computational issues. 8th International Conference on Conceptual Structures, ICCS 2000, Darmstadt, Germany, August 14-18, 2000 (2000) 0.01
    0.0143410815 = product of:
      0.064534865 = sum of:
        0.021134188 = weight(_text_:data in 691) [ClassicSimilarity], result of:
          0.021134188 = score(doc=691,freq=6.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.18153305 = fieldWeight in 691, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0234375 = fieldNorm(doc=691)
        0.043400675 = weight(_text_:germany in 691) [ClassicSimilarity], result of:
          0.043400675 = score(doc=691,freq=2.0), product of:
            0.21956629 = queryWeight, product of:
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.036818076 = queryNorm
            0.19766548 = fieldWeight in 691, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.0234375 = fieldNorm(doc=691)
      0.22222222 = coord(2/9)
    
    Content
    Concepts and Language: The Role of Conceptual Structure in Human Evolution (Keith Devlin) - Concepts in Linguistics - Concepts in Natural Language (Gisela Harras) - Patterns, Schemata, and Types: Author Support through Formalized Experience (Felix H. Gatzemeier) - Conventions and Notations for Knowledge Representation and Retrieval (Philippe Martin) - Conceptual Ontology: Ontology, Metadata, and Semiotics (John F. Sowa) - Pragmatically Yours (Mary Keeler) - Conceptual Modeling for Distributed Ontology Environments (Deborah L. McGuinness) - Discovery of Class Relations in Exception Structured Knowledge Bases (Hendra Suryanto, Paul Compton) - Conceptual Graphs: Perspectives: CGs Applications: Where Are We 7 Years after the First ICCS ? (Michel Chein, David Genest) - The Engineering of a CC-Based System: Fundamental Issues (Guy W. Mineau) - Conceptual Graphs, Metamodeling, and Notation of Concepts (Olivier Gerbé, Guy W. Mineau, Rudolf K. Keller) - Knowledge Representation and Reasonings: Based on Graph Homomorphism (Marie-Laure Mugnier) - User Modeling Using Conceptual Graphs for Intelligent Agents (James F. Baldwin, Trevor P. Martin, Aimilia Tzanavari) - Towards a Unified Querying System of Both Structured and Semi-structured Imprecise Data Using Fuzzy View (Patrice Buche, Ollivier Haemmerlé) - Formal Semantics of Conceptual Structures: The Extensional Semantics of the Conceptual Graph Formalism (Guy W. Mineau) - Semantics of Attribute Relations in Conceptual Graphs (Pavel Kocura) - Nested Concept Graphs and Triadic Power Context Families (Susanne Prediger) - Negations in Simple Concept Graphs (Frithjof Dau) - Extending the CG Model by Simulations (Jean-François Baget) - Contextual Logic and Formal Concept Analysis: Building and Structuring Description Logic Knowledge Bases: Using Least Common Subsumers and Concept Analysis (Franz Baader, Ralf Molitor) - On the Contextual Logic of Ordinal Data (Silke Pollandt, Rudolf Wille) - Boolean Concept Logic (Rudolf Wille) - Lattices of Triadic Concept Graphs (Bernd Groh, Rudolf Wille) - Formalizing Hypotheses with Concepts (Bernhard Ganter, Sergei 0. Kuznetsov) - Generalized Formal Concept Analysis (Laurent Chaudron, Nicolas Maille) - A Logical Generalization of Formal Concept Analysis (Sébastien Ferré, Olivier Ridoux) - On the Treatment of Incomplete Knowledge in Formal Concept Analysis (Peter Burmeister, Richard Holzer) - Conceptual Structures in Practice: Logic-Based Networks: Concept Graphs and Conceptual Structures (Peter W. Eklund) - Conceptual Knowledge Discovery and Data Analysis (Joachim Hereth, Gerd Stumme, Rudolf Wille, Uta Wille) - CEM - A Conceptual Email Manager (Richard Cole, Gerd Stumme) - A Contextual-Logic Extension of TOSCANA (Peter Eklund, Bernd Groh, Gerd Stumme, Rudolf Wille) - A Conceptual Graph Model for W3C Resource Description Framework (Olivier Corby, Rose Dieng, Cédric Hébert) - Computational Aspects of Conceptual Structures: Computing with Conceptual Structures (Bernhard Ganter) - Symmetry and the Computation of Conceptual Structures (Robert Levinson) An Introduction to SNePS 3 (Stuart C. Shapiro) - Composition Norm Dynamics Calculation with Conceptual Graphs (Aldo de Moor) - From PROLOG++ to PROLOG+CG: A CG Object-Oriented Logic Programming Language (Adil Kabbaj, Martin Janta-Polczynski) - A Cost-Bounded Algorithm to Control Events Generalization (Gaël de Chalendar, Brigitte Grau, Olivier Ferret)
  9. Gopestake, A.: Acquisition of lexical translation relations from MRDS (1994/95) 0.01
    0.013649155 = product of:
      0.1228424 = sum of:
        0.1228424 = weight(_text_:readable in 4073) [ClassicSimilarity], result of:
          0.1228424 = score(doc=4073,freq=2.0), product of:
            0.2262076 = queryWeight, product of:
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.036818076 = queryNorm
            0.5430516 = fieldWeight in 4073, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.0625 = fieldNorm(doc=4073)
      0.11111111 = coord(1/9)
    
    Abstract
    Presents a methodology for extracting information about lexical translation equivalences from the machine readable versions of conventional dictionaries (MRDs), and describes a series of experiments on semi automatic construction of a linked multilingual lexical knowledge base for English, Dutch and Spanish. Discusses the advantage and limitations of using MRDs that this has revealed, and some strategies developed to cover gaps where direct translation can be found
  10. Lehrndorfer, A.: Kontrolliertes Deutsch : Linguistische und sprachpsychologische Leitlinien für eine (maschinell) kontrollierte Sprache in der Technischen Dokumentation (1996) 0.01
    0.01285946 = product of:
      0.115735136 = sum of:
        0.115735136 = weight(_text_:germany in 2039) [ClassicSimilarity], result of:
          0.115735136 = score(doc=2039,freq=2.0), product of:
            0.21956629 = queryWeight, product of:
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.036818076 = queryNorm
            0.52710795 = fieldWeight in 2039, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.0625 = fieldNorm(doc=2039)
      0.11111111 = coord(1/9)
    
    Content
    The book comprises 8 chapters and 5 appendices: Artificial languages, sublanguages and controlled languages; Theory for language planning, especially for planning of controlled languages; The language situation of technical documentation in Germany; The Lexicon; The syntax; Perspectives; Literature
  11. Semantik, Lexikographie und Computeranwendungen : Workshop ... (Bonn) : 1995.01.27-28 (1996) 0.01
    0.011809694 = product of:
      0.05314362 = sum of:
        0.040672768 = weight(_text_:data in 190) [ClassicSimilarity], result of:
          0.040672768 = score(doc=190,freq=8.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.34936053 = fieldWeight in 190, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=190)
        0.012470853 = product of:
          0.024941705 = sum of:
            0.024941705 = weight(_text_:22 in 190) [ClassicSimilarity], result of:
              0.024941705 = score(doc=190,freq=2.0), product of:
                0.12893063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.036818076 = queryNorm
                0.19345059 = fieldWeight in 190, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=190)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Date
    14. 4.2007 10:04:22
    LCSH
    Semantics / Data processing ; Lexicography / Data processing ; Computational linguistics
    Subject
    Semantics / Data processing ; Lexicography / Data processing ; Computational linguistics
  12. Kim, S.; Ko, Y.; Oard, D.W.: Combining lexical and statistical translation evidence for cross-language information retrieval (2015) 0.01
    0.010236867 = product of:
      0.0921318 = sum of:
        0.0921318 = weight(_text_:readable in 1606) [ClassicSimilarity], result of:
          0.0921318 = score(doc=1606,freq=2.0), product of:
            0.2262076 = queryWeight, product of:
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.036818076 = queryNorm
            0.4072887 = fieldWeight in 1606, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.046875 = fieldNorm(doc=1606)
      0.11111111 = coord(1/9)
    
    Abstract
    This article explores how best to use lexical and statistical translation evidence together for cross-language information retrieval (CLIR). Lexical translation evidence is assembled from Wikipedia and from a large machine-readable dictionary, statistical translation evidence is drawn from parallel corpora, and evidence from co-occurrence in the document language provides a basis for limiting the adverse effect of translation ambiguity. Coverage statistics for NII Testbeds and Community for Information Access Research (NTCIR) queries confirm that these resources have complementary strengths. Experiments with translation evidence from a small parallel corpus indicate that even rather rough estimates of translation probabilities can yield further improvements over a strong technique for translation weighting based on using Jensen-Shannon divergence as a term-association measure. Finally, a novel approach to posttranslation query expansion using a random walk over the Wikipedia concept link graph is shown to yield further improvements over alternative techniques for posttranslation query expansion. Evaluation results on the NTCIR-5 English-Korean test collection show statistically significant improvements over strong baselines.
  13. Liddy, E.D.: Natural language processing for information retrieval and knowledge discovery (1998) 0.01
    0.010206696 = product of:
      0.04593013 = sum of:
        0.028470935 = weight(_text_:data in 2345) [ClassicSimilarity], result of:
          0.028470935 = score(doc=2345,freq=2.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.24455236 = fieldWeight in 2345, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2345)
        0.017459193 = product of:
          0.034918386 = sum of:
            0.034918386 = weight(_text_:22 in 2345) [ClassicSimilarity], result of:
              0.034918386 = score(doc=2345,freq=2.0), product of:
                0.12893063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.036818076 = queryNorm
                0.2708308 = fieldWeight in 2345, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2345)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Date
    22. 9.1997 19:16:05
    Source
    Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al
  14. Rahmstorf, G.: Concept structures for large vocabularies (1998) 0.01
    0.008748596 = product of:
      0.03936868 = sum of:
        0.024403658 = weight(_text_:data in 75) [ClassicSimilarity], result of:
          0.024403658 = score(doc=75,freq=2.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.2096163 = fieldWeight in 75, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=75)
        0.014965023 = product of:
          0.029930046 = sum of:
            0.029930046 = weight(_text_:22 in 75) [ClassicSimilarity], result of:
              0.029930046 = score(doc=75,freq=2.0), product of:
                0.12893063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.036818076 = queryNorm
                0.23214069 = fieldWeight in 75, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=75)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Abstract
    A technology is described which supports the acquisition, visualisation and manipulation of large vocabularies with associated structures. It is used for dictionary production, terminology data bases, thesauri, library classification systems etc. Essential features of the technology are a lexicographic user interface, variable word description, unlimited list of word readings, a concept language, automatic transformations of formulas into graphic structures, structure manipulation operations and retransformation into formulas. The concept language includes notations for undefined concepts. The structure of defined concepts can be constructed interactively. The technology supports the generation of large vocabularies with structures representing word senses. Concept structures and ordering systems for indexing and retrieval can be constructed separately and connected by associating relations.
    Date
    30.12.2001 19:01:22
  15. WordNet : an electronic lexical database (language, speech and communication) (1998) 0.01
    0.007748807 = product of:
      0.06973926 = sum of:
        0.06973926 = weight(_text_:data in 2434) [ClassicSimilarity], result of:
          0.06973926 = score(doc=2434,freq=12.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.59902847 = fieldWeight in 2434, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2434)
      0.11111111 = coord(1/9)
    
    LCSH
    Semantics / Data processing
    Lexicology / Data processing
    English language / Data processing
    Subject
    Semantics / Data processing
    Lexicology / Data processing
    English language / Data processing
  16. Barton, G.E. Jr.; Berwick, R.C.; Ristad, E.S.: Computational complexity and natural language (1987) 0.01
    0.0076693306 = product of:
      0.069023974 = sum of:
        0.069023974 = weight(_text_:data in 7138) [ClassicSimilarity], result of:
          0.069023974 = score(doc=7138,freq=4.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.5928845 = fieldWeight in 7138, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.09375 = fieldNorm(doc=7138)
      0.11111111 = coord(1/9)
    
    LCSH
    Linguistics / Data processing
    Subject
    Linguistics / Data processing
  17. Hausser, R.: Language and nonlanguage cognition (2021) 0.01
    0.007174002 = product of:
      0.064566016 = sum of:
        0.064566016 = weight(_text_:data in 255) [ClassicSimilarity], result of:
          0.064566016 = score(doc=255,freq=14.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.55459267 = fieldWeight in 255, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=255)
      0.11111111 = coord(1/9)
    
    Abstract
    A basic distinction in agent-based data-driven Database Semantics (DBS) is between language and nonlanguage cognition. Language cognition transfers content between agents by means of raw data. Nonlanguage cognition maps between content and raw data inside the focus agent. {\it Recognition} applies a concept type to raw data, resulting in a concept token. In language recognition, the focus agent (hearer) takes raw language-data (surfaces) produced by another agent (speaker) as input, while nonlanguage recognition takes raw nonlanguage-data as input. In either case, the output is a content which is stored in the agent's onboard short term memory. {\it Action} adapts a concept type to a purpose, resulting in a token. In language action, the focus agent (speaker) produces language-dependent surfaces for another agent (hearer), while nonlanguage action produces intentions for a nonlanguage purpose. In either case, the output is raw action data. As long as the procedural implementation of place holder values works properly, it is compatible with the DBS requirement of input-output equivalence between the natural prototype and the artificial reconstruction.
  18. Yannakoudakis, E.J.; Daraki, J.J.: Lexical clustering and retrieval of bibliographic records (1994) 0.01
    0.0067813364 = product of:
      0.061032027 = sum of:
        0.061032027 = weight(_text_:bibliographic in 1045) [ClassicSimilarity], result of:
          0.061032027 = score(doc=1045,freq=4.0), product of:
            0.14333439 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.036818076 = queryNorm
            0.4258017 = fieldWeight in 1045, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1045)
      0.11111111 = coord(1/9)
    
    Abstract
    Presents a new system that enables users to retrieve catalogue entries on the basis of theri lexical similarities and to cluster records in a dynamic fashion. Describes the information retrieval system developed by the Department of Informatics, Athens University of Economics and Business, Greece. The system also offers the means for cyclic retrieval of records from each cluster while allowing the user to define the field to be used in each case. The approach is based on logical keys which are derived from pertinent bibliographic fields and are used for all clustering and information retrieval functions
  19. Xiang, R.; Chersoni, E.; Lu, Q.; Huang, C.-R.; Li, W.; Long, Y.: Lexical data augmentation for sentiment analysis (2021) 0.01
    0.0063911085 = product of:
      0.057519976 = sum of:
        0.057519976 = weight(_text_:data in 392) [ClassicSimilarity], result of:
          0.057519976 = score(doc=392,freq=16.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.49407038 = fieldWeight in 392, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=392)
      0.11111111 = coord(1/9)
    
    Abstract
    Machine learning methods, especially deep learning models, have achieved impressive performance in various natural language processing tasks including sentiment analysis. However, deep learning models are more demanding for training data. Data augmentation techniques are widely used to generate new instances based on modifications to existing data or relying on external knowledge bases to address annotated data scarcity, which hinders the full potential of machine learning techniques. This paper presents our work using part-of-speech (POS) focused lexical substitution for data augmentation (PLSDA) to enhance the performance of machine learning algorithms in sentiment analysis. We exploit POS information to identify words to be replaced and investigate different augmentation strategies to find semantically related substitutions when generating new instances. The choice of POS tags as well as a variety of strategies such as semantic-based substitution methods and sampling methods are discussed in detail. Performance evaluation focuses on the comparison between PLSDA and two previous lexical substitution-based data augmentation methods, one of which is thesaurus-based, and the other is lexicon manipulation based. Our approach is tested on five English sentiment analysis benchmarks: SST-2, MR, IMDB, Twitter, and AirRecord. Hyperparameters such as the candidate similarity threshold and number of newly generated instances are optimized. Results show that six classifiers (SVM, LSTM, BiLSTM-AT, bidirectional encoder representations from transformers [BERT], XLNet, and RoBERTa) trained with PLSDA achieve accuracy improvement of more than 0.6% comparing to two previous lexical substitution methods averaged on five benchmarks. Introducing POS constraint and well-designed augmentation strategies can improve the reliability of lexical data augmentation methods. Consequently, PLSDA significantly improves the performance of sentiment analysis algorithms.
  20. Ruge, G.: Experiments on linguistically-based term associations (1992) 0.01
    0.006261982 = product of:
      0.05635784 = sum of:
        0.05635784 = weight(_text_:data in 1810) [ClassicSimilarity], result of:
          0.05635784 = score(doc=1810,freq=6.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.48408815 = fieldWeight in 1810, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0625 = fieldNorm(doc=1810)
      0.11111111 = coord(1/9)
    
    Abstract
    Describes the hyperterm system REALIST (REtrieval Aids by LInguistic and STatistics) and describes its semantic component. The semantic component of REALIST generates semantic term relations such synonyms. It takes as input a free text data base and generates as output term pairs that are semantically related with respect to their meanings in the data base. In the 1st step an automatic syntactic analysis provides linguistical knowledge about the terms of the data base. In the 2nd step this knowledge is compared by statistical similarity computation. Various experiments with different similarity measures are described

Authors

Years

Languages

  • e 117
  • d 27
  • f 1
  • More… Less…

Types

  • a 112
  • el 16
  • m 16
  • s 9
  • x 7
  • p 3
  • d 1
  • More… Less…

Classifications