Search (460 results, page 1 of 23)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.15
    0.14865604 = product of:
      0.29731208 = sum of:
        0.06985858 = product of:
          0.20957573 = sum of:
            0.20957573 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.20957573 = score(doc=562,freq=2.0), product of:
                0.37289858 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.043984205 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.20957573 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.20957573 = score(doc=562,freq=2.0), product of:
            0.37289858 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.043984205 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.017877758 = product of:
          0.035755515 = sum of:
            0.035755515 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.035755515 = score(doc=562,freq=2.0), product of:
                0.1540252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043984205 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.5 = coord(3/6)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.12
    0.11821951 = product of:
      0.23643902 = sum of:
        0.0089855315 = weight(_text_:information in 563) [ClassicSimilarity], result of:
          0.0089855315 = score(doc=563,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.116372846 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.20957573 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.20957573 = score(doc=563,freq=2.0), product of:
            0.37289858 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.043984205 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.017877758 = product of:
          0.035755515 = sum of:
            0.035755515 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.035755515 = score(doc=563,freq=2.0), product of:
                0.1540252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043984205 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.5 = coord(3/6)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  3. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.09
    0.093144774 = product of:
      0.27943432 = sum of:
        0.06985858 = product of:
          0.20957573 = sum of:
            0.20957573 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.20957573 = score(doc=862,freq=2.0), product of:
                0.37289858 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.043984205 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.20957573 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.20957573 = score(doc=862,freq=2.0), product of:
            0.37289858 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.043984205 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(2/6)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  4. Fóris, A.: Network theory and terminology (2013) 0.05
    0.049350783 = product of:
      0.14805235 = sum of:
        0.13315421 = weight(_text_:networks in 1365) [ClassicSimilarity], result of:
          0.13315421 = score(doc=1365,freq=12.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.640036 = fieldWeight in 1365, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1365)
        0.0148981325 = product of:
          0.029796265 = sum of:
            0.029796265 = weight(_text_:22 in 1365) [ClassicSimilarity], result of:
              0.029796265 = score(doc=1365,freq=2.0), product of:
                0.1540252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043984205 = queryNorm
                0.19345059 = fieldWeight in 1365, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1365)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    The paper aims to present the relations of network theory and terminology. The model of scale-free networks, which has been recently developed and widely applied since, can be effectively used in terminology research as well. Operation based on the principle of networks is a universal characteristic of complex systems. Networks are governed by general laws. The model of scale-free networks can be viewed as a statistical-probability model, and it can be described with mathematical tools. Its main feature is that "everything is connected to everything else," that is, every node is reachable (in a few steps) starting from any other node; this phenomena is called "the small world phenomenon." The existence of a linguistic network and the general laws of the operation of networks enable us to place issues of language use in the complex system of relations that reveal the deeper connection s between phenomena with the help of networks embedded in each other. The realization of the metaphor that language also has a network structure is the basis of the classification methods of the terminological system, and likewise of the ways of creating terminology databases, which serve the purpose of providing easy and versatile accessibility to specialised knowledge.
    Date
    2. 9.2014 21:22:48
  5. Moisl, H.: Artificial neural networks and Natural Language Processing (2009) 0.04
    0.04499444 = product of:
      0.13498332 = sum of:
        0.011980709 = weight(_text_:information in 3138) [ClassicSimilarity], result of:
          0.011980709 = score(doc=3138,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.1551638 = fieldWeight in 3138, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3138)
        0.1230026 = weight(_text_:networks in 3138) [ClassicSimilarity], result of:
          0.1230026 = score(doc=3138,freq=4.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.59124 = fieldWeight in 3138, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0625 = fieldNorm(doc=3138)
      0.33333334 = coord(2/6)
    
    Abstract
    This entry gives an overview of work to date on natural language processing (NLP) using artificial neural networks (ANN). It is in three main parts: the first gives a brief introduction to ANNs, the second outlines some of the main issues in ANN-based NLP, and the third surveys specific application areas. Each part cites a representative selection of research literature that itself contains pointers to further reading.
    Source
    Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates
  6. Ruge, G.: Sprache und Computer : Wortbedeutung und Termassoziation. Methoden zur automatischen semantischen Klassifikation (1995) 0.04
    0.036937665 = product of:
      0.11081299 = sum of:
        0.08697598 = weight(_text_:networks in 1534) [ClassicSimilarity], result of:
          0.08697598 = score(doc=1534,freq=2.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.4180698 = fieldWeight in 1534, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0625 = fieldNorm(doc=1534)
        0.023837011 = product of:
          0.047674023 = sum of:
            0.047674023 = weight(_text_:22 in 1534) [ClassicSimilarity], result of:
              0.047674023 = score(doc=1534,freq=2.0), product of:
                0.1540252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043984205 = queryNorm
                0.30952093 = fieldWeight in 1534, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1534)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Content
    Enthält folgende Kapitel: (1) Motivation; (2) Language philosophical foundations; (3) Structural comparison of extensions; (4) Earlier approaches towards term association; (5) Experiments; (6) Spreading-activation networks or memory models; (7) Perspective. Appendices: Heads and modifiers of 'car'. Glossary. Index. Language and computer. Word semantics and term association. Methods towards an automatic semantic classification
    Footnote
    Rez. in: Knowledge organization 22(1995) no.3/4, S.182-184 (M.T. Rolland)
  7. Meng, K.; Ba, Z.; Ma, Y.; Li, G.: ¬A network coupling approach to detecting hierarchical linkages between science and technology (2024) 0.03
    0.033745833 = product of:
      0.10123749 = sum of:
        0.0089855315 = weight(_text_:information in 1205) [ClassicSimilarity], result of:
          0.0089855315 = score(doc=1205,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.116372846 = fieldWeight in 1205, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1205)
        0.09225196 = weight(_text_:networks in 1205) [ClassicSimilarity], result of:
          0.09225196 = score(doc=1205,freq=4.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.44343 = fieldWeight in 1205, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.046875 = fieldNorm(doc=1205)
      0.33333334 = coord(2/6)
    
    Abstract
    Detecting science-technology hierarchical linkages is beneficial for understanding deep interactions between science and technology (S&T). Previous studies have mainly focused on linear linkages between S&T but ignored their structural linkages. In this paper, we propose a network coupling approach to inspect hierarchical interactions of S&T by integrating their knowledge linkages and structural linkages. S&T knowledge networks are first enhanced with bidirectional encoder representation from transformers (BERT) knowledge alignment, and then their hierarchical structures are identified based on K-core decomposition. Hierarchical coupling preferences and strengths of the S&T networks over time are further calculated based on similarities of coupling nodes' degree distribution and similarities of coupling edges' weight distribution. Extensive experimental results indicate that our approach is feasible and robust in identifying the coupling hierarchy with superior performance compared to other isomorphism and dissimilarity algorithms. Our research extends the mindset of S&T linkage measurement by identifying patterns and paths of the interaction of S&T hierarchical knowledge.
    Source
    Journal of the Association for Information Science and Technology. 75(2023) no.2, S.167-187
  8. Hofstadter, D.: Artificial neural networks today are not conscious (2022) 0.03
    0.031384755 = product of:
      0.18830852 = sum of:
        0.18830852 = weight(_text_:networks in 860) [ClassicSimilarity], result of:
          0.18830852 = score(doc=860,freq=6.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.90514773 = fieldWeight in 860, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.078125 = fieldNorm(doc=860)
      0.16666667 = coord(1/6)
    
    Content
    Vgl. auch: Agüera y Arcas, B.: Artificial neural networks are making strides towards consciousness..
    Source
    ¬The Economist. 2022, [https://www.economist.com/by-invitation/2022/06/09/artificial-neural-networks-today-are-not-conscious-according-to-douglas-hofstadter?giftId=81ea03d7-78f3-4e84-8824-6aa9dac9ab01]
  9. Agüera y Arcas, B.: Artificial neural networks are making strides towards consciousness (2022) 0.03
    0.031384755 = product of:
      0.18830852 = sum of:
        0.18830852 = weight(_text_:networks in 861) [ClassicSimilarity], result of:
          0.18830852 = score(doc=861,freq=6.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.90514773 = fieldWeight in 861, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.078125 = fieldNorm(doc=861)
      0.16666667 = coord(1/6)
    
    Content
    Vgl. auch: Hofstadter, D.: Artificial neural networks today are not conscious.
    Source
    ¬The Economist. 2022, [https://www.economist.com/by-invitation/2022/06/09/artificial-neural-networks-are-making-strides-towards-consciousness-according-to-blaise-aguera-y-arcas?giftId=89e08696-9884-4670-b164-df58fffdf067]
  10. Suissa, O.; Elmalech, A.; Zhitomirsky-Geffet, M.: Text analysis using deep neural networks in digital humanities and information science (2022) 0.03
    0.029155392 = product of:
      0.08746617 = sum of:
        0.01058955 = weight(_text_:information in 491) [ClassicSimilarity], result of:
          0.01058955 = score(doc=491,freq=4.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.13714671 = fieldWeight in 491, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=491)
        0.076876625 = weight(_text_:networks in 491) [ClassicSimilarity], result of:
          0.076876625 = score(doc=491,freq=4.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.369525 = fieldWeight in 491, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0390625 = fieldNorm(doc=491)
      0.33333334 = coord(2/6)
    
    Abstract
    Combining computational technologies and humanities is an ongoing effort aimed at making resources such as texts, images, audio, video, and other artifacts digitally available, searchable, and analyzable. In recent years, deep neural networks (DNN) dominate the field of automatic text analysis and natural language processing (NLP), in some cases presenting a super-human performance. DNNs are the state-of-the-art machine learning algorithms solving many NLP tasks that are relevant for Digital Humanities (DH) research, such as spell checking, language detection, entity extraction, author detection, question answering, and other tasks. These supervised algorithms learn patterns from a large number of "right" and "wrong" examples and apply them to new examples. However, using DNNs for analyzing the text resources in DH research presents two main challenges: (un)availability of training data and a need for domain adaptation. This paper explores these challenges by analyzing multiple use-cases of DH studies in recent literature and their possible solutions and lays out a practical decision model for DH experts for when and how to choose the appropriate deep learning approaches for their research. Moreover, in this paper, we aim to raise awareness of the benefits of utilizing deep learning models in the DH community.
    Source
    Journal of the Association for Information Science and Technology. 73(2022) no.2, S.268-287
  11. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.03
    0.028862368 = product of:
      0.0865871 = sum of:
        0.010483121 = weight(_text_:information in 1595) [ClassicSimilarity], result of:
          0.010483121 = score(doc=1595,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.13576832 = fieldWeight in 1595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
        0.07610398 = weight(_text_:networks in 1595) [ClassicSimilarity], result of:
          0.07610398 = score(doc=1595,freq=2.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.36581108 = fieldWeight in 1595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
      0.33333334 = coord(2/6)
    
    Abstract
    This paper presents a method that exploits the hierarchical structure of an indexing vocabulary to guide the development and training of machine learning methods for automatic text categorization. We present the design of a hierarchical classifier based an the divide-and-conquer principle. The method is evaluated using backpropagation neural networks, such as the machine learning algorithm, that leam to assign MeSH categories to a subset of MEDLINE records. Comparisons with traditional Rocchio's algorithm adapted for text categorization, as well as flat neural network classifiers, are provided. The results indicate that the use of hierarchical structures improves Performance significantly.
    Imprint
    Medford, NJ : Information Today
  12. Radev, D.R.; Joseph, M.T.; Gibson, B.; Muthukrishnan, P.: ¬A bibliometric and network analysis of the field of computational linguistics (2016) 0.03
    0.028862368 = product of:
      0.0865871 = sum of:
        0.010483121 = weight(_text_:information in 2764) [ClassicSimilarity], result of:
          0.010483121 = score(doc=2764,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.13576832 = fieldWeight in 2764, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2764)
        0.07610398 = weight(_text_:networks in 2764) [ClassicSimilarity], result of:
          0.07610398 = score(doc=2764,freq=2.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.36581108 = fieldWeight in 2764, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2764)
      0.33333334 = coord(2/6)
    
    Abstract
    The ACL Anthology is a large collection of research papers in computational linguistics. Citation data were obtained using text extraction from a collection of PDF files with significant manual postprocessing performed to clean up the results. Manual annotation of the references was then performed to complete the citation network. We analyzed the networks of paper citations, author citations, and author collaborations in an attempt to identify the most central papers and authors. The analysis includes general network statistics, PageRank, metrics across publication years and venues, the impact factor and h-index, as well as other measures.
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.3, S.683-706
  13. Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.02
    0.024858069 = product of:
      0.0745742 = sum of:
        0.01339484 = weight(_text_:information in 3948) [ClassicSimilarity], result of:
          0.01339484 = score(doc=3948,freq=10.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.1734784 = fieldWeight in 3948, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3948)
        0.061179362 = weight(_text_:united in 3948) [ClassicSimilarity], result of:
          0.061179362 = score(doc=3948,freq=2.0), product of:
            0.24675635 = queryWeight, product of:
              5.6101127 = idf(docFreq=439, maxDocs=44218)
              0.043984205 = queryNorm
            0.2479343 = fieldWeight in 3948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6101127 = idf(docFreq=439, maxDocs=44218)
              0.03125 = fieldNorm(doc=3948)
      0.33333334 = coord(2/6)
    
    Abstract
    Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.
    Footnote
    Beitrag in einem Special Issue: Content architecture: exploiting and managing diverse resources: proceedings of the first national conference of the United Kingdom chapter of the International Society for Knowedge Organization (ISKO)
  14. Martínez, F.; Martín, M.T.; Rivas, V.M.; Díaz, M.C.; Ureña, L.A.: Using neural networks for multiword recognition in IR (2003) 0.02
    0.024739172 = product of:
      0.07421751 = sum of:
        0.0089855315 = weight(_text_:information in 2777) [ClassicSimilarity], result of:
          0.0089855315 = score(doc=2777,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.116372846 = fieldWeight in 2777, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2777)
        0.06523198 = weight(_text_:networks in 2777) [ClassicSimilarity], result of:
          0.06523198 = score(doc=2777,freq=2.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.31355235 = fieldWeight in 2777, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.046875 = fieldNorm(doc=2777)
      0.33333334 = coord(2/6)
    
    Abstract
    In this paper, a supervised neural network has been used to classify pairs of terms as being multiwords or non-multiwords. Classification is based an the values yielded by different estimators, currently available in literature, used as inputs for the neural network. Lists of multiwords and non-multiwords have been built to train the net. Afterward, many other pairs of terms have been classified using the trained net. Results obtained in this classification have been used to perform information retrieval tasks. Experiments show that detecting multiwords results in better performance of the IR methods.
  15. Warner, A.J.: Natural language processing (1987) 0.02
    0.023878481 = product of:
      0.07163544 = sum of:
        0.023961417 = weight(_text_:information in 337) [ClassicSimilarity], result of:
          0.023961417 = score(doc=337,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.3103276 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=337)
        0.047674023 = product of:
          0.095348045 = sum of:
            0.095348045 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
              0.095348045 = score(doc=337,freq=2.0), product of:
                0.1540252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043984205 = queryNorm
                0.61904186 = fieldWeight in 337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=337)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Source
    Annual review of information science and technology. 22(1987), S.79-108
  16. From information to knowledge : conceptual and content analysis by computer (1995) 0.02
    0.021649845 = product of:
      0.064949535 = sum of:
        0.01058955 = weight(_text_:information in 5392) [ClassicSimilarity], result of:
          0.01058955 = score(doc=5392,freq=4.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.13714671 = fieldWeight in 5392, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5392)
        0.054359984 = weight(_text_:networks in 5392) [ClassicSimilarity], result of:
          0.054359984 = score(doc=5392,freq=2.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.26129362 = fieldWeight in 5392, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5392)
      0.33333334 = coord(2/6)
    
    Content
    SCHMIDT, K.M.: Concepts - content - meaning: an introduction; DUCHASTEL, J. et al.: The SACAO project: using computation toward textual data analysis; PAQUIN, L.-C. u. L. DUPUY: An approach to expertise transfer: computer-assisted text analysis; HOGENRAAD, R., Y. BESTGEN u. J.-L. NYSTEN: Terrorist rhetoric: texture and architecture; MOHLER, P.P.: On the interaction between reading and computing: an interpretative approach to content analysis; LANCASHIRE, I.: Computer tools for cognitive stylistics; MERGENTHALER, E.: An outline of knowledge based text analysis; NAMENWIRTH, J.Z.: Ideography in computer-aided content analysis; WEBER, R.P. u. J.Z. Namenwirth: Content-analytic indicators: a self-critique; McKINNON, A.: Optimizing the aberrant frequency word technique; ROSATI, R.: Factor analysis in classical archaeology: export patterns of Attic pottery trade; PETRILLO, P.S.: Old and new worlds: ancient coinage and modern technology; DARANYI, S., S. MARJAI u.a.: Caryatids and the measurement of semiosis in architecture; ZARRI, G.P.: Intelligent information retrieval: an application in the field of historical biographical data; BOUCHARD, G., R. ROY u.a.: Computers and genealogy: from family reconstitution to population reconstruction; DEMÉLAS-BOHY, M.-D. u. M. RENAUD: Instability, networks and political parties: a political history expert system prototype; DARANYI, S., A. ABRANYI u. G. KOVACS: Knowledge extraction from ethnopoetic texts by multivariate statistical methods; FRAUTSCHI, R.L.: Measures of narrative voice in French prose fiction applied to textual samples from the enlightenment to the twentieth century; DANNENBERG, R. u.a.: A project in computer music: the musician's workbench
  17. Helbig, H.: Knowledge representation and the semantics of natural language (2014) 0.02
    0.021649845 = product of:
      0.064949535 = sum of:
        0.01058955 = weight(_text_:information in 2396) [ClassicSimilarity], result of:
          0.01058955 = score(doc=2396,freq=4.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.13714671 = fieldWeight in 2396, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2396)
        0.054359984 = weight(_text_:networks in 2396) [ClassicSimilarity], result of:
          0.054359984 = score(doc=2396,freq=2.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.26129362 = fieldWeight in 2396, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2396)
      0.33333334 = coord(2/6)
    
    Abstract
    Natural Language is not only the most important means of communication between human beings, it is also used over historical periods for the preservation of cultural achievements and their transmission from one generation to the other. During the last few decades, the flod of digitalized information has been growing tremendously. This tendency will continue with the globalisation of information societies and with the growing importance of national and international computer networks. This is one reason why the theoretical understanding and the automated treatment of communication processes based on natural language have such a decisive social and economic impact. In this context, the semantic representation of knowledge originally formulated in natural language plays a central part, because it connects all components of natural language processing systems, be they the automatic understanding of natural language (analysis), the rational reasoning over knowledge bases, or the generation of natural language expressions from formal representations. This book presents a method for the semantic representation of natural language expressions (texts, sentences, phrases, etc.) which can be used as a universal knowledge representation paradigm in the human sciences, like linguistics, cognitive psychology, or philosophy of language, as well as in computational linguistics and in artificial intelligence. It is also an attempt to close the gap between these disciplines, which to a large extent are still working separately.
  18. Levin, M.; Krawczyk, S.; Bethard, S.; Jurafsky, D.: Citation-based bootstrapping for large-scale author disambiguation (2012) 0.02
    0.020615976 = product of:
      0.061847925 = sum of:
        0.007487943 = weight(_text_:information in 246) [ClassicSimilarity], result of:
          0.007487943 = score(doc=246,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.09697737 = fieldWeight in 246, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=246)
        0.054359984 = weight(_text_:networks in 246) [ClassicSimilarity], result of:
          0.054359984 = score(doc=246,freq=2.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.26129362 = fieldWeight in 246, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0390625 = fieldNorm(doc=246)
      0.33333334 = coord(2/6)
    
    Abstract
    We present a new, two-stage, self-supervised algorithm for author disambiguation in large bibliographic databases. In the first "bootstrap" stage, a collection of high-precision features is used to bootstrap a training set with positive and negative examples of coreferring authors. A supervised feature-based classifier is then trained on the bootstrap clusters and used to cluster the authors in a larger unlabeled dataset. Our self-supervised approach shares the advantages of unsupervised approaches (no need for expensive hand labels) as well as supervised approaches (a rich set of features that can be discriminatively trained). The algorithm disambiguates 54,000,000 author instances in Thomson Reuters' Web of Knowledge with B3 F1 of.807. We analyze parameters and features, particularly those from citation networks, which have not been deeply investigated in author disambiguation. The most important citation feature is self-citation, which can be approximated without expensive extraction of the full network. For the supervised stage, the minor improvement due to other citation features (increasing F1 from.748 to.767) suggests they may not be worth the trouble of extracting from databases that don't already have them. A lean feature set without expensive abstract and title features performs 130 times faster with about equal F1.
    Source
    Journal of the American Society for Information Science and Technology. 63(2012) no.5, S.1030-1047
  19. Griffiths, T.L.; Steyvers, M.: ¬A probabilistic approach to semantic representation (2002) 0.02
    0.020500435 = product of:
      0.1230026 = sum of:
        0.1230026 = weight(_text_:networks in 3671) [ClassicSimilarity], result of:
          0.1230026 = score(doc=3671,freq=4.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.59124 = fieldWeight in 3671, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.0625 = fieldNorm(doc=3671)
      0.16666667 = coord(1/6)
    
    Abstract
    Semantic networks produced from human data have statistical properties that cannot be easily captured by spatial representations. We explore a probabilistic approach to semantic representation that explicitly models the probability with which words occurin diffrent contexts, and hence captures the probabilistic relationships between words. We show that this representation has statistical properties consistent with the large-scale structure of semantic networks constructed by humans, and trace the origins of these properties.
  20. Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02
    0.020390145 = product of:
      0.061170436 = sum of:
        0.025414921 = weight(_text_:information in 4483) [ClassicSimilarity], result of:
          0.025414921 = score(doc=4483,freq=4.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.3291521 = fieldWeight in 4483, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
        0.035755515 = product of:
          0.07151103 = sum of:
            0.07151103 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
              0.07151103 = score(doc=4483,freq=2.0), product of:
                0.1540252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043984205 = queryNorm
                0.46428138 = fieldWeight in 4483, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4483)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Date
    15. 3.2000 10:22:37
    Source
    Journal of information science. 25(1999) no.2, S.113-131

Languages

Types

  • a 392
  • m 35
  • el 31
  • s 20
  • x 10
  • p 3
  • d 2
  • b 1
  • More… Less…

Subjects

Classifications