Search (6 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Computerlinguistik"
  • × type_ss:"el"
  1. Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.00
    0.0026559862 = product of:
      0.023903877 = sum of:
        0.023903877 = product of:
          0.047807753 = sum of:
            0.047807753 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
              0.047807753 = score(doc=4888,freq=2.0), product of:
                0.10297151 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02940506 = queryNorm
                0.46428138 = fieldWeight in 4888, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4888)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Date
    1. 3.2013 14:56:22
  2. Rajasurya, S.; Muralidharan, T.; Devi, S.; Swamynathan, S.: Semantic information retrieval using ontology in university domain (2012) 0.00
    0.0019223152 = product of:
      0.017300837 = sum of:
        0.017300837 = product of:
          0.034601673 = sum of:
            0.034601673 = weight(_text_:web in 2861) [ClassicSimilarity], result of:
              0.034601673 = score(doc=2861,freq=8.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.36057037 = fieldWeight in 2861, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2861)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Today's conventional search engines hardly do provide the essential content relevant to the user's search query. This is because the context and semantics of the request made by the user is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is upcoming in the area of web search which combines Natural Language Processing and Artificial Intelligence. The objective of the work done here is to design, develop and implement a semantic search engine- SIEU(Semantic Information Extraction in University Domain) confined to the university domain. SIEU uses ontology as a knowledge base for the information retrieval process. It is not just a mere keyword search. It is one layer above what Google or any other search engines retrieve by analyzing just the keywords. Here the query is analyzed both syntactically and semantically. The developed system retrieves the web results more relevant to the user query through keyword expansion. The results obtained here will be accurate enough to satisfy the request made by the user. The level of accuracy will be enhanced since the query is analyzed semantically. The system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query.
  3. Wong, W.; Liu, W.; Bennamoun, M.: Ontology learning from text : a look back and into the future (2010) 0.00
    0.001902995 = product of:
      0.017126955 = sum of:
        0.017126955 = product of:
          0.03425391 = sum of:
            0.03425391 = weight(_text_:web in 4733) [ClassicSimilarity], result of:
              0.03425391 = score(doc=4733,freq=4.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.35694647 = fieldWeight in 4733, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4733)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Ontologies are often viewed as the answer to the need for inter-operable semantics in modern information systems. The explosion of textual information on the "Read/Write" Web coupled with the increasing demand for ontologies to power the Semantic Web have made (semi-)automatic ontology learning from text a very promising research area. This together with the advanced state in related areas such as natural language processing have fuelled research into ontology learning over the past decade. This survey looks at how far we have come since the turn of the millennium, and discusses the remaining challenges that will define the research directions in this area in the near future.
  4. Perovsek, M.; Kranjca, J.; Erjaveca, T.; Cestnika, B.; Lavraca, N.: TextFlows : a visual programming platform for text mining and natural language processing (2016) 0.00
    0.0016311385 = product of:
      0.014680246 = sum of:
        0.014680246 = product of:
          0.029360492 = sum of:
            0.029360492 = weight(_text_:web in 2697) [ClassicSimilarity], result of:
              0.029360492 = score(doc=2697,freq=4.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.3059541 = fieldWeight in 2697, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2697)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Text mining and natural language processing are fast growing areas of research, with numerous applications in business, science and creative industries. This paper presents TextFlows, a web-based text mining and natural language processing platform supporting workflow construction, sharing and execution. The platform enables visual construction of text mining workflows through a web browser, and the execution of the constructed workflows on a processing cloud. This makes TextFlows an adaptable infrastructure for the construction and sharing of text processing workflows, which can be reused in various applications. The paper presents the implemented text mining and language processing modules, and describes some precomposed workflows. Their features are demonstrated on three use cases: comparison of document classifiers and of different part-of-speech taggers on a text categorization problem, and outlier detection in document corpora.
  5. Spitkovsky, V.; Norvig, P.: From words to concepts and back : dictionaries for linking text, entities and ideas (2012) 0.00
    0.001331819 = product of:
      0.011986371 = sum of:
        0.011986371 = product of:
          0.023972742 = sum of:
            0.023972742 = weight(_text_:web in 337) [ClassicSimilarity], result of:
              0.023972742 = score(doc=337,freq=6.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.24981049 = fieldWeight in 337, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03125 = fieldNorm(doc=337)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Human language is both rich and ambiguous. When we hear or read words, we resolve meanings to mental representations, for example recognizing and linking names to the intended persons, locations or organizations. Bridging words and meaning - from turning search queries into relevant results to suggesting targeted keywords for advertisers - is also Google's core competency, and important for many other tasks in information retrieval and natural language processing. We are happy to release a resource, spanning 7,560,141 concepts and 175,100,788 unique text strings, that we hope will help everyone working in these areas. How do we represent concepts? Our approach piggybacks on the unique titles of entries from an encyclopedia, which are mostly proper and common noun phrases. We consider each individual Wikipedia article as representing a concept (an entity or an idea), identified by its URL. Text strings that refer to concepts were collected using the publicly available hypertext of anchors (the text you click on in a web link) that point to each Wikipedia page, thus drawing on the vast link structure of the web. For every English article we harvested the strings associated with its incoming hyperlinks from the rest of Wikipedia, the greater web, and also anchors of parallel, non-English Wikipedia pages. Our dictionaries are cross-lingual, and any concept deemed too fine can be broadened to a desired level of generality using Wikipedia's groupings of articles into hierarchical categories. The data set contains triples, each consisting of (i) text, a short, raw natural language string; (ii) url, a related concept, represented by an English Wikipedia article's canonical location; and (iii) count, an integer indicating the number of times text has been observed connected with the concept's url. Our database thus includes weights that measure degrees of association. For example, the top two entries for football indicate that it is an ambiguous term, which is almost twice as likely to refer to what we in the US call soccer. Vgl. auch: Spitkovsky, V.I., A.X. Chang: A cross-lingual dictionary for english Wikipedia concepts. In: http://nlp.stanford.edu/pubs/crosswikis.pdf.
  6. Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; Agarwal, S.; Herbert-Voss, A.; Krueger, G.; Henighan, T.; Child, R.; Ramesh, A.; Ziegler, D.M.; Wu, J.; Winter, C.; Hesse, C.; Chen, M.; Sigler, E.; Litwin, M.; Gray, S.; Chess, B.; Clark, J.; Berner, C.; McCandlish, S.; Radford, A.; Sutskever, I.; Amodei, D.: Language models are few-shot learners (2020) 0.00
    7.6892605E-4 = product of:
      0.0069203344 = sum of:
        0.0069203344 = product of:
          0.013840669 = sum of:
            0.013840669 = weight(_text_:web in 872) [ClassicSimilarity], result of:
              0.013840669 = score(doc=872,freq=2.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.14422815 = fieldWeight in 872, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03125 = fieldNorm(doc=872)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.