Search (118 results, page 2 of 6)

  • × theme_ss:"Computerlinguistik"
  1. Robertson, S.E.; Sparck Jones, K.: Relevance weighting of search terms (1976) 0.03
    0.030991834 = product of:
      0.0929755 = sum of:
        0.0929755 = weight(_text_:search in 71) [ClassicSimilarity], result of:
          0.0929755 = score(doc=71,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.5321022 = fieldWeight in 71, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=71)
      0.33333334 = coord(1/3)
    
    Abstract
    Examines statistical techniques for exploiting relevance information to weight search terms. These techniques are presented as a natural extension of weighting methods using information about the distribution of index terms in documents in general. A series of relevance weighting functions is derived and is justified by theoretical considerations. In particular, it is shown that specific weighted search methods are implied by a general probabilistic theory of retrieval. Different applications of relevance weighting are illustrated by experimental results for test collections
  2. Hull, D.; Ait-Mokhtar, S.; Chuat, M.; Eisele, A.; Gaussier, E.; Grefenstette, G.; Isabelle, P.; Samulesson, C.; Segand, F.: Language technologies and patent search and classification (2001) 0.03
    0.026839714 = product of:
      0.08051914 = sum of:
        0.08051914 = weight(_text_:search in 6318) [ClassicSimilarity], result of:
          0.08051914 = score(doc=6318,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.460814 = fieldWeight in 6318, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.09375 = fieldNorm(doc=6318)
      0.33333334 = coord(1/3)
    
  3. Ding, Y.; Chowdhury, G.C.; Foo, S.: Incorporating the results of co-word analyses to increase search variety for information retrieval (2000) 0.03
    0.026839714 = product of:
      0.08051914 = sum of:
        0.08051914 = weight(_text_:search in 6328) [ClassicSimilarity], result of:
          0.08051914 = score(doc=6328,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.460814 = fieldWeight in 6328, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.09375 = fieldNorm(doc=6328)
      0.33333334 = coord(1/3)
    
  4. Metz, C.: ¬The new chatbots could change the world : can you trust them? (2022) 0.03
    0.026839714 = product of:
      0.08051914 = sum of:
        0.08051914 = weight(_text_:search in 854) [ClassicSimilarity], result of:
          0.08051914 = score(doc=854,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.460814 = fieldWeight in 854, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.09375 = fieldNorm(doc=854)
      0.33333334 = coord(1/3)
    
    Abstract
    Siri, Google Search, online marketing and your child's homework will never be the same. Then there's the misinformation problem.
  5. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.03
    0.026615571 = product of:
      0.07984671 = sum of:
        0.07984671 = product of:
          0.23954013 = sum of:
            0.23954013 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.23954013 = score(doc=862,freq=2.0), product of:
                0.4262143 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.05027291 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  6. Navarretta, C.; Pedersen, B.S.; Hansen, D.H.: Language technology in knowledge-organization systems (2006) 0.02
    0.023243874 = product of:
      0.06973162 = sum of:
        0.06973162 = weight(_text_:search in 5706) [ClassicSimilarity], result of:
          0.06973162 = score(doc=5706,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.39907667 = fieldWeight in 5706, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=5706)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper describes the language technology methods developed in the Danish research project VID to extract from Danish text material relevant information for the population of knowledge organization systems (KOS) within specific corporate domains. The results achieved by applying these methods to a prototype search engine tuned to the patent and trademark domain indicate that the use of human language technology can support the construction of a linguistically based KOS and that linguistic information in search improves recall substantially without harming precision (near 90%). Finally, we describe two research experiments where (1) linguistic analysis of Danish compounds and is exploited to improve search atrategies on these (2) linguistic knowledge is used to model corporate knowledge into a language-based ontology.
  7. Frappaolo, C.: Artificial intelligence and text retrieval : a current perspective on the state of the art (1992) 0.02
    0.022366427 = product of:
      0.06709928 = sum of:
        0.06709928 = weight(_text_:search in 7097) [ClassicSimilarity], result of:
          0.06709928 = score(doc=7097,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.3840117 = fieldWeight in 7097, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.078125 = fieldNorm(doc=7097)
      0.33333334 = coord(1/3)
    
    Abstract
    Brief discussion of the ways in which computerized information retrieval and database searching can be enhanced by integrating artificial intelligence with such search systems. Explores the possibility of integrating the powers and capabilities of artificial intelligence (specifically natural language processing) with text retrieval
  8. Feldman, S.: Find what I mean, not what I say : meaning-based search tools (2000) 0.02
    0.022366427 = product of:
      0.06709928 = sum of:
        0.06709928 = weight(_text_:search in 4799) [ClassicSimilarity], result of:
          0.06709928 = score(doc=4799,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.3840117 = fieldWeight in 4799, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.078125 = fieldNorm(doc=4799)
      0.33333334 = coord(1/3)
    
  9. Sünkler, S.; Kerkmann, F.; Schultheiß, S.: Ok Google . the end of search as we know it : sprachgesteuerte Websuche im Test (2018) 0.02
    0.022141634 = product of:
      0.0664249 = sum of:
        0.0664249 = weight(_text_:search in 5626) [ClassicSimilarity], result of:
          0.0664249 = score(doc=5626,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.38015217 = fieldWeight in 5626, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5626)
      0.33333334 = coord(1/3)
    
    Abstract
    Sprachsteuerungssysteme, die den Nutzer auf Zuruf unterstützen, werden im Zuge der Verbreitung von Smartphones und Lautsprechersystemen wie Amazon Echo oder Google Home zunehmend populär. Eine der zentralen Anwendungen dabei stellt die Suche in Websuchmaschinen dar. Wie aber funktioniert "googlen", wenn der Nutzer seine Suchanfrage nicht schreibt, sondern spricht? Dieser Frage ist ein Projektteam der HAW Hamburg nachgegangen und hat im Auftrag der Deutschen Telekom untersucht, wie effektiv, effizient und zufriedenstellend Google Now, Apple Siri, Microsoft Cortana sowie das Amazon Fire OS arbeiten. Ermittelt wurden Stärken und Schwächen der Systeme sowie Erfolgskriterien für eine hohe Gebrauchstauglichkeit. Diese Erkenntnisse mündeten in dem Prototyp einer optimalen Voice Web Search.
  10. Ferber, R.; Wettler, M.; Rapp, R.: ¬An associative model of word selection in the generation of search queries (1995) 0.02
    0.019369897 = product of:
      0.058109686 = sum of:
        0.058109686 = weight(_text_:search in 3177) [ClassicSimilarity], result of:
          0.058109686 = score(doc=3177,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.33256388 = fieldWeight in 3177, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3177)
      0.33333334 = coord(1/3)
    
    Abstract
    To generate a search query based on an end user request, a database searcher has to select appropriate search terms. These terms can either be taken from the request, or they can be added by the searcher. This selection process is simulated by an associative lexical net; the nodes of the net are the terms used in 94 records of written requests to a psychological information agency and the respective online searches. The weights connecting the nodes are calculated from the co-occurrences of these terms in the abstracts of the database PsycLit. To simulate the term selection process of a query, the nodes of all terms used in the written requests are activated, and 1 or more spreading activation cycles are performed. The result of the simulation is a ranking of the terms according to the activities of their nodes. Simulations for all 94 records show a low mean activity rank for the terms selected from the request; the mean activity rank for new terms added by the searcher is lower than the mean activity rank for thode terms of the request that were not used in the query
  11. Nissim, M.; Zaninello, A,: Modeling the internal variability of multiword expressions through a pattern-based method (2013) 0.02
    0.019369897 = product of:
      0.058109686 = sum of:
        0.058109686 = weight(_text_:search in 990) [ClassicSimilarity], result of:
          0.058109686 = score(doc=990,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.33256388 = fieldWeight in 990, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=990)
      0.33333334 = coord(1/3)
    
    Abstract
    The issue of internal variability of multiword expressions (MWEs) is crucial towards their identification and extraction in running text.We present a corpus-supported and computational study on Italian MWEs, aimed at defining an automatic method for modeling internal variation, exploiting frequency and part-of-speech (POS) information. We do so by deriving an XML-encoded lexicon of MWEs based on a manually compiled dictionary, which is then projected onto a a large corpus. Since a search for fixed forms suffers from low recall, while an unconstrained flexible search for lemmas yields a loss in precision, we suggest a procedure aimed at maximizing precision in the identification of MWEs within a flexible search. Our method builds on the idea that internal variability can be modelled via the novel introduction of variation patterns, which work over POS patterns, and can be used as working tools for controlling precision. We also compare the performance of variation patterns to that of association measures, and explore the possibility of using variation patterns in MWE extraction in addition to identification. Finally, we suggest that corpus-derived, pattern-related information can be included in the original MWE lexicon by means of an enriched coding and the creation of an XML-based repository of patterns.
  12. Gencosman, B.C.; Ozmutlu, H.C.; Ozmutlu, S.: Character n-gram application for automatic new topic identification (2014) 0.02
    0.019369897 = product of:
      0.058109686 = sum of:
        0.058109686 = weight(_text_:search in 2688) [ClassicSimilarity], result of:
          0.058109686 = score(doc=2688,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.33256388 = fieldWeight in 2688, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2688)
      0.33333334 = coord(1/3)
    
    Abstract
    The widespread availability of the Internet and the variety of Internet-based applications have resulted in a significant increase in the amount of web pages. Determining the behaviors of search engine users has become a critical step in enhancing search engine performance. Search engine user behaviors can be determined by content-based or content-ignorant algorithms. Although many content-ignorant studies have been performed to automatically identify new topics, previous results have demonstrated that spelling errors can cause significant errors in topic shift estimates. In this study, we focused on minimizing the number of wrong estimates that were based on spelling errors. We developed a new hybrid algorithm combining character n-gram and neural network methodologies, and compared the experimental results with results from previous studies. For the FAST and Excite datasets, the proposed algorithm improved topic shift estimates by 6.987% and 2.639%, respectively. Moreover, we analyzed the performance of the character n-gram method in different aspects including the comparison with Levenshtein edit-distance method. The experimental results demonstrated that the character n-gram method outperformed to the Levensthein edit distance method in terms of topic identification.
  13. Dolamic, L.; Savoy, J.: Retrieval effectiveness of machine translated queries (2010) 0.02
    0.018978544 = product of:
      0.056935627 = sum of:
        0.056935627 = weight(_text_:search in 4102) [ClassicSimilarity], result of:
          0.056935627 = score(doc=4102,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.3258447 = fieldWeight in 4102, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=4102)
      0.33333334 = coord(1/3)
    
    Abstract
    This article describes and evaluates various information retrieval models used to search document collections written in English through submitting queries written in various other languages, either members of the Indo-European family (English, French, German, and Spanish) or radically different language groups such as Chinese. This evaluation method involves searching a rather large number of topics (around 300) and using two commercial machine translation systems to translate across the language barriers. In this study, mean average precision is used to measure variances in retrieval effectiveness when a query language differs from the document language. Although performance differences are rather large for certain languages pairs, this does not mean that bilingual search methods are not commercially viable. Causes of the difficulties incurred when searching or during translation are analyzed and the results of concrete examples are explained.
  14. Warner, A.J.: Natural language processing (1987) 0.02
    0.01816343 = product of:
      0.054490287 = sum of:
        0.054490287 = product of:
          0.10898057 = sum of:
            0.10898057 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
              0.10898057 = score(doc=337,freq=2.0), product of:
                0.17604718 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05027291 = queryNorm
                0.61904186 = fieldWeight in 337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=337)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Annual review of information science and technology. 22(1987), S.79-108
  15. Pritchard-Schoch, T.: Comparing natural language retrieval : Win & Freestyle (1995) 0.02
    0.017893143 = product of:
      0.053679425 = sum of:
        0.053679425 = weight(_text_:search in 2546) [ClassicSimilarity], result of:
          0.053679425 = score(doc=2546,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.30720934 = fieldWeight in 2546, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=2546)
      0.33333334 = coord(1/3)
    
    Abstract
    Reports on a comparison of 2 natural language interfaces to full text legal databases: WIN for access to WESTLAW databases and FREESTYLE for access to the LEXIS database. 30 legal issues in natural langugae queries were presented to identical libraries in both systems. The top 20 ranked documents from each search were analyzed and reviewed for relevance to the legal issue
  16. Pritchard-Schoch, T.: Natural language comes of age (1993) 0.02
    0.017893143 = product of:
      0.053679425 = sum of:
        0.053679425 = weight(_text_:search in 2570) [ClassicSimilarity], result of:
          0.053679425 = score(doc=2570,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.30720934 = fieldWeight in 2570, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=2570)
      0.33333334 = coord(1/3)
    
    Abstract
    Discusses natural languages and the natural language implementations of Westlaw's full-text legal documents, Westlaw Is Natural. Natural language is not aritificial intelligence but a hybrid of linguistics, mathematics and statistics. Provides 3 classes of retrieval models. Explains how Westlaw processes an English query. Assesses WIN. Covers WIN enhancements; the natural language features of Congressional Quarterly's Washington Alert using a document for a query; the personal librarian front end search software and Dowquest from Dow Jones news/retrieval. Conmsiders whether natural language encourages fuzzy thinking and whether Boolean logic will still be needed
  17. Frakes, W.B.: Stemming algorithms (1992) 0.02
    0.017893143 = product of:
      0.053679425 = sum of:
        0.053679425 = weight(_text_:search in 3503) [ClassicSimilarity], result of:
          0.053679425 = score(doc=3503,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.30720934 = fieldWeight in 3503, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=3503)
      0.33333334 = coord(1/3)
    
    Abstract
    Desribes stemming algorithms - programs that relate morphologically similar indexing and search terms. Stemming is used to improve retrieval effectiveness and to reduce the size of indexing files. Several approaches to stemming are describes - table lookup, affix removal, successor variety, and n-gram. empirical studies of stemming are summarized. The Porter stemmer is described in detail, and a full implementation in C is presented
  18. Diaz, I.; Morato, J.; Lioréns, J.: ¬An algorithm for term conflation based on tree structures (2002) 0.02
    0.017893143 = product of:
      0.053679425 = sum of:
        0.053679425 = weight(_text_:search in 246) [ClassicSimilarity], result of:
          0.053679425 = score(doc=246,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.30720934 = fieldWeight in 246, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=246)
      0.33333334 = coord(1/3)
    
    Abstract
    This work presents a new stemming algorithm. This algorithm stores the stemming information in tree structures. This storage allows us to enhance the performance of the algorithm due to the reduction of the search space and the overall complexity. The final result of that stemming algorithm is a normalized concept, understanding this process as the automatic extraction of the generic form (or a lexeme) for a selected term.
  19. Soo, J.; Frieder, O.: On searching misspelled collections (2015) 0.02
    0.017893143 = product of:
      0.053679425 = sum of:
        0.053679425 = weight(_text_:search in 1862) [ClassicSimilarity], result of:
          0.053679425 = score(doc=1862,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.30720934 = fieldWeight in 1862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=1862)
      0.33333334 = coord(1/3)
    
    Abstract
    We describe an unsupervised, language-independent spelling correction search system. We compare the proposed approach with unsupervised and supervised algorithms. The described approach consistently outperforms other unsupervised efforts and nearly matches the performance of a current state-of-the-art supervised approach.
  20. McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.02
    0.015893001 = product of:
      0.047679 = sum of:
        0.047679 = product of:
          0.095358 = sum of:
            0.095358 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
              0.095358 = score(doc=3164,freq=2.0), product of:
                0.17604718 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05027291 = queryNorm
                0.5416616 = fieldWeight in 3164, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3164)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Computational linguistics. 22(1996) no.2, S.217-248

Years

Languages

  • e 98
  • d 20
  • el 1
  • m 1
  • More… Less…

Types

  • a 97
  • el 14
  • m 7
  • s 5
  • p 3
  • x 3
  • d 1
  • r 1
  • More… Less…