Search (69 results, page 1 of 4)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.23
    0.22707656 = product of:
      0.30276874 = sum of:
        0.071140714 = product of:
          0.21342213 = sum of:
            0.21342213 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.21342213 = score(doc=562,freq=2.0), product of:
                0.3797425 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.04479146 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.21342213 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.21342213 = score(doc=562,freq=2.0), product of:
            0.3797425 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04479146 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.018205874 = product of:
          0.036411747 = sum of:
            0.036411747 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.036411747 = score(doc=562,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.14
    0.14228143 = product of:
      0.28456286 = sum of:
        0.071140714 = product of:
          0.21342213 = sum of:
            0.21342213 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.21342213 = score(doc=862,freq=2.0), product of:
                0.3797425 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.04479146 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.21342213 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.21342213 = score(doc=862,freq=2.0), product of:
            0.3797425 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04479146 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.5 = coord(2/4)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  3. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.12
    0.115814 = product of:
      0.231628 = sum of:
        0.21342213 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.21342213 = score(doc=563,freq=2.0), product of:
            0.3797425 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04479146 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.018205874 = product of:
          0.036411747 = sum of:
            0.036411747 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.036411747 = score(doc=563,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  4. Sparck Jones, K.: Synonymy and semantic classification (1986) 0.07
    0.06511198 = product of:
      0.26044792 = sum of:
        0.26044792 = product of:
          0.52089584 = sum of:
            0.52089584 = weight(_text_:programming in 1304) [ClassicSimilarity], result of:
              0.52089584 = score(doc=1304,freq=12.0), product of:
                0.29361802 = queryWeight, product of:
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.04479146 = queryNorm
                1.7740594 = fieldWeight in 1304, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1304)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    LCSH
    Programming languages (Electronic computers) / Syntax
    Programming languages (Electronic computers) / Semantics
    PRECIS
    Computer systems / Programming languages / Grammar
    Subject
    Programming languages (Electronic computers) / Syntax
    Programming languages (Electronic computers) / Semantics
    Computer systems / Programming languages / Grammar
  5. Way, E.C.: Knowledge representation and metaphor (oder: meaning) (1994) 0.05
    0.054668214 = product of:
      0.21867286 = sum of:
        0.21867286 = sum of:
          0.17012386 = weight(_text_:programming in 771) [ClassicSimilarity], result of:
            0.17012386 = score(doc=771,freq=2.0), product of:
              0.29361802 = queryWeight, product of:
                6.5552235 = idf(docFreq=170, maxDocs=44218)
                0.04479146 = queryNorm
              0.57940537 = fieldWeight in 771, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5552235 = idf(docFreq=170, maxDocs=44218)
                0.0625 = fieldNorm(doc=771)
          0.048548996 = weight(_text_:22 in 771) [ClassicSimilarity], result of:
            0.048548996 = score(doc=771,freq=2.0), product of:
              0.15685207 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04479146 = queryNorm
              0.30952093 = fieldWeight in 771, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=771)
      0.25 = coord(1/4)
    
    Content
    Enthält folgende 9 Kapitel: The literal and the metaphoric; Views of metaphor; Knowledge representation; Representation schemes and conceptual graphs; The dynamic type hierarchy theory of metaphor; Computational approaches to metaphor; Thenature and structure of semantic hierarchies; Language games, open texture and family resemblance; Programming the dynamic type hierarchy; Subject index
    Footnote
    Bereits 1991 bei Kluwer publiziert // Rez. in: Knowledge organization 22(1995) no.1, S.48-49 (O. Sechser)
  6. Bedathur, S.; Narang, A.: Mind your language : effects of spoken query formulation on retrieval effectiveness (2013) 0.04
    0.044713248 = product of:
      0.17885299 = sum of:
        0.17885299 = weight(_text_:engines in 1150) [ClassicSimilarity], result of:
          0.17885299 = score(doc=1150,freq=8.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.7858995 = fieldWeight in 1150, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1150)
      0.25 = coord(1/4)
    
    Abstract
    Voice search is becoming a popular mode for interacting with search engines. As a result, research has gone into building better voice transcription engines, interfaces, and search engines that better handle inherent verbosity of queries. However, when one considers its use by non- native speakers of English, another aspect that becomes important is the formulation of the query by users. In this paper, we present the results of a preliminary study that we conducted with non-native English speakers who formulate queries for given retrieval tasks. Our results show that the current search engines are sensitive in their rankings to the query formulation, and thus highlights the need for developing more robust ranking methods.
  7. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.04
    0.04266595 = product of:
      0.0853319 = sum of:
        0.06387607 = weight(_text_:engines in 2541) [ClassicSimilarity], result of:
          0.06387607 = score(doc=2541,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.2806784 = fieldWeight in 2541, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.021455828 = product of:
          0.042911656 = sum of:
            0.042911656 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
              0.042911656 = score(doc=2541,freq=4.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.27358043 = fieldWeight in 2541, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  8. Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.03
    0.03319098 = product of:
      0.13276392 = sum of:
        0.13276392 = weight(_text_:engines in 3455) [ClassicSimilarity], result of:
          0.13276392 = score(doc=3455,freq=6.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.58337915 = fieldWeight in 3455, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=3455)
      0.25 = coord(1/4)
    
    Abstract
    Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
  9. Addison, E.R.; Wilson, H.D.; Feder, J.: ¬The impact of plain English searching on end users (1993) 0.03
    0.025550429 = product of:
      0.102201715 = sum of:
        0.102201715 = weight(_text_:engines in 5354) [ClassicSimilarity], result of:
          0.102201715 = score(doc=5354,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.44908544 = fieldWeight in 5354, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0625 = fieldNorm(doc=5354)
      0.25 = coord(1/4)
    
    Abstract
    Commercial software products are available with plain English searching capabilities as engines for online and CD-ROM information services, and for internal text information management. With plain English interfaces, end users do not need to master the keyword and connector approach of the Boolean search query language. Describes plain English searching and its impact on the process of full text retrieval. Explores the issues of ease of use, reliability and implications for the total research process
  10. Chandrasekar, R.; Bangalore, S.: Glean : using syntactic information in document filtering (2002) 0.02
    0.022583602 = product of:
      0.09033441 = sum of:
        0.09033441 = weight(_text_:engines in 4257) [ClassicSimilarity], result of:
          0.09033441 = score(doc=4257,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39693922 = fieldWeight in 4257, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4257)
      0.25 = coord(1/4)
    
    Abstract
    In today's networked world, a huge amount of data is available in machine-processable form. Likewise, there are any number of search engines and specialized information retrieval (IR) programs that seek to extract relevant information from these data repositories. Most IR systems and Web search engines have been designed for speed and tend to maximize the quantity of information (recall) rather than the relevance of the information (precision) to the query. As a result, search engine users get inundated with information for practically any query, and are forced to scan a large number of potentially relevant items to get to the information of interest. The Holy Grail of IR is to somehow retrieve those and only those documents pertinent to the user's query. Polysemy and synonymy - the fact that often there are several meanings for a word or phrase, and likewise, many ways to express a conceptmake this a very hard task. While conventional IR systems provide usable solutions, there are a number of open problems to be solved, in areas such as syntactic processing, semantic analysis, and user modeling, before we develop systems that "understand" user queries and text collections. Meanwhile, we can use tools and techniques available today to improve the precision of retrieval. In particular, using the approach described in this article, we can approximate understanding using the syntactic structure and patterns of language use that is latent in documents to make IR more effective.
  11. Wang, F.L.; Yang, C.C.: Mining Web data for Chinese segmentation (2007) 0.02
    0.022583602 = product of:
      0.09033441 = sum of:
        0.09033441 = weight(_text_:engines in 604) [ClassicSimilarity], result of:
          0.09033441 = score(doc=604,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39693922 = fieldWeight in 604, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=604)
      0.25 = coord(1/4)
    
    Abstract
    Modern information retrieval systems use keywords within documents as indexing terms for search of relevant documents. As Chinese is an ideographic character-based language, the words in the texts are not delimited by white spaces. Indexing of Chinese documents is impossible without a proper segmentation algorithm. Many Chinese segmentation algorithms have been proposed in the past. Traditional segmentation algorithms cannot operate without a large dictionary or a large corpus of training data. Nowadays, the Web has become the largest corpus that is ideal for Chinese segmentation. Although most search engines have problems in segmenting texts into proper words, they maintain huge databases of documents and frequencies of character sequences in the documents. Their databases are important potential resources for segmentation. In this paper, we propose a segmentation algorithm by mining Web data with the help of search engines. On the other hand, the Romanized pinyin of Chinese language indicates boundaries of words in the text. Our algorithm is the first to utilize the Romanized pinyin to segmentation. It is the first unified segmentation algorithm for the Chinese language from different geographical areas, and it is also domain independent because of the nature of the Web. Experiments have been conducted on the datasets of a recent Chinese segmentation competition. The results show that our algorithm outperforms the traditional algorithms in terms of precision and recall. Moreover, our algorithm can effectively deal with the problems of segmentation ambiguity, new word (unknown word) detection, and stop words.
  12. Rajasurya, S.; Muralidharan, T.; Devi, S.; Swamynathan, S.: Semantic information retrieval using ontology in university domain (2012) 0.02
    0.022583602 = product of:
      0.09033441 = sum of:
        0.09033441 = weight(_text_:engines in 2861) [ClassicSimilarity], result of:
          0.09033441 = score(doc=2861,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39693922 = fieldWeight in 2861, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2861)
      0.25 = coord(1/4)
    
    Abstract
    Today's conventional search engines hardly do provide the essential content relevant to the user's search query. This is because the context and semantics of the request made by the user is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is upcoming in the area of web search which combines Natural Language Processing and Artificial Intelligence. The objective of the work done here is to design, develop and implement a semantic search engine- SIEU(Semantic Information Extraction in University Domain) confined to the university domain. SIEU uses ontology as a knowledge base for the information retrieval process. It is not just a mere keyword search. It is one layer above what Google or any other search engines retrieve by analyzing just the keywords. Here the query is analyzed both syntactically and semantically. The developed system retrieves the web results more relevant to the user query through keyword expansion. The results obtained here will be accurate enough to satisfy the request made by the user. The level of accuracy will be enhanced since the query is analyzed semantically. The system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query.
  13. Bakar, Z.A.; Sembok, T.M.T.; Yusoff, M.: ¬An evaluation of retrieval effectiveness using spelling-correction and string-similarity matching methods on Malay texts (2000) 0.02
    0.02255545 = product of:
      0.0902218 = sum of:
        0.0902218 = product of:
          0.1804436 = sum of:
            0.1804436 = weight(_text_:programming in 4804) [ClassicSimilarity], result of:
              0.1804436 = score(doc=4804,freq=4.0), product of:
                0.29361802 = queryWeight, product of:
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.04479146 = queryNorm
                0.6145522 = fieldWeight in 4804, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4804)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    This article evaluates the effectiveness of spelling-correction and string-similarity matching methods in retrieving similar words in a Maly dictionary associated with a set of query words. The spelling-correction techniques used are SPEEDCOP, Soundex, Davidson, Phonic, and Hartlib. 2 dynamic-programming methods that measure longest common subsequence and edit-cost-distance are used. Several search combinations od query and doctionary words are performed in the experiments, the best being one that stems both query and dictionary words using an existing Malay stemming algorithm. the retrieval effectivness (E) and retrieved and relevant (R&R) mean measure are calculated from weighted combination of recall and precision values. Results from these experiments are then compared with available diagram, a string-similarity method. The best R&R and E results are given by using diagram. Editcost-distances produce the best E results, and both dynamic-programming methods rank second in finding R&R mean measures
  14. Perovsek, M.; Kranjca, J.; Erjaveca, T.; Cestnika, B.; Lavraca, N.: TextFlows : a visual programming platform for text mining and natural language processing (2016) 0.02
    0.02255545 = product of:
      0.0902218 = sum of:
        0.0902218 = product of:
          0.1804436 = sum of:
            0.1804436 = weight(_text_:programming in 2697) [ClassicSimilarity], result of:
              0.1804436 = score(doc=2697,freq=4.0), product of:
                0.29361802 = queryWeight, product of:
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.04479146 = queryNorm
                0.6145522 = fieldWeight in 2697, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2697)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Science of computer programming. In Press, 2016
  15. Brenner, E.H.: Beyond Boolean : new approaches in information retrieval; the quest for intuitive online search systems past, present & future (1995) 0.02
    0.022356624 = product of:
      0.089426495 = sum of:
        0.089426495 = weight(_text_:engines in 2547) [ClassicSimilarity], result of:
          0.089426495 = score(doc=2547,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39294976 = fieldWeight in 2547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2547)
      0.25 = coord(1/4)
    
    Content
    (1) The Boolean world; (2) The Non-Boolean picture; (3) The commercial search engines: Personal Librarian, CLARIT, ConQuest, DR-LINK, InQuizit, InTEXT, TOPIC, WIN, TARGET, FREESTYLE, InfoSeek; (4) Wiedergabe von 8 Aufsätzen aus 'Monitor'
  16. Nait-Baha, L.; Jackiewicz, A.; Djioua, B.; Laublet, P.: Query reformulation for information retrieval on the Web using the point of view methodology : preliminary results (2001) 0.02
    0.01916282 = product of:
      0.07665128 = sum of:
        0.07665128 = weight(_text_:engines in 249) [ClassicSimilarity], result of:
          0.07665128 = score(doc=249,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.33681408 = fieldWeight in 249, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=249)
      0.25 = coord(1/4)
    
    Abstract
    The work we are presenting is devoted to the information collected on the WWW. By the term collected we mean the whole process of retrieving, extracting and presenting results to the user. This research is part of the RAP (Research, Analyze, Propose) project in which we propose to combine two methods: (i) query reformulation using linguistic markers according to a given point of view; and (ii) text semantic analysis by means of contextual exploration results (Descles, 1991). The general project architecture describing the interactions between the users, the RAP system and the WWW search engines is presented in Nait-Baha et al. (1998). We will focus this paper on showing how we use linguistic markers to reformulate the queries according to a given point of view
  17. Ahmed, F.; Nürnberger, A.: Evaluation of n-gram conflation approaches for Arabic text retrieval (2009) 0.02
    0.01916282 = product of:
      0.07665128 = sum of:
        0.07665128 = weight(_text_:engines in 2941) [ClassicSimilarity], result of:
          0.07665128 = score(doc=2941,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.33681408 = fieldWeight in 2941, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=2941)
      0.25 = coord(1/4)
    
    Abstract
    In this paper we present a language-independent approach for conflation that does not depend on predefined rules or prior knowledge of the target language. The proposed unsupervised method is based on an enhancement of the pure n-gram model that can group related words based on various string-similarity measures, while restricting the search to specific locations of the target word by taking into account the order of n-grams. We show that the method is effective to achieve high score similarities for all word-form variations and reduces the ambiguity, i.e., obtains a higher precision and recall, compared to pure n-gram-based approaches for English, Portuguese, and Arabic. The proposed method is especially suited for conflation approaches in Arabic, since Arabic is a highly inflectional language. Therefore, we present in addition an adaptive user interface for Arabic text retrieval called araSearch. araSearch serves as a metasearch interface to existing search engines. The system is able to extend a query using the proposed conflation approach such that additional results for relevant subwords can be found automatically.
  18. Hess, M.: ¬An incrementally extensible document retrieval system based on linguistic and logical principles (1992) 0.02
    0.015949111 = product of:
      0.063796446 = sum of:
        0.063796446 = product of:
          0.12759289 = sum of:
            0.12759289 = weight(_text_:programming in 2413) [ClassicSimilarity], result of:
              0.12759289 = score(doc=2413,freq=2.0), product of:
                0.29361802 = queryWeight, product of:
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.04479146 = queryNorm
                0.43455404 = fieldWeight in 2413, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2413)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Most natural language based document retrieval systems use the syntax structures of constituent phrases of documents as index terms. Many of these systems also attempt to reduce the syntactic variability of natural language by some normalisation procedure applied to these syntax structures. However, the retrieval performance of such systems remains fairly disappointing. Some systems therefore use a meaning representation language to index and retrieve documents. In this paper, a system is presented that uses Horn Clause Logic as meaning representation language, employs advanced techniques from Natural Language Processing to achieve incremental extensibility, and uses methods from Logic Programming to achieve robustness in the face of insufficient data. An Incrementally Extensible Document Retrieval System Based on Linguistic and Logical Principles.
  19. Kajanan, S.; Bao, Y.; Datta, A.; VanderMeer, D.; Dutta, K.: Efficient automatic search query formulation using phrase-level analysis (2014) 0.01
    0.012775214 = product of:
      0.051100858 = sum of:
        0.051100858 = weight(_text_:engines in 1264) [ClassicSimilarity], result of:
          0.051100858 = score(doc=1264,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.22454272 = fieldWeight in 1264, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=1264)
      0.25 = coord(1/4)
    
    Abstract
    Over the past decade, the volume of information available digitally over the Internet has grown enormously. Technical developments in the area of search, such as Google's Page Rank algorithm, have proved so good at serving relevant results that Internet search has become integrated into daily human activity. One can endlessly explore topics of interest simply by querying and reading through the resulting links. Yet, although search engines are well known for providing relevant results based on users' queries, users do not always receive the results they are looking for. Google's Director of Research describes clickstream evidence of frustrated users repeatedly reformulating queries and searching through page after page of results. Given the general quality of search engine results, one must consider the possibility that the frustrated user's query is not effective; that is, it does not describe the essence of the user's interest. Indeed, extensive research into human search behavior has found that humans are not very effective at formulating good search queries that describe what they are interested in. Ideally, the user should simply point to a portion of text that sparked the user's interest, and a system should automatically formulate a search query that captures the essence of the text. In this paper, we describe an implemented system that provides this capability. We first describe how our work differs from existing work in automatic query formulation, and propose a new method for improved quantification of the relevance of candidate search terms drawn from input text using phrase-level analysis. We then propose an implementable method designed to provide relevant queries based on a user's text input. We demonstrate the quality of our results and performance of our system through experimental studies. Our results demonstrate that our system produces relevant search terms with roughly two-thirds precision and recall compared to search terms selected by experts, and that typical users find significantly more relevant results (31% more relevant) more quickly (64% faster) using our system than self-formulated search queries. Further, we show that our implementation can scale to request loads of up to 10 requests per second within current online responsiveness expectations (<2-second response times at the highest loads tested).
  20. Warner, A.J.: Natural language processing (1987) 0.01
    0.012137249 = product of:
      0.048548996 = sum of:
        0.048548996 = product of:
          0.09709799 = sum of:
            0.09709799 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
              0.09709799 = score(doc=337,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.61904186 = fieldWeight in 337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=337)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Annual review of information science and technology. 22(1987), S.79-108

Years

Languages

  • e 52
  • d 17
  • el 1
  • m 1
  • More… Less…

Types

  • a 53
  • m 9
  • el 8
  • s 6
  • p 2
  • x 2
  • d 1
  • More… Less…