Search (4 results, page 1 of 1)

  • × theme_ss:"Computerlinguistik"
  • × theme_ss:"Retrievalalgorithmen"
  1. Jones, K.: Linguistic searching versus relevance ranking : DR-LINK and TARGET (1999) 0.03
    0.028268103 = product of:
      0.16960861 = sum of:
        0.16960861 = weight(_text_:ranking in 6423) [ClassicSimilarity], result of:
          0.16960861 = score(doc=6423,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.8366664 = fieldWeight in 6423, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.109375 = fieldNorm(doc=6423)
      0.16666667 = coord(1/6)
    
  2. Brenner, E.H.: Beyond Boolean : new approaches in information retrieval; the quest for intuitive online search systems past, present & future (1995) 0.01
    0.014134051 = product of:
      0.084804304 = sum of:
        0.084804304 = weight(_text_:ranking in 2547) [ClassicSimilarity], result of:
          0.084804304 = score(doc=2547,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.4183332 = fieldWeight in 2547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2547)
      0.16666667 = coord(1/6)
    
    Abstract
    The challenge of effectively bringing specific, relevant information from the global sea of data to our fingertips, has become an increasingly difficult one. Discusses how the online information industry, founded on Boolean search systems, may be evolving to take advantage of other methods, such as 'term weighting', 'relevance ranking' and 'query by example'
  3. Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.01
    0.0121149 = product of:
      0.0726894 = sum of:
        0.0726894 = weight(_text_:ranking in 3455) [ClassicSimilarity], result of:
          0.0726894 = score(doc=3455,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.35857132 = fieldWeight in 3455, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=3455)
      0.16666667 = coord(1/6)
    
    Abstract
    Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
  4. Hoenkamp, E.; Bruza, P.D.; Song, D.; Huang, Q.: ¬An effective approach to verbose queries using a limited dependencies language model (2009) 0.01
    0.008076601 = product of:
      0.0484596 = sum of:
        0.0484596 = weight(_text_:ranking in 2122) [ClassicSimilarity], result of:
          0.0484596 = score(doc=2122,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.23904754 = fieldWeight in 2122, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03125 = fieldNorm(doc=2122)
      0.16666667 = coord(1/6)
    
    Abstract
    Intuitively, any 'bag of words' approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distributions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document's initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur's search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophisticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.