Search (2 results, page 1 of 1)

Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.10

0.09615815 = product of:
  0.14423722 = sum of:
    0.06973162 = weight(_text_:search in 3455) [ClassicSimilarity], result of:
      0.06973162 = score(doc=3455,freq=6.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.39907667 = fieldWeight in 3455, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=3455)
    0.074505605 = product of:
      0.14901121 = sum of:
        0.14901121 = weight(_text_:engines in 3455) [ClassicSimilarity], result of:
          0.14901121 = score(doc=3455,freq=6.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.58337915 = fieldWeight in 3455, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=3455)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.

Radev, D.R.; Libner, K.; Fan, W.: Getting answers to natural language questions on the Web (2002) 0.09
```
0.09297244 = product of:
  0.13945866 = sum of:
    0.08876401 = weight(_text_:search in 5204) [ClassicSimilarity], result of:
      0.08876401 = score(doc=5204,freq=14.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.5079997 = fieldWeight in 5204, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5204)
    0.05069464 = product of:
      0.10138928 = sum of:
        0.10138928 = weight(_text_:engines in 5204) [ClassicSimilarity], result of:
          0.10138928 = score(doc=5204,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.39693922 = fieldWeight in 5204, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5204)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Seven hundred natural language questions from TREC-8 and TREC-9 were sent by Radev, Libner, and Fan to each of nine web search engines. The top 40 sites returned by each system were stored for evaluation of their productivity of correct answers. Each question per engine was scored as the sum of the reciprocal ranks of identified correct answers. The large number of zero scores gave a positive skew violating the normality assumption for ANOVA, so values were transformed to zero for no hit and one for one or more hits. The non-zero values were then square-root transformed to remove the remaining positive skew. Interactions were observed between search engine and answer type (name, place, date, et cetera), search engine and number of proper nouns in the query, search engine and the need for time limitation, and search engine and total query words. All effects were significant. Shortest queries had the highest mean scores. One or more proper nouns present provides a significant advantage. Non-time dependent queries have an advantage. Place, name, person, and text description had mean scores between .85 and .9 with date at .81 and number at .59. There were significant differences in score by search engine. Search engines found at least one correct answer in between 87.7 and 75.45 of the cases. Google and Northern Light were just short of a 90% hit rate. No evidence indicated that a particular engine was better at answering any particular sort of question.

Search (2 results, page 1 of 1)

Authors

Themes