Search (2 results, page 1 of 1)

  • × author_ss:"Nie, J.-Y."
  • × year_i:[2000 TO 2010}
  1. Song, R.; Luo, Z.; Nie, J.-Y.; Yu, Y.; Hon, H.-W.: Identification of ambiguous queries in web search (2009) 0.08
    0.07516497 = product of:
      0.112747446 = sum of:
        0.06973162 = weight(_text_:search in 2441) [ClassicSimilarity], result of:
          0.06973162 = score(doc=2441,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.39907667 = fieldWeight in 2441, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=2441)
        0.043015826 = product of:
          0.08603165 = sum of:
            0.08603165 = weight(_text_:engines in 2441) [ClassicSimilarity], result of:
              0.08603165 = score(doc=2441,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.33681408 = fieldWeight in 2441, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2441)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    It is widely believed that many queries submitted to search engines are inherently ambiguous (e.g., java and apple). However, few studies have tried to classify queries based on ambiguity and to answer "what the proportion of ambiguous queries is". This paper deals with these issues. First, we clarify the definition of ambiguous queries by constructing the taxonomy of queries from being ambiguous to specific. Second, we ask human annotators to manually classify queries. From manually labeled results, we observe that query ambiguity is to some extent predictable. Third, we propose a supervised learning approach to automatically identify ambiguous queries. Experimental results show that we can correctly identify 87% of labeled queries with the approach. Finally, by using our approach, we estimate that about 16% of queries in a real search log are ambiguous.
  2. Bai, J.; Nie, J.-Y.: Adapting information retrieval to query contexts (2008) 0.01
    0.011183213 = product of:
      0.03354964 = sum of:
        0.03354964 = weight(_text_:search in 2446) [ClassicSimilarity], result of:
          0.03354964 = score(doc=2446,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.19200584 = fieldWeight in 2446, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2446)
      0.33333334 = coord(1/3)
    
    Abstract
    In current IR approaches documents are retrieved only according to the terms specified in the query. The same answers are returned for the same query whatever the user and the search goal are. In reality, many other contextual factors strongly influence document's relevance and they should be taken into account in IR operations. This paper proposes a method, based on language modeling, to integrate several contextual factors so that document ranking will be adapted to the specific query contexts. We will consider three contextual factors in this paper: the topic domain of the query, the characteristics of the document collection, as well as context words within the query. Each contextual factor is used to generate a new query language model to specify some aspect of the information need. All these query models are then combined together to produce a more complete model for the underlying information need. Our experiments on TREC collections show that each contextual factor can positively influence the IR effectiveness and the combined model results in the highest effectiveness. This study shows that it is both beneficial and feasible to integrate more contextual factors in the current IR practice.