Document (#32918)

Author
Abdelali, A.
Cowie, J.
Soliman, H.S.
Title
Improving query precision using semantic expansion
Source
Information processing and management. 43(2007) no.3, S.705-716
Year
2007
Abstract
Query Expansion (QE) is one of the most important mechanisms in the information retrieval field. A typical short Internet query will go through a process of refinement to improve its retrieval power. Most of the existing QE techniques suffer from retrieval performance degradation due to imprecise choice of query's additive terms in the QE process. In this paper, we introduce a novel automated QE mechanism. The new expansion process is guided by the semantics relations between the original query and the expanding words, in the context of the utilized corpus. Experimental results of our "controlled" query expansion, using the Arabic TREC-10 data, show a significant enhancement of recall and precision over current existing mechanisms in the field.
Footnote
Beitrag in: Special issue on Heterogeneous and Distributed IR
Theme
Retrievalalgorithmen

Similar documents (content)

  1. He, B.; Ounis, I.: Combining fields for query expansion and adaptive query expansion (2007) 0.32
    0.3188723 = sum of:
      0.3188723 = product of:
        0.99647593 = sum of:
          0.085085504 = weight(abstract_txt:mechanism in 926) [ClassicSimilarity], result of:
            0.085085504 = score(doc=926,freq=3.0), product of:
              0.124424174 = queryWeight, product of:
                1.0358326 = boost
                6.31699 = idf(docFreq=216, maxDocs=44218)
                0.01901538 = queryNorm
              0.6838342 = fieldWeight in 926, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.31699 = idf(docFreq=216, maxDocs=44218)
                0.0625 = fieldNorm(doc=926)
          0.058603782 = weight(abstract_txt:trec in 926) [ClassicSimilarity], result of:
            0.058603782 = score(doc=926,freq=1.0), product of:
              0.13995612 = queryWeight, product of:
                1.0985837 = boost
                6.699675 = idf(docFreq=147, maxDocs=44218)
                0.01901538 = queryNorm
              0.4187297 = fieldWeight in 926, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.699675 = idf(docFreq=147, maxDocs=44218)
                0.0625 = fieldNorm(doc=926)
          0.016188122 = weight(abstract_txt:using in 926) [ClassicSimilarity], result of:
            0.016188122 = score(doc=926,freq=1.0), product of:
              0.07479096 = queryWeight, product of:
                1.1357344 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.01901538 = queryNorm
              0.21644491 = fieldWeight in 926, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=926)
          0.03532744 = weight(abstract_txt:field in 926) [ClassicSimilarity], result of:
            0.03532744 = score(doc=926,freq=1.0), product of:
              0.12583251 = queryWeight, product of:
                1.4731557 = boost
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.01901538 = queryNorm
              0.28074968 = fieldWeight in 926, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.0625 = fieldNorm(doc=926)
          0.03469903 = weight(abstract_txt:retrieval in 926) [ClassicSimilarity], result of:
            0.03469903 = score(doc=926,freq=2.0), product of:
              0.11296661 = queryWeight, product of:
                1.709515 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01901538 = queryNorm
              0.3071618 = fieldWeight in 926, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=926)
          0.054965615 = weight(abstract_txt:process in 926) [ClassicSimilarity], result of:
            0.054965615 = score(doc=926,freq=2.0), product of:
              0.15350856 = queryWeight, product of:
                1.992802 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.01901538 = queryNorm
              0.3580622 = fieldWeight in 926, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.0625 = fieldNorm(doc=926)
          0.2769469 = weight(abstract_txt:query in 926) [ClassicSimilarity], result of:
            0.2769469 = score(doc=926,freq=7.0), product of:
              0.3523139 = queryWeight, product of:
                3.8975089 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01901538 = queryNorm
              0.78607994 = fieldWeight in 926, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=926)
          0.4346595 = weight(abstract_txt:expansion in 926) [ClassicSimilarity], result of:
            0.4346595 = score(doc=926,freq=6.0), product of:
              0.46499023 = queryWeight, product of:
                4.0048766 = boost
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.01901538 = queryNorm
              0.9347713 = fieldWeight in 926, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.0625 = fieldNorm(doc=926)
        0.32 = coord(8/25)
    
  2. Xu, B.; Lin, H.; Lin, Y.: Assessment of learning to rank methods for query expansion (2016) 0.25
    0.24877886 = sum of:
      0.24877886 = product of:
        0.777434 = sum of:
          0.044774003 = weight(abstract_txt:introduce in 2929) [ClassicSimilarity], result of:
            0.044774003 = score(doc=2929,freq=1.0), product of:
              0.11696576 = queryWeight, product of:
                1.0043073 = boost
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.01901538 = queryNorm
              0.3827958 = fieldWeight in 2929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.0625 = fieldNorm(doc=2929)
          0.058603782 = weight(abstract_txt:trec in 2929) [ClassicSimilarity], result of:
            0.058603782 = score(doc=2929,freq=1.0), product of:
              0.13995612 = queryWeight, product of:
                1.0985837 = boost
                6.699675 = idf(docFreq=147, maxDocs=44218)
                0.01901538 = queryNorm
              0.4187297 = fieldWeight in 2929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.699675 = idf(docFreq=147, maxDocs=44218)
                0.0625 = fieldNorm(doc=2929)
          0.016188122 = weight(abstract_txt:using in 2929) [ClassicSimilarity], result of:
            0.016188122 = score(doc=2929,freq=1.0), product of:
              0.07479096 = queryWeight, product of:
                1.1357344 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.01901538 = queryNorm
              0.21644491 = fieldWeight in 2929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=2929)
          0.08195804 = weight(abstract_txt:refinement in 2929) [ClassicSimilarity], result of:
            0.08195804 = score(doc=2929,freq=1.0), product of:
              0.17502597 = queryWeight, product of:
                1.2285377 = boost
                7.4921947 = idf(docFreq=66, maxDocs=44218)
                0.01901538 = queryNorm
              0.46826217 = fieldWeight in 2929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4921947 = idf(docFreq=66, maxDocs=44218)
                0.0625 = fieldNorm(doc=2929)
          0.023905825 = weight(abstract_txt:most in 2929) [ClassicSimilarity], result of:
            0.023905825 = score(doc=2929,freq=1.0), product of:
              0.09698859 = queryWeight, product of:
                1.2933395 = boost
                3.943693 = idf(docFreq=2328, maxDocs=44218)
                0.01901538 = queryNorm
              0.24648081 = fieldWeight in 2929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.943693 = idf(docFreq=2328, maxDocs=44218)
                0.0625 = fieldNorm(doc=2929)
          0.04907184 = weight(abstract_txt:retrieval in 2929) [ClassicSimilarity], result of:
            0.04907184 = score(doc=2929,freq=4.0), product of:
              0.11296661 = queryWeight, product of:
                1.709515 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01901538 = queryNorm
              0.43439242 = fieldWeight in 2929, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=2929)
          0.14803433 = weight(abstract_txt:query in 2929) [ClassicSimilarity], result of:
            0.14803433 = score(doc=2929,freq=2.0), product of:
              0.3523139 = queryWeight, product of:
                3.8975089 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01901538 = queryNorm
              0.4201774 = fieldWeight in 2929, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=2929)
          0.354898 = weight(abstract_txt:expansion in 2929) [ClassicSimilarity], result of:
            0.354898 = score(doc=2929,freq=4.0), product of:
              0.46499023 = queryWeight, product of:
                4.0048766 = boost
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.01901538 = queryNorm
              0.76323754 = fieldWeight in 2929, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.0625 = fieldNorm(doc=2929)
        0.32 = coord(8/25)
    
  3. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.25
    0.24699016 = sum of:
      0.24699016 = product of:
        1.0291257 = sum of:
          0.044200394 = weight(abstract_txt:corpus in 1338) [ClassicSimilarity], result of:
            0.044200394 = score(doc=1338,freq=1.0), product of:
              0.115964636 = queryWeight, product of:
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.01901538 = queryNorm
              0.3811541 = fieldWeight in 1338, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.0625 = fieldNorm(doc=1338)
          0.11405463 = weight(abstract_txt:imprecise in 1338) [ClassicSimilarity], result of:
            0.11405463 = score(doc=1338,freq=1.0), product of:
              0.21816416 = queryWeight, product of:
                1.3716046 = boost
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.01901538 = queryNorm
              0.5227927 = fieldWeight in 1338, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.0625 = fieldNorm(doc=1338)
          0.054863986 = weight(abstract_txt:retrieval in 1338) [ClassicSimilarity], result of:
            0.054863986 = score(doc=1338,freq=5.0), product of:
              0.11296661 = queryWeight, product of:
                1.709515 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01901538 = queryNorm
              0.4856655 = fieldWeight in 1338, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=1338)
          0.06731886 = weight(abstract_txt:process in 1338) [ClassicSimilarity], result of:
            0.06731886 = score(doc=1338,freq=3.0), product of:
              0.15350856 = queryWeight, product of:
                1.992802 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.01901538 = queryNorm
              0.43853486 = fieldWeight in 1338, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.0625 = fieldNorm(doc=1338)
          0.31402826 = weight(abstract_txt:query in 1338) [ClassicSimilarity], result of:
            0.31402826 = score(doc=1338,freq=9.0), product of:
              0.3523139 = queryWeight, product of:
                3.8975089 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01901538 = queryNorm
              0.89133084 = fieldWeight in 1338, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=1338)
          0.4346595 = weight(abstract_txt:expansion in 1338) [ClassicSimilarity], result of:
            0.4346595 = score(doc=1338,freq=6.0), product of:
              0.46499023 = queryWeight, product of:
                4.0048766 = boost
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.01901538 = queryNorm
              0.9347713 = fieldWeight in 1338, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.0625 = fieldNorm(doc=1338)
        0.24 = coord(6/25)
    
  4. Efthimiadis, E.N.: Query expansion (1996) 0.24
    0.24142879 = sum of:
      0.24142879 = product of:
        1.50893 = sum of:
          0.04907184 = weight(abstract_txt:retrieval in 4847) [ClassicSimilarity], result of:
            0.04907184 = score(doc=4847,freq=1.0), product of:
              0.11296661 = queryWeight, product of:
                1.709515 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01901538 = queryNorm
              0.43439242 = fieldWeight in 4847, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.125 = fieldNorm(doc=4847)
          0.07773312 = weight(abstract_txt:process in 4847) [ClassicSimilarity], result of:
            0.07773312 = score(doc=4847,freq=1.0), product of:
              0.15350856 = queryWeight, product of:
                1.992802 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.01901538 = queryNorm
              0.50637645 = fieldWeight in 4847, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.125 = fieldNorm(doc=4847)
          0.51280606 = weight(abstract_txt:query in 4847) [ClassicSimilarity], result of:
            0.51280606 = score(doc=4847,freq=6.0), product of:
              0.3523139 = queryWeight, product of:
                3.8975089 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01901538 = queryNorm
              1.4555373 = fieldWeight in 4847, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.125 = fieldNorm(doc=4847)
          0.869319 = weight(abstract_txt:expansion in 4847) [ClassicSimilarity], result of:
            0.869319 = score(doc=4847,freq=6.0), product of:
              0.46499023 = queryWeight, product of:
                4.0048766 = boost
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.01901538 = queryNorm
              1.8695426 = fieldWeight in 4847, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.125 = fieldNorm(doc=4847)
        0.16 = coord(4/25)
    
  5. Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.24
    0.23759025 = sum of:
      0.23759025 = product of:
        0.8485366 = sum of:
          0.044774003 = weight(abstract_txt:introduce in 1343) [ClassicSimilarity], result of:
            0.044774003 = score(doc=1343,freq=1.0), product of:
              0.11696576 = queryWeight, product of:
                1.0043073 = boost
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.01901538 = queryNorm
              0.3827958 = fieldWeight in 1343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.0625 = fieldNorm(doc=1343)
          0.016188122 = weight(abstract_txt:using in 1343) [ClassicSimilarity], result of:
            0.016188122 = score(doc=1343,freq=1.0), product of:
              0.07479096 = queryWeight, product of:
                1.1357344 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.01901538 = queryNorm
              0.21644491 = fieldWeight in 1343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=1343)
          0.03921611 = weight(abstract_txt:existing in 1343) [ClassicSimilarity], result of:
            0.03921611 = score(doc=1343,freq=1.0), product of:
              0.13490492 = queryWeight, product of:
                1.525338 = boost
                4.6511106 = idf(docFreq=1147, maxDocs=44218)
                0.01901538 = queryNorm
              0.29069442 = fieldWeight in 1343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6511106 = idf(docFreq=1147, maxDocs=44218)
                0.0625 = fieldNorm(doc=1343)
          0.02453592 = weight(abstract_txt:retrieval in 1343) [ClassicSimilarity], result of:
            0.02453592 = score(doc=1343,freq=1.0), product of:
              0.11296661 = queryWeight, product of:
                1.709515 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01901538 = queryNorm
              0.21719621 = fieldWeight in 1343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=1343)
          0.092971586 = weight(abstract_txt:precision in 1343) [ClassicSimilarity], result of:
            0.092971586 = score(doc=1343,freq=2.0), product of:
              0.19037428 = queryWeight, product of:
                1.8119924 = boost
                5.5251865 = idf(docFreq=478, maxDocs=44218)
                0.01901538 = queryNorm
              0.4883621 = fieldWeight in 1343, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5251865 = idf(docFreq=478, maxDocs=44218)
                0.0625 = fieldNorm(doc=1343)
          0.23406284 = weight(abstract_txt:query in 1343) [ClassicSimilarity], result of:
            0.23406284 = score(doc=1343,freq=5.0), product of:
              0.3523139 = queryWeight, product of:
                3.8975089 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01901538 = queryNorm
              0.6643588 = fieldWeight in 1343, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=1343)
          0.39678803 = weight(abstract_txt:expansion in 1343) [ClassicSimilarity], result of:
            0.39678803 = score(doc=1343,freq=5.0), product of:
              0.46499023 = queryWeight, product of:
                4.0048766 = boost
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.01901538 = queryNorm
              0.85332555 = fieldWeight in 1343, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1059003 = idf(docFreq=267, maxDocs=44218)
                0.0625 = fieldNorm(doc=1343)
        0.28 = coord(7/25)