Document (#32919)

Author
Abdelali, A.
Cowie, J.
Soliman, H.S.
Title
Improving query precision using semantic expansion
Source
Information processing and management. 43(2007) no.3, S.705-716
Year
2007
Abstract
Query Expansion (QE) is one of the most important mechanisms in the information retrieval field. A typical short Internet query will go through a process of refinement to improve its retrieval power. Most of the existing QE techniques suffer from retrieval performance degradation due to imprecise choice of query's additive terms in the QE process. In this paper, we introduce a novel automated QE mechanism. The new expansion process is guided by the semantics relations between the original query and the expanding words, in the context of the utilized corpus. Experimental results of our "controlled" query expansion, using the Arabic TREC-10 data, show a significant enhancement of recall and precision over current existing mechanisms in the field.
Footnote
Beitrag in: Special issue on Heterogeneous and Distributed IR
Theme
Retrievalalgorithmen

Similar documents (content)

  1. He, B.; Ounis, I.: Combining fields for query expansion and adaptive query expansion (2007) 0.32
    0.31897944 = sum of:
      0.31897944 = product of:
        0.9968108 = sum of:
          0.08741995 = weight(abstract_txt:mechanism in 2927) [ClassicSimilarity], result of:
            0.08741995 = score(doc=2927,freq=3.0), product of:
              0.1265816 = queryWeight, product of:
                1.0401446 = boost
                6.379687 = idf(docFreq=196, maxDocs=42740)
                0.019075569 = queryNorm
              0.6906213 = fieldWeight in 2927, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.379687 = idf(docFreq=196, maxDocs=42740)
                0.0625 = fieldNorm(doc=2927)
          0.058100726 = weight(abstract_txt:trec in 2927) [ClassicSimilarity], result of:
            0.058100726 = score(doc=2927,freq=1.0), product of:
              0.13903527 = queryWeight, product of:
                1.0901115 = boost
                6.6861567 = idf(docFreq=144, maxDocs=42740)
                0.019075569 = queryNorm
              0.4178848 = fieldWeight in 2927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6861567 = idf(docFreq=144, maxDocs=42740)
                0.0625 = fieldNorm(doc=2927)
          0.016376829 = weight(abstract_txt:using in 2927) [ClassicSimilarity], result of:
            0.016376829 = score(doc=2927,freq=1.0), product of:
              0.075306736 = queryWeight, product of:
                1.1345936 = boost
                3.4794931 = idf(docFreq=3580, maxDocs=42740)
                0.019075569 = queryNorm
              0.21746832 = fieldWeight in 2927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4794931 = idf(docFreq=3580, maxDocs=42740)
                0.0625 = fieldNorm(doc=2927)
          0.036015138 = weight(abstract_txt:field in 2927) [ClassicSimilarity], result of:
            0.036015138 = score(doc=2927,freq=1.0), product of:
              0.12735148 = queryWeight, product of:
                1.4754531 = boost
                4.5248175 = idf(docFreq=1258, maxDocs=42740)
                0.019075569 = queryNorm
              0.2828011 = fieldWeight in 2927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5248175 = idf(docFreq=1258, maxDocs=42740)
                0.0625 = fieldNorm(doc=2927)
          0.03430229 = weight(abstract_txt:retrieval in 2927) [ClassicSimilarity], result of:
            0.03430229 = score(doc=2927,freq=2.0), product of:
              0.112008184 = queryWeight, product of:
                1.6947043 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.019075569 = queryNorm
              0.30624807 = fieldWeight in 2927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.0625 = fieldNorm(doc=2927)
          0.055828556 = weight(abstract_txt:process in 2927) [ClassicSimilarity], result of:
            0.055828556 = score(doc=2927,freq=2.0), product of:
              0.15497868 = queryWeight, product of:
                1.9934485 = boost
                4.07558 = idf(docFreq=1972, maxDocs=42740)
                0.019075569 = queryNorm
              0.36023378 = fieldWeight in 2927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.07558 = idf(docFreq=1972, maxDocs=42740)
                0.0625 = fieldNorm(doc=2927)
          0.2726988 = weight(abstract_txt:query in 2927) [ClassicSimilarity], result of:
            0.2726988 = score(doc=2927,freq=7.0), product of:
              0.34840423 = queryWeight, product of:
                3.8586478 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.019075569 = queryNorm
              0.78270805 = fieldWeight in 2927, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.0625 = fieldNorm(doc=2927)
          0.43606856 = weight(abstract_txt:expansion in 2927) [ClassicSimilarity], result of:
            0.43606856 = score(doc=2927,freq=6.0), product of:
              0.46559685 = queryWeight, product of:
                3.9897294 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.019075569 = queryNorm
              0.9365797 = fieldWeight in 2927, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.0625 = fieldNorm(doc=2927)
        0.32 = coord(8/25)
    
  2. Xu, B.; Lin, H.; Lin, Y.: Assessment of learning to rank methods for query expansion (2016) 0.25
    0.24845473 = sum of:
      0.24845473 = product of:
        0.77642107 = sum of:
          0.045836907 = weight(abstract_txt:introduce in 4930) [ClassicSimilarity], result of:
            0.045836907 = score(doc=4930,freq=1.0), product of:
              0.11870822 = queryWeight, product of:
                1.0072768 = boost
                6.1780934 = idf(docFreq=240, maxDocs=42740)
                0.019075569 = queryNorm
              0.38613084 = fieldWeight in 4930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1780934 = idf(docFreq=240, maxDocs=42740)
                0.0625 = fieldNorm(doc=4930)
          0.058100726 = weight(abstract_txt:trec in 4930) [ClassicSimilarity], result of:
            0.058100726 = score(doc=4930,freq=1.0), product of:
              0.13903527 = queryWeight, product of:
                1.0901115 = boost
                6.6861567 = idf(docFreq=144, maxDocs=42740)
                0.019075569 = queryNorm
              0.4178848 = fieldWeight in 4930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6861567 = idf(docFreq=144, maxDocs=42740)
                0.0625 = fieldNorm(doc=4930)
          0.016376829 = weight(abstract_txt:using in 4930) [ClassicSimilarity], result of:
            0.016376829 = score(doc=4930,freq=1.0), product of:
              0.075306736 = queryWeight, product of:
                1.1345936 = boost
                3.4794931 = idf(docFreq=3580, maxDocs=42740)
                0.019075569 = queryNorm
              0.21746832 = fieldWeight in 4930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4794931 = idf(docFreq=3580, maxDocs=42740)
                0.0625 = fieldNorm(doc=4930)
          0.0816276 = weight(abstract_txt:refinement in 4930) [ClassicSimilarity], result of:
            0.0816276 = score(doc=4930,freq=1.0), product of:
              0.17440622 = queryWeight, product of:
                1.2209262 = boost
                7.4885035 = idf(docFreq=64, maxDocs=42740)
                0.019075569 = queryNorm
              0.46803147 = fieldWeight in 4930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4885035 = idf(docFreq=64, maxDocs=42740)
                0.0625 = fieldNorm(doc=4930)
          0.024156054 = weight(abstract_txt:most in 4930) [ClassicSimilarity], result of:
            0.024156054 = score(doc=4930,freq=1.0), product of:
              0.09758085 = queryWeight, product of:
                1.2915337 = boost
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.019075569 = queryNorm
              0.24754913 = fieldWeight in 4930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.960786 = idf(docFreq=2212, maxDocs=42740)
                0.0625 = fieldNorm(doc=4930)
          0.048510764 = weight(abstract_txt:retrieval in 4930) [ClassicSimilarity], result of:
            0.048510764 = score(doc=4930,freq=4.0), product of:
              0.112008184 = queryWeight, product of:
                1.6947043 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.019075569 = queryNorm
              0.43310016 = fieldWeight in 4930, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.0625 = fieldNorm(doc=4930)
          0.14576365 = weight(abstract_txt:query in 4930) [ClassicSimilarity], result of:
            0.14576365 = score(doc=4930,freq=2.0), product of:
              0.34840423 = queryWeight, product of:
                3.8586478 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.019075569 = queryNorm
              0.41837507 = fieldWeight in 4930, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.0625 = fieldNorm(doc=4930)
          0.3560485 = weight(abstract_txt:expansion in 4930) [ClassicSimilarity], result of:
            0.3560485 = score(doc=4930,freq=4.0), product of:
              0.46559685 = queryWeight, product of:
                3.9897294 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.019075569 = queryNorm
              0.7647141 = fieldWeight in 4930, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.0625 = fieldNorm(doc=4930)
        0.32 = coord(8/25)
    
  3. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.25
    0.24602982 = sum of:
      0.24602982 = product of:
        1.0251243 = sum of:
          0.044850655 = weight(abstract_txt:corpus in 3339) [ClassicSimilarity], result of:
            0.044850655 = score(doc=3339,freq=1.0), product of:
              0.11699927 = queryWeight, product of:
                6.1334615 = idf(docFreq=251, maxDocs=42740)
                0.019075569 = queryNorm
              0.38334134 = fieldWeight in 3339, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1334615 = idf(docFreq=251, maxDocs=42740)
                0.0625 = fieldNorm(doc=3339)
          0.11238132 = weight(abstract_txt:imprecise in 3339) [ClassicSimilarity], result of:
            0.11238132 = score(doc=3339,freq=1.0), product of:
              0.21584071 = queryWeight, product of:
                1.3582356 = boost
                8.330686 = idf(docFreq=27, maxDocs=42740)
                0.019075569 = queryNorm
              0.52066785 = fieldWeight in 3339, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.330686 = idf(docFreq=27, maxDocs=42740)
                0.0625 = fieldNorm(doc=3339)
          0.054236684 = weight(abstract_txt:retrieval in 3339) [ClassicSimilarity], result of:
            0.054236684 = score(doc=3339,freq=5.0), product of:
              0.112008184 = queryWeight, product of:
                1.6947043 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.019075569 = queryNorm
              0.4842207 = fieldWeight in 3339, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.0625 = fieldNorm(doc=3339)
          0.06837574 = weight(abstract_txt:process in 3339) [ClassicSimilarity], result of:
            0.06837574 = score(doc=3339,freq=3.0), product of:
              0.15497868 = queryWeight, product of:
                1.9934485 = boost
                4.07558 = idf(docFreq=1972, maxDocs=42740)
                0.019075569 = queryNorm
              0.44119447 = fieldWeight in 3339, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.07558 = idf(docFreq=1972, maxDocs=42740)
                0.0625 = fieldNorm(doc=3339)
          0.30921137 = weight(abstract_txt:query in 3339) [ClassicSimilarity], result of:
            0.30921137 = score(doc=3339,freq=9.0), product of:
              0.34840423 = queryWeight, product of:
                3.8586478 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.019075569 = queryNorm
              0.88750756 = fieldWeight in 3339, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.0625 = fieldNorm(doc=3339)
          0.43606856 = weight(abstract_txt:expansion in 3339) [ClassicSimilarity], result of:
            0.43606856 = score(doc=3339,freq=6.0), product of:
              0.46559685 = queryWeight, product of:
                3.9897294 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.019075569 = queryNorm
              0.9365797 = fieldWeight in 3339, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.0625 = fieldNorm(doc=3339)
        0.24 = coord(6/25)
    
  4. Efthimiadis, E.N.: Query expansion (1996) 0.24
    0.24072662 = sum of:
      0.24072662 = product of:
        1.5045414 = sum of:
          0.048510764 = weight(abstract_txt:retrieval in 4916) [ClassicSimilarity], result of:
            0.048510764 = score(doc=4916,freq=1.0), product of:
              0.112008184 = queryWeight, product of:
                1.6947043 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.019075569 = queryNorm
              0.43310016 = fieldWeight in 4916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.125 = fieldNorm(doc=4916)
          0.078953505 = weight(abstract_txt:process in 4916) [ClassicSimilarity], result of:
            0.078953505 = score(doc=4916,freq=1.0), product of:
              0.15497868 = queryWeight, product of:
                1.9934485 = boost
                4.07558 = idf(docFreq=1972, maxDocs=42740)
                0.019075569 = queryNorm
              0.5094475 = fieldWeight in 4916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.07558 = idf(docFreq=1972, maxDocs=42740)
                0.125 = fieldNorm(doc=4916)
          0.5049401 = weight(abstract_txt:query in 4916) [ClassicSimilarity], result of:
            0.5049401 = score(doc=4916,freq=6.0), product of:
              0.34840423 = queryWeight, product of:
                3.8586478 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.019075569 = queryNorm
              1.4492939 = fieldWeight in 4916, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.125 = fieldNorm(doc=4916)
          0.8721371 = weight(abstract_txt:expansion in 4916) [ClassicSimilarity], result of:
            0.8721371 = score(doc=4916,freq=6.0), product of:
              0.46559685 = queryWeight, product of:
                3.9897294 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.019075569 = queryNorm
              1.8731594 = fieldWeight in 4916, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.125 = fieldNorm(doc=4916)
        0.16 = coord(4/25)
    
  5. Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.24
    0.2373478 = sum of:
      0.2373478 = product of:
        0.8476707 = sum of:
          0.045836907 = weight(abstract_txt:introduce in 3344) [ClassicSimilarity], result of:
            0.045836907 = score(doc=3344,freq=1.0), product of:
              0.11870822 = queryWeight, product of:
                1.0072768 = boost
                6.1780934 = idf(docFreq=240, maxDocs=42740)
                0.019075569 = queryNorm
              0.38613084 = fieldWeight in 3344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1780934 = idf(docFreq=240, maxDocs=42740)
                0.0625 = fieldNorm(doc=3344)
          0.016376829 = weight(abstract_txt:using in 3344) [ClassicSimilarity], result of:
            0.016376829 = score(doc=3344,freq=1.0), product of:
              0.075306736 = queryWeight, product of:
                1.1345936 = boost
                3.4794931 = idf(docFreq=3580, maxDocs=42740)
                0.019075569 = queryNorm
              0.21746832 = fieldWeight in 3344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4794931 = idf(docFreq=3580, maxDocs=42740)
                0.0625 = fieldNorm(doc=3344)
          0.040354535 = weight(abstract_txt:existing in 3344) [ClassicSimilarity], result of:
            0.040354535 = score(doc=3344,freq=1.0), product of:
              0.1373859 = queryWeight, product of:
                1.5324789 = boost
                4.6997004 = idf(docFreq=1056, maxDocs=42740)
                0.019075569 = queryNorm
              0.29373127 = fieldWeight in 3344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6997004 = idf(docFreq=1056, maxDocs=42740)
                0.0625 = fieldNorm(doc=3344)
          0.024255382 = weight(abstract_txt:retrieval in 3344) [ClassicSimilarity], result of:
            0.024255382 = score(doc=3344,freq=1.0), product of:
              0.112008184 = queryWeight, product of:
                1.6947043 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.019075569 = queryNorm
              0.21655008 = fieldWeight in 3344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.0625 = fieldNorm(doc=3344)
          0.092300124 = weight(abstract_txt:precision in 3344) [ClassicSimilarity], result of:
            0.092300124 = score(doc=3344,freq=2.0), product of:
              0.18929484 = queryWeight, product of:
                1.7988411 = boost
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.019075569 = queryNorm
              0.48759976 = fieldWeight in 3344, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5165615 = idf(docFreq=466, maxDocs=42740)
                0.0625 = fieldNorm(doc=3344)
          0.23047256 = weight(abstract_txt:query in 3344) [ClassicSimilarity], result of:
            0.23047256 = score(doc=3344,freq=5.0), product of:
              0.34840423 = queryWeight, product of:
                3.8586478 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.019075569 = queryNorm
              0.6615091 = fieldWeight in 3344, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.0625 = fieldNorm(doc=3344)
          0.39807433 = weight(abstract_txt:expansion in 3344) [ClassicSimilarity], result of:
            0.39807433 = score(doc=3344,freq=5.0), product of:
              0.46559685 = queryWeight, product of:
                3.9897294 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.019075569 = queryNorm
              0.8549764 = fieldWeight in 3344, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.0625 = fieldNorm(doc=3344)
        0.28 = coord(7/25)