Document (#39878)

Author
Selvaretnam, B.
Belkhatir, M.
Title
¬A linguistically driven framework for query expansion via grammatical constituent highlighting and role-based concept weighting
Source
Information processing and management. 52(2016) no.2, S.174-192
Year
2016
Abstract
In this paper, we propose a linguistically-motivated query expansion framework that recognizes and encodes significant query constituents characterizing query intent in order to improve retrieval performance. Concepts-of-Interest are recognized as the core concepts that represent the gist of the search goal whilst the remaining query constituents which serve to specify the search goal and complete the query structure are classified as descriptive, relational or structural. Acknowledging the need to form semantically-associated base pairs for the purpose of extracting related potential expansion concepts, an algorithm which capitalizes on syntactical dependencies to capture relationships between adjacent and non-adjacent query concepts is proposed. Lastly, a robust weighting scheme that duly emphasizes the importance of query constituents based on their linguistic role within the expanded query is presented. We demonstrate improvements in retrieval effectiveness in terms of increased mean average precision garnered by the proposed linguistic-based query expansion framework through experimentation on the TREC ad hoc test collections.
Content
Vgl.: doi:10.1016/j.ipm.2015.04.002.
Theme
Semantisches Umfeld in Indexierung u. Retrieval

Similar documents (content)

  1. Mestrovic, A.; Cali, A.: ¬An ontology-based approach to information retrieval (2017) 0.16
    0.15626809 = sum of:
      0.15626809 = product of:
        0.6511171 = sum of:
          0.015090846 = weight(abstract_txt:based in 5490) [ClassicSimilarity], result of:
            0.015090846 = score(doc=5490,freq=1.0), product of:
              0.060196903 = queryWeight, product of:
                1.2354895 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.015183981 = queryNorm
              0.2506914 = fieldWeight in 5490, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.078125 = fieldNorm(doc=5490)
          0.06353988 = weight(abstract_txt:framework in 5490) [ClassicSimilarity], result of:
            0.06353988 = score(doc=5490,freq=2.0), product of:
              0.12458124 = queryWeight, product of:
                1.7773719 = boost
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.015183981 = queryNorm
              0.51002765 = fieldWeight in 5490, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.078125 = fieldNorm(doc=5490)
          0.10358145 = weight(abstract_txt:weighting in 5490) [ClassicSimilarity], result of:
            0.10358145 = score(doc=5490,freq=1.0), product of:
              0.18992813 = queryWeight, product of:
                1.7918473 = boost
                6.980759 = idf(docFreq=107, maxDocs=42740)
                0.015183981 = queryNorm
              0.54537183 = fieldWeight in 5490, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.980759 = idf(docFreq=107, maxDocs=42740)
                0.078125 = fieldNorm(doc=5490)
          0.10113565 = weight(abstract_txt:concepts in 5490) [ClassicSimilarity], result of:
            0.10113565 = score(doc=5490,freq=3.0), product of:
              0.16329533 = queryWeight, product of:
                2.3496773 = boost
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.015183981 = queryNorm
              0.61934197 = fieldWeight in 5490, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.078125 = fieldNorm(doc=5490)
          0.13943486 = weight(abstract_txt:expansion in 5490) [ClassicSimilarity], result of:
            0.13943486 = score(doc=5490,freq=1.0), product of:
              0.29173747 = queryWeight, product of:
                3.140635 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.015183981 = queryNorm
              0.47794634 = fieldWeight in 5490, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.078125 = fieldNorm(doc=5490)
          0.22833441 = weight(abstract_txt:query in 5490) [ClassicSimilarity], result of:
            0.22833441 = score(doc=5490,freq=2.0), product of:
              0.4366119 = queryWeight, product of:
                6.0749 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.015183981 = queryNorm
              0.5229688 = fieldWeight in 5490, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.078125 = fieldNorm(doc=5490)
        0.24 = coord(6/25)
    
  2. Järvelin, K.; Kristensen, J.; Niemi, T.; Sormunen, E.; Keskustalo, H.: ¬A deductive data model for query expansion (1996) 0.15
    0.1535198 = sum of:
      0.1535198 = product of:
        0.7675989 = sum of:
          0.031365734 = weight(abstract_txt:based in 4231) [ClassicSimilarity], result of:
            0.031365734 = score(doc=4231,freq=3.0), product of:
              0.060196903 = queryWeight, product of:
                1.2354895 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.015183981 = queryNorm
              0.5210523 = fieldWeight in 4231, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.09375 = fieldNorm(doc=4231)
          0.07332971 = weight(abstract_txt:linguistic in 4231) [ClassicSimilarity], result of:
            0.07332971 = score(doc=4231,freq=1.0), product of:
              0.13359815 = queryWeight, product of:
                1.5028187 = boost
                5.8547482 = idf(docFreq=332, maxDocs=42740)
                0.015183981 = queryNorm
              0.54888266 = fieldWeight in 4231, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8547482 = idf(docFreq=332, maxDocs=42740)
                0.09375 = fieldNorm(doc=4231)
          0.0990923 = weight(abstract_txt:concepts in 4231) [ClassicSimilarity], result of:
            0.0990923 = score(doc=4231,freq=2.0), product of:
              0.16329533 = queryWeight, product of:
                2.3496773 = boost
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.015183981 = queryNorm
              0.60682875 = fieldWeight in 4231, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.09375 = fieldNorm(doc=4231)
          0.28980988 = weight(abstract_txt:expansion in 4231) [ClassicSimilarity], result of:
            0.28980988 = score(doc=4231,freq=3.0), product of:
              0.29173747 = queryWeight, product of:
                3.140635 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.015183981 = queryNorm
              0.99339277 = fieldWeight in 4231, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.09375 = fieldNorm(doc=4231)
          0.2740013 = weight(abstract_txt:query in 4231) [ClassicSimilarity], result of:
            0.2740013 = score(doc=4231,freq=2.0), product of:
              0.4366119 = queryWeight, product of:
                6.0749 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.015183981 = queryNorm
              0.62756264 = fieldWeight in 4231, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.09375 = fieldNorm(doc=4231)
        0.2 = coord(5/25)
    
  3. Gray, A.J.G.; Gray, N.; Hall, C.W.; Ounis, I.: Finding the right term : retrieving and exploring semantic concepts in astronomical vocabularies (2010) 0.13
    0.13498285 = sum of:
      0.13498285 = product of:
        0.5624286 = sum of:
          0.024348633 = weight(abstract_txt:proposed in 1236) [ClassicSimilarity], result of:
            0.024348633 = score(doc=1236,freq=1.0), product of:
              0.08394427 = queryWeight, product of:
                1.1912472 = boost
                4.640914 = idf(docFreq=1120, maxDocs=42740)
                0.015183981 = queryNorm
              0.29005712 = fieldWeight in 1236, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.640914 = idf(docFreq=1120, maxDocs=42740)
                0.0625 = fieldNorm(doc=1236)
          0.012072678 = weight(abstract_txt:based in 1236) [ClassicSimilarity], result of:
            0.012072678 = score(doc=1236,freq=1.0), product of:
              0.060196903 = queryWeight, product of:
                1.2354895 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.015183981 = queryNorm
              0.20055313 = fieldWeight in 1236, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.0625 = fieldNorm(doc=1236)
          0.16573031 = weight(abstract_txt:weighting in 1236) [ClassicSimilarity], result of:
            0.16573031 = score(doc=1236,freq=4.0), product of:
              0.18992813 = queryWeight, product of:
                1.7918473 = boost
                6.980759 = idf(docFreq=107, maxDocs=42740)
                0.015183981 = queryNorm
              0.8725949 = fieldWeight in 1236, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.980759 = idf(docFreq=107, maxDocs=42740)
                0.0625 = fieldNorm(doc=1236)
          0.066061534 = weight(abstract_txt:concepts in 1236) [ClassicSimilarity], result of:
            0.066061534 = score(doc=1236,freq=2.0), product of:
              0.16329533 = queryWeight, product of:
                2.3496773 = boost
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.015183981 = queryNorm
              0.4045525 = fieldWeight in 1236, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.576989 = idf(docFreq=1194, maxDocs=42740)
                0.0625 = fieldNorm(doc=1236)
          0.11154788 = weight(abstract_txt:expansion in 1236) [ClassicSimilarity], result of:
            0.11154788 = score(doc=1236,freq=1.0), product of:
              0.29173747 = queryWeight, product of:
                3.140635 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.015183981 = queryNorm
              0.38235706 = fieldWeight in 1236, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.0625 = fieldNorm(doc=1236)
          0.18266754 = weight(abstract_txt:query in 1236) [ClassicSimilarity], result of:
            0.18266754 = score(doc=1236,freq=2.0), product of:
              0.4366119 = queryWeight, product of:
                6.0749 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.015183981 = queryNorm
              0.41837507 = fieldWeight in 1236, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.0625 = fieldNorm(doc=1236)
        0.24 = coord(6/25)
    
  4. Na, S.-H.; Kang, I.-S.; Lee, J.-H.: Parsimonious translation models for information retrieval (2007) 0.13
    0.13200983 = sum of:
      0.13200983 = product of:
        0.55004096 = sum of:
          0.057614118 = weight(abstract_txt:experimentation in 2899) [ClassicSimilarity], result of:
            0.057614118 = score(doc=2899,freq=1.0), product of:
              0.11830886 = queryWeight, product of:
                7.7916894 = idf(docFreq=47, maxDocs=42740)
                0.015183981 = queryNorm
              0.4869806 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7916894 = idf(docFreq=47, maxDocs=42740)
                0.0625 = fieldNorm(doc=2899)
          0.096865624 = weight(abstract_txt:encodes in 2899) [ClassicSimilarity], result of:
            0.096865624 = score(doc=2899,freq=1.0), product of:
              0.16728017 = queryWeight, product of:
                1.1890869 = boost
                9.264996 = idf(docFreq=10, maxDocs=42740)
                0.015183981 = queryNorm
              0.5790622 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.264996 = idf(docFreq=10, maxDocs=42740)
                0.0625 = fieldNorm(doc=2899)
          0.024348633 = weight(abstract_txt:proposed in 2899) [ClassicSimilarity], result of:
            0.024348633 = score(doc=2899,freq=1.0), product of:
              0.08394427 = queryWeight, product of:
                1.1912472 = boost
                4.640914 = idf(docFreq=1120, maxDocs=42740)
                0.015183981 = queryNorm
              0.29005712 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.640914 = idf(docFreq=1120, maxDocs=42740)
                0.0625 = fieldNorm(doc=2899)
          0.035943583 = weight(abstract_txt:framework in 2899) [ClassicSimilarity], result of:
            0.035943583 = score(doc=2899,freq=1.0), product of:
              0.12458124 = queryWeight, product of:
                1.7773719 = boost
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.015183981 = queryNorm
              0.2885152 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.0625 = fieldNorm(doc=2899)
          0.11154788 = weight(abstract_txt:expansion in 2899) [ClassicSimilarity], result of:
            0.11154788 = score(doc=2899,freq=1.0), product of:
              0.29173747 = queryWeight, product of:
                3.140635 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.015183981 = queryNorm
              0.38235706 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.0625 = fieldNorm(doc=2899)
          0.22372112 = weight(abstract_txt:query in 2899) [ClassicSimilarity], result of:
            0.22372112 = score(doc=2899,freq=3.0), product of:
              0.4366119 = queryWeight, product of:
                6.0749 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.015183981 = queryNorm
              0.5124027 = fieldWeight in 2899, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.0625 = fieldNorm(doc=2899)
        0.24 = coord(6/25)
    
  5. Hancock-Beaulieu, M.; Fieldhouse, M.; Do, T.: ¬An evaluation of interactive query expansion in an online library catalogue with a graphical user interface (1995) 0.12
    0.12284773 = sum of:
      0.12284773 = product of:
        0.7677983 = sum of:
          0.018109016 = weight(abstract_txt:based in 1735) [ClassicSimilarity], result of:
            0.018109016 = score(doc=1735,freq=1.0), product of:
              0.060196903 = queryWeight, product of:
                1.2354895 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.015183981 = queryNorm
              0.3008297 = fieldWeight in 1735, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.09375 = fieldNorm(doc=1735)
          0.12429774 = weight(abstract_txt:weighting in 1735) [ClassicSimilarity], result of:
            0.12429774 = score(doc=1735,freq=1.0), product of:
              0.18992813 = queryWeight, product of:
                1.7918473 = boost
                6.980759 = idf(docFreq=107, maxDocs=42740)
                0.015183981 = queryNorm
              0.6544462 = fieldWeight in 1735, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.980759 = idf(docFreq=107, maxDocs=42740)
                0.09375 = fieldNorm(doc=1735)
          0.28980988 = weight(abstract_txt:expansion in 1735) [ClassicSimilarity], result of:
            0.28980988 = score(doc=1735,freq=3.0), product of:
              0.29173747 = queryWeight, product of:
                3.140635 = boost
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.015183981 = queryNorm
              0.99339277 = fieldWeight in 1735, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.117713 = idf(docFreq=255, maxDocs=42740)
                0.09375 = fieldNorm(doc=1735)
          0.33558166 = weight(abstract_txt:query in 1735) [ClassicSimilarity], result of:
            0.33558166 = score(doc=1735,freq=3.0), product of:
              0.4366119 = queryWeight, product of:
                6.0749 = boost
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.015183981 = queryNorm
              0.76860404 = fieldWeight in 1735, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7333736 = idf(docFreq=1021, maxDocs=42740)
                0.09375 = fieldNorm(doc=1735)
        0.16 = coord(4/25)