Document (#37098)

Author
Lassalle, E.
Lassalle, E.
Title
Semantic models in information retrieval
Source
Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a
Imprint
Hershey, PA : IGI Publishing
Year
2012
Pages
S.138-173
Abstract
Robertson and Spärck Jones pioneered experimental probabilistic models (Binary Independence Model) with both a typology generalizing the Boolean model, a frequency counting to calculate elementary weightings, and their combination into a global probabilistic estimation. However, this model did not consider indexing terms dependencies. An extension to mixture models (e.g., using a 2-Poisson law) made it possible to take into account these dependencies from a macroscopic point of view (BM25), as well as a shallow linguistic processing of co-references. New approaches (language models, for example "bag of words" models, probabilistic dependencies between requests and documents, and consequently Bayesian inference using Dirichlet prior conjugate) furnished new solutions for documents structuring (categorization) and for index smoothing. Presently, in these probabilistic models the main issues have been addressed from a formal point of view only. Thus, linguistic properties are neglected in the indexing language. The authors examine how a linguistic and semantic modeling can be integrated in indexing languages and set up a hybrid model that makes it possible to deal with different information retrieval problems in a unified way.
Footnote
Vgl.: http://www.igi-global.com/book/next-generation-search-engines/64424.
Theme
Semantic Web
Wissensrepräsentation

Similar documents (content)

  1. Lhadj, L.S.; Boughanem, M.; Amrouche, K.: Enhancing information retrieval through concept-based language modeling and semantic smoothing (2016) 0.26
    0.261273 = sum of:
      0.261273 = product of:
        0.72575825 = sum of:
          0.019547084 = weight(abstract_txt:documents in 3221) [ClassicSimilarity], result of:
            0.019547084 = score(doc=3221,freq=1.0), product of:
              0.07588702 = queryWeight, product of:
                1.0051492 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018319027 = queryNorm
              0.2575814 = fieldWeight in 3221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.03537679 = weight(abstract_txt:language in 3221) [ClassicSimilarity], result of:
            0.03537679 = score(doc=3221,freq=3.0), product of:
              0.078142025 = queryWeight, product of:
                1.0199741 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.018319027 = queryNorm
              0.45272425 = fieldWeight in 3221, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.043323208 = weight(abstract_txt:semantic in 3221) [ClassicSimilarity], result of:
            0.043323208 = score(doc=3221,freq=3.0), product of:
              0.08944433 = queryWeight, product of:
                1.0912474 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.018319027 = queryNorm
              0.48435947 = fieldWeight in 3221, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.11226804 = weight(abstract_txt:smoothing in 3221) [ClassicSimilarity], result of:
            0.11226804 = score(doc=3221,freq=1.0), product of:
              0.19317025 = queryWeight, product of:
                1.133971 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.018319027 = queryNorm
              0.581187 = fieldWeight in 3221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.03521549 = weight(abstract_txt:view in 3221) [ClassicSimilarity], result of:
            0.03521549 = score(doc=3221,freq=1.0), product of:
              0.112357475 = queryWeight, product of:
                1.2230601 = boost
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.018319027 = queryNorm
              0.31342366 = fieldWeight in 3221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.039368823 = weight(abstract_txt:point in 3221) [ClassicSimilarity], result of:
            0.039368823 = score(doc=3221,freq=1.0), product of:
              0.12102667 = queryWeight, product of:
                1.2693675 = boost
                5.2046475 = idf(docFreq=659, maxDocs=44218)
                0.018319027 = queryNorm
              0.32529047 = fieldWeight in 3221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2046475 = idf(docFreq=659, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.070750065 = weight(abstract_txt:model in 3221) [ClassicSimilarity], result of:
            0.070750065 = score(doc=3221,freq=4.0), product of:
              0.14198878 = queryWeight, product of:
                1.9444144 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.018319027 = queryNorm
              0.49827924 = fieldWeight in 3221, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.20236596 = weight(abstract_txt:dependencies in 3221) [ClassicSimilarity], result of:
            0.20236596 = score(doc=3221,freq=1.0), product of:
              0.41263703 = queryWeight, product of:
                2.8706255 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.018319027 = queryNorm
              0.49042124 = fieldWeight in 3221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.16754277 = weight(abstract_txt:models in 3221) [ClassicSimilarity], result of:
            0.16754277 = score(doc=3221,freq=4.0), product of:
              0.28876886 = queryWeight, product of:
                3.3961167 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.018319027 = queryNorm
              0.5801968 = fieldWeight in 3221, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
        0.36 = coord(9/25)
    
  2. Vilares, J.; Alonso, M.A.; Vilares, M.: Extraction of complex index terms in non-English IR : a shallow parsing based approach (2008) 0.20
    0.203284 = sum of:
      0.203284 = product of:
        0.72601426 = sum of:
          0.027643753 = weight(abstract_txt:documents in 2107) [ClassicSimilarity], result of:
            0.027643753 = score(doc=2107,freq=2.0), product of:
              0.07588702 = queryWeight, product of:
                1.0051492 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018319027 = queryNorm
              0.36427513 = fieldWeight in 2107, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2107)
          0.08068999 = weight(abstract_txt:shallow in 2107) [ClassicSimilarity], result of:
            0.08068999 = score(doc=2107,freq=1.0), product of:
              0.15499437 = queryWeight, product of:
                1.0157568 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.018319027 = queryNorm
              0.5205995 = fieldWeight in 2107, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.0625 = fieldNorm(doc=2107)
          0.028885027 = weight(abstract_txt:language in 2107) [ClassicSimilarity], result of:
            0.028885027 = score(doc=2107,freq=2.0), product of:
              0.078142025 = queryWeight, product of:
                1.0199741 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.018319027 = queryNorm
              0.3696478 = fieldWeight in 2107, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=2107)
          0.027631348 = weight(abstract_txt:possible in 2107) [ClassicSimilarity], result of:
            0.027631348 = score(doc=2107,freq=1.0), product of:
              0.095583044 = queryWeight, product of:
                1.1280731 = boost
                4.6253138 = idf(docFreq=1177, maxDocs=44218)
                0.018319027 = queryNorm
              0.2890821 = fieldWeight in 2107, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6253138 = idf(docFreq=1177, maxDocs=44218)
                0.0625 = fieldNorm(doc=2107)
          0.039368823 = weight(abstract_txt:point in 2107) [ClassicSimilarity], result of:
            0.039368823 = score(doc=2107,freq=1.0), product of:
              0.12102667 = queryWeight, product of:
                1.2693675 = boost
                5.2046475 = idf(docFreq=659, maxDocs=44218)
                0.018319027 = queryNorm
              0.32529047 = fieldWeight in 2107, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2046475 = idf(docFreq=659, maxDocs=44218)
                0.0625 = fieldNorm(doc=2107)
          0.11706335 = weight(abstract_txt:linguistic in 2107) [ClassicSimilarity], result of:
            0.11706335 = score(doc=2107,freq=2.0), product of:
              0.22737736 = queryWeight, product of:
                2.1309147 = boost
                5.8247695 = idf(docFreq=354, maxDocs=44218)
                0.018319027 = queryNorm
              0.51484174 = fieldWeight in 2107, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8247695 = idf(docFreq=354, maxDocs=44218)
                0.0625 = fieldNorm(doc=2107)
          0.40473193 = weight(abstract_txt:dependencies in 2107) [ClassicSimilarity], result of:
            0.40473193 = score(doc=2107,freq=4.0), product of:
              0.41263703 = queryWeight, product of:
                2.8706255 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.018319027 = queryNorm
              0.9808425 = fieldWeight in 2107, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=2107)
        0.28 = coord(7/25)
    
  3. Bodoff, D.; Wong, S.P.-S.: Documents and queries as random variables : history and implications (2006) 0.19
    0.18651536 = sum of:
      0.18651536 = product of:
        0.77714735 = sum of:
          0.042320676 = weight(abstract_txt:documents in 193) [ClassicSimilarity], result of:
            0.042320676 = score(doc=193,freq=3.0), product of:
              0.07588702 = queryWeight, product of:
                1.0051492 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018319027 = queryNorm
              0.5576801 = fieldWeight in 193, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=193)
          0.036106285 = weight(abstract_txt:language in 193) [ClassicSimilarity], result of:
            0.036106285 = score(doc=193,freq=2.0), product of:
              0.078142025 = queryWeight, product of:
                1.0199741 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.018319027 = queryNorm
              0.46205974 = fieldWeight in 193, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=193)
          0.044019364 = weight(abstract_txt:view in 193) [ClassicSimilarity], result of:
            0.044019364 = score(doc=193,freq=1.0), product of:
              0.112357475 = queryWeight, product of:
                1.2230601 = boost
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.018319027 = queryNorm
              0.39177957 = fieldWeight in 193, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.078125 = fieldNorm(doc=193)
          0.04421879 = weight(abstract_txt:model in 193) [ClassicSimilarity], result of:
            0.04421879 = score(doc=193,freq=1.0), product of:
              0.14198878 = queryWeight, product of:
                1.9444144 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.018319027 = queryNorm
              0.31142452 = fieldWeight in 193, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.078125 = fieldNorm(doc=193)
          0.3763341 = weight(abstract_txt:probabilistic in 193) [ClassicSimilarity], result of:
            0.3763341 = score(doc=193,freq=3.0), product of:
              0.41038495 = queryWeight, product of:
                3.305655 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.018319027 = queryNorm
              0.91702706 = fieldWeight in 193, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.078125 = fieldNorm(doc=193)
          0.23414813 = weight(abstract_txt:models in 193) [ClassicSimilarity], result of:
            0.23414813 = score(doc=193,freq=5.0), product of:
              0.28876886 = queryWeight, product of:
                3.3961167 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.018319027 = queryNorm
              0.81084967 = fieldWeight in 193, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=193)
        0.24 = coord(6/25)
    
  4. Lee, C.; Lee, G.G.: Probabilistic information retrieval model for a dependence structured indexing system (2005) 0.19
    0.18568183 = sum of:
      0.18568183 = product of:
        0.7736743 = sum of:
          0.024433857 = weight(abstract_txt:documents in 1004) [ClassicSimilarity], result of:
            0.024433857 = score(doc=1004,freq=1.0), product of:
              0.07588702 = queryWeight, product of:
                1.0051492 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018319027 = queryNorm
              0.32197678 = fieldWeight in 1004, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
          0.11919851 = weight(abstract_txt:poisson in 1004) [ClassicSimilarity], result of:
            0.11919851 = score(doc=1004,freq=1.0), product of:
              0.17325138 = queryWeight, product of:
                1.0739156 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.018319027 = queryNorm
              0.688009 = fieldWeight in 1004, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
          0.04308479 = weight(abstract_txt:indexing in 1004) [ClassicSimilarity], result of:
            0.04308479 = score(doc=1004,freq=1.0), product of:
              0.12679026 = queryWeight, product of:
                1.5912389 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.018319027 = queryNorm
              0.3398115 = fieldWeight in 1004, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
          0.06253481 = weight(abstract_txt:model in 1004) [ClassicSimilarity], result of:
            0.06253481 = score(doc=1004,freq=2.0), product of:
              0.14198878 = queryWeight, product of:
                1.9444144 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.018319027 = queryNorm
              0.44042078 = fieldWeight in 1004, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
          0.3763341 = weight(abstract_txt:probabilistic in 1004) [ClassicSimilarity], result of:
            0.3763341 = score(doc=1004,freq=3.0), product of:
              0.41038495 = queryWeight, product of:
                3.305655 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.018319027 = queryNorm
              0.91702706 = fieldWeight in 1004, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
          0.14808829 = weight(abstract_txt:models in 1004) [ClassicSimilarity], result of:
            0.14808829 = score(doc=1004,freq=2.0), product of:
              0.28876886 = queryWeight, product of:
                3.3961167 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.018319027 = queryNorm
              0.5128264 = fieldWeight in 1004, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
        0.24 = coord(6/25)
    
  5. Cho, B.-H.; Lee, C.; Lee, G.G.: Exploring term dependences in probabilistic information retrieval model (2003) 0.16
    0.16489486 = sum of:
      0.16489486 = product of:
        0.68706197 = sum of:
          0.024433857 = weight(abstract_txt:documents in 1077) [ClassicSimilarity], result of:
            0.024433857 = score(doc=1077,freq=1.0), product of:
              0.07588702 = queryWeight, product of:
                1.0051492 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018319027 = queryNorm
              0.32197678 = fieldWeight in 1077, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=1077)
          0.025530998 = weight(abstract_txt:language in 1077) [ClassicSimilarity], result of:
            0.025530998 = score(doc=1077,freq=1.0), product of:
              0.078142025 = queryWeight, product of:
                1.0199741 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.018319027 = queryNorm
              0.32672557 = fieldWeight in 1077, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=1077)
          0.11919851 = weight(abstract_txt:poisson in 1077) [ClassicSimilarity], result of:
            0.11919851 = score(doc=1077,freq=1.0), product of:
              0.17325138 = queryWeight, product of:
                1.0739156 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.018319027 = queryNorm
              0.688009 = fieldWeight in 1077, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.078125 = fieldNorm(doc=1077)
          0.06253481 = weight(abstract_txt:model in 1077) [ClassicSimilarity], result of:
            0.06253481 = score(doc=1077,freq=2.0), product of:
              0.14198878 = queryWeight, product of:
                1.9444144 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.018319027 = queryNorm
              0.44042078 = fieldWeight in 1077, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.078125 = fieldNorm(doc=1077)
          0.3072755 = weight(abstract_txt:probabilistic in 1077) [ClassicSimilarity], result of:
            0.3072755 = score(doc=1077,freq=2.0), product of:
              0.41038495 = queryWeight, product of:
                3.305655 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.018319027 = queryNorm
              0.74874943 = fieldWeight in 1077, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.078125 = fieldNorm(doc=1077)
          0.14808829 = weight(abstract_txt:models in 1077) [ClassicSimilarity], result of:
            0.14808829 = score(doc=1077,freq=2.0), product of:
              0.28876886 = queryWeight, product of:
                3.3961167 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.018319027 = queryNorm
              0.5128264 = fieldWeight in 1077, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=1077)
        0.24 = coord(6/25)