Document (#19151)

Author
Melucci, M.
Title
Passage retrieval : a probabilistic technique
Source
Information processing and management. 34(1998) no.1, S.43-68
Year
1998
Abstract
This paper presents a probabilistic technique to retrieve passages from texts having a large size or heterogeneous semantic content. The proposed technique is independent on any supporting auxiliary data, such as text structure, topic organization, or pre-defined text segments. A Bayesian framework implements the probabilistic technique. We carried out experiments to compare the probabilistique technique to one based on a text segmentation algorithm. In particular, the probabilistique technique is more effective than, or as effective as the one based on the text segmentation to retrieve small passages. Results show that passage size affects passage retrieval performance. Results do also suggest that text organization and query generality may have an impact on the difference in effectiveness between the two techniques
Theme
Volltextretrieval

Similar documents (author)

  1. Melucci, M.: Making digital libraries effective : automatic generation of links for similarity search across hyper-textbooks (2004) 5.81
    5.81187 = sum of:
      5.81187 = weight(author_txt:melucci in 2226) [ClassicSimilarity], result of:
        5.81187 = fieldWeight in 2226, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.625 = fieldNorm(doc=2226)
    
  2. Melucci, M.: Contextual search : a computational framework (2012) 5.81
    5.81187 = sum of:
      5.81187 = weight(author_txt:melucci in 4913) [ClassicSimilarity], result of:
        5.81187 = fieldWeight in 4913, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.625 = fieldNorm(doc=4913)
    
  3. Agosti, M.; Melucci, M.: Information retrieval techniques for the automatic construction of hypertext (2000) 4.65
    4.649496 = sum of:
      4.649496 = weight(author_txt:melucci in 4671) [ClassicSimilarity], result of:
        4.649496 = fieldWeight in 4671, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.5 = fieldNorm(doc=4671)
    
  4. Melucci, M.; Orio, N.: Combining melody processing and information retrieval techniques : methodology, evaluation, and system implementation (2004) 4.65
    4.649496 = sum of:
      4.649496 = weight(author_txt:melucci in 3087) [ClassicSimilarity], result of:
        4.649496 = fieldWeight in 3087, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.5 = fieldNorm(doc=3087)
    
  5. Melucci, M.; Orio, N.: Design, implementation, and evaluation of a methodology for automatic stemmer generation (2007) 4.65
    4.649496 = sum of:
      4.649496 = weight(author_txt:melucci in 268) [ClassicSimilarity], result of:
        4.649496 = fieldWeight in 268, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.5 = fieldNorm(doc=268)
    

Similar documents (content)

  1. Otterbacher, J.; Erkan, G.; Radev, D.R.: Biased LexRank : passage retrieval using random walks with question-based priors (2009) 0.35
    0.35455456 = sum of:
      0.35455456 = product of:
        1.2662663 = sum of:
          0.014028057 = weight(abstract_txt:based in 2450) [ClassicSimilarity], result of:
            0.014028057 = score(doc=2450,freq=1.0), product of:
              0.046937265 = queryWeight, product of:
                1.1135648 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.013221898 = queryNorm
              0.29886824 = fieldWeight in 2450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.09375 = fieldNorm(doc=2450)
          0.025698263 = weight(abstract_txt:retrieval in 2450) [ClassicSimilarity], result of:
            0.025698263 = score(doc=2450,freq=2.0), product of:
              0.055775736 = queryWeight, product of:
                1.2138898 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013221898 = queryNorm
              0.4607427 = fieldWeight in 2450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.09375 = fieldNorm(doc=2450)
          0.018285902 = weight(abstract_txt:results in 2450) [ClassicSimilarity], result of:
            0.018285902 = score(doc=2450,freq=1.0), product of:
              0.05600976 = queryWeight, product of:
                1.2164338 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.013221898 = queryNorm
              0.32647708 = fieldWeight in 2450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.09375 = fieldNorm(doc=2450)
          0.09446925 = weight(abstract_txt:retrieve in 2450) [ClassicSimilarity], result of:
            0.09446925 = score(doc=2450,freq=1.0), product of:
              0.16738367 = queryWeight, product of:
                2.1028736 = boost
                6.0201335 = idf(docFreq=291, maxDocs=44218)
                0.013221898 = queryNorm
              0.5643875 = fieldWeight in 2450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0201335 = idf(docFreq=291, maxDocs=44218)
                0.09375 = fieldNorm(doc=2450)
          0.49437854 = weight(abstract_txt:passages in 2450) [ClassicSimilarity], result of:
            0.49437854 = score(doc=2450,freq=4.0), product of:
              0.317838 = queryWeight, product of:
                2.8977408 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.013221898 = queryNorm
              1.5554419 = fieldWeight in 2450, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.09375 = fieldNorm(doc=2450)
          0.10123195 = weight(abstract_txt:text in 2450) [ClassicSimilarity], result of:
            0.10123195 = score(doc=2450,freq=2.0), product of:
              0.18881413 = queryWeight, product of:
                3.5313754 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013221898 = queryNorm
              0.53614604 = fieldWeight in 2450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=2450)
          0.5181744 = weight(abstract_txt:passage in 2450) [ClassicSimilarity], result of:
            0.5181744 = score(doc=2450,freq=2.0), product of:
              0.47299567 = queryWeight, product of:
                4.329431 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.013221898 = queryNorm
              1.0955162 = fieldWeight in 2450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.09375 = fieldNorm(doc=2450)
        0.28 = coord(7/25)
    
  2. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.29
    0.29092962 = sum of:
      0.29092962 = product of:
        1.454648 = sum of:
          0.008183033 = weight(abstract_txt:based in 2765) [ClassicSimilarity], result of:
            0.008183033 = score(doc=2765,freq=1.0), product of:
              0.046937265 = queryWeight, product of:
                1.1135648 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.013221898 = queryNorm
              0.1743398 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.023702307 = weight(abstract_txt:retrieval in 2765) [ClassicSimilarity], result of:
            0.023702307 = score(doc=2765,freq=5.0), product of:
              0.055775736 = queryWeight, product of:
                1.2138898 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013221898 = queryNorm
              0.4249573 = fieldWeight in 2765, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.5395236 = weight(abstract_txt:passages in 2765) [ClassicSimilarity], result of:
            0.5395236 = score(doc=2765,freq=14.0), product of:
              0.317838 = queryWeight, product of:
                2.8977408 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.013221898 = queryNorm
              1.6974797 = fieldWeight in 2765, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.0835121 = weight(abstract_txt:text in 2765) [ClassicSimilarity], result of:
            0.0835121 = score(doc=2765,freq=4.0), product of:
              0.18881413 = queryWeight, product of:
                3.5313754 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013221898 = queryNorm
              0.4422979 = fieldWeight in 2765, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.799727 = weight(abstract_txt:passage in 2765) [ClassicSimilarity], result of:
            0.799727 = score(doc=2765,freq=14.0), product of:
              0.47299567 = queryWeight, product of:
                4.329431 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.013221898 = queryNorm
              1.6907703 = fieldWeight in 2765, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
        0.2 = coord(5/25)
    
  3. Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.23
    0.22865348 = sum of:
      0.22865348 = product of:
        0.95272285 = sum of:
          0.013225778 = weight(abstract_txt:based in 1107) [ClassicSimilarity], result of:
            0.013225778 = score(doc=1107,freq=2.0), product of:
              0.046937265 = queryWeight, product of:
                1.1135648 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.013221898 = queryNorm
              0.28177565 = fieldWeight in 1107, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=1107)
          0.012114278 = weight(abstract_txt:retrieval in 1107) [ClassicSimilarity], result of:
            0.012114278 = score(doc=1107,freq=1.0), product of:
              0.055775736 = queryWeight, product of:
                1.2138898 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013221898 = queryNorm
              0.21719621 = fieldWeight in 1107, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=1107)
          0.23305227 = weight(abstract_txt:passages in 1107) [ClassicSimilarity], result of:
            0.23305227 = score(doc=1107,freq=2.0), product of:
              0.317838 = queryWeight, product of:
                2.8977408 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.013221898 = queryNorm
              0.7332423 = fieldWeight in 1107, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.0625 = fieldNorm(doc=1107)
          0.13497594 = weight(abstract_txt:text in 1107) [ClassicSimilarity], result of:
            0.13497594 = score(doc=1107,freq=8.0), product of:
              0.18881413 = queryWeight, product of:
                3.5313754 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013221898 = queryNorm
              0.7148614 = fieldWeight in 1107, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=1107)
          0.3454496 = weight(abstract_txt:passage in 1107) [ClassicSimilarity], result of:
            0.3454496 = score(doc=1107,freq=2.0), product of:
              0.47299567 = queryWeight, product of:
                4.329431 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.013221898 = queryNorm
              0.7303441 = fieldWeight in 1107, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0625 = fieldNorm(doc=1107)
          0.21390496 = weight(abstract_txt:technique in 1107) [ClassicSimilarity], result of:
            0.21390496 = score(doc=1107,freq=2.0), product of:
              0.4329369 = queryWeight, product of:
                5.857733 = boost
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.013221898 = queryNorm
              0.49407884 = fieldWeight in 1107, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.0625 = fieldNorm(doc=1107)
        0.24 = coord(6/25)
    
  4. Lioma, C.; Ounis, I.: ¬A syntactically-based query reformulation technique for information retrieval (2008) 0.21
    0.21409963 = sum of:
      0.21409963 = product of:
        0.66906136 = sum of:
          0.011572556 = weight(abstract_txt:based in 2031) [ClassicSimilarity], result of:
            0.011572556 = score(doc=2031,freq=2.0), product of:
              0.046937265 = queryWeight, product of:
                1.1135648 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.013221898 = queryNorm
              0.24655369 = fieldWeight in 2031, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2031)
          0.029981308 = weight(abstract_txt:retrieval in 2031) [ClassicSimilarity], result of:
            0.029981308 = score(doc=2031,freq=8.0), product of:
              0.055775736 = queryWeight, product of:
                1.2138898 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013221898 = queryNorm
              0.53753316 = fieldWeight in 2031, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2031)
          0.015085099 = weight(abstract_txt:results in 2031) [ClassicSimilarity], result of:
            0.015085099 = score(doc=2031,freq=2.0), product of:
              0.05600976 = queryWeight, product of:
                1.2164338 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.013221898 = queryNorm
              0.26932985 = fieldWeight in 2031, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2031)
          0.028569855 = weight(abstract_txt:effective in 2031) [ClassicSimilarity], result of:
            0.028569855 = score(doc=2031,freq=1.0), product of:
              0.10802234 = queryWeight, product of:
                1.6893258 = boost
                4.8362236 = idf(docFreq=953, maxDocs=44218)
                0.013221898 = queryNorm
              0.26448098 = fieldWeight in 2031, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8362236 = idf(docFreq=953, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2031)
          0.07316515 = weight(abstract_txt:size in 2031) [ClassicSimilarity], result of:
            0.07316515 = score(doc=2031,freq=2.0), product of:
              0.160485 = queryWeight, product of:
                2.059083 = boost
                5.8947687 = idf(docFreq=330, maxDocs=44218)
                0.013221898 = queryNorm
              0.45590025 = fieldWeight in 2031, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8947687 = idf(docFreq=330, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2031)
          0.04175605 = weight(abstract_txt:text in 2031) [ClassicSimilarity], result of:
            0.04175605 = score(doc=2031,freq=1.0), product of:
              0.18881413 = queryWeight, product of:
                3.5313754 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013221898 = queryNorm
              0.22114895 = fieldWeight in 2031, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2031)
          0.20423742 = weight(abstract_txt:probabilistic in 2031) [ClassicSimilarity], result of:
            0.20423742 = score(doc=2031,freq=3.0), product of:
              0.31816697 = queryWeight, product of:
                3.5508294 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.013221898 = queryNorm
              0.64191896 = fieldWeight in 2031, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2031)
          0.26469392 = weight(abstract_txt:technique in 2031) [ClassicSimilarity], result of:
            0.26469392 = score(doc=2031,freq=4.0), product of:
              0.4329369 = queryWeight, product of:
                5.857733 = boost
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.013221898 = queryNorm
              0.6113914 = fieldWeight in 2031, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2031)
        0.32 = coord(8/25)
    
  5. Yang, C.C.; Li, K.W.: ¬A heuristic method based on a statistical approach for chinese text segmentation (2005) 0.19
    0.18832035 = sum of:
      0.18832035 = product of:
        0.78466815 = sum of:
          0.013225778 = weight(abstract_txt:based in 4580) [ClassicSimilarity], result of:
            0.013225778 = score(doc=4580,freq=2.0), product of:
              0.046937265 = queryWeight, product of:
                1.1135648 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.013221898 = queryNorm
              0.28177565 = fieldWeight in 4580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.012114278 = weight(abstract_txt:retrieval in 4580) [ClassicSimilarity], result of:
            0.012114278 = score(doc=4580,freq=1.0), product of:
              0.055775736 = queryWeight, product of:
                1.2138898 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013221898 = queryNorm
              0.21719621 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.049047433 = weight(abstract_txt:affects in 4580) [ClassicSimilarity], result of:
            0.049047433 = score(doc=4580,freq=1.0), product of:
              0.112455614 = queryWeight, product of:
                1.2187992 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.013221898 = queryNorm
              0.43614927 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.43276858 = weight(abstract_txt:segmentation in 4580) [ClassicSimilarity], result of:
            0.43276858 = score(doc=4580,freq=9.0), product of:
              0.29085058 = queryWeight, product of:
                2.7719896 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.013221898 = queryNorm
              1.4879413 = fieldWeight in 4580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.12625842 = weight(abstract_txt:text in 4580) [ClassicSimilarity], result of:
            0.12625842 = score(doc=4580,freq=7.0), product of:
              0.18881413 = queryWeight, product of:
                3.5313754 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013221898 = queryNorm
              0.6686916 = fieldWeight in 4580, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.15125366 = weight(abstract_txt:technique in 4580) [ClassicSimilarity], result of:
            0.15125366 = score(doc=4580,freq=1.0), product of:
              0.4329369 = queryWeight, product of:
                5.857733 = boost
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.013221898 = queryNorm
              0.34936652 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5898643 = idf(docFreq=448, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
        0.24 = coord(6/25)