Document (#32915)

Author
Lin, J.
Title
User simulations for evaluating answers to question series
Source
Information processing and management. 43(2007) no.3, S.717-729
Year
2007
Abstract
Recently, question series have become one focus of research in question answering. These series are comprised of individual factoid, list, and "other" questions organized around a central topic, and represent abstractions of user-system dialogs. Existing evaluation methodologies have yet to catch up with this richer task model, as they fail to take into account contextual dependencies and different user behaviors. This paper presents a novel simulation-based methodology for evaluating answers to question series that addresses some of these shortcomings. Using this methodology, we examine two different behavior models: a "QA-styled" user and an "IR-styled" user. Results suggest that an off-the-shelf document retrieval system is competitive with state-of-the-art QA systems in this task. Advantages and limitations of evaluations based on user simulations are also discussed.
Footnote
Beitrag in: Special issue on Heterogeneous and Distributed IR

Similar documents (content)

  1. Lin, J.; Katz, B.: Building a reusable test collection for question answering (2006) 0.26
    0.2564976 = sum of:
      0.2564976 = product of:
        0.7124933 = sum of:
          0.13356422 = weight(abstract_txt:answering in 5045) [ClassicSimilarity], result of:
            0.13356422 = score(doc=5045,freq=4.0), product of:
              0.12809983 = queryWeight, product of:
                1.0251179 = boost
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.018726353 = queryNorm
              1.0426573 = fieldWeight in 5045, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.078125 = fieldNorm(doc=5045)
          0.024379367 = weight(abstract_txt:system in 5045) [ClassicSimilarity], result of:
            0.024379367 = score(doc=5045,freq=2.0), product of:
              0.06543198 = queryWeight, product of:
                1.0361189 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.018726353 = queryNorm
              0.372591 = fieldWeight in 5045, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.078125 = fieldNorm(doc=5045)
          0.022137359 = weight(abstract_txt:different in 5045) [ClassicSimilarity], result of:
            0.022137359 = score(doc=5045,freq=1.0), product of:
              0.077304065 = queryWeight, product of:
                1.1262004 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.018726353 = queryNorm
              0.28636733 = fieldWeight in 5045, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.078125 = fieldNorm(doc=5045)
          0.09287163 = weight(abstract_txt:shortcomings in 5045) [ClassicSimilarity], result of:
            0.09287163 = score(doc=5045,freq=1.0), product of:
              0.15959913 = queryWeight, product of:
                1.1442338 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.018726353 = queryNorm
              0.5819056 = fieldWeight in 5045, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.078125 = fieldNorm(doc=5045)
          0.04308336 = weight(abstract_txt:methodology in 5045) [ClassicSimilarity], result of:
            0.04308336 = score(doc=5045,freq=1.0), product of:
              0.120501235 = queryWeight, product of:
                1.4060808 = boost
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.018726353 = queryNorm
              0.3575346 = fieldWeight in 5045, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.078125 = fieldNorm(doc=5045)
          0.02187749 = weight(abstract_txt:this in 5045) [ClassicSimilarity], result of:
            0.02187749 = score(doc=5045,freq=3.0), product of:
              0.06700179 = queryWeight, product of:
                1.4827664 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018726353 = queryNorm
              0.32652098 = fieldWeight in 5045, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=5045)
          0.05324941 = weight(abstract_txt:task in 5045) [ClassicSimilarity], result of:
            0.05324941 = score(doc=5045,freq=1.0), product of:
              0.13878046 = queryWeight, product of:
                1.5089633 = boost
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.018726353 = queryNorm
              0.3836953 = fieldWeight in 5045, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.078125 = fieldNorm(doc=5045)
          0.25393283 = weight(abstract_txt:question in 5045) [ClassicSimilarity], result of:
            0.25393283 = score(doc=5045,freq=4.0), product of:
              0.31207168 = queryWeight, product of:
                3.2000496 = boost
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.018726353 = queryNorm
              0.8137003 = fieldWeight in 5045, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.078125 = fieldNorm(doc=5045)
          0.067397594 = weight(abstract_txt:user in 5045) [ClassicSimilarity], result of:
            0.067397594 = score(doc=5045,freq=1.0), product of:
              0.23420085 = queryWeight, product of:
                3.3952315 = boost
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.018726353 = queryNorm
              0.2877769 = fieldWeight in 5045, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.078125 = fieldNorm(doc=5045)
        0.36 = coord(9/25)
    
  2. Abacha, A.B.; Zweigenbaum, P.: MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies (2015) 0.17
    0.16777772 = sum of:
      0.16777772 = product of:
        0.59920615 = sum of:
          0.09444417 = weight(abstract_txt:answering in 2677) [ClassicSimilarity], result of:
            0.09444417 = score(doc=2677,freq=2.0), product of:
              0.12809983 = queryWeight, product of:
                1.0251179 = boost
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.018726353 = queryNorm
              0.73727006 = fieldWeight in 2677, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.078125 = fieldNorm(doc=2677)
          0.029858502 = weight(abstract_txt:system in 2677) [ClassicSimilarity], result of:
            0.029858502 = score(doc=2677,freq=3.0), product of:
              0.06543198 = queryWeight, product of:
                1.0361189 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.018726353 = queryNorm
              0.45632887 = fieldWeight in 2677, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.078125 = fieldNorm(doc=2677)
          0.017862897 = weight(abstract_txt:this in 2677) [ClassicSimilarity], result of:
            0.017862897 = score(doc=2677,freq=2.0), product of:
              0.06700179 = queryWeight, product of:
                1.4827664 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018726353 = queryNorm
              0.2666033 = fieldWeight in 2677, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=2677)
          0.05324941 = weight(abstract_txt:task in 2677) [ClassicSimilarity], result of:
            0.05324941 = score(doc=2677,freq=1.0), product of:
              0.13878046 = queryWeight, product of:
                1.5089633 = boost
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.018726353 = queryNorm
              0.3836953 = fieldWeight in 2677, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.078125 = fieldNorm(doc=2677)
          0.15683594 = weight(abstract_txt:answers in 2677) [ClassicSimilarity], result of:
            0.15683594 = score(doc=2677,freq=2.0), product of:
              0.22632831 = queryWeight, product of:
                1.92701 = boost
                6.2719374 = idf(docFreq=226, maxDocs=44218)
                0.018726353 = queryNorm
              0.6929577 = fieldWeight in 2677, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2719374 = idf(docFreq=226, maxDocs=44218)
                0.078125 = fieldNorm(doc=2677)
          0.17955764 = weight(abstract_txt:question in 2677) [ClassicSimilarity], result of:
            0.17955764 = score(doc=2677,freq=2.0), product of:
              0.31207168 = queryWeight, product of:
                3.2000496 = boost
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.018726353 = queryNorm
              0.57537305 = fieldWeight in 2677, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.078125 = fieldNorm(doc=2677)
          0.067397594 = weight(abstract_txt:user in 2677) [ClassicSimilarity], result of:
            0.067397594 = score(doc=2677,freq=1.0), product of:
              0.23420085 = queryWeight, product of:
                3.3952315 = boost
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.018726353 = queryNorm
              0.2877769 = fieldWeight in 2677, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.078125 = fieldNorm(doc=2677)
        0.28 = coord(7/25)
    
  3. Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.16
    0.16032183 = sum of:
      0.16032183 = product of:
        0.57257795 = sum of:
          0.09444417 = weight(abstract_txt:answering in 3455) [ClassicSimilarity], result of:
            0.09444417 = score(doc=3455,freq=2.0), product of:
              0.12809983 = queryWeight, product of:
                1.0251179 = boost
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.018726353 = queryNorm
              0.73727006 = fieldWeight in 3455, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.078125 = fieldNorm(doc=3455)
          0.017238814 = weight(abstract_txt:system in 3455) [ClassicSimilarity], result of:
            0.017238814 = score(doc=3455,freq=1.0), product of:
              0.06543198 = queryWeight, product of:
                1.0361189 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.018726353 = queryNorm
              0.2634616 = fieldWeight in 3455, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.078125 = fieldNorm(doc=3455)
          0.022137359 = weight(abstract_txt:different in 3455) [ClassicSimilarity], result of:
            0.022137359 = score(doc=3455,freq=1.0), product of:
              0.077304065 = queryWeight, product of:
                1.1262004 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.018726353 = queryNorm
              0.28636733 = fieldWeight in 3455, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.078125 = fieldNorm(doc=3455)
          0.012630976 = weight(abstract_txt:this in 3455) [ClassicSimilarity], result of:
            0.012630976 = score(doc=3455,freq=1.0), product of:
              0.06700179 = queryWeight, product of:
                1.4827664 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018726353 = queryNorm
              0.18851699 = fieldWeight in 3455, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=3455)
          0.11089977 = weight(abstract_txt:answers in 3455) [ClassicSimilarity], result of:
            0.11089977 = score(doc=3455,freq=1.0), product of:
              0.22632831 = queryWeight, product of:
                1.92701 = boost
                6.2719374 = idf(docFreq=226, maxDocs=44218)
                0.018726353 = queryNorm
              0.48999512 = fieldWeight in 3455, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2719374 = idf(docFreq=226, maxDocs=44218)
                0.078125 = fieldNorm(doc=3455)
          0.21991228 = weight(abstract_txt:question in 3455) [ClassicSimilarity], result of:
            0.21991228 = score(doc=3455,freq=3.0), product of:
              0.31207168 = queryWeight, product of:
                3.2000496 = boost
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.018726353 = queryNorm
              0.70468515 = fieldWeight in 3455, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.078125 = fieldNorm(doc=3455)
          0.09531459 = weight(abstract_txt:user in 3455) [ClassicSimilarity], result of:
            0.09531459 = score(doc=3455,freq=2.0), product of:
              0.23420085 = queryWeight, product of:
                3.3952315 = boost
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.018726353 = queryNorm
              0.40697798 = fieldWeight in 3455, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.078125 = fieldNorm(doc=3455)
        0.28 = coord(7/25)
    
  4. Budzik, J.; Hammond, K.: Q&A: a system for the capture, organization and reuse of expertise (1999) 0.15
    0.15301749 = sum of:
      0.15301749 = product of:
        0.546491 = sum of:
          0.06611092 = weight(abstract_txt:answering in 6668) [ClassicSimilarity], result of:
            0.06611092 = score(doc=6668,freq=2.0), product of:
              0.12809983 = queryWeight, product of:
                1.0251179 = boost
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.018726353 = queryNorm
              0.516089 = fieldWeight in 6668, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6668)
          0.017065557 = weight(abstract_txt:system in 6668) [ClassicSimilarity], result of:
            0.017065557 = score(doc=6668,freq=2.0), product of:
              0.06543198 = queryWeight, product of:
                1.0361189 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.018726353 = queryNorm
              0.26081368 = fieldWeight in 6668, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6668)
          0.015314244 = weight(abstract_txt:this in 6668) [ClassicSimilarity], result of:
            0.015314244 = score(doc=6668,freq=3.0), product of:
              0.06700179 = queryWeight, product of:
                1.4827664 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018726353 = queryNorm
              0.2285647 = fieldWeight in 6668, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6668)
          0.037274584 = weight(abstract_txt:task in 6668) [ClassicSimilarity], result of:
            0.037274584 = score(doc=6668,freq=1.0), product of:
              0.13878046 = queryWeight, product of:
                1.5089633 = boost
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.018726353 = queryNorm
              0.2685867 = fieldWeight in 6668, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6668)
          0.077629834 = weight(abstract_txt:answers in 6668) [ClassicSimilarity], result of:
            0.077629834 = score(doc=6668,freq=1.0), product of:
              0.22632831 = queryWeight, product of:
                1.92701 = boost
                6.2719374 = idf(docFreq=226, maxDocs=44218)
                0.018726353 = queryNorm
              0.34299657 = fieldWeight in 6668, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2719374 = idf(docFreq=226, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6668)
          0.25138068 = weight(abstract_txt:question in 6668) [ClassicSimilarity], result of:
            0.25138068 = score(doc=6668,freq=8.0), product of:
              0.31207168 = queryWeight, product of:
                3.2000496 = boost
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.018726353 = queryNorm
              0.8055222 = fieldWeight in 6668, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6668)
          0.08171523 = weight(abstract_txt:user in 6668) [ClassicSimilarity], result of:
            0.08171523 = score(doc=6668,freq=3.0), product of:
              0.23420085 = queryWeight, product of:
                3.3952315 = boost
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.018726353 = queryNorm
              0.34891093 = fieldWeight in 6668, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6668)
        0.28 = coord(7/25)
    
  5. Le, L.T.; Shah, C.: Retrieving people : identifying potential answerers in Community Question-Answering (2018) 0.14
    0.13812552 = sum of:
      0.13812552 = product of:
        0.575523 = sum of:
          0.05342569 = weight(abstract_txt:answering in 4467) [ClassicSimilarity], result of:
            0.05342569 = score(doc=4467,freq=1.0), product of:
              0.12809983 = queryWeight, product of:
                1.0251179 = boost
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.018726353 = queryNorm
              0.4170629 = fieldWeight in 4467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6730065 = idf(docFreq=151, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.05390591 = weight(abstract_txt:evaluations in 4467) [ClassicSimilarity], result of:
            0.05390591 = score(doc=4467,freq=1.0), product of:
              0.12886631 = queryWeight, product of:
                1.0281802 = boost
                6.6929407 = idf(docFreq=148, maxDocs=44218)
                0.018726353 = queryNorm
              0.4183088 = fieldWeight in 4467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6929407 = idf(docFreq=148, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.010104781 = weight(abstract_txt:this in 4467) [ClassicSimilarity], result of:
            0.010104781 = score(doc=4467,freq=1.0), product of:
              0.06700179 = queryWeight, product of:
                1.4827664 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018726353 = queryNorm
              0.1508136 = fieldWeight in 4467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.088719815 = weight(abstract_txt:answers in 4467) [ClassicSimilarity], result of:
            0.088719815 = score(doc=4467,freq=1.0), product of:
              0.22632831 = queryWeight, product of:
                1.92701 = boost
                6.2719374 = idf(docFreq=226, maxDocs=44218)
                0.018726353 = queryNorm
              0.3919961 = fieldWeight in 4467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2719374 = idf(docFreq=226, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.24880236 = weight(abstract_txt:question in 4467) [ClassicSimilarity], result of:
            0.24880236 = score(doc=4467,freq=6.0), product of:
              0.31207168 = queryWeight, product of:
                3.2000496 = boost
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.018726353 = queryNorm
              0.7972603 = fieldWeight in 4467, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.120564476 = weight(abstract_txt:user in 4467) [ClassicSimilarity], result of:
            0.120564476 = score(doc=4467,freq=5.0), product of:
              0.23420085 = queryWeight, product of:
                3.3952315 = boost
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.018726353 = queryNorm
              0.51479095 = fieldWeight in 4467, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
        0.24 = coord(6/25)