Document (#36470)

Author
Cao, Y.
Duan, H.
Lin, C.-L.
Yu, Y.
Title
Re-ranking question search results by clustering questions
Source
Journal of the American Society for Information Science and Technology. 62(2011) no.6, S.1177-1187
Year
2011
Abstract
In this article, we address the problem of question clustering and study its use for re-ranking question search results. In question clustering we have to organize question search results into certain meaningful and condensed groups. Specifically, we propose to use a data structure consisting of question topic and question focus for modeling questions, and then cluster questions on the basis of the data structure. Experimental results show that our approach to question clustering improves the performance of question search significantly better than the approach not utilizing the topic-focus structure.

Similar documents (content)

  1. Spink, A.; Ozmultu, H.C.: Characteristics of question format web queries : an exploratory study (2002) 0.32
    0.32004598 = sum of:
      0.32004598 = product of:
        1.1430213 = sum of:
          0.01284762 = weight(abstract_txt:data in 3910) [ClassicSimilarity], result of:
            0.01284762 = score(doc=3910,freq=1.0), product of:
              0.061612856 = queryWeight, product of:
                1.2960042 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.014249302 = queryNorm
              0.20852174 = fieldWeight in 3910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=3910)
          0.044878725 = weight(abstract_txt:topic in 3910) [ClassicSimilarity], result of:
            0.044878725 = score(doc=3910,freq=1.0), product of:
              0.14184582 = queryWeight, product of:
                1.9664323 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.014249302 = queryNorm
              0.31639087 = fieldWeight in 3910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.0625 = fieldNorm(doc=3910)
          0.060740225 = weight(abstract_txt:structure in 3910) [ClassicSimilarity], result of:
            0.060740225 = score(doc=3910,freq=2.0), product of:
              0.1576864 = queryWeight, product of:
                2.5392969 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.014249302 = queryNorm
              0.38519636 = fieldWeight in 3910, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=3910)
          0.02922018 = weight(abstract_txt:results in 3910) [ClassicSimilarity], result of:
            0.02922018 = score(doc=3910,freq=1.0), product of:
              0.13425222 = queryWeight, product of:
                2.7054935 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.014249302 = queryNorm
              0.21765138 = fieldWeight in 3910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=3910)
          0.082959116 = weight(abstract_txt:search in 3910) [ClassicSimilarity], result of:
            0.082959116 = score(doc=3910,freq=6.0), product of:
              0.14813529 = queryWeight, product of:
                2.8419406 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.014249302 = queryNorm
              0.56002265 = fieldWeight in 3910, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=3910)
          0.060844734 = weight(abstract_txt:questions in 3910) [ClassicSimilarity], result of:
            0.060844734 = score(doc=3910,freq=1.0), product of:
              0.19890024 = queryWeight, product of:
                2.8518982 = boost
                4.8944926 = idf(docFreq=899, maxDocs=44218)
                0.014249302 = queryNorm
              0.3059058 = fieldWeight in 3910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8944926 = idf(docFreq=899, maxDocs=44218)
                0.0625 = fieldNorm(doc=3910)
          0.85153073 = weight(abstract_txt:question in 3910) [ClassicSimilarity], result of:
            0.85153073 = score(doc=3910,freq=15.0), product of:
              0.6755075 = queryWeight, product of:
                9.103158 = boost
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.014249302 = queryNorm
              1.2605792 = fieldWeight in 3910, product of:
                3.8729835 = tf(freq=15.0), with freq of:
                  15.0 = termFreq=15.0
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.0625 = fieldNorm(doc=3910)
        0.28 = coord(7/25)
    
  2. Chen, L.-C.: Next generation search engine for the result clustering technology (2012) 0.25
    0.24556147 = sum of:
      0.24556147 = product of:
        0.8770052 = sum of:
          0.034007315 = weight(abstract_txt:experimental in 105) [ClassicSimilarity], result of:
            0.034007315 = score(doc=105,freq=1.0), product of:
              0.080640726 = queryWeight, product of:
                1.0484143 = boost
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.014249302 = queryNorm
              0.4217139 = fieldWeight in 105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.078125 = fieldNorm(doc=105)
          0.036061868 = weight(abstract_txt:significantly in 105) [ClassicSimilarity], result of:
            0.036061868 = score(doc=105,freq=1.0), product of:
              0.08385681 = queryWeight, product of:
                1.0691162 = boost
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.014249302 = queryNorm
              0.430041 = fieldWeight in 105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.078125 = fieldNorm(doc=105)
          0.075761035 = weight(abstract_txt:organize in 105) [ClassicSimilarity], result of:
            0.075761035 = score(doc=105,freq=2.0), product of:
              0.109175906 = queryWeight, product of:
                1.2198857 = boost
                6.280787 = idf(docFreq=224, maxDocs=44218)
                0.014249302 = queryNorm
              0.69393545 = fieldWeight in 105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.280787 = idf(docFreq=224, maxDocs=44218)
                0.078125 = fieldNorm(doc=105)
          0.07592528 = weight(abstract_txt:structure in 105) [ClassicSimilarity], result of:
            0.07592528 = score(doc=105,freq=2.0), product of:
              0.1576864 = queryWeight, product of:
                2.5392969 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.014249302 = queryNorm
              0.48149544 = fieldWeight in 105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.078125 = fieldNorm(doc=105)
          0.073050454 = weight(abstract_txt:results in 105) [ClassicSimilarity], result of:
            0.073050454 = score(doc=105,freq=4.0), product of:
              0.13425222 = queryWeight, product of:
                2.7054935 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.014249302 = queryNorm
              0.5441285 = fieldWeight in 105, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.078125 = fieldNorm(doc=105)
          0.07332619 = weight(abstract_txt:search in 105) [ClassicSimilarity], result of:
            0.07332619 = score(doc=105,freq=3.0), product of:
              0.14813529 = queryWeight, product of:
                2.8419406 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.014249302 = queryNorm
              0.49499476 = fieldWeight in 105, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.078125 = fieldNorm(doc=105)
          0.50887305 = weight(abstract_txt:clustering in 105) [ClassicSimilarity], result of:
            0.50887305 = score(doc=105,freq=6.0), product of:
              0.42777503 = queryWeight, product of:
                4.829403 = boost
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.014249302 = queryNorm
              1.189581 = fieldWeight in 105, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.078125 = fieldNorm(doc=105)
        0.28 = coord(7/25)
    
  3. Le, L.T.; Shah, C.: Retrieving people : identifying potential answerers in Community Question-Answering (2018) 0.22
    0.2199341 = sum of:
      0.2199341 = product of:
        0.78547895 = sum of:
          0.02720585 = weight(abstract_txt:experimental in 4467) [ClassicSimilarity], result of:
            0.02720585 = score(doc=4467,freq=1.0), product of:
              0.080640726 = queryWeight, product of:
                1.0484143 = boost
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.014249302 = queryNorm
              0.3373711 = fieldWeight in 4467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.025703557 = weight(abstract_txt:approach in 4467) [ClassicSimilarity], result of:
            0.025703557 = score(doc=4467,freq=2.0), product of:
              0.07764409 = queryWeight, product of:
                1.4548725 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.014249302 = queryNorm
              0.33104333 = fieldWeight in 4467, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.044878725 = weight(abstract_txt:topic in 4467) [ClassicSimilarity], result of:
            0.044878725 = score(doc=4467,freq=1.0), product of:
              0.14184582 = queryWeight, product of:
                1.9664323 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.014249302 = queryNorm
              0.31639087 = fieldWeight in 4467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.02922018 = weight(abstract_txt:results in 4467) [ClassicSimilarity], result of:
            0.02922018 = score(doc=4467,freq=1.0), product of:
              0.13425222 = queryWeight, product of:
                2.7054935 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.014249302 = queryNorm
              0.21765138 = fieldWeight in 4467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.033867918 = weight(abstract_txt:search in 4467) [ClassicSimilarity], result of:
            0.033867918 = score(doc=4467,freq=1.0), product of:
              0.14813529 = queryWeight, product of:
                2.8419406 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.014249302 = queryNorm
              0.22862828 = fieldWeight in 4467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.08604745 = weight(abstract_txt:questions in 4467) [ClassicSimilarity], result of:
            0.08604745 = score(doc=4467,freq=2.0), product of:
              0.19890024 = queryWeight, product of:
                2.8518982 = boost
                4.8944926 = idf(docFreq=899, maxDocs=44218)
                0.014249302 = queryNorm
              0.4326161 = fieldWeight in 4467, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8944926 = idf(docFreq=899, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
          0.53855526 = weight(abstract_txt:question in 4467) [ClassicSimilarity], result of:
            0.53855526 = score(doc=4467,freq=6.0), product of:
              0.6755075 = queryWeight, product of:
                9.103158 = boost
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.014249302 = queryNorm
              0.7972603 = fieldWeight in 4467, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.0625 = fieldNorm(doc=4467)
        0.28 = coord(7/25)
    
  4. Luo, Z.; Yu, Y.; Osborne, M.; Wang, T.: Structuring tweets for improving Twitter search (2015) 0.21
    0.21404703 = sum of:
      0.21404703 = product of:
        0.53511757 = sum of:
          0.02720585 = weight(abstract_txt:experimental in 2335) [ClassicSimilarity], result of:
            0.02720585 = score(doc=2335,freq=1.0), product of:
              0.080640726 = queryWeight, product of:
                1.0484143 = boost
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.014249302 = queryNorm
              0.3373711 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
          0.028849494 = weight(abstract_txt:significantly in 2335) [ClassicSimilarity], result of:
            0.028849494 = score(doc=2335,freq=1.0), product of:
              0.08385681 = queryWeight, product of:
                1.0691162 = boost
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.014249302 = queryNorm
              0.3440328 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
          0.037611354 = weight(abstract_txt:modeling in 2335) [ClassicSimilarity], result of:
            0.037611354 = score(doc=2335,freq=1.0), product of:
              0.100074984 = queryWeight, product of:
                1.1679345 = boost
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.014249302 = queryNorm
              0.37583172 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
          0.052494798 = weight(abstract_txt:improves in 2335) [ClassicSimilarity], result of:
            0.052494798 = score(doc=2335,freq=1.0), product of:
              0.12498476 = queryWeight, product of:
                1.3052217 = boost
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.014249302 = queryNorm
              0.42000958 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
          0.01817516 = weight(abstract_txt:approach in 2335) [ClassicSimilarity], result of:
            0.01817516 = score(doc=2335,freq=1.0), product of:
              0.07764409 = queryWeight, product of:
                1.4548725 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.014249302 = queryNorm
              0.234083 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
          0.044878725 = weight(abstract_txt:topic in 2335) [ClassicSimilarity], result of:
            0.044878725 = score(doc=2335,freq=1.0), product of:
              0.14184582 = queryWeight, product of:
                1.9664323 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.014249302 = queryNorm
              0.31639087 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
          0.042949826 = weight(abstract_txt:structure in 2335) [ClassicSimilarity], result of:
            0.042949826 = score(doc=2335,freq=1.0), product of:
              0.1576864 = queryWeight, product of:
                2.5392969 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.014249302 = queryNorm
              0.27237496 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
          0.02922018 = weight(abstract_txt:results in 2335) [ClassicSimilarity], result of:
            0.02922018 = score(doc=2335,freq=1.0), product of:
              0.13425222 = queryWeight, product of:
                2.7054935 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.014249302 = queryNorm
              0.21765138 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
          0.033867918 = weight(abstract_txt:search in 2335) [ClassicSimilarity], result of:
            0.033867918 = score(doc=2335,freq=1.0), product of:
              0.14813529 = queryWeight, product of:
                2.8419406 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.014249302 = queryNorm
              0.22862828 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
          0.21986426 = weight(abstract_txt:question in 2335) [ClassicSimilarity], result of:
            0.21986426 = score(doc=2335,freq=1.0), product of:
              0.6755075 = queryWeight, product of:
                9.103158 = boost
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.014249302 = queryNorm
              0.32548013 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.0625 = fieldNorm(doc=2335)
        0.4 = coord(10/25)
    
  5. Bae, K.; Ko, Y.: Improving question retrieval in community question answering service using dependency relations and question classification (2019) 0.21
    0.21092309 = sum of:
      0.21092309 = product of:
        0.75329673 = sum of:
          0.02360823 = weight(abstract_txt:propose in 5412) [ClassicSimilarity], result of:
            0.02360823 = score(doc=5412,freq=1.0), product of:
              0.07336493 = queryWeight, product of:
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.014249302 = queryNorm
              0.32179177 = fieldWeight in 5412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0625 = fieldNorm(doc=5412)
          0.02720585 = weight(abstract_txt:experimental in 5412) [ClassicSimilarity], result of:
            0.02720585 = score(doc=5412,freq=1.0), product of:
              0.080640726 = queryWeight, product of:
                1.0484143 = boost
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.014249302 = queryNorm
              0.3373711 = fieldWeight in 5412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.0625 = fieldNorm(doc=5412)
          0.028849494 = weight(abstract_txt:significantly in 5412) [ClassicSimilarity], result of:
            0.028849494 = score(doc=5412,freq=1.0), product of:
              0.08385681 = queryWeight, product of:
                1.0691162 = boost
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.014249302 = queryNorm
              0.3440328 = fieldWeight in 5412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.0625 = fieldNorm(doc=5412)
          0.01817516 = weight(abstract_txt:approach in 5412) [ClassicSimilarity], result of:
            0.01817516 = score(doc=5412,freq=1.0), product of:
              0.07764409 = queryWeight, product of:
                1.4548725 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.014249302 = queryNorm
              0.234083 = fieldWeight in 5412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=5412)
          0.05844036 = weight(abstract_txt:results in 5412) [ClassicSimilarity], result of:
            0.05844036 = score(doc=5412,freq=4.0), product of:
              0.13425222 = queryWeight, product of:
                2.7054935 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.014249302 = queryNorm
              0.43530276 = fieldWeight in 5412, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=5412)
          0.10538617 = weight(abstract_txt:questions in 5412) [ClassicSimilarity], result of:
            0.10538617 = score(doc=5412,freq=3.0), product of:
              0.19890024 = queryWeight, product of:
                2.8518982 = boost
                4.8944926 = idf(docFreq=899, maxDocs=44218)
                0.014249302 = queryNorm
              0.52984434 = fieldWeight in 5412, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.8944926 = idf(docFreq=899, maxDocs=44218)
                0.0625 = fieldNorm(doc=5412)
          0.49163145 = weight(abstract_txt:question in 5412) [ClassicSimilarity], result of:
            0.49163145 = score(doc=5412,freq=5.0), product of:
              0.6755075 = queryWeight, product of:
                9.103158 = boost
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.014249302 = queryNorm
              0.7277957 = fieldWeight in 5412, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.207682 = idf(docFreq=657, maxDocs=44218)
                0.0625 = fieldNorm(doc=5412)
        0.28 = coord(7/25)