Document (#40964)

Author
Toraman, C.
Can, F.
Title
Discovering story chains : a framework based on zigzagged search and news actors
Source
Journal of the Association for Information Science and Technology. 68(2017) no.12, S.2795-2808
Year
2017
Abstract
A story chain is a set of related news articles that reveal how different events are connected. This study presents a framework for discovering story chains, given an input document, in a text collection. The framework has 3 complementary parts that i) scan the collection, ii) measure the similarity between chain-member candidates and the chain, and iii) measure similarity among news articles. For scanning, we apply a novel text-mining method that uses a zigzagged search that reinvestigates past documents based on the updated chain. We also utilize social networks of news actors to reveal connections among news articles. We conduct 2 user studies in terms of 4 effectiveness measures-relevance, coverage, coherence, and ability to disclose relations. The first user study compares several versions of the framework, by varying parameters, to set a guideline for use. The second compares the framework with 3 baselines. The results show that our method provides statistically significant improvement in effectiveness in 61% of pairwise comparisons, with medium or large effect size; in the remainder, none of the baselines significantly outperforms our method.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23885/full.
Field
Kommunikationswissenschaften

Similar documents (content)

  1. Zhao, X.; Jin, P.; Yue, L.: Discovering topic time from web news (2015) 0.19
    0.19339141 = sum of:
      0.19339141 = product of:
        0.6906836 = sum of:
          0.016798098 = weight(abstract_txt:text in 2673) [ClassicSimilarity], result of:
            0.016798098 = score(doc=2673,freq=1.0), product of:
              0.06646351 = queryWeight, product of:
                1.0130795 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.016223438 = queryNorm
              0.25274166 = fieldWeight in 2673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.033664003 = weight(abstract_txt:effectiveness in 2673) [ClassicSimilarity], result of:
            0.033664003 = score(doc=2673,freq=1.0), product of:
              0.105646156 = queryWeight, product of:
                1.2772584 = boost
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.016223438 = queryNorm
              0.31864864 = fieldWeight in 2673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.04083391 = weight(abstract_txt:measure in 2673) [ClassicSimilarity], result of:
            0.04083391 = score(doc=2673,freq=1.0), product of:
              0.12015924 = queryWeight, product of:
                1.3621675 = boost
                5.437306 = idf(docFreq=522, maxDocs=44218)
                0.016223438 = queryNorm
              0.33983162 = fieldWeight in 2673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.437306 = idf(docFreq=522, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.011947612 = weight(abstract_txt:that in 2673) [ClassicSimilarity], result of:
            0.011947612 = score(doc=2673,freq=2.0), product of:
              0.05704715 = queryWeight, product of:
                1.4840171 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.016223438 = queryNorm
              0.20943399 = fieldWeight in 2673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.14760077 = weight(abstract_txt:discovering in 2673) [ClassicSimilarity], result of:
            0.14760077 = score(doc=2673,freq=2.0), product of:
              0.22462545 = queryWeight, product of:
                1.8624362 = boost
                7.4342074 = idf(docFreq=70, maxDocs=44218)
                0.016223438 = queryNorm
              0.6570973 = fieldWeight in 2673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.4342074 = idf(docFreq=70, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.08464812 = weight(abstract_txt:framework in 2673) [ClassicSimilarity], result of:
            0.08464812 = score(doc=2673,freq=2.0), product of:
              0.21043828 = queryWeight, product of:
                2.8502588 = boost
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.016223438 = queryNorm
              0.40224677 = fieldWeight in 2673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.35519108 = weight(abstract_txt:news in 2673) [ClassicSimilarity], result of:
            0.35519108 = score(doc=2673,freq=7.0), product of:
              0.36057746 = queryWeight, product of:
                3.7309654 = boost
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.016223438 = queryNorm
              0.9850618 = fieldWeight in 2673, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
        0.28 = coord(7/25)
    
  2. Bounhas, I.; Elayeb, B.; Evrard, F.; Slimani, Y.: Toward a computer study of the reliability of Arabic stories (2010) 0.15
    0.15444136 = sum of:
      0.15444136 = product of:
        0.7722068 = sum of:
          0.04083391 = weight(abstract_txt:measure in 3709) [ClassicSimilarity], result of:
            0.04083391 = score(doc=3709,freq=1.0), product of:
              0.12015924 = queryWeight, product of:
                1.3621675 = boost
                5.437306 = idf(docFreq=522, maxDocs=44218)
                0.016223438 = queryNorm
              0.33983162 = fieldWeight in 3709, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.437306 = idf(docFreq=522, maxDocs=44218)
                0.0625 = fieldNorm(doc=3709)
          0.011947612 = weight(abstract_txt:that in 3709) [ClassicSimilarity], result of:
            0.011947612 = score(doc=3709,freq=2.0), product of:
              0.05704715 = queryWeight, product of:
                1.4840171 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.016223438 = queryNorm
              0.20943399 = fieldWeight in 3709, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=3709)
          0.26810953 = weight(abstract_txt:chains in 3709) [ClassicSimilarity], result of:
            0.26810953 = score(doc=3709,freq=3.0), product of:
              0.29213098 = queryWeight, product of:
                2.1239326 = boost
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.016223438 = queryNorm
              0.91777164 = fieldWeight in 3709, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.0625 = fieldNorm(doc=3709)
          0.2583091 = weight(abstract_txt:story in 3709) [ClassicSimilarity], result of:
            0.2583091 = score(doc=3709,freq=3.0), product of:
              0.3262068 = queryWeight, product of:
                2.7488058 = boost
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.016223438 = queryNorm
              0.7918569 = fieldWeight in 3709, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.0625 = fieldNorm(doc=3709)
          0.19300666 = weight(abstract_txt:chain in 3709) [ClassicSimilarity], result of:
            0.19300666 = score(doc=3709,freq=1.0), product of:
              0.42638448 = queryWeight, product of:
                3.6288383 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.016223438 = queryNorm
              0.45265874 = fieldWeight in 3709, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.0625 = fieldNorm(doc=3709)
        0.2 = coord(5/25)
    
  3. Ou, S.; Khoo, C.S.G.; Goh, D.H.: Multi-document summarization of news articles using an event-based framework (2006) 0.14
    0.14119066 = sum of:
      0.14119066 = product of:
        0.7059533 = sum of:
          0.011947612 = weight(abstract_txt:that in 657) [ClassicSimilarity], result of:
            0.011947612 = score(doc=657,freq=2.0), product of:
              0.05704715 = queryWeight, product of:
                1.4840171 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.016223438 = queryNorm
              0.20943399 = fieldWeight in 657, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
          0.034743488 = weight(abstract_txt:method in 657) [ClassicSimilarity], result of:
            0.034743488 = score(doc=657,freq=1.0), product of:
              0.12350634 = queryWeight, product of:
                1.6913838 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.016223438 = queryNorm
              0.28130937 = fieldWeight in 657, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
          0.11025068 = weight(abstract_txt:articles in 657) [ClassicSimilarity], result of:
            0.11025068 = score(doc=657,freq=7.0), product of:
              0.13942109 = queryWeight, product of:
                1.7970567 = boost
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.016223438 = queryNorm
              0.79077476 = fieldWeight in 657, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
          0.16929623 = weight(abstract_txt:framework in 657) [ClassicSimilarity], result of:
            0.16929623 = score(doc=657,freq=8.0), product of:
              0.21043828 = queryWeight, product of:
                2.8502588 = boost
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.016223438 = queryNorm
              0.80449355 = fieldWeight in 657, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
          0.37971526 = weight(abstract_txt:news in 657) [ClassicSimilarity], result of:
            0.37971526 = score(doc=657,freq=8.0), product of:
              0.36057746 = queryWeight, product of:
                3.7309654 = boost
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.016223438 = queryNorm
              1.0530754 = fieldWeight in 657, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
        0.2 = coord(5/25)
    
  4. Xianghao, G.; Yixin, Z.; Li, Y.: ¬A new method of news test understanding and abstracting based on speech acts theory (1998) 0.13
    0.13306078 = sum of:
      0.13306078 = product of:
        0.8316299 = sum of:
          0.014784417 = weight(abstract_txt:that in 3532) [ClassicSimilarity], result of:
            0.014784417 = score(doc=3532,freq=1.0), product of:
              0.05704715 = queryWeight, product of:
                1.4840171 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.016223438 = queryNorm
              0.25916135 = fieldWeight in 3532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.109375 = fieldNorm(doc=3532)
          0.08598575 = weight(abstract_txt:method in 3532) [ClassicSimilarity], result of:
            0.08598575 = score(doc=3532,freq=2.0), product of:
              0.12350634 = queryWeight, product of:
                1.6913838 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.016223438 = queryNorm
              0.69620514 = fieldWeight in 3532, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.109375 = fieldNorm(doc=3532)
          0.26098597 = weight(abstract_txt:story in 3532) [ClassicSimilarity], result of:
            0.26098597 = score(doc=3532,freq=1.0), product of:
              0.3262068 = queryWeight, product of:
                2.7488058 = boost
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.016223438 = queryNorm
              0.8000629 = fieldWeight in 3532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.109375 = fieldNorm(doc=3532)
          0.4698737 = weight(abstract_txt:news in 3532) [ClassicSimilarity], result of:
            0.4698737 = score(doc=3532,freq=4.0), product of:
              0.36057746 = queryWeight, product of:
                3.7309654 = boost
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.016223438 = queryNorm
              1.3031144 = fieldWeight in 3532, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.109375 = fieldNorm(doc=3532)
        0.16 = coord(4/25)
    
  5. Lehmann, J.; Castillo, C.; Lalmas, M.; Baeza-Yates, R.: Story-focused reading in online news and its potential for user engagement (2017) 0.13
    0.13013843 = sum of:
      0.13013843 = product of:
        0.8133652 = sum of:
          0.020693872 = weight(abstract_txt:that in 3529) [ClassicSimilarity], result of:
            0.020693872 = score(doc=3529,freq=6.0), product of:
              0.05704715 = queryWeight, product of:
                1.4840171 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.016223438 = queryNorm
              0.36275032 = fieldWeight in 3529, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=3529)
          0.07217602 = weight(abstract_txt:articles in 3529) [ClassicSimilarity], result of:
            0.07217602 = score(doc=3529,freq=3.0), product of:
              0.13942109 = queryWeight, product of:
                1.7970567 = boost
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.016223438 = queryNorm
              0.5176836 = fieldWeight in 3529, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.0625 = fieldNorm(doc=3529)
          0.36530426 = weight(abstract_txt:story in 3529) [ClassicSimilarity], result of:
            0.36530426 = score(doc=3529,freq=6.0), product of:
              0.3262068 = queryWeight, product of:
                2.7488058 = boost
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.016223438 = queryNorm
              1.1198548 = fieldWeight in 3529, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.0625 = fieldNorm(doc=3529)
          0.35519108 = weight(abstract_txt:news in 3529) [ClassicSimilarity], result of:
            0.35519108 = score(doc=3529,freq=7.0), product of:
              0.36057746 = queryWeight, product of:
                3.7309654 = boost
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.016223438 = queryNorm
              0.9850618 = fieldWeight in 3529, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.0625 = fieldNorm(doc=3529)
        0.16 = coord(4/25)