Document (#31117)

Author
Thelwall, M.
Prabowo, R.
Fairclough, R.
Title
Are raw RSS feeds suitable for broad issue scanning? : a science concern case study
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.12, S.1644-1654
Year
2006
Abstract
Broad issue scanning is the task of identifying important public debates arising in a given broad issue; really simple syndication (RSS) feeds are a natural information source for investigating broad issues. RSS, as originally conceived, is a method for publishing timely and concise information on the Internet, for example, about the main stories in a news site or the latest postings in a blog. RSS feeds are potentially a nonintrusive source of high-quality data about public opinion: Monitoring a large number may allow quantitative methods to extract information relevant to a given need. In this article we describe an RSS feed-based coword frequency method to identify bursts of discussion relevant to a given broad issue. A case study of public science concerns is used to demonstrate the method and assess the suitability of raw RSS feeds for broad issue scanning (i.e., without data cleansing). An attempt to identify genuine science concern debates from the corpus through investigating the top 1,000 "burst" words found only two genuine debates, however. The low success rate was mainly caused by a few pathological feeds that dominated the results and obscured any significant debates. The results point to the need to develop effective data cleansing procedures for RSS feeds, particularly if there is not a large quantity of discussion about the broad issue, and a range of potential techniques is suggested. Finally, the analysis confirmed that the time series information generated by real-time monitoring of RSS feeds could usefully illustrate the evolution of new debates relevant to a broad issue.
Object
RSS

Similar documents (author)

  1. Thelwall, M.; Thelwall, S.: ¬A thematic analysis of highly retweeted early COVID-19 tweets : consensus, information, dissent and lockdown life (2020) 4.90
    4.897565 = sum of:
      4.897565 = weight(author_txt:thelwall in 178) [ClassicSimilarity], result of:
        4.897565 = fieldWeight in 178, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.926203 = idf(docFreq=117, maxDocs=44218)
          0.5 = fieldNorm(doc=178)
    
  2. Thelwall, M.: Extracting macroscopic information from Web links (2001) 4.33
    4.3288765 = sum of:
      4.3288765 = weight(author_txt:thelwall in 6851) [ClassicSimilarity], result of:
        4.3288765 = fieldWeight in 6851, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.926203 = idf(docFreq=117, maxDocs=44218)
          0.625 = fieldNorm(doc=6851)
    
  3. Thelwall, M.: Conceptualizing documentation on the Web : an evaluation of different heuristic-based models for counting links between university Web sites (2002) 4.33
    4.3288765 = sum of:
      4.3288765 = weight(author_txt:thelwall in 978) [ClassicSimilarity], result of:
        4.3288765 = fieldWeight in 978, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.926203 = idf(docFreq=117, maxDocs=44218)
          0.625 = fieldNorm(doc=978)
    
  4. Thelwall, M.: Text characteristics of English language university Web sites (2005) 4.33
    4.3288765 = sum of:
      4.3288765 = weight(author_txt:thelwall in 3463) [ClassicSimilarity], result of:
        4.3288765 = fieldWeight in 3463, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.926203 = idf(docFreq=117, maxDocs=44218)
          0.625 = fieldNorm(doc=3463)
    
  5. Thelwall, M.: Bibliometrics to webometrics (2009) 4.33
    4.3288765 = sum of:
      4.3288765 = weight(author_txt:thelwall in 4239) [ClassicSimilarity], result of:
        4.3288765 = fieldWeight in 4239, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.926203 = idf(docFreq=117, maxDocs=44218)
          0.625 = fieldNorm(doc=4239)
    

Similar documents (content)

  1. Farooq, U.; Ganoe, C.H.; Carroll, J.M.; Councill, I.G.; Giles, C.L.: Design and evaluation of awareness mechanisms in CiteSeer (2008) 0.16
    0.16022553 = sum of:
      0.16022553 = product of:
        0.8011276 = sum of:
          0.00639489 = weight(abstract_txt:information in 2051) [ClassicSimilarity], result of:
            0.00639489 = score(doc=2051,freq=2.0), product of:
              0.029884975 = queryWeight, product of:
                1.0071968 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0122561315 = queryNorm
              0.21398345 = fieldWeight in 2051, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=2051)
          0.01375605 = weight(abstract_txt:science in 2051) [ClassicSimilarity], result of:
            0.01375605 = score(doc=2051,freq=1.0), product of:
              0.05700642 = queryWeight, product of:
                1.204704 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0122561315 = queryNorm
              0.24130704 = fieldWeight in 2051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0625 = fieldNorm(doc=2051)
          0.046264064 = weight(abstract_txt:investigating in 2051) [ClassicSimilarity], result of:
            0.046264064 = score(doc=2051,freq=1.0), product of:
              0.111787535 = queryWeight, product of:
                1.3774301 = boost
                6.6217136 = idf(docFreq=159, maxDocs=44218)
                0.0122561315 = queryNorm
              0.4138571 = fieldWeight in 2051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6217136 = idf(docFreq=159, maxDocs=44218)
                0.0625 = fieldNorm(doc=2051)
          0.023808302 = weight(abstract_txt:relevant in 2051) [ClassicSimilarity], result of:
            0.023808302 = score(doc=2051,freq=1.0), product of:
              0.08217636 = queryWeight, product of:
                1.4464117 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0122561315 = queryNorm
              0.28972206 = fieldWeight in 2051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0625 = fieldNorm(doc=2051)
          0.7109043 = weight(abstract_txt:feeds in 2051) [ClassicSimilarity], result of:
            0.7109043 = score(doc=2051,freq=4.0), product of:
              0.6608572 = queryWeight, product of:
                6.2655716 = boost
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.0122561315 = queryNorm
              1.0757306 = fieldWeight in 2051, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.0625 = fieldNorm(doc=2051)
        0.2 = coord(5/25)
    
  2. Thelwall, M.; Prabowo, R.: Identifying and characterizing public science-related fears from RSS feeds (2007) 0.12
    0.12456273 = sum of:
      0.12456273 = product of:
        0.62281364 = sum of:
          0.04211913 = weight(abstract_txt:science in 137) [ClassicSimilarity], result of:
            0.04211913 = score(doc=137,freq=6.0), product of:
              0.05700642 = queryWeight, product of:
                1.204704 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0122561315 = queryNorm
              0.7388489 = fieldWeight in 137, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.078125 = fieldNorm(doc=137)
          0.050652962 = weight(abstract_txt:concern in 137) [ClassicSimilarity], result of:
            0.050652962 = score(doc=137,freq=1.0), product of:
              0.10233576 = queryWeight, product of:
                1.3179125 = boost
                6.335595 = idf(docFreq=212, maxDocs=44218)
                0.0122561315 = queryNorm
              0.49496835 = fieldWeight in 137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.335595 = idf(docFreq=212, maxDocs=44218)
                0.078125 = fieldNorm(doc=137)
          0.027242463 = weight(abstract_txt:method in 137) [ClassicSimilarity], result of:
            0.027242463 = score(doc=137,freq=1.0), product of:
              0.07747332 = queryWeight, product of:
                1.4044122 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0122561315 = queryNorm
              0.3516367 = fieldWeight in 137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.078125 = fieldNorm(doc=137)
          0.058483925 = weight(abstract_txt:public in 137) [ClassicSimilarity], result of:
            0.058483925 = score(doc=137,freq=4.0), product of:
              0.08121924 = queryWeight, product of:
                1.4379638 = boost
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.0122561315 = queryNorm
              0.7200748 = fieldWeight in 137, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.078125 = fieldNorm(doc=137)
          0.44431517 = weight(abstract_txt:feeds in 137) [ClassicSimilarity], result of:
            0.44431517 = score(doc=137,freq=1.0), product of:
              0.6608572 = queryWeight, product of:
                6.2655716 = boost
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.0122561315 = queryNorm
              0.6723316 = fieldWeight in 137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.078125 = fieldNorm(doc=137)
        0.2 = coord(5/25)
    
  3. Cornelius, I.: Theorizing information for information science (2002) 0.11
    0.108438134 = sum of:
      0.108438134 = product of:
        0.33886918 = sum of:
          0.016479254 = weight(abstract_txt:information in 4244) [ClassicSimilarity], result of:
            0.016479254 = score(doc=4244,freq=34.0), product of:
              0.029884975 = queryWeight, product of:
                1.0071968 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0122561315 = queryNorm
              0.5514227 = fieldWeight in 4244, product of:
                5.8309517 = tf(freq=34.0), with freq of:
                  34.0 = termFreq=34.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4244)
          0.009609005 = weight(abstract_txt:data in 4244) [ClassicSimilarity], result of:
            0.009609005 = score(doc=4244,freq=3.0), product of:
              0.042568315 = queryWeight, product of:
                1.0410264 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0122561315 = queryNorm
              0.2257314 = fieldWeight in 4244, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4244)
          0.013208607 = weight(abstract_txt:source in 4244) [ClassicSimilarity], result of:
            0.013208607 = score(doc=4244,freq=1.0), product of:
              0.06630539 = queryWeight, product of:
                1.0608337 = boost
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.0122561315 = queryNorm
              0.19920865 = fieldWeight in 4244, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4244)
          0.013558496 = weight(abstract_txt:discussion in 4244) [ClassicSimilarity], result of:
            0.013558496 = score(doc=4244,freq=1.0), product of:
              0.06747121 = queryWeight, product of:
                1.0701191 = boost
                5.144379 = idf(docFreq=700, maxDocs=44218)
                0.0122561315 = queryNorm
              0.2009523 = fieldWeight in 4244, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.144379 = idf(docFreq=700, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4244)
          0.025792593 = weight(abstract_txt:science in 4244) [ClassicSimilarity], result of:
            0.025792593 = score(doc=4244,freq=9.0), product of:
              0.05700642 = queryWeight, product of:
                1.204704 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0122561315 = queryNorm
              0.4524507 = fieldWeight in 4244, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4244)
          0.012737173 = weight(abstract_txt:about in 4244) [ClassicSimilarity], result of:
            0.012737173 = score(doc=4244,freq=2.0), product of:
              0.058800355 = queryWeight, product of:
                1.2235126 = boost
                3.9211915 = idf(docFreq=2381, maxDocs=44218)
                0.0122561315 = queryNorm
              0.21661727 = fieldWeight in 4244, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9211915 = idf(docFreq=2381, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4244)
          0.025326481 = weight(abstract_txt:concern in 4244) [ClassicSimilarity], result of:
            0.025326481 = score(doc=4244,freq=1.0), product of:
              0.10233576 = queryWeight, product of:
                1.3179125 = boost
                6.335595 = idf(docFreq=212, maxDocs=44218)
                0.0122561315 = queryNorm
              0.24748418 = fieldWeight in 4244, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.335595 = idf(docFreq=212, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4244)
          0.22215758 = weight(abstract_txt:feeds in 4244) [ClassicSimilarity], result of:
            0.22215758 = score(doc=4244,freq=1.0), product of:
              0.6608572 = queryWeight, product of:
                6.2655716 = boost
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.0122561315 = queryNorm
              0.3361658 = fieldWeight in 4244, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4244)
        0.32 = coord(8/25)
    
  4. Otterbacher, J.; Radev, D.: Exploring fact-focused relevance and novelty detection (2008) 0.09
    0.09117679 = sum of:
      0.09117679 = product of:
        0.3256314 = sum of:
          0.007832109 = weight(abstract_txt:information in 2210) [ClassicSimilarity], result of:
            0.007832109 = score(doc=2210,freq=3.0), product of:
              0.029884975 = queryWeight, product of:
                1.0071968 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0122561315 = queryNorm
              0.26207513 = fieldWeight in 2210, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.019195305 = weight(abstract_txt:identify in 2210) [ClassicSimilarity], result of:
            0.019195305 = score(doc=2210,freq=1.0), product of:
              0.062186226 = queryWeight, product of:
                1.0273536 = boost
                4.9387927 = idf(docFreq=860, maxDocs=44218)
                0.0122561315 = queryNorm
              0.30867454 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9387927 = idf(docFreq=860, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.04052237 = weight(abstract_txt:concern in 2210) [ClassicSimilarity], result of:
            0.04052237 = score(doc=2210,freq=1.0), product of:
              0.10233576 = queryWeight, product of:
                1.3179125 = boost
                6.335595 = idf(docFreq=212, maxDocs=44218)
                0.0122561315 = queryNorm
              0.3959747 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.335595 = idf(docFreq=212, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.033670023 = weight(abstract_txt:relevant in 2210) [ClassicSimilarity], result of:
            0.033670023 = score(doc=2210,freq=2.0), product of:
              0.08217636 = queryWeight, product of:
                1.4464117 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0122561315 = queryNorm
              0.40972885 = fieldWeight in 2210, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.02483294 = weight(abstract_txt:given in 2210) [ClassicSimilarity], result of:
            0.02483294 = score(doc=2210,freq=1.0), product of:
              0.08451751 = queryWeight, product of:
                1.4668708 = boost
                4.701121 = idf(docFreq=1091, maxDocs=44218)
                0.0122561315 = queryNorm
              0.29382005 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.701121 = idf(docFreq=1091, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.07663006 = weight(abstract_txt:issue in 2210) [ClassicSimilarity], result of:
            0.07663006 = score(doc=2210,freq=1.0), product of:
              0.23760357 = queryWeight, product of:
                3.7569325 = boost
                5.160196 = idf(docFreq=689, maxDocs=44218)
                0.0122561315 = queryNorm
              0.32251224 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.160196 = idf(docFreq=689, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.12294862 = weight(abstract_txt:broad in 2210) [ClassicSimilarity], result of:
            0.12294862 = score(doc=2210,freq=1.0), product of:
              0.3404604 = queryWeight, product of:
                4.807687 = boost
                5.777993 = idf(docFreq=371, maxDocs=44218)
                0.0122561315 = queryNorm
              0.36112458 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.777993 = idf(docFreq=371, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
        0.28 = coord(7/25)
    
  5. Nomoto, T.: Discriminative sentence compression with conditional random fields (2007) 0.09
    0.0907703 = sum of:
      0.0907703 = product of:
        0.5673144 = sum of:
          0.007993612 = weight(abstract_txt:information in 945) [ClassicSimilarity], result of:
            0.007993612 = score(doc=945,freq=2.0), product of:
              0.029884975 = queryWeight, product of:
                1.0071968 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0122561315 = queryNorm
              0.2674793 = fieldWeight in 945, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=945)
          0.01921801 = weight(abstract_txt:data in 945) [ClassicSimilarity], result of:
            0.01921801 = score(doc=945,freq=3.0), product of:
              0.042568315 = queryWeight, product of:
                1.0410264 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0122561315 = queryNorm
              0.4514628 = fieldWeight in 945, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.078125 = fieldNorm(doc=945)
          0.09578758 = weight(abstract_txt:issue in 945) [ClassicSimilarity], result of:
            0.09578758 = score(doc=945,freq=1.0), product of:
              0.23760357 = queryWeight, product of:
                3.7569325 = boost
                5.160196 = idf(docFreq=689, maxDocs=44218)
                0.0122561315 = queryNorm
              0.4031403 = fieldWeight in 945, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.160196 = idf(docFreq=689, maxDocs=44218)
                0.078125 = fieldNorm(doc=945)
          0.44431517 = weight(abstract_txt:feeds in 945) [ClassicSimilarity], result of:
            0.44431517 = score(doc=945,freq=1.0), product of:
              0.6608572 = queryWeight, product of:
                6.2655716 = boost
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.0122561315 = queryNorm
              0.6723316 = fieldWeight in 945, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.078125 = fieldNorm(doc=945)
        0.16 = coord(4/25)