Document (#22857)

Author
Watters, C.
Wang, H.
Title
Rating new documents for similarity
Source
Journal of the American Society for Information Science. 51(2000) no.9, S.793-804
Year
2000
Abstract
Electronic news has long held the promise of personalized and dynamic delivery of current event new items, particularly for Web users. Although wlwctronic versions of print news are now widely available, the personalization of that delivery has not yet been accomplished. In this paper, we present a methodology of associating news documents based on the extraction of feature phrases, where feature phrases identify dates, locations, people and organizations. A news representation is created from these feature phrases to define news objects that can then be compared and ranked to find related news items. Unlike tradtional information retrieval, we are much more interested in precision than recall. That is, the user would like to see one or more specifically related articles, rather than all somewhat related articles. The algorithm is designed to work interactively the the user using regular web browsers as the interface
Theme
Internet
Form
Zeitungen

Similar documents (author)

  1. Watters, C.: Extending the multimedia class hierarchy for hypermedia applications (1996) 2.48
    2.479113 = sum of:
      2.479113 = product of:
        4.958226 = sum of:
          4.958226 = weight(author_txt:watters in 605) [ClassicSimilarity], result of:
            4.958226 = score(doc=605,freq=1.0), product of:
              0.88255703 = queryWeight, product of:
                1.3700223 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.07166575 = queryNorm
              5.6180234 = fieldWeight in 605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.625 = fieldNorm(doc=605)
        0.5 = coord(1/2)
    
  2. Watters, C.: Information retrieval and the virtual document (1999) 2.48
    2.479113 = sum of:
      2.479113 = product of:
        4.958226 = sum of:
          4.958226 = weight(author_txt:watters in 4319) [ClassicSimilarity], result of:
            4.958226 = score(doc=4319,freq=1.0), product of:
              0.88255703 = queryWeight, product of:
                1.3700223 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.07166575 = queryNorm
              5.6180234 = fieldWeight in 4319, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.625 = fieldNorm(doc=4319)
        0.5 = coord(1/2)
    
  3. Watters, C.; Shepherd, M.A.: Shifting the information paradigm from data-centered to user-centered (1994) 1.98
    1.9832904 = sum of:
      1.9832904 = product of:
        3.9665809 = sum of:
          3.9665809 = weight(author_txt:watters in 7290) [ClassicSimilarity], result of:
            3.9665809 = score(doc=7290,freq=1.0), product of:
              0.88255703 = queryWeight, product of:
                1.3700223 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.07166575 = queryNorm
              4.4944186 = fieldWeight in 7290, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.5 = fieldNorm(doc=7290)
        0.5 = coord(1/2)
    
  4. Carrick, C.; Watters, C.: Automatic association of news items (1997) 1.98
    1.9832904 = sum of:
      1.9832904 = product of:
        3.9665809 = sum of:
          3.9665809 = weight(author_txt:watters in 1549) [ClassicSimilarity], result of:
            3.9665809 = score(doc=1549,freq=1.0), product of:
              0.88255703 = queryWeight, product of:
                1.3700223 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.07166575 = queryNorm
              4.4944186 = fieldWeight in 1549, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.5 = fieldNorm(doc=1549)
        0.5 = coord(1/2)
    
  5. Watters, C.; Amoudi, A.: Geosearcher : location-based ranking of search engine results (2003) 1.98
    1.9832904 = sum of:
      1.9832904 = product of:
        3.9665809 = sum of:
          3.9665809 = weight(author_txt:watters in 5152) [ClassicSimilarity], result of:
            3.9665809 = score(doc=5152,freq=1.0), product of:
              0.88255703 = queryWeight, product of:
                1.3700223 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.07166575 = queryNorm
              4.4944186 = fieldWeight in 5152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.5 = fieldNorm(doc=5152)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Sela, M.; Lavie, T.; Inbar, O.; Oppenheim, I.; Meyer, J.: Personalizing news content : an experimental study (2015) 0.33
    0.32931554 = sum of:
      0.32931554 = product of:
        1.176127 = sum of:
          0.009386444 = weight(abstract_txt:that in 1604) [ClassicSimilarity], result of:
            0.009386444 = score(doc=1604,freq=2.0), product of:
              0.04481815 = queryWeight, product of:
                1.0146863 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.018641034 = queryNorm
              0.20943399 = fieldWeight in 1604, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=1604)
          0.15257162 = weight(abstract_txt:personalized in 1604) [ClassicSimilarity], result of:
            0.15257162 = score(doc=1604,freq=6.0), product of:
              0.13825734 = queryWeight, product of:
                1.0289359 = boost
                7.208251 = idf(docFreq=88, maxDocs=44218)
                0.018641034 = queryNorm
              1.1035336 = fieldWeight in 1604, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.208251 = idf(docFreq=88, maxDocs=44218)
                0.0625 = fieldNorm(doc=1604)
          0.033248045 = weight(abstract_txt:user in 1604) [ClassicSimilarity], result of:
            0.033248045 = score(doc=1604,freq=4.0), product of:
              0.072208814 = queryWeight, product of:
                1.051609 = boost
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.018641034 = queryNorm
              0.46044302 = fieldWeight in 1604, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.0625 = fieldNorm(doc=1604)
          0.0784624 = weight(abstract_txt:personalization in 1604) [ClassicSimilarity], result of:
            0.0784624 = score(doc=1604,freq=1.0), product of:
              0.16126144 = queryWeight, product of:
                1.1112441 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.018641034 = queryNorm
              0.48655403 = fieldWeight in 1604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.0625 = fieldNorm(doc=1604)
          0.12913488 = weight(abstract_txt:items in 1604) [ClassicSimilarity], result of:
            0.12913488 = score(doc=1604,freq=5.0), product of:
              0.1656298 = queryWeight, product of:
                1.5926797 = boost
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.018641034 = queryNorm
              0.7796596 = fieldWeight in 1604, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.0625 = fieldNorm(doc=1604)
          0.07370801 = weight(abstract_txt:delivery in 1604) [ClassicSimilarity], result of:
            0.07370801 = score(doc=1604,freq=1.0), product of:
              0.19488388 = queryWeight, product of:
                1.7276158 = boost
                6.0514402 = idf(docFreq=282, maxDocs=44218)
                0.018641034 = queryNorm
              0.37821501 = fieldWeight in 1604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0514402 = idf(docFreq=282, maxDocs=44218)
                0.0625 = fieldNorm(doc=1604)
          0.69961554 = weight(abstract_txt:news in 1604) [ClassicSimilarity], result of:
            0.69961554 = score(doc=1604,freq=11.0), product of:
              0.56656355 = queryWeight, product of:
                5.102043 = boost
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.018641034 = queryNorm
              1.2348404 = fieldWeight in 1604, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.0625 = fieldNorm(doc=1604)
        0.28 = coord(7/25)
    
  2. Watters, C.R.; Shepherd, M.A.; Burkowski, F.J.: Electronic news delivery project (1998) 0.19
    0.19288053 = sum of:
      0.19288053 = product of:
        0.9644026 = sum of:
          0.0066372184 = weight(abstract_txt:that in 444) [ClassicSimilarity], result of:
            0.0066372184 = score(doc=444,freq=1.0), product of:
              0.04481815 = queryWeight, product of:
                1.0146863 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.018641034 = queryNorm
              0.1480922 = fieldWeight in 444, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=444)
          0.0622871 = weight(abstract_txt:personalized in 444) [ClassicSimilarity], result of:
            0.0622871 = score(doc=444,freq=1.0), product of:
              0.13825734 = queryWeight, product of:
                1.0289359 = boost
                7.208251 = idf(docFreq=88, maxDocs=44218)
                0.018641034 = queryNorm
              0.4505157 = fieldWeight in 444, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.208251 = idf(docFreq=88, maxDocs=44218)
                0.0625 = fieldNorm(doc=444)
          0.12766601 = weight(abstract_txt:delivery in 444) [ClassicSimilarity], result of:
            0.12766601 = score(doc=444,freq=3.0), product of:
              0.19488388 = queryWeight, product of:
                1.7276158 = boost
                6.0514402 = idf(docFreq=282, maxDocs=44218)
                0.018641034 = queryNorm
              0.6550876 = fieldWeight in 444, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.0514402 = idf(docFreq=282, maxDocs=44218)
                0.0625 = fieldNorm(doc=444)
          0.037087698 = weight(abstract_txt:related in 444) [ClassicSimilarity], result of:
            0.037087698 = score(doc=444,freq=1.0), product of:
              0.14112906 = queryWeight, product of:
                1.8005826 = boost
                4.2046843 = idf(docFreq=1793, maxDocs=44218)
                0.018641034 = queryNorm
              0.26279277 = fieldWeight in 444, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2046843 = idf(docFreq=1793, maxDocs=44218)
                0.0625 = fieldNorm(doc=444)
          0.7307246 = weight(abstract_txt:news in 444) [ClassicSimilarity], result of:
            0.7307246 = score(doc=444,freq=12.0), product of:
              0.56656355 = queryWeight, product of:
                5.102043 = boost
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.018641034 = queryNorm
              1.2897487 = fieldWeight in 444, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.0625 = fieldNorm(doc=444)
        0.2 = coord(5/25)
    
  3. Shapira, B.; Shoval, P.; Tractinsky, N.; Meyer, J.: ePaper : a personalized mobile newspaper (2009) 0.18
    0.18399191 = sum of:
      0.18399191 = product of:
        0.91995955 = sum of:
          0.07785888 = weight(abstract_txt:personalized in 3168) [ClassicSimilarity], result of:
            0.07785888 = score(doc=3168,freq=1.0), product of:
              0.13825734 = queryWeight, product of:
                1.0289359 = boost
                7.208251 = idf(docFreq=88, maxDocs=44218)
                0.018641034 = queryNorm
              0.5631446 = fieldWeight in 3168, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.208251 = idf(docFreq=88, maxDocs=44218)
                0.078125 = fieldNorm(doc=3168)
          0.029387398 = weight(abstract_txt:user in 3168) [ClassicSimilarity], result of:
            0.029387398 = score(doc=3168,freq=2.0), product of:
              0.072208814 = queryWeight, product of:
                1.051609 = boost
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.018641034 = queryNorm
              0.40697798 = fieldWeight in 3168, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.078125 = fieldNorm(doc=3168)
          0.098078005 = weight(abstract_txt:personalization in 3168) [ClassicSimilarity], result of:
            0.098078005 = score(doc=3168,freq=1.0), product of:
              0.16126144 = queryWeight, product of:
                1.1112441 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.018641034 = queryNorm
              0.60819256 = fieldWeight in 3168, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.078125 = fieldNorm(doc=3168)
          0.12503432 = weight(abstract_txt:items in 3168) [ClassicSimilarity], result of:
            0.12503432 = score(doc=3168,freq=3.0), product of:
              0.1656298 = queryWeight, product of:
                1.5926797 = boost
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.018641034 = queryNorm
              0.75490224 = fieldWeight in 3168, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.078125 = fieldNorm(doc=3168)
          0.5896009 = weight(abstract_txt:news in 3168) [ClassicSimilarity], result of:
            0.5896009 = score(doc=3168,freq=5.0), product of:
              0.56656355 = queryWeight, product of:
                5.102043 = boost
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.018641034 = queryNorm
              1.0406616 = fieldWeight in 3168, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.078125 = fieldNorm(doc=3168)
        0.2 = coord(5/25)
    
  4. Carrick, C.; Watters, C.: Automatic association of news items (1997) 0.18
    0.18299778 = sum of:
      0.18299778 = product of:
        0.9149889 = sum of:
          0.10107836 = weight(abstract_txt:event in 1549) [ClassicSimilarity], result of:
            0.10107836 = score(doc=1549,freq=2.0), product of:
              0.1305905 = queryWeight, product of:
                7.0055394 = idf(docFreq=108, maxDocs=44218)
                0.018641034 = queryNorm
              0.77401006 = fieldWeight in 1549, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.0055394 = idf(docFreq=108, maxDocs=44218)
                0.078125 = fieldNorm(doc=1549)
          0.014369999 = weight(abstract_txt:that in 1549) [ClassicSimilarity], result of:
            0.014369999 = score(doc=1549,freq=3.0), product of:
              0.04481815 = queryWeight, product of:
                1.0146863 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.018641034 = queryNorm
              0.320629 = fieldWeight in 1549, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=1549)
          0.14437717 = weight(abstract_txt:items in 1549) [ClassicSimilarity], result of:
            0.14437717 = score(doc=1549,freq=4.0), product of:
              0.1656298 = queryWeight, product of:
                1.5926797 = boost
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.018641034 = queryNorm
              0.871686 = fieldWeight in 1549, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.57879 = idf(docFreq=453, maxDocs=44218)
                0.078125 = fieldNorm(doc=1549)
          0.065562405 = weight(abstract_txt:related in 1549) [ClassicSimilarity], result of:
            0.065562405 = score(doc=1549,freq=2.0), product of:
              0.14112906 = queryWeight, product of:
                1.8005826 = boost
                4.2046843 = idf(docFreq=1793, maxDocs=44218)
                0.018641034 = queryNorm
              0.46455637 = fieldWeight in 1549, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2046843 = idf(docFreq=1793, maxDocs=44218)
                0.078125 = fieldNorm(doc=1549)
          0.5896009 = weight(abstract_txt:news in 1549) [ClassicSimilarity], result of:
            0.5896009 = score(doc=1549,freq=5.0), product of:
              0.56656355 = queryWeight, product of:
                5.102043 = boost
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.018641034 = queryNorm
              1.0406616 = fieldWeight in 1549, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.078125 = fieldNorm(doc=1549)
        0.2 = coord(5/25)
    
  5. Ou, S.; Khoo, C.S.G.; Goh, D.H.: Multi-document summarization of news articles using an event-based framework (2006) 0.18
    0.1754103 = sum of:
      0.1754103 = product of:
        0.8770515 = sum of:
          0.15128022 = weight(abstract_txt:event in 657) [ClassicSimilarity], result of:
            0.15128022 = score(doc=657,freq=7.0), product of:
              0.1305905 = queryWeight, product of:
                7.0055394 = idf(docFreq=108, maxDocs=44218)
                0.018641034 = queryNorm
              1.1584321 = fieldWeight in 657, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.0055394 = idf(docFreq=108, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
          0.009386444 = weight(abstract_txt:that in 657) [ClassicSimilarity], result of:
            0.009386444 = score(doc=657,freq=2.0), product of:
              0.04481815 = queryWeight, product of:
                1.0146863 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.018641034 = queryNorm
              0.20943399 = fieldWeight in 657, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
          0.023509918 = weight(abstract_txt:user in 657) [ClassicSimilarity], result of:
            0.023509918 = score(doc=657,freq=2.0), product of:
              0.072208814 = queryWeight, product of:
                1.051609 = boost
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.018641034 = queryNorm
              0.32558239 = fieldWeight in 657, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6835442 = idf(docFreq=3020, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
          0.0962407 = weight(abstract_txt:articles in 657) [ClassicSimilarity], result of:
            0.0962407 = score(doc=657,freq=7.0), product of:
              0.12170431 = queryWeight, product of:
                1.36525 = boost
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.018641034 = queryNorm
              0.79077476 = fieldWeight in 657, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
          0.59663415 = weight(abstract_txt:news in 657) [ClassicSimilarity], result of:
            0.59663415 = score(doc=657,freq=8.0), product of:
              0.56656355 = queryWeight, product of:
                5.102043 = boost
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.018641034 = queryNorm
              1.0530754 = fieldWeight in 657, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.957094 = idf(docFreq=310, maxDocs=44218)
                0.0625 = fieldNorm(doc=657)
        0.2 = coord(5/25)