Document (#34061)

Author
Santini, M.
Title
Zero, single, or multi? : genre of web pages through the users' perspective
Source
Information processing and management. 44(2008) no.2, S.702-737
Year
2008
Abstract
The goal of the study presented in this article is to investigate to what extent the classification of a web page by a single genre matches the users' perspective. The extent of agreement on a single genre label for a web page can help understand whether there is a need for a different classification scheme that overrides the single-genre labelling. My hypothesis is that a single genre label does not account for the users' perspective. In order to test this hypothesis, I submitted a restricted number of web pages (25 web pages) to a large number of web users (135 subjects) asking them to assign only a single genre label to each of the web pages. Users could choose from a list of 21 genre labels, or select one of the two 'escape' options, i.e. 'Add a label' and 'I don't know'. The rationale was to observe the level of agreement on a single genre label per web page, and draw some conclusions about the appropriateness of limiting the assignment to only a single label when doing genre classification of web pages. Results show that users largely disagree on the label to be assigned to a web page.
Theme
Social tagging

Similar documents (content)

  1. Rosso, M.A.: User-based identification of Web genres (2008) 0.20
    0.1957922 = sum of:
      0.1957922 = product of:
        0.978961 = sum of:
          0.021140741 = weight(abstract_txt:classification in 3864) [ClassicSimilarity], result of:
            0.021140741 = score(doc=3864,freq=2.0), product of:
              0.06837296 = queryWeight, product of:
                1.6389042 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.010435133 = queryNorm
              0.3091974 = fieldWeight in 3864, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3864)
          0.03689558 = weight(abstract_txt:users in 3864) [ClassicSimilarity], result of:
            0.03689558 = score(doc=3864,freq=3.0), product of:
              0.10908542 = queryWeight, product of:
                2.9275866 = boost
                3.570746 = idf(docFreq=3307, maxDocs=43254)
                0.010435133 = queryNorm
              0.3382265 = fieldWeight in 3864, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.570746 = idf(docFreq=3307, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3864)
          0.09502917 = weight(abstract_txt:page in 3864) [ClassicSimilarity], result of:
            0.09502917 = score(doc=3864,freq=2.0), product of:
              0.2049691 = queryWeight, product of:
                3.2766116 = boost
                5.9946723 = idf(docFreq=292, maxDocs=43254)
                0.010435133 = queryNorm
              0.4636268 = fieldWeight in 3864, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9946723 = idf(docFreq=292, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3864)
          0.1526934 = weight(abstract_txt:pages in 3864) [ClassicSimilarity], result of:
            0.1526934 = score(doc=3864,freq=5.0), product of:
              0.22317933 = queryWeight, product of:
                3.8226342 = boost
                5.5949116 = idf(docFreq=436, maxDocs=43254)
                0.010435133 = queryNorm
              0.6841736 = fieldWeight in 3864, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.5949116 = idf(docFreq=436, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3864)
          0.6732021 = weight(abstract_txt:genre in 3864) [ClassicSimilarity], result of:
            0.6732021 = score(doc=3864,freq=10.0), product of:
              0.5793641 = queryWeight, product of:
                8.263191 = boost
                6.719018 = idf(docFreq=141, maxDocs=43254)
                0.010435133 = queryNorm
              1.1619672 = fieldWeight in 3864, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                6.719018 = idf(docFreq=141, maxDocs=43254)
                0.0546875 = fieldNorm(doc=3864)
        0.2 = coord(5/25)
    
  2. Tsai, R.T.-H.; Chiu, B.; Wu, C.-E.: Visual webpage block importance prediction using conditional random fields (2011) 0.14
    0.14448121 = sum of:
      0.14448121 = product of:
        0.60200506 = sum of:
          0.050194718 = weight(abstract_txt:labels in 1389) [ClassicSimilarity], result of:
            0.050194718 = score(doc=1389,freq=2.0), product of:
              0.07718647 = queryWeight, product of:
                1.0053594 = boost
                7.357357 = idf(docFreq=74, maxDocs=43254)
                0.010435133 = queryNorm
              0.6503046 = fieldWeight in 1389, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.357357 = idf(docFreq=74, maxDocs=43254)
                0.0625 = fieldNorm(doc=1389)
          0.019294664 = weight(abstract_txt:only in 1389) [ClassicSimilarity], result of:
            0.019294664 = score(doc=1389,freq=2.0), product of:
              0.051412728 = queryWeight, product of:
                1.1603823 = boost
                4.245918 = idf(docFreq=1683, maxDocs=43254)
                0.010435133 = queryNorm
              0.37528965 = fieldWeight in 1389, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.245918 = idf(docFreq=1683, maxDocs=43254)
                0.0625 = fieldNorm(doc=1389)
          0.07679516 = weight(abstract_txt:page in 1389) [ClassicSimilarity], result of:
            0.07679516 = score(doc=1389,freq=1.0), product of:
              0.2049691 = queryWeight, product of:
                3.2766116 = boost
                5.9946723 = idf(docFreq=292, maxDocs=43254)
                0.010435133 = queryNorm
              0.37466702 = fieldWeight in 1389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9946723 = idf(docFreq=292, maxDocs=43254)
                0.0625 = fieldNorm(doc=1389)
          0.11036775 = weight(abstract_txt:pages in 1389) [ClassicSimilarity], result of:
            0.11036775 = score(doc=1389,freq=2.0), product of:
              0.22317933 = queryWeight, product of:
                3.8226342 = boost
                5.5949116 = idf(docFreq=436, maxDocs=43254)
                0.010435133 = queryNorm
              0.494525 = fieldWeight in 1389, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5949116 = idf(docFreq=436, maxDocs=43254)
                0.0625 = fieldNorm(doc=1389)
          0.10212843 = weight(abstract_txt:single in 1389) [ClassicSimilarity], result of:
            0.10212843 = score(doc=1389,freq=1.0), product of:
              0.31230116 = queryWeight, product of:
                5.7198224 = boost
                5.232305 = idf(docFreq=627, maxDocs=43254)
                0.010435133 = queryNorm
              0.32701907 = fieldWeight in 1389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.232305 = idf(docFreq=627, maxDocs=43254)
                0.0625 = fieldNorm(doc=1389)
          0.24322434 = weight(abstract_txt:label in 1389) [ClassicSimilarity], result of:
            0.24322434 = score(doc=1389,freq=1.0), product of:
              0.5327006 = queryWeight, product of:
                6.9878144 = boost
                7.305397 = idf(docFreq=78, maxDocs=43254)
                0.010435133 = queryNorm
              0.4565873 = fieldWeight in 1389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.305397 = idf(docFreq=78, maxDocs=43254)
                0.0625 = fieldNorm(doc=1389)
        0.24 = coord(6/25)
    
  3. Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.10
    0.104887225 = sum of:
      0.104887225 = product of:
        0.5244361 = sum of:
          0.031056399 = weight(abstract_txt:labels in 96) [ClassicSimilarity], result of:
            0.031056399 = score(doc=96,freq=1.0), product of:
              0.07718647 = queryWeight, product of:
                1.0053594 = boost
                7.357357 = idf(docFreq=74, maxDocs=43254)
                0.010435133 = queryNorm
              0.40235546 = fieldWeight in 96, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.357357 = idf(docFreq=74, maxDocs=43254)
                0.0546875 = fieldNorm(doc=96)
          0.010952878 = weight(abstract_txt:number in 96) [ClassicSimilarity], result of:
            0.010952878 = score(doc=96,freq=1.0), product of:
              0.048544046 = queryWeight, product of:
                1.1275446 = boost
                4.1257625 = idf(docFreq=1898, maxDocs=43254)
                0.010435133 = queryNorm
              0.22562763 = fieldWeight in 96, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1257625 = idf(docFreq=1898, maxDocs=43254)
                0.0546875 = fieldNorm(doc=96)
          0.011937965 = weight(abstract_txt:only in 96) [ClassicSimilarity], result of:
            0.011937965 = score(doc=96,freq=1.0), product of:
              0.051412728 = queryWeight, product of:
                1.1603823 = boost
                4.245918 = idf(docFreq=1683, maxDocs=43254)
                0.010435133 = queryNorm
              0.23219863 = fieldWeight in 96, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.245918 = idf(docFreq=1683, maxDocs=43254)
                0.0546875 = fieldNorm(doc=96)
          0.04484629 = weight(abstract_txt:classification in 96) [ClassicSimilarity], result of:
            0.04484629 = score(doc=96,freq=9.0), product of:
              0.06837296 = queryWeight, product of:
                1.6389042 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.010435133 = queryNorm
              0.6559068 = fieldWeight in 96, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.0546875 = fieldNorm(doc=96)
          0.42564258 = weight(abstract_txt:label in 96) [ClassicSimilarity], result of:
            0.42564258 = score(doc=96,freq=4.0), product of:
              0.5327006 = queryWeight, product of:
                6.9878144 = boost
                7.305397 = idf(docFreq=78, maxDocs=43254)
                0.010435133 = queryNorm
              0.7990278 = fieldWeight in 96, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.305397 = idf(docFreq=78, maxDocs=43254)
                0.0546875 = fieldNorm(doc=96)
        0.2 = coord(5/25)
    
  4. Hajibayova, L.; Jacob, E.K.: User-generated genre tags through the lens of genre theories (2014) 0.10
    0.102572806 = sum of:
      0.102572806 = product of:
        0.8547734 = sum of:
          0.025821526 = weight(abstract_txt:users in 2915) [ClassicSimilarity], result of:
            0.025821526 = score(doc=2915,freq=2.0), product of:
              0.10908542 = queryWeight, product of:
                2.9275866 = boost
                3.570746 = idf(docFreq=3307, maxDocs=43254)
                0.010435133 = queryNorm
              0.23670924 = fieldWeight in 2915, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.570746 = idf(docFreq=3307, maxDocs=43254)
                0.046875 = fieldNorm(doc=2915)
          0.07659632 = weight(abstract_txt:single in 2915) [ClassicSimilarity], result of:
            0.07659632 = score(doc=2915,freq=1.0), product of:
              0.31230116 = queryWeight, product of:
                5.7198224 = boost
                5.232305 = idf(docFreq=627, maxDocs=43254)
                0.010435133 = queryNorm
              0.24526429 = fieldWeight in 2915, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.232305 = idf(docFreq=627, maxDocs=43254)
                0.046875 = fieldNorm(doc=2915)
          0.7523556 = weight(abstract_txt:genre in 2915) [ClassicSimilarity], result of:
            0.7523556 = score(doc=2915,freq=17.0), product of:
              0.5793641 = queryWeight, product of:
                8.263191 = boost
                6.719018 = idf(docFreq=141, maxDocs=43254)
                0.010435133 = queryNorm
              1.2985885 = fieldWeight in 2915, product of:
                4.1231055 = tf(freq=17.0), with freq of:
                  17.0 = termFreq=17.0
                6.719018 = idf(docFreq=141, maxDocs=43254)
                0.046875 = fieldNorm(doc=2915)
        0.12 = coord(3/25)
    
  5. Ringltetter, C.; Stubbe, A.: Practical aspects of automatic genre classification (2008) 0.10
    0.10108522 = sum of:
      0.10108522 = product of:
        0.63178265 = sum of:
          0.019294664 = weight(abstract_txt:only in 3955) [ClassicSimilarity], result of:
            0.019294664 = score(doc=3955,freq=2.0), product of:
              0.051412728 = queryWeight, product of:
                1.1603823 = boost
                4.245918 = idf(docFreq=1683, maxDocs=43254)
                0.010435133 = queryNorm
              0.37528965 = fieldWeight in 3955, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.245918 = idf(docFreq=1683, maxDocs=43254)
                0.0625 = fieldNorm(doc=3955)
          0.029590875 = weight(abstract_txt:classification in 3955) [ClassicSimilarity], result of:
            0.029590875 = score(doc=3955,freq=3.0), product of:
              0.06837296 = queryWeight, product of:
                1.6389042 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.010435133 = queryNorm
              0.43278623 = fieldWeight in 3955, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.0625 = fieldNorm(doc=3955)
          0.038867667 = weight(abstract_txt:perspective in 3955) [ClassicSimilarity], result of:
            0.038867667 = score(doc=3955,freq=1.0), product of:
              0.11827108 = queryWeight, product of:
                2.155513 = boost
                5.258113 = idf(docFreq=611, maxDocs=43254)
                0.010435133 = queryNorm
              0.32863206 = fieldWeight in 3955, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.258113 = idf(docFreq=611, maxDocs=43254)
                0.0625 = fieldNorm(doc=3955)
          0.5440295 = weight(abstract_txt:genre in 3955) [ClassicSimilarity], result of:
            0.5440295 = score(doc=3955,freq=5.0), product of:
              0.5793641 = queryWeight, product of:
                8.263191 = boost
                6.719018 = idf(docFreq=141, maxDocs=43254)
                0.010435133 = queryNorm
              0.93901134 = fieldWeight in 3955, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.719018 = idf(docFreq=141, maxDocs=43254)
                0.0625 = fieldNorm(doc=3955)
        0.16 = coord(4/25)