Document (#34061)

Author
Santini, M.
Title
Zero, single, or multi? : genre of web pages through the users' perspective
Source
Information processing and management. 44(2008) no.2, S.702-737
Year
2008
Abstract
The goal of the study presented in this article is to investigate to what extent the classification of a web page by a single genre matches the users' perspective. The extent of agreement on a single genre label for a web page can help understand whether there is a need for a different classification scheme that overrides the single-genre labelling. My hypothesis is that a single genre label does not account for the users' perspective. In order to test this hypothesis, I submitted a restricted number of web pages (25 web pages) to a large number of web users (135 subjects) asking them to assign only a single genre label to each of the web pages. Users could choose from a list of 21 genre labels, or select one of the two 'escape' options, i.e. 'Add a label' and 'I don't know'. The rationale was to observe the level of agreement on a single genre label per web page, and draw some conclusions about the appropriateness of limiting the assignment to only a single label when doing genre classification of web pages. Results show that users largely disagree on the label to be assigned to a web page.
Theme
Social tagging

Similar documents (content)

  1. Rosso, M.A.: User-based identification of Web genres (2008) 0.19
    0.19489239 = sum of:
      0.19489239 = product of:
        0.9744619 = sum of:
          0.021134192 = weight(abstract_txt:classification in 3864) [ClassicSimilarity], result of:
            0.021134192 = score(doc=3864,freq=2.0), product of:
              0.06831683 = queryWeight, product of:
                1.63953 = boost
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.01041726 = queryNorm
              0.30935556 = fieldWeight in 3864, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.0546875 = fieldNorm(doc=3864)
          0.036919538 = weight(abstract_txt:users in 3864) [ClassicSimilarity], result of:
            0.036919538 = score(doc=3864,freq=3.0), product of:
              0.10906558 = queryWeight, product of:
                2.9296403 = boost
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.01041726 = queryNorm
              0.3385077 = fieldWeight in 3864, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.0546875 = fieldNorm(doc=3864)
          0.094287775 = weight(abstract_txt:page in 3864) [ClassicSimilarity], result of:
            0.094287775 = score(doc=3864,freq=2.0), product of:
              0.20377634 = queryWeight, product of:
                3.2696536 = boost
                5.982718 = idf(docFreq=292, maxDocs=42740)
                0.01041726 = queryNorm
              0.46270224 = fieldWeight in 3864, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.982718 = idf(docFreq=292, maxDocs=42740)
                0.0546875 = fieldNorm(doc=3864)
          0.15162377 = weight(abstract_txt:pages in 3864) [ClassicSimilarity], result of:
            0.15162377 = score(doc=3864,freq=5.0), product of:
              0.22199936 = queryWeight, product of:
                3.8155375 = boost
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.01041726 = queryNorm
              0.6829919 = fieldWeight in 3864, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.0546875 = fieldNorm(doc=3864)
          0.67049664 = weight(abstract_txt:genre in 3864) [ClassicSimilarity], result of:
            0.67049664 = score(doc=3864,freq=10.0), product of:
              0.5774558 = queryWeight, product of:
                8.25611 = boost
                6.7141304 = idf(docFreq=140, maxDocs=42740)
                0.01041726 = queryNorm
              1.161122 = fieldWeight in 3864, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                6.7141304 = idf(docFreq=140, maxDocs=42740)
                0.0546875 = fieldNorm(doc=3864)
        0.2 = coord(5/25)
    
  2. Tsai, R.T.-H.; Chiu, B.; Wu, C.-E.: Visual webpage block importance prediction using conditional random fields (2011) 0.15
    0.14503759 = sum of:
      0.14503759 = product of:
        0.6043233 = sum of:
          0.05127657 = weight(abstract_txt:labels in 1925) [ClassicSimilarity], result of:
            0.05127657 = score(doc=1925,freq=2.0), product of:
              0.07824349 = queryWeight, product of:
                1.0130222 = boost
                7.4143953 = idf(docFreq=69, maxDocs=42740)
                0.01041726 = queryNorm
              0.65534616 = fieldWeight in 1925, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.4143953 = idf(docFreq=69, maxDocs=42740)
                0.0625 = fieldNorm(doc=1925)
          0.019432366 = weight(abstract_txt:only in 1925) [ClassicSimilarity], result of:
            0.019432366 = score(doc=1925,freq=2.0), product of:
              0.05162531 = queryWeight, product of:
                1.1637005 = boost
                4.258611 = idf(docFreq=1642, maxDocs=42740)
                0.01041726 = queryNorm
              0.3764116 = fieldWeight in 1925, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.258611 = idf(docFreq=1642, maxDocs=42740)
                0.0625 = fieldNorm(doc=1925)
          0.07619602 = weight(abstract_txt:page in 1925) [ClassicSimilarity], result of:
            0.07619602 = score(doc=1925,freq=1.0), product of:
              0.20377634 = queryWeight, product of:
                3.2696536 = boost
                5.982718 = idf(docFreq=292, maxDocs=42740)
                0.01041726 = queryNorm
              0.37391987 = fieldWeight in 1925, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.982718 = idf(docFreq=292, maxDocs=42740)
                0.0625 = fieldNorm(doc=1925)
          0.10959462 = weight(abstract_txt:pages in 1925) [ClassicSimilarity], result of:
            0.10959462 = score(doc=1925,freq=2.0), product of:
              0.22199936 = queryWeight, product of:
                3.8155375 = boost
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.01041726 = queryNorm
              0.49367088 = fieldWeight in 1925, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5852485 = idf(docFreq=435, maxDocs=42740)
                0.0625 = fieldNorm(doc=1925)
          0.10236982 = weight(abstract_txt:single in 1925) [ClassicSimilarity], result of:
            0.10236982 = score(doc=1925,freq=1.0), product of:
              0.31260088 = queryWeight, product of:
                5.7271023 = boost
                5.2396436 = idf(docFreq=615, maxDocs=42740)
                0.01041726 = queryNorm
              0.32747772 = fieldWeight in 1925, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2396436 = idf(docFreq=615, maxDocs=42740)
                0.0625 = fieldNorm(doc=1925)
          0.24545395 = weight(abstract_txt:label in 1925) [ClassicSimilarity], result of:
            0.24545395 = score(doc=1925,freq=1.0), product of:
              0.5356218 = queryWeight, product of:
                7.012502 = boost
                7.332157 = idf(docFreq=75, maxDocs=42740)
                0.01041726 = queryNorm
              0.45825982 = fieldWeight in 1925, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.332157 = idf(docFreq=75, maxDocs=42740)
                0.0625 = fieldNorm(doc=1925)
        0.24 = coord(6/25)
    
  3. Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.11
    0.10581205 = sum of:
      0.10581205 = product of:
        0.52906024 = sum of:
          0.031725757 = weight(abstract_txt:labels in 96) [ClassicSimilarity], result of:
            0.031725757 = score(doc=96,freq=1.0), product of:
              0.07824349 = queryWeight, product of:
                1.0130222 = boost
                7.4143953 = idf(docFreq=69, maxDocs=42740)
                0.01041726 = queryNorm
              0.40547475 = fieldWeight in 96, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4143953 = idf(docFreq=69, maxDocs=42740)
                0.0546875 = fieldNorm(doc=96)
          0.0109345345 = weight(abstract_txt:number in 96) [ClassicSimilarity], result of:
            0.0109345345 = score(doc=96,freq=1.0), product of:
              0.048460037 = queryWeight, product of:
                1.1274614 = boost
                4.1259933 = idf(docFreq=1875, maxDocs=42740)
                0.01041726 = queryNorm
              0.22564025 = fieldWeight in 96, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1259933 = idf(docFreq=1875, maxDocs=42740)
                0.0546875 = fieldNorm(doc=96)
          0.012023163 = weight(abstract_txt:only in 96) [ClassicSimilarity], result of:
            0.012023163 = score(doc=96,freq=1.0), product of:
              0.05162531 = queryWeight, product of:
                1.1637005 = boost
                4.258611 = idf(docFreq=1642, maxDocs=42740)
                0.01041726 = queryNorm
              0.2328928 = fieldWeight in 96, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.258611 = idf(docFreq=1642, maxDocs=42740)
                0.0546875 = fieldNorm(doc=96)
          0.044832394 = weight(abstract_txt:classification in 96) [ClassicSimilarity], result of:
            0.044832394 = score(doc=96,freq=9.0), product of:
              0.06831683 = queryWeight, product of:
                1.63953 = boost
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.01041726 = queryNorm
              0.65624225 = fieldWeight in 96, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.0546875 = fieldNorm(doc=96)
          0.42954442 = weight(abstract_txt:label in 96) [ClassicSimilarity], result of:
            0.42954442 = score(doc=96,freq=4.0), product of:
              0.5356218 = queryWeight, product of:
                7.012502 = boost
                7.332157 = idf(docFreq=75, maxDocs=42740)
                0.01041726 = queryNorm
              0.8019547 = fieldWeight in 96, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.332157 = idf(docFreq=75, maxDocs=42740)
                0.0546875 = fieldNorm(doc=96)
        0.2 = coord(5/25)
    
  4. Hajibayova, L.; Jacob, E.K.: User-generated genre tags through the lens of genre theories (2014) 0.10
    0.10223371 = sum of:
      0.10223371 = product of:
        0.8519476 = sum of:
          0.025838295 = weight(abstract_txt:users in 3451) [ClassicSimilarity], result of:
            0.025838295 = score(doc=3451,freq=2.0), product of:
              0.10906558 = queryWeight, product of:
                2.9296403 = boost
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.01041726 = queryNorm
              0.23690605 = fieldWeight in 3451, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.046875 = fieldNorm(doc=3451)
          0.07677737 = weight(abstract_txt:single in 3451) [ClassicSimilarity], result of:
            0.07677737 = score(doc=3451,freq=1.0), product of:
              0.31260088 = queryWeight, product of:
                5.7271023 = boost
                5.2396436 = idf(docFreq=615, maxDocs=42740)
                0.01041726 = queryNorm
              0.2456083 = fieldWeight in 3451, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2396436 = idf(docFreq=615, maxDocs=42740)
                0.046875 = fieldNorm(doc=3451)
          0.74933195 = weight(abstract_txt:genre in 3451) [ClassicSimilarity], result of:
            0.74933195 = score(doc=3451,freq=17.0), product of:
              0.5774558 = queryWeight, product of:
                8.25611 = boost
                6.7141304 = idf(docFreq=140, maxDocs=42740)
                0.01041726 = queryNorm
              1.2976438 = fieldWeight in 3451, product of:
                4.1231055 = tf(freq=17.0), with freq of:
                  17.0 = termFreq=17.0
                6.7141304 = idf(docFreq=140, maxDocs=42740)
                0.046875 = fieldNorm(doc=3451)
        0.12 = coord(3/25)
    
  5. Ringltetter, C.; Stubbe, A.: Practical aspects of automatic genre classification (2008) 0.10
    0.10080812 = sum of:
      0.10080812 = product of:
        0.6300508 = sum of:
          0.019432366 = weight(abstract_txt:only in 3955) [ClassicSimilarity], result of:
            0.019432366 = score(doc=3955,freq=2.0), product of:
              0.05162531 = queryWeight, product of:
                1.1637005 = boost
                4.258611 = idf(docFreq=1642, maxDocs=42740)
                0.01041726 = queryNorm
              0.3764116 = fieldWeight in 3955, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.258611 = idf(docFreq=1642, maxDocs=42740)
                0.0625 = fieldNorm(doc=3955)
          0.029581707 = weight(abstract_txt:classification in 3955) [ClassicSimilarity], result of:
            0.029581707 = score(doc=3955,freq=3.0), product of:
              0.06831683 = queryWeight, product of:
                1.63953 = boost
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.01041726 = queryNorm
              0.4330076 = fieldWeight in 3955, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.0625 = fieldNorm(doc=3955)
          0.039193593 = weight(abstract_txt:perspective in 3955) [ClassicSimilarity], result of:
            0.039193593 = score(doc=3955,freq=1.0), product of:
              0.11885826 = queryWeight, product of:
                2.16257 = boost
                5.276011 = idf(docFreq=593, maxDocs=42740)
                0.01041726 = queryNorm
              0.3297507 = fieldWeight in 3955, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.276011 = idf(docFreq=593, maxDocs=42740)
                0.0625 = fieldNorm(doc=3955)
          0.5418431 = weight(abstract_txt:genre in 3955) [ClassicSimilarity], result of:
            0.5418431 = score(doc=3955,freq=5.0), product of:
              0.5774558 = queryWeight, product of:
                8.25611 = boost
                6.7141304 = idf(docFreq=140, maxDocs=42740)
                0.01041726 = queryNorm
              0.93832827 = fieldWeight in 3955, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.7141304 = idf(docFreq=140, maxDocs=42740)
                0.0625 = fieldNorm(doc=3955)
        0.16 = coord(4/25)