Document (#34014)

Author
Cosh, K.J.
Burns, R.
Daniel, T.
Title
Content clouds : classifying content in Web 2.0
Source
Library review. 57(2008) no.9, S.722-729
Year
2008
Abstract
Purpose - With increasing amounts of user generated content being produced electronically in the form of wikis, blogs, forums etc. the purpose of this paper is to investigate a new approach to classifying ad hoc content. Design/methodology/approach - The approach applies natural language processing (NLP) tools to automatically extract the content of some text, visualizing the results in a content cloud. Findings - Content clouds share the visual simplicity of a tag cloud, but display the details of an article at a different level of abstraction, providing a complimentary classification. Research limitations/implications - Provides the general approach to creating a content cloud. In the future, the process can be refined and enhanced by further evaluation of results. Further work is also required to better identify closely related articles. Practical implications - Being able to automatically classify the content generated by web users will enable others to find more appropriate content. Originality/value - The approach is original. Other researchers have produced a cloud, simply by using skiplists to filter unwanted words, this paper's approach improves this by applying appropriate NLP techniques.
Theme
Automatisches Klassifizieren
Object
Tag cloud
Word cloud

Similar documents (author)

  1. Burns, B.A.F.: Alternatives for library catalogues : tools for catalogue planning (1981) 2.55
    2.5531936 = sum of:
      2.5531936 = product of:
        5.106387 = sum of:
          5.106387 = weight(author_txt:burns in 5392) [ClassicSimilarity], result of:
            5.106387 = score(doc=5392,freq=1.0), product of:
              0.82484746 = queryWeight, product of:
                1.2078865 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.06894257 = queryNorm
              6.190705 = fieldWeight in 5392, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.625 = fieldNorm(doc=5392)
        0.5 = coord(1/2)
    
  2. Bullard, J.; Burns, C.S.; VanScoy, A.: Warrant as a means to study classification system design (2017) 1.53
    1.531916 = sum of:
      1.531916 = product of:
        3.063832 = sum of:
          3.063832 = weight(author_txt:burns in 3360) [ClassicSimilarity], result of:
            3.063832 = score(doc=3360,freq=1.0), product of:
              0.82484746 = queryWeight, product of:
                1.2078865 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.06894257 = queryNorm
              3.7144227 = fieldWeight in 3360, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.375 = fieldNorm(doc=3360)
        0.5 = coord(1/2)
    
  3. Bossaller, J.; Burns, C.S.; VanScoy, A.: Re-conceiving time in reference and information services work : a qualitative secondary analysis (2017) 1.53
    1.531916 = sum of:
      1.531916 = product of:
        3.063832 = sum of:
          3.063832 = weight(author_txt:burns in 3363) [ClassicSimilarity], result of:
            3.063832 = score(doc=3363,freq=1.0), product of:
              0.82484746 = queryWeight, product of:
                1.2078865 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.06894257 = queryNorm
              3.7144227 = fieldWeight in 3363, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.375 = fieldNorm(doc=3363)
        0.5 = coord(1/2)
    
  4. Daniel, F.: Elektronische Informationsdienste in der StadtBibliothek Köln (1995) 1.45
    1.4487898 = sum of:
      1.4487898 = product of:
        2.8975797 = sum of:
          2.8975797 = weight(author_txt:daniel in 2696) [ClassicSimilarity], result of:
            2.8975797 = score(doc=2696,freq=1.0), product of:
              0.56535524 = queryWeight, product of:
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.06894257 = queryNorm
              5.125237 = fieldWeight in 2696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.625 = fieldNorm(doc=2696)
        0.5 = coord(1/2)
    
  5. Daniel, F.: Präsentationssoftware 'infoThek' für elektronische Informationsmedien (1996) 1.45
    1.4487898 = sum of:
      1.4487898 = product of:
        2.8975797 = sum of:
          2.8975797 = weight(author_txt:daniel in 3315) [ClassicSimilarity], result of:
            2.8975797 = score(doc=3315,freq=1.0), product of:
              0.56535524 = queryWeight, product of:
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.06894257 = queryNorm
              5.125237 = fieldWeight in 3315, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.625 = fieldNorm(doc=3315)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Leginus, M.; Zhai, C.X.; Dolog, P.: Personalized generation of word clouds from tweets (2016) 0.24
    0.23676434 = sum of:
      0.23676434 = product of:
        1.1838217 = sum of:
          0.03509097 = weight(abstract_txt:further in 2886) [ClassicSimilarity], result of:
            0.03509097 = score(doc=2886,freq=1.0), product of:
              0.09615305 = queryWeight, product of:
                1.2836821 = boost
                4.671349 = idf(docFreq=1124, maxDocs=44218)
                0.01603479 = queryNorm
              0.36494914 = fieldWeight in 2886, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.671349 = idf(docFreq=1124, maxDocs=44218)
                0.078125 = fieldNorm(doc=2886)
          0.057933047 = weight(abstract_txt:generated in 2886) [ClassicSimilarity], result of:
            0.057933047 = score(doc=2886,freq=1.0), product of:
              0.13431269 = queryWeight, product of:
                1.5171708 = boost
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.01603479 = queryNorm
              0.43132967 = fieldWeight in 2886, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.078125 = fieldNorm(doc=2886)
          0.4943365 = weight(abstract_txt:clouds in 2886) [ClassicSimilarity], result of:
            0.4943365 = score(doc=2886,freq=3.0), product of:
              0.38887274 = queryWeight, product of:
                2.5815449 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.01603479 = queryNorm
              1.2712038 = fieldWeight in 2886, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.078125 = fieldNorm(doc=2886)
          0.054257434 = weight(abstract_txt:approach in 2886) [ClassicSimilarity], result of:
            0.054257434 = score(doc=2886,freq=1.0), product of:
              0.18542974 = queryWeight, product of:
                3.0876372 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01603479 = queryNorm
              0.29260373 = fieldWeight in 2886, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=2886)
          0.5422038 = weight(abstract_txt:cloud in 2886) [ClassicSimilarity], result of:
            0.5422038 = score(doc=2886,freq=3.0), product of:
              0.5210876 = queryWeight, product of:
                4.226164 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.01603479 = queryNorm
              1.0405233 = fieldWeight in 2886, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.078125 = fieldNorm(doc=2886)
        0.2 = coord(5/25)
    
  2. Huang, C.; Fu, T.; Chen, H.: Text-based video content classification for online video-sharing sites (2010) 0.10
    0.10005975 = sum of:
      0.10005975 = product of:
        0.41691563 = sum of:
          0.046918076 = weight(abstract_txt:blogs in 3452) [ClassicSimilarity], result of:
            0.046918076 = score(doc=3452,freq=1.0), product of:
              0.117485486 = queryWeight, product of:
                1.0033514 = boost
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.01603479 = queryNorm
              0.3993521 = fieldWeight in 3452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3452)
          0.05821089 = weight(abstract_txt:forums in 3452) [ClassicSimilarity], result of:
            0.05821089 = score(doc=3452,freq=1.0), product of:
              0.13565223 = queryWeight, product of:
                1.0781381 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.01603479 = queryNorm
              0.42911857 = fieldWeight in 3452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3452)
          0.024563681 = weight(abstract_txt:further in 3452) [ClassicSimilarity], result of:
            0.024563681 = score(doc=3452,freq=1.0), product of:
              0.09615305 = queryWeight, product of:
                1.2836821 = boost
                4.671349 = idf(docFreq=1124, maxDocs=44218)
                0.01603479 = queryNorm
              0.2554644 = fieldWeight in 3452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.671349 = idf(docFreq=1124, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3452)
          0.08110627 = weight(abstract_txt:generated in 3452) [ClassicSimilarity], result of:
            0.08110627 = score(doc=3452,freq=4.0), product of:
              0.13431269 = queryWeight, product of:
                1.5171708 = boost
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.01603479 = queryNorm
              0.6038616 = fieldWeight in 3452, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3452)
          0.053712122 = weight(abstract_txt:approach in 3452) [ClassicSimilarity], result of:
            0.053712122 = score(doc=3452,freq=2.0), product of:
              0.18542974 = queryWeight, product of:
                3.0876372 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01603479 = queryNorm
              0.28966293 = fieldWeight in 3452, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3452)
          0.15240458 = weight(abstract_txt:content in 3452) [ClassicSimilarity], result of:
            0.15240458 = score(doc=3452,freq=3.0), product of:
              0.38493052 = queryWeight, product of:
                5.743176 = boost
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.01603479 = queryNorm
              0.3959275 = fieldWeight in 3452, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3452)
        0.24 = coord(6/25)
    
  3. Hartel, J.; Savolainen, R.: Pictorial metaphors for information (2016) 0.10
    0.09594088 = sum of:
      0.09594088 = product of:
        0.3997537 = sum of:
          0.030303182 = weight(abstract_txt:purpose in 3163) [ClassicSimilarity], result of:
            0.030303182 = score(doc=3163,freq=2.0), product of:
              0.08778418 = queryWeight, product of:
                1.2265466 = boost
                4.463432 = idf(docFreq=1384, maxDocs=44218)
                0.01603479 = queryNorm
              0.34520096 = fieldWeight in 3163, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.463432 = idf(docFreq=1384, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3163)
          0.030675698 = weight(abstract_txt:implications in 3163) [ClassicSimilarity], result of:
            0.030675698 = score(doc=3163,freq=2.0), product of:
              0.08850213 = queryWeight, product of:
                1.2315521 = boost
                4.481647 = idf(docFreq=1359, maxDocs=44218)
                0.01603479 = queryNorm
              0.3466097 = fieldWeight in 3163, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.481647 = idf(docFreq=1359, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3163)
          0.040553134 = weight(abstract_txt:generated in 3163) [ClassicSimilarity], result of:
            0.040553134 = score(doc=3163,freq=1.0), product of:
              0.13431269 = queryWeight, product of:
                1.5171708 = boost
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.01603479 = queryNorm
              0.3019308 = fieldWeight in 3163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3163)
          0.041112415 = weight(abstract_txt:produced in 3163) [ClassicSimilarity], result of:
            0.041112415 = score(doc=3163,freq=1.0), product of:
              0.13554476 = queryWeight, product of:
                1.5241135 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.01603479 = queryNorm
              0.30331245 = fieldWeight in 3163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3163)
          0.037980206 = weight(abstract_txt:approach in 3163) [ClassicSimilarity], result of:
            0.037980206 = score(doc=3163,freq=1.0), product of:
              0.18542974 = queryWeight, product of:
                3.0876372 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01603479 = queryNorm
              0.20482263 = fieldWeight in 3163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3163)
          0.21912906 = weight(abstract_txt:cloud in 3163) [ClassicSimilarity], result of:
            0.21912906 = score(doc=3163,freq=1.0), product of:
              0.5210876 = queryWeight, product of:
                4.226164 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.01603479 = queryNorm
              0.4205225 = fieldWeight in 3163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3163)
        0.24 = coord(6/25)
    
  4. Williamson, A.: Strategies for managing digital content formats (2005) 0.10
    0.0956276 = sum of:
      0.0956276 = product of:
        0.47813797 = sum of:
          0.030610837 = weight(abstract_txt:purpose in 4745) [ClassicSimilarity], result of:
            0.030610837 = score(doc=4745,freq=1.0), product of:
              0.08778418 = queryWeight, product of:
                1.2265466 = boost
                4.463432 = idf(docFreq=1384, maxDocs=44218)
                0.01603479 = queryNorm
              0.34870562 = fieldWeight in 4745, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.463432 = idf(docFreq=1384, maxDocs=44218)
                0.078125 = fieldNorm(doc=4745)
          0.030987134 = weight(abstract_txt:implications in 4745) [ClassicSimilarity], result of:
            0.030987134 = score(doc=4745,freq=1.0), product of:
              0.08850213 = queryWeight, product of:
                1.2315521 = boost
                4.481647 = idf(docFreq=1359, maxDocs=44218)
                0.01603479 = queryNorm
              0.35012868 = fieldWeight in 4745, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.481647 = idf(docFreq=1359, maxDocs=44218)
                0.078125 = fieldNorm(doc=4745)
          0.058732018 = weight(abstract_txt:produced in 4745) [ClassicSimilarity], result of:
            0.058732018 = score(doc=4745,freq=1.0), product of:
              0.13554476 = queryWeight, product of:
                1.5241135 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.01603479 = queryNorm
              0.43330348 = fieldWeight in 4745, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.078125 = fieldNorm(doc=4745)
          0.0767316 = weight(abstract_txt:approach in 4745) [ClassicSimilarity], result of:
            0.0767316 = score(doc=4745,freq=2.0), product of:
              0.18542974 = queryWeight, product of:
                3.0876372 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01603479 = queryNorm
              0.41380417 = fieldWeight in 4745, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=4745)
          0.28107637 = weight(abstract_txt:content in 4745) [ClassicSimilarity], result of:
            0.28107637 = score(doc=4745,freq=5.0), product of:
              0.38493052 = queryWeight, product of:
                5.743176 = boost
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.01603479 = queryNorm
              0.7302003 = fieldWeight in 4745, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.078125 = fieldNorm(doc=4745)
        0.2 = coord(5/25)
    
  5. Pu, H.-T.; Chuang, S.-L.; Yang, C.: Subject categorization of query terms for exploring Web users' search interests (2002) 0.09
    0.091681756 = sum of:
      0.091681756 = product of:
        0.38200733 = sum of:
          0.028072778 = weight(abstract_txt:further in 587) [ClassicSimilarity], result of:
            0.028072778 = score(doc=587,freq=1.0), product of:
              0.09615305 = queryWeight, product of:
                1.2836821 = boost
                4.671349 = idf(docFreq=1124, maxDocs=44218)
                0.01603479 = queryNorm
              0.29195932 = fieldWeight in 587, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.671349 = idf(docFreq=1124, maxDocs=44218)
                0.0625 = fieldNorm(doc=587)
          0.04146987 = weight(abstract_txt:appropriate in 587) [ClassicSimilarity], result of:
            0.04146987 = score(doc=587,freq=1.0), product of:
              0.124717645 = queryWeight, product of:
                1.4619749 = boost
                5.3201604 = idf(docFreq=587, maxDocs=44218)
                0.01603479 = queryNorm
              0.33251002 = fieldWeight in 587, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3201604 = idf(docFreq=587, maxDocs=44218)
                0.0625 = fieldNorm(doc=587)
          0.046242017 = weight(abstract_txt:automatically in 587) [ClassicSimilarity], result of:
            0.046242017 = score(doc=587,freq=1.0), product of:
              0.13411087 = queryWeight, product of:
                1.5160306 = boost
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.01603479 = queryNorm
              0.3448044 = fieldWeight in 587, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.0625 = fieldNorm(doc=587)
          0.07884985 = weight(abstract_txt:classifying in 587) [ClassicSimilarity], result of:
            0.07884985 = score(doc=587,freq=1.0), product of:
              0.19141386 = queryWeight, product of:
                1.8111843 = boost
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.01603479 = queryNorm
              0.41193387 = fieldWeight in 587, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.0625 = fieldNorm(doc=587)
          0.0868119 = weight(abstract_txt:approach in 587) [ClassicSimilarity], result of:
            0.0868119 = score(doc=587,freq=4.0), product of:
              0.18542974 = queryWeight, product of:
                3.0876372 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01603479 = queryNorm
              0.468166 = fieldWeight in 587, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=587)
          0.10056094 = weight(abstract_txt:content in 587) [ClassicSimilarity], result of:
            0.10056094 = score(doc=587,freq=1.0), product of:
              0.38493052 = queryWeight, product of:
                5.743176 = boost
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.01603479 = queryNorm
              0.2612444 = fieldWeight in 587, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.0625 = fieldNorm(doc=587)
        0.24 = coord(6/25)