Document (#30275)

Author
Yoon, Y.
Lee, C.
Lee, G.G.
Title
¬An effective procedure for constructing a hierarchical text classification system
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.3, S.431-442
Year
2006
Abstract
In text categorization tasks, classification on some class hierarchies has better results than in cases without the hierarchy. Currently, because a large number of documents are divided into several subgroups in a hierarchy, we can appropriately use a hierarchical classification method. However, we have no systematic method to build a hierarchical classification system that performs well with large collections of practical data. In this article, we introduce a new evaluation scheme for internal node classifiers, which can be used effectively to develop a hierarchical classification system. We also show that our method for constructing the hierarchical classification system is very effective, especially for the task of constructing classifiers applied to hierarchy tree with a lot of levels.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Yoon, L.L.: ¬The performance of cited references as an approach to information retrieval (1994) 5.60
    5.604247 = sum of:
      5.604247 = weight(author_txt:yoon in 219) [ClassicSimilarity], result of:
        5.604247 = fieldWeight in 219, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.966795 = idf(docFreq=14, maxDocs=43254)
          0.625 = fieldNorm(doc=219)
    
  2. Yoon, J.W.: Utilizing quantitative users' reactions to represent affective meanings of an image (2010) 5.60
    5.604247 = sum of:
      5.604247 = weight(author_txt:yoon in 49) [ClassicSimilarity], result of:
        5.604247 = fieldWeight in 49, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.966795 = idf(docFreq=14, maxDocs=43254)
          0.625 = fieldNorm(doc=49)
    
  3. Yoon, J.W.: Towards a user-oriented thesaurus for non-domain-specific image collections (2009) 5.60
    5.604247 = sum of:
      5.604247 = weight(author_txt:yoon in 686) [ClassicSimilarity], result of:
        5.604247 = fieldWeight in 686, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.966795 = idf(docFreq=14, maxDocs=43254)
          0.625 = fieldNorm(doc=686)
    
  4. Yoon, K.: Conceptual syntagmatic associations in user tagging (2012) 5.60
    5.604247 = sum of:
      5.604247 = weight(author_txt:yoon in 1705) [ClassicSimilarity], result of:
        5.604247 = fieldWeight in 1705, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.966795 = idf(docFreq=14, maxDocs=43254)
          0.625 = fieldNorm(doc=1705)
    
  5. Yoon, A.: Data reusers' trust development (2017) 5.60
    5.604247 = sum of:
      5.604247 = weight(author_txt:yoon in 4997) [ClassicSimilarity], result of:
        5.604247 = fieldWeight in 4997, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.966795 = idf(docFreq=14, maxDocs=43254)
          0.625 = fieldNorm(doc=4997)
    

Similar documents (content)

  1. Gauch, S.; Chandramouli, A.; Ranganathan, S.: Training a hierarchical classifier using inter document relationships (2009) 0.28
    0.2803727 = sum of:
      0.2803727 = product of:
        1.1682197 = sum of:
          0.0395286 = weight(abstract_txt:text in 4698) [ClassicSimilarity], result of:
            0.0395286 = score(doc=4698,freq=2.0), product of:
              0.0883445 = queryWeight, product of:
                1.3763185 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.01585016 = queryNorm
              0.44743705 = fieldWeight in 4698, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.078125 = fieldNorm(doc=4698)
          0.05304864 = weight(abstract_txt:large in 4698) [ClassicSimilarity], result of:
            0.05304864 = score(doc=4698,freq=2.0), product of:
              0.1074867 = queryWeight, product of:
                1.5181216 = boost
                4.466985 = idf(docFreq=1349, maxDocs=43254)
                0.01585016 = queryNorm
              0.49353677 = fieldWeight in 4698, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.466985 = idf(docFreq=1349, maxDocs=43254)
                0.078125 = fieldNorm(doc=4698)
          0.31544667 = weight(abstract_txt:classifiers in 4698) [ClassicSimilarity], result of:
            0.31544667 = score(doc=4698,freq=3.0), product of:
              0.30819488 = queryWeight, product of:
                2.5706437 = boost
                7.563971 = idf(docFreq=60, maxDocs=43254)
                0.01585016 = queryNorm
              1.0235299 = fieldWeight in 4698, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.563971 = idf(docFreq=60, maxDocs=43254)
                0.078125 = fieldNorm(doc=4698)
          0.25374937 = weight(abstract_txt:hierarchy in 4698) [ClassicSimilarity], result of:
            0.25374937 = score(doc=4698,freq=2.0), product of:
              0.34930566 = queryWeight, product of:
                3.3517964 = boost
                6.5749784 = idf(docFreq=163, maxDocs=43254)
                0.01585016 = queryNorm
              0.72643936 = fieldWeight in 4698, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5749784 = idf(docFreq=163, maxDocs=43254)
                0.078125 = fieldNorm(doc=4698)
          0.16134866 = weight(abstract_txt:classification in 4698) [ClassicSimilarity], result of:
            0.16134866 = score(doc=4698,freq=4.0), product of:
              0.25829294 = queryWeight, product of:
                4.076112 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.01585016 = queryNorm
              0.6246731 = fieldWeight in 4698, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.078125 = fieldNorm(doc=4698)
          0.3450977 = weight(abstract_txt:hierarchical in 4698) [ClassicSimilarity], result of:
            0.3450977 = score(doc=4698,freq=3.0), product of:
              0.44410208 = queryWeight, product of:
                4.87911 = boost
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.01585016 = queryNorm
              0.7770684 = fieldWeight in 4698, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.078125 = fieldNorm(doc=4698)
        0.24 = coord(6/25)
    
  2. Sun, A.; Lim, E.-P.; Ng, W.-K.: Performance measurement framework for hierarchical text classification (2003) 0.24
    0.2392589 = sum of:
      0.2392589 = product of:
        0.9969121 = sum of:
          0.053835686 = weight(abstract_txt:tree in 3809) [ClassicSimilarity], result of:
            0.053835686 = score(doc=3809,freq=1.0), product of:
              0.12595789 = queryWeight, product of:
                1.1620555 = boost
                6.838563 = idf(docFreq=125, maxDocs=43254)
                0.01585016 = queryNorm
              0.4274102 = fieldWeight in 3809, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.838563 = idf(docFreq=125, maxDocs=43254)
                0.0625 = fieldNorm(doc=3809)
          0.022360755 = weight(abstract_txt:text in 3809) [ClassicSimilarity], result of:
            0.022360755 = score(doc=3809,freq=1.0), product of:
              0.0883445 = queryWeight, product of:
                1.3763185 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.01585016 = queryNorm
              0.25310862 = fieldWeight in 3809, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.0625 = fieldNorm(doc=3809)
          0.046401646 = weight(abstract_txt:method in 3809) [ClassicSimilarity], result of:
            0.046401646 = score(doc=3809,freq=1.0), product of:
              0.16452853 = queryWeight, product of:
                2.300358 = boost
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.01585016 = queryNorm
              0.28202796 = fieldWeight in 3809, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.0625 = fieldNorm(doc=3809)
          0.3257919 = weight(abstract_txt:classifiers in 3809) [ClassicSimilarity], result of:
            0.3257919 = score(doc=3809,freq=5.0), product of:
              0.30819488 = queryWeight, product of:
                2.5706437 = boost
                7.563971 = idf(docFreq=60, maxDocs=43254)
                0.01585016 = queryNorm
              1.0570971 = fieldWeight in 3809, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.563971 = idf(docFreq=60, maxDocs=43254)
                0.0625 = fieldNorm(doc=3809)
          0.15808874 = weight(abstract_txt:classification in 3809) [ClassicSimilarity], result of:
            0.15808874 = score(doc=3809,freq=6.0), product of:
              0.25829294 = queryWeight, product of:
                4.076112 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.01585016 = queryNorm
              0.61205214 = fieldWeight in 3809, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.0625 = fieldNorm(doc=3809)
          0.39043346 = weight(abstract_txt:hierarchical in 3809) [ClassicSimilarity], result of:
            0.39043346 = score(doc=3809,freq=6.0), product of:
              0.44410208 = queryWeight, product of:
                4.87911 = boost
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.01585016 = queryNorm
              0.87915254 = fieldWeight in 3809, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.0625 = fieldNorm(doc=3809)
        0.24 = coord(6/25)
    
  3. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.24
    0.23812729 = sum of:
      0.23812729 = product of:
        0.9921971 = sum of:
          0.078088164 = weight(abstract_txt:performs in 4761) [ClassicSimilarity], result of:
            0.078088164 = score(doc=4761,freq=1.0), product of:
              0.13908982 = queryWeight, product of:
                1.2211299 = boost
                7.1862087 = idf(docFreq=88, maxDocs=43254)
                0.01585016 = queryNorm
              0.5614226 = fieldWeight in 4761, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1862087 = idf(docFreq=88, maxDocs=43254)
                0.078125 = fieldNorm(doc=4761)
          0.06846555 = weight(abstract_txt:text in 4761) [ClassicSimilarity], result of:
            0.06846555 = score(doc=4761,freq=6.0), product of:
              0.0883445 = queryWeight, product of:
                1.3763185 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.01585016 = queryNorm
              0.77498376 = fieldWeight in 4761, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.078125 = fieldNorm(doc=4761)
          0.032060884 = weight(abstract_txt:system in 4761) [ClassicSimilarity], result of:
            0.032060884 = score(doc=4761,freq=1.0), product of:
              0.121966965 = queryWeight, product of:
                2.2869954 = boost
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.01585016 = queryNorm
              0.2628653 = fieldWeight in 4761, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.078125 = fieldNorm(doc=4761)
          0.25374937 = weight(abstract_txt:hierarchy in 4761) [ClassicSimilarity], result of:
            0.25374937 = score(doc=4761,freq=2.0), product of:
              0.34930566 = queryWeight, product of:
                3.3517964 = boost
                6.5749784 = idf(docFreq=163, maxDocs=43254)
                0.01585016 = queryNorm
              0.72643936 = fieldWeight in 4761, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5749784 = idf(docFreq=163, maxDocs=43254)
                0.078125 = fieldNorm(doc=4761)
          0.16134866 = weight(abstract_txt:classification in 4761) [ClassicSimilarity], result of:
            0.16134866 = score(doc=4761,freq=4.0), product of:
              0.25829294 = queryWeight, product of:
                4.076112 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.01585016 = queryNorm
              0.6246731 = fieldWeight in 4761, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.078125 = fieldNorm(doc=4761)
          0.39848447 = weight(abstract_txt:hierarchical in 4761) [ClassicSimilarity], result of:
            0.39848447 = score(doc=4761,freq=4.0), product of:
              0.44410208 = queryWeight, product of:
                4.87911 = boost
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.01585016 = queryNorm
              0.8972812 = fieldWeight in 4761, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.078125 = fieldNorm(doc=4761)
        0.24 = coord(6/25)
    
  4. Li, T.; Zhu, S.; Ogihara, M.: Hierarchical document classification using automatically generated hierarchy (2007) 0.23
    0.23341098 = sum of:
      0.23341098 = product of:
        0.83361065 = sum of:
          0.10659502 = weight(abstract_txt:categorization in 1262) [ClassicSimilarity], result of:
            0.10659502 = score(doc=1262,freq=3.0), product of:
              0.11867413 = queryWeight, product of:
                1.1279562 = boost
                6.6378922 = idf(docFreq=153, maxDocs=43254)
                0.01585016 = queryNorm
              0.8982161 = fieldWeight in 1262, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.6378922 = idf(docFreq=153, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.13278547 = weight(abstract_txt:hierarchies in 1262) [ClassicSimilarity], result of:
            0.13278547 = score(doc=1262,freq=3.0), product of:
              0.13739318 = queryWeight, product of:
                1.2136593 = boost
                7.1422453 = idf(docFreq=92, maxDocs=43254)
                0.01585016 = queryNorm
              0.9664634 = fieldWeight in 1262, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.1422453 = idf(docFreq=92, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.0395286 = weight(abstract_txt:text in 1262) [ClassicSimilarity], result of:
            0.0395286 = score(doc=1262,freq=2.0), product of:
              0.0883445 = queryWeight, product of:
                1.3763185 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.01585016 = queryNorm
              0.44743705 = fieldWeight in 1262, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.03751106 = weight(abstract_txt:large in 1262) [ClassicSimilarity], result of:
            0.03751106 = score(doc=1262,freq=1.0), product of:
              0.1074867 = queryWeight, product of:
                1.5181216 = boost
                4.466985 = idf(docFreq=1349, maxDocs=43254)
                0.01585016 = queryNorm
              0.34898323 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.466985 = idf(docFreq=1349, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.05800206 = weight(abstract_txt:method in 1262) [ClassicSimilarity], result of:
            0.05800206 = score(doc=1262,freq=1.0), product of:
              0.16452853 = queryWeight, product of:
                2.300358 = boost
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.01585016 = queryNorm
              0.35253495 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.114090726 = weight(abstract_txt:classification in 1262) [ClassicSimilarity], result of:
            0.114090726 = score(doc=1262,freq=2.0), product of:
              0.25829294 = queryWeight, product of:
                4.076112 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.01585016 = queryNorm
              0.4417106 = fieldWeight in 1262, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
          0.3450977 = weight(abstract_txt:hierarchical in 1262) [ClassicSimilarity], result of:
            0.3450977 = score(doc=1262,freq=3.0), product of:
              0.44410208 = queryWeight, product of:
                4.87911 = boost
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.01585016 = queryNorm
              0.7770684 = fieldWeight in 1262, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.078125 = fieldNorm(doc=1262)
        0.28 = coord(7/25)
    
  5. Pons-Porrata, A.; Berlanga-Llavori, R.; Ruiz-Shulcloper, J.: Topic discovery based on text mining techniques (2007) 0.19
    0.19102597 = sum of:
      0.19102597 = product of:
        0.6822356 = sum of:
          0.042884428 = weight(abstract_txt:build in 2917) [ClassicSimilarity], result of:
            0.042884428 = score(doc=2917,freq=1.0), product of:
              0.09327637 = queryWeight, product of:
                5.884885 = idf(docFreq=326, maxDocs=43254)
                0.01585016 = queryNorm
              0.4597566 = fieldWeight in 2917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.884885 = idf(docFreq=326, maxDocs=43254)
                0.078125 = fieldNorm(doc=2917)
          0.07666373 = weight(abstract_txt:hierarchies in 2917) [ClassicSimilarity], result of:
            0.07666373 = score(doc=2917,freq=1.0), product of:
              0.13739318 = queryWeight, product of:
                1.2136593 = boost
                7.1422453 = idf(docFreq=92, maxDocs=43254)
                0.01585016 = queryNorm
              0.5579879 = fieldWeight in 2917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1422453 = idf(docFreq=92, maxDocs=43254)
                0.078125 = fieldNorm(doc=2917)
          0.045340937 = weight(abstract_txt:system in 2917) [ClassicSimilarity], result of:
            0.045340937 = score(doc=2917,freq=2.0), product of:
              0.121966965 = queryWeight, product of:
                2.2869954 = boost
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.01585016 = queryNorm
              0.37174767 = fieldWeight in 2917, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.364676 = idf(docFreq=4064, maxDocs=43254)
                0.078125 = fieldNorm(doc=2917)
          0.05800206 = weight(abstract_txt:method in 2917) [ClassicSimilarity], result of:
            0.05800206 = score(doc=2917,freq=1.0), product of:
              0.16452853 = queryWeight, product of:
                2.300358 = boost
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.01585016 = queryNorm
              0.35253495 = fieldWeight in 2917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.078125 = fieldNorm(doc=2917)
          0.1794279 = weight(abstract_txt:hierarchy in 2917) [ClassicSimilarity], result of:
            0.1794279 = score(doc=2917,freq=1.0), product of:
              0.34930566 = queryWeight, product of:
                3.3517964 = boost
                6.5749784 = idf(docFreq=163, maxDocs=43254)
                0.01585016 = queryNorm
              0.5136702 = fieldWeight in 2917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5749784 = idf(docFreq=163, maxDocs=43254)
                0.078125 = fieldNorm(doc=2917)
          0.08067433 = weight(abstract_txt:classification in 2917) [ClassicSimilarity], result of:
            0.08067433 = score(doc=2917,freq=1.0), product of:
              0.25829294 = queryWeight, product of:
                4.076112 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.01585016 = queryNorm
              0.31233656 = fieldWeight in 2917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.078125 = fieldNorm(doc=2917)
          0.19924223 = weight(abstract_txt:hierarchical in 2917) [ClassicSimilarity], result of:
            0.19924223 = score(doc=2917,freq=1.0), product of:
              0.44410208 = queryWeight, product of:
                4.87911 = boost
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.01585016 = queryNorm
              0.4486406 = fieldWeight in 2917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7426 = idf(docFreq=376, maxDocs=43254)
                0.078125 = fieldNorm(doc=2917)
        0.28 = coord(7/25)