Document (#35253)

Author
Xu, Y.
Bernard, A.
Title
Knowledge organization through statistical computation : a new approach
Source
Knowledge organization. 36(2009) no.4, S.227-239
Year
2009
Abstract
Knowledge organization (KO) is an interdisciplinary issue which includes some problems in knowledge classification such as how to classify newly emerged knowledge. With the great complexity and ambiguity of knowledge, it is becoming sometimes inefficient to classify knowledge by logical reasoning. This paper attempts to propose a statistical approach to knowledge organization in order to resolve the problems in classifying complex and mass knowledge. By integrating the classification process into a mathematical model, a knowledge classifier, based on the maximum entropy theory, is constructed and the experimental results show that the classification results acquired from the classifier are reliable. The approach proposed in this paper is quite formal and is not dependent on specific contexts, so it could easily be adapted to the use of knowledge classification in other domains within KO.
Content
Vgl. unter: http://www.ergon-verlag.de/isko_ko/downloads/ko3620094e.pdf.
Theme
Automatisches Klassifizieren

Similar documents (content)

  1. Mengle, S.S.R.; Goharian, N.: Ambiguity measure feature-selection algorithm (2009) 0.24
    0.24321578 = sum of:
      0.24321578 = product of:
        0.76004934 = sum of:
          0.017417662 = weight(abstract_txt:results in 2804) [ClassicSimilarity], result of:
            0.017417662 = score(doc=2804,freq=1.0), product of:
              0.08002551 = queryWeight, product of:
                1.0662487 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.021552049 = queryNorm
              0.21765138 = fieldWeight in 2804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=2804)
          0.075489804 = weight(abstract_txt:ambiguity in 2804) [ClassicSimilarity], result of:
            0.075489804 = score(doc=2804,freq=1.0), product of:
              0.16884339 = queryWeight, product of:
                1.0951442 = boost
                7.1535926 = idf(docFreq=93, maxDocs=44218)
                0.021552049 = queryNorm
              0.44709954 = fieldWeight in 2804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1535926 = idf(docFreq=93, maxDocs=44218)
                0.0625 = fieldNorm(doc=2804)
          0.032733317 = weight(abstract_txt:problems in 2804) [ClassicSimilarity], result of:
            0.032733317 = score(doc=2804,freq=1.0), product of:
              0.12186956 = queryWeight, product of:
                1.315806 = boost
                4.297489 = idf(docFreq=1634, maxDocs=44218)
                0.021552049 = queryNorm
              0.26859307 = fieldWeight in 2804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.297489 = idf(docFreq=1634, maxDocs=44218)
                0.0625 = fieldNorm(doc=2804)
          0.07036427 = weight(abstract_txt:statistical in 2804) [ClassicSimilarity], result of:
            0.07036427 = score(doc=2804,freq=1.0), product of:
              0.20298782 = queryWeight, product of:
                1.6981626 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.021552049 = queryNorm
              0.3466428 = fieldWeight in 2804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.0625 = fieldNorm(doc=2804)
          0.045964383 = weight(abstract_txt:approach in 2804) [ClassicSimilarity], result of:
            0.045964383 = score(doc=2804,freq=2.0), product of:
              0.13884702 = queryWeight, product of:
                1.7201178 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.021552049 = queryNorm
              0.33104333 = fieldWeight in 2804, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=2804)
          0.11525157 = weight(abstract_txt:classify in 2804) [ClassicSimilarity], result of:
            0.11525157 = score(doc=2804,freq=1.0), product of:
              0.28205454 = queryWeight, product of:
                2.0017545 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.021552049 = queryNorm
              0.4086145 = fieldWeight in 2804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.0625 = fieldNorm(doc=2804)
          0.35035098 = weight(abstract_txt:classifier in 2804) [ClassicSimilarity], result of:
            0.35035098 = score(doc=2804,freq=5.0), product of:
              0.34613654 = queryWeight, product of:
                2.2175224 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.021552049 = queryNorm
              1.0121757 = fieldWeight in 2804, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.0625 = fieldNorm(doc=2804)
          0.052477337 = weight(abstract_txt:classification in 2804) [ClassicSimilarity], result of:
            0.052477337 = score(doc=2804,freq=1.0), product of:
              0.21032605 = queryWeight, product of:
                2.4445887 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.021552049 = queryNorm
              0.2495047 = fieldWeight in 2804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=2804)
        0.32 = coord(8/25)
    
  2. Lan, K.C.; Ho, K.S.; Luk, R.W.P.; Leong, H.V.: Dialogue act recognition using maximum entropy (2008) 0.17
    0.17363921 = sum of:
      0.17363921 = product of:
        0.72349674 = sum of:
          0.017192872 = weight(abstract_txt:paper in 1717) [ClassicSimilarity], result of:
            0.017192872 = score(doc=1717,freq=1.0), product of:
              0.07933549 = queryWeight, product of:
                1.0616418 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.021552049 = queryNorm
              0.216711 = fieldWeight in 1717, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.078340866 = weight(abstract_txt:maximum in 1717) [ClassicSimilarity], result of:
            0.078340866 = score(doc=1717,freq=1.0), product of:
              0.17306827 = queryWeight, product of:
                1.1087612 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.021552049 = queryNorm
              0.45265874 = fieldWeight in 1717, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.10397449 = weight(abstract_txt:entropy in 1717) [ClassicSimilarity], result of:
            0.10397449 = score(doc=1717,freq=1.0), product of:
              0.2090145 = queryWeight, product of:
                1.2184774 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.021552049 = queryNorm
              0.4974511 = fieldWeight in 1717, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.056294642 = weight(abstract_txt:approach in 1717) [ClassicSimilarity], result of:
            0.056294642 = score(doc=1717,freq=3.0), product of:
              0.13884702 = queryWeight, product of:
                1.7201178 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.021552049 = queryNorm
              0.40544364 = fieldWeight in 1717, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.35035098 = weight(abstract_txt:classifier in 1717) [ClassicSimilarity], result of:
            0.35035098 = score(doc=1717,freq=5.0), product of:
              0.34613654 = queryWeight, product of:
                2.2175224 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.021552049 = queryNorm
              1.0121757 = fieldWeight in 1717, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
          0.1173429 = weight(abstract_txt:classification in 1717) [ClassicSimilarity], result of:
            0.1173429 = score(doc=1717,freq=5.0), product of:
              0.21032605 = queryWeight, product of:
                2.4445887 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.021552049 = queryNorm
              0.5579095 = fieldWeight in 1717, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=1717)
        0.24 = coord(6/25)
    
  3. Brychcín, T.; Konopík, M.: HPS: High precision stemmer (2015) 0.17
    0.17297575 = sum of:
      0.17297575 = product of:
        0.61777055 = sum of:
          0.06258255 = weight(abstract_txt:reliable in 2686) [ClassicSimilarity], result of:
            0.06258255 = score(doc=2686,freq=1.0), product of:
              0.14900267 = queryWeight, product of:
                1.0287889 = boost
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.021552049 = queryNorm
              0.42000958 = fieldWeight in 2686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.024632296 = weight(abstract_txt:results in 2686) [ClassicSimilarity], result of:
            0.024632296 = score(doc=2686,freq=2.0), product of:
              0.08002551 = queryWeight, product of:
                1.0662487 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.021552049 = queryNorm
              0.30780554 = fieldWeight in 2686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.078340866 = weight(abstract_txt:maximum in 2686) [ClassicSimilarity], result of:
            0.078340866 = score(doc=2686,freq=1.0), product of:
              0.17306827 = queryWeight, product of:
                1.1087612 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.021552049 = queryNorm
              0.45265874 = fieldWeight in 2686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.10397449 = weight(abstract_txt:entropy in 2686) [ClassicSimilarity], result of:
            0.10397449 = score(doc=2686,freq=1.0), product of:
              0.2090145 = queryWeight, product of:
                1.2184774 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.021552049 = queryNorm
              0.4974511 = fieldWeight in 2686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.07036427 = weight(abstract_txt:statistical in 2686) [ClassicSimilarity], result of:
            0.07036427 = score(doc=2686,freq=1.0), product of:
              0.20298782 = queryWeight, product of:
                1.6981626 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.021552049 = queryNorm
              0.3466428 = fieldWeight in 2686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.056294642 = weight(abstract_txt:approach in 2686) [ClassicSimilarity], result of:
            0.056294642 = score(doc=2686,freq=3.0), product of:
              0.13884702 = queryWeight, product of:
                1.7201178 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.021552049 = queryNorm
              0.40544364 = fieldWeight in 2686, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.22158143 = weight(abstract_txt:classifier in 2686) [ClassicSimilarity], result of:
            0.22158143 = score(doc=2686,freq=2.0), product of:
              0.34613654 = queryWeight, product of:
                2.2175224 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.021552049 = queryNorm
              0.64015615 = fieldWeight in 2686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
        0.28 = coord(7/25)
    
  4. Sánchez, D.; Batet, M.; Valls, A.; Gibert, K.: Ontology-driven web-based semantic similarity (2010) 0.17
    0.17236187 = sum of:
      0.17236187 = product of:
        0.53863084 = sum of:
          0.05475973 = weight(abstract_txt:reliable in 335) [ClassicSimilarity], result of:
            0.05475973 = score(doc=335,freq=1.0), product of:
              0.14900267 = queryWeight, product of:
                1.0287889 = boost
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.021552049 = queryNorm
              0.36750838 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.0546875 = fieldNorm(doc=335)
          0.015043763 = weight(abstract_txt:paper in 335) [ClassicSimilarity], result of:
            0.015043763 = score(doc=335,freq=1.0), product of:
              0.07933549 = queryWeight, product of:
                1.0616418 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.021552049 = queryNorm
              0.18962212 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.0546875 = fieldNorm(doc=335)
          0.015240455 = weight(abstract_txt:results in 335) [ClassicSimilarity], result of:
            0.015240455 = score(doc=335,freq=1.0), product of:
              0.08002551 = queryWeight, product of:
                1.0662487 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.021552049 = queryNorm
              0.19044496 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0546875 = fieldNorm(doc=335)
          0.09341387 = weight(abstract_txt:ambiguity in 335) [ClassicSimilarity], result of:
            0.09341387 = score(doc=335,freq=2.0), product of:
              0.16884339 = queryWeight, product of:
                1.0951442 = boost
                7.1535926 = idf(docFreq=93, maxDocs=44218)
                0.021552049 = queryNorm
              0.55325747 = fieldWeight in 335, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1535926 = idf(docFreq=93, maxDocs=44218)
                0.0546875 = fieldNorm(doc=335)
          0.082628995 = weight(abstract_txt:computation in 335) [ClassicSimilarity], result of:
            0.082628995 = score(doc=335,freq=1.0), product of:
              0.19602351 = queryWeight, product of:
                1.1800036 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.021552049 = queryNorm
              0.42152596 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0546875 = fieldNorm(doc=335)
          0.028641654 = weight(abstract_txt:problems in 335) [ClassicSimilarity], result of:
            0.028641654 = score(doc=335,freq=1.0), product of:
              0.12186956 = queryWeight, product of:
                1.315806 = boost
                4.297489 = idf(docFreq=1634, maxDocs=44218)
                0.021552049 = queryNorm
              0.23501894 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.297489 = idf(docFreq=1634, maxDocs=44218)
                0.0546875 = fieldNorm(doc=335)
          0.08707133 = weight(abstract_txt:statistical in 335) [ClassicSimilarity], result of:
            0.08707133 = score(doc=335,freq=2.0), product of:
              0.20298782 = queryWeight, product of:
                1.6981626 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.021552049 = queryNorm
              0.42894855 = fieldWeight in 335, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.0546875 = fieldNorm(doc=335)
          0.16183105 = weight(abstract_txt:knowledge in 335) [ClassicSimilarity], result of:
            0.16183105 = score(doc=335,freq=4.0), product of:
              0.41646105 = queryWeight, product of:
                5.438967 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.021552049 = queryNorm
              0.38858628 = fieldWeight in 335, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.0546875 = fieldNorm(doc=335)
        0.32 = coord(8/25)
    
  5. Prabowo, R.; Jackson, M.; Burden, P.; Knoell, H.-D.: Ontology-based automatic classification for the Web pages : design, implementation and evaluation (2002) 0.15
    0.15446325 = sum of:
      0.15446325 = product of:
        0.6435969 = sum of:
          0.073801994 = weight(abstract_txt:classifying in 3383) [ClassicSimilarity], result of:
            0.073801994 = score(doc=3383,freq=1.0), product of:
              0.14332786 = queryWeight, product of:
                1.0090079 = boost
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.021552049 = queryNorm
              0.5149173 = fieldWeight in 3383, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.078125 = fieldNorm(doc=3383)
          0.021491092 = weight(abstract_txt:paper in 3383) [ClassicSimilarity], result of:
            0.021491092 = score(doc=3383,freq=1.0), product of:
              0.07933549 = queryWeight, product of:
                1.0616418 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.021552049 = queryNorm
              0.27088875 = fieldWeight in 3383, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.078125 = fieldNorm(doc=3383)
          0.02177208 = weight(abstract_txt:results in 3383) [ClassicSimilarity], result of:
            0.02177208 = score(doc=3383,freq=1.0), product of:
              0.08002551 = queryWeight, product of:
                1.0662487 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.021552049 = queryNorm
              0.27206424 = fieldWeight in 3383, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.078125 = fieldNorm(doc=3383)
          0.040627155 = weight(abstract_txt:approach in 3383) [ClassicSimilarity], result of:
            0.040627155 = score(doc=3383,freq=1.0), product of:
              0.13884702 = queryWeight, product of:
                1.7201178 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.021552049 = queryNorm
              0.29260373 = fieldWeight in 3383, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=3383)
          0.33922592 = weight(abstract_txt:classifier in 3383) [ClassicSimilarity], result of:
            0.33922592 = score(doc=3383,freq=3.0), product of:
              0.34613654 = queryWeight, product of:
                2.2175224 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.021552049 = queryNorm
              0.98003495 = fieldWeight in 3383, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.078125 = fieldNorm(doc=3383)
          0.14667863 = weight(abstract_txt:classification in 3383) [ClassicSimilarity], result of:
            0.14667863 = score(doc=3383,freq=5.0), product of:
              0.21032605 = queryWeight, product of:
                2.4445887 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.021552049 = queryNorm
              0.69738686 = fieldWeight in 3383, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=3383)
        0.24 = coord(6/25)