Document (#35254)

Author
Xu, Y.
Bernard, A.
Title
Knowledge organization through statistical computation : a new approach
Source
Knowledge organization. 36(2009) no.4, S.227-239
Year
2009
Abstract
Knowledge organization (KO) is an interdisciplinary issue which includes some problems in knowledge classification such as how to classify newly emerged knowledge. With the great complexity and ambiguity of knowledge, it is becoming sometimes inefficient to classify knowledge by logical reasoning. This paper attempts to propose a statistical approach to knowledge organization in order to resolve the problems in classifying complex and mass knowledge. By integrating the classification process into a mathematical model, a knowledge classifier, based on the maximum entropy theory, is constructed and the experimental results show that the classification results acquired from the classifier are reliable. The approach proposed in this paper is quite formal and is not dependent on specific contexts, so it could easily be adapted to the use of knowledge classification in other domains within KO.
Content
Vgl. unter: http://www.ergon-verlag.de/isko_ko/downloads/ko3620094e.pdf.
Theme
Automatisches Klassifizieren

Similar documents (content)

  1. Mengle, S.S.R.; Goharian, N.: Ambiguity measure feature-selection algorithm (2009) 0.24
    0.24387994 = sum of:
      0.24387994 = product of:
        0.76212484 = sum of:
          0.017584132 = weight(abstract_txt:results in 4805) [ClassicSimilarity], result of:
            0.017584132 = score(doc=4805,freq=1.0), product of:
              0.0803684 = queryWeight, product of:
                1.0697432 = boost
                3.5007057 = idf(docFreq=3547, maxDocs=43254)
                0.021461012 = queryNorm
              0.2187941 = fieldWeight in 4805, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5007057 = idf(docFreq=3547, maxDocs=43254)
                0.0625 = fieldNorm(doc=4805)
          0.07605427 = weight(abstract_txt:ambiguity in 4805) [ClassicSimilarity], result of:
            0.07605427 = score(doc=4805,freq=1.0), product of:
              0.16933383 = queryWeight, product of:
                1.0979782 = boost
                7.1862087 = idf(docFreq=88, maxDocs=43254)
                0.021461012 = queryNorm
              0.44913805 = fieldWeight in 4805, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1862087 = idf(docFreq=88, maxDocs=43254)
                0.0625 = fieldNorm(doc=4805)
          0.03250774 = weight(abstract_txt:problems in 4805) [ClassicSimilarity], result of:
            0.03250774 = score(doc=4805,freq=1.0), product of:
              0.12105866 = queryWeight, product of:
                1.3129095 = boost
                4.296461 = idf(docFreq=1600, maxDocs=43254)
                0.021461012 = queryNorm
              0.26852882 = fieldWeight in 4805, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.296461 = idf(docFreq=1600, maxDocs=43254)
                0.0625 = fieldNorm(doc=4805)
          0.06991101 = weight(abstract_txt:statistical in 4805) [ClassicSimilarity], result of:
            0.06991101 = score(doc=4805,freq=1.0), product of:
              0.20169808 = queryWeight, product of:
                1.6946801 = boost
                5.545795 = idf(docFreq=458, maxDocs=43254)
                0.021461012 = queryNorm
              0.3466122 = fieldWeight in 4805, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.545795 = idf(docFreq=458, maxDocs=43254)
                0.0625 = fieldNorm(doc=4805)
          0.046066575 = weight(abstract_txt:approach in 4805) [ClassicSimilarity], result of:
            0.046066575 = score(doc=4805,freq=2.0), product of:
              0.13876578 = queryWeight, product of:
                1.7215661 = boost
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.021461012 = queryNorm
              0.33197358 = fieldWeight in 4805, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.0625 = fieldNorm(doc=4805)
          0.11748717 = weight(abstract_txt:classify in 4805) [ClassicSimilarity], result of:
            0.11748717 = score(doc=4805,freq=1.0), product of:
              0.28510073 = queryWeight, product of:
                2.014819 = boost
                6.5934405 = idf(docFreq=160, maxDocs=43254)
                0.021461012 = queryNorm
              0.41209003 = fieldWeight in 4805, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5934405 = idf(docFreq=160, maxDocs=43254)
                0.0625 = fieldNorm(doc=4805)
          0.35013193 = weight(abstract_txt:classifier in 4805) [ClassicSimilarity], result of:
            0.35013193 = score(doc=4805,freq=5.0), product of:
              0.3452782 = queryWeight, product of:
                2.2172847 = boost
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.021461012 = queryNorm
              1.0140574 = fieldWeight in 4805, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.0625 = fieldNorm(doc=4805)
          0.052382052 = weight(abstract_txt:classification in 4805) [ClassicSimilarity], result of:
            0.052382052 = score(doc=4805,freq=1.0), product of:
              0.20963785 = queryWeight, product of:
                2.4433558 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.021461012 = queryNorm
              0.24986924 = fieldWeight in 4805, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.0625 = fieldNorm(doc=4805)
        0.32 = coord(8/25)
    
  2. Lan, K.C.; Ho, K.S.; Luk, R.W.P.; Leong, H.V.: Dialogue act recognition using maximum entropy (2008) 0.17
    0.17319983 = sum of:
      0.17319983 = product of:
        0.721666 = sum of:
          0.017424082 = weight(abstract_txt:paper in 3718) [ClassicSimilarity], result of:
            0.017424082 = score(doc=3718,freq=1.0), product of:
              0.079879984 = queryWeight, product of:
                1.0664877 = boost
                3.4900522 = idf(docFreq=3585, maxDocs=43254)
                0.021461012 = queryNorm
              0.21812826 = fieldWeight in 3718, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4900522 = idf(docFreq=3585, maxDocs=43254)
                0.0625 = fieldNorm(doc=3718)
          0.07714815 = weight(abstract_txt:maximum in 3718) [ClassicSimilarity], result of:
            0.07714815 = score(doc=3718,freq=1.0), product of:
              0.17095365 = queryWeight, product of:
                1.1032172 = boost
                7.2204976 = idf(docFreq=85, maxDocs=43254)
                0.021461012 = queryNorm
              0.4512811 = fieldWeight in 3718, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2204976 = idf(docFreq=85, maxDocs=43254)
                0.0625 = fieldNorm(doc=3718)
          0.10341217 = weight(abstract_txt:entropy in 3718) [ClassicSimilarity], result of:
            0.10341217 = score(doc=3718,freq=1.0), product of:
              0.20783043 = queryWeight, product of:
                1.2164 = boost
                7.9612727 = idf(docFreq=40, maxDocs=43254)
                0.021461012 = queryNorm
              0.49757954 = fieldWeight in 3718, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9612727 = idf(docFreq=40, maxDocs=43254)
                0.0625 = fieldNorm(doc=3718)
          0.0564198 = weight(abstract_txt:approach in 3718) [ClassicSimilarity], result of:
            0.0564198 = score(doc=3718,freq=3.0), product of:
              0.13876578 = queryWeight, product of:
                1.7215661 = boost
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.021461012 = queryNorm
              0.40658295 = fieldWeight in 3718, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.0625 = fieldNorm(doc=3718)
          0.35013193 = weight(abstract_txt:classifier in 3718) [ClassicSimilarity], result of:
            0.35013193 = score(doc=3718,freq=5.0), product of:
              0.3452782 = queryWeight, product of:
                2.2172847 = boost
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.021461012 = queryNorm
              1.0140574 = fieldWeight in 3718, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.0625 = fieldNorm(doc=3718)
          0.11712983 = weight(abstract_txt:classification in 3718) [ClassicSimilarity], result of:
            0.11712983 = score(doc=3718,freq=5.0), product of:
              0.20963785 = queryWeight, product of:
                2.4433558 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.021461012 = queryNorm
              0.55872464 = fieldWeight in 3718, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.0625 = fieldNorm(doc=3718)
        0.24 = coord(6/25)
    
  3. Sánchez, D.; Batet, M.; Valls, A.; Gibert, K.: Ontology-driven web-based semantic similarity (2010) 0.17
    0.17302373 = sum of:
      0.17302373 = product of:
        0.5406992 = sum of:
          0.05509073 = weight(abstract_txt:reliable in 1800) [ClassicSimilarity], result of:
            0.05509073 = score(doc=1800,freq=1.0), product of:
              0.14929377 = queryWeight, product of:
                1.0309621 = boost
                6.7475915 = idf(docFreq=137, maxDocs=43254)
                0.021461012 = queryNorm
              0.3690089 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7475915 = idf(docFreq=137, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1800)
          0.015246073 = weight(abstract_txt:paper in 1800) [ClassicSimilarity], result of:
            0.015246073 = score(doc=1800,freq=1.0), product of:
              0.079879984 = queryWeight, product of:
                1.0664877 = boost
                3.4900522 = idf(docFreq=3585, maxDocs=43254)
                0.021461012 = queryNorm
              0.19086224 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4900522 = idf(docFreq=3585, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1800)
          0.015386116 = weight(abstract_txt:results in 1800) [ClassicSimilarity], result of:
            0.015386116 = score(doc=1800,freq=1.0), product of:
              0.0803684 = queryWeight, product of:
                1.0697432 = boost
                3.5007057 = idf(docFreq=3547, maxDocs=43254)
                0.021461012 = queryNorm
              0.19144484 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5007057 = idf(docFreq=3547, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1800)
          0.094112344 = weight(abstract_txt:ambiguity in 1800) [ClassicSimilarity], result of:
            0.094112344 = score(doc=1800,freq=2.0), product of:
              0.16933383 = queryWeight, product of:
                1.0979782 = boost
                7.1862087 = idf(docFreq=88, maxDocs=43254)
                0.021461012 = queryNorm
              0.55577993 = fieldWeight in 1800, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1862087 = idf(docFreq=88, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1800)
          0.08141601 = weight(abstract_txt:computation in 1800) [ClassicSimilarity], result of:
            0.08141601 = score(doc=1800,freq=1.0), product of:
              0.1936998 = queryWeight, product of:
                1.17432 = boost
                7.685861 = idf(docFreq=53, maxDocs=43254)
                0.021461012 = queryNorm
              0.42032054 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.685861 = idf(docFreq=53, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1800)
          0.028444272 = weight(abstract_txt:problems in 1800) [ClassicSimilarity], result of:
            0.028444272 = score(doc=1800,freq=1.0), product of:
              0.12105866 = queryWeight, product of:
                1.3129095 = boost
                4.296461 = idf(docFreq=1600, maxDocs=43254)
                0.021461012 = queryNorm
              0.23496272 = fieldWeight in 1800, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.296461 = idf(docFreq=1600, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1800)
          0.086510465 = weight(abstract_txt:statistical in 1800) [ClassicSimilarity], result of:
            0.086510465 = score(doc=1800,freq=2.0), product of:
              0.20169808 = queryWeight, product of:
                1.6946801 = boost
                5.545795 = idf(docFreq=458, maxDocs=43254)
                0.021461012 = queryNorm
              0.4289107 = fieldWeight in 1800, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.545795 = idf(docFreq=458, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1800)
          0.1644932 = weight(abstract_txt:knowledge in 1800) [ClassicSimilarity], result of:
            0.1644932 = score(doc=1800,freq=4.0), product of:
              0.42014706 = queryWeight, product of:
                5.469184 = boost
                3.5795512 = idf(docFreq=3278, maxDocs=43254)
                0.021461012 = queryNorm
              0.3915134 = fieldWeight in 1800, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5795512 = idf(docFreq=3278, maxDocs=43254)
                0.0546875 = fieldNorm(doc=1800)
        0.32 = coord(8/25)
    
  4. Brychcín, T.; Konopík, M.: HPS: High precision stemmer (2015) 0.17
    0.17252551 = sum of:
      0.17252551 = product of:
        0.61616254 = sum of:
          0.06296083 = weight(abstract_txt:reliable in 4151) [ClassicSimilarity], result of:
            0.06296083 = score(doc=4151,freq=1.0), product of:
              0.14929377 = queryWeight, product of:
                1.0309621 = boost
                6.7475915 = idf(docFreq=137, maxDocs=43254)
                0.021461012 = queryNorm
              0.42172447 = fieldWeight in 4151, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7475915 = idf(docFreq=137, maxDocs=43254)
                0.0625 = fieldNorm(doc=4151)
          0.024867719 = weight(abstract_txt:results in 4151) [ClassicSimilarity], result of:
            0.024867719 = score(doc=4151,freq=2.0), product of:
              0.0803684 = queryWeight, product of:
                1.0697432 = boost
                3.5007057 = idf(docFreq=3547, maxDocs=43254)
                0.021461012 = queryNorm
              0.3094216 = fieldWeight in 4151, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5007057 = idf(docFreq=3547, maxDocs=43254)
                0.0625 = fieldNorm(doc=4151)
          0.07714815 = weight(abstract_txt:maximum in 4151) [ClassicSimilarity], result of:
            0.07714815 = score(doc=4151,freq=1.0), product of:
              0.17095365 = queryWeight, product of:
                1.1032172 = boost
                7.2204976 = idf(docFreq=85, maxDocs=43254)
                0.021461012 = queryNorm
              0.4512811 = fieldWeight in 4151, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2204976 = idf(docFreq=85, maxDocs=43254)
                0.0625 = fieldNorm(doc=4151)
          0.10341217 = weight(abstract_txt:entropy in 4151) [ClassicSimilarity], result of:
            0.10341217 = score(doc=4151,freq=1.0), product of:
              0.20783043 = queryWeight, product of:
                1.2164 = boost
                7.9612727 = idf(docFreq=40, maxDocs=43254)
                0.021461012 = queryNorm
              0.49757954 = fieldWeight in 4151, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9612727 = idf(docFreq=40, maxDocs=43254)
                0.0625 = fieldNorm(doc=4151)
          0.06991101 = weight(abstract_txt:statistical in 4151) [ClassicSimilarity], result of:
            0.06991101 = score(doc=4151,freq=1.0), product of:
              0.20169808 = queryWeight, product of:
                1.6946801 = boost
                5.545795 = idf(docFreq=458, maxDocs=43254)
                0.021461012 = queryNorm
              0.3466122 = fieldWeight in 4151, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.545795 = idf(docFreq=458, maxDocs=43254)
                0.0625 = fieldNorm(doc=4151)
          0.0564198 = weight(abstract_txt:approach in 4151) [ClassicSimilarity], result of:
            0.0564198 = score(doc=4151,freq=3.0), product of:
              0.13876578 = queryWeight, product of:
                1.7215661 = boost
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.021461012 = queryNorm
              0.40658295 = fieldWeight in 4151, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.0625 = fieldNorm(doc=4151)
          0.22144286 = weight(abstract_txt:classifier in 4151) [ClassicSimilarity], result of:
            0.22144286 = score(doc=4151,freq=2.0), product of:
              0.3452782 = queryWeight, product of:
                2.2172847 = boost
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.021461012 = queryNorm
              0.6413462 = fieldWeight in 4151, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.0625 = fieldNorm(doc=4151)
        0.28 = coord(7/25)
    
  5. Prabowo, R.; Jackson, M.; Burden, P.; Knoell, H.-D.: Ontology-based automatic classification for the Web pages : design, implementation and evaluation (2002) 0.15
    0.1546025 = sum of:
      0.1546025 = product of:
        0.6441771 = sum of:
          0.07427326 = weight(abstract_txt:classifying in 5384) [ClassicSimilarity], result of:
            0.07427326 = score(doc=5384,freq=1.0), product of:
              0.14364031 = queryWeight, product of:
                1.0112535 = boost
                6.6185994 = idf(docFreq=156, maxDocs=43254)
                0.021461012 = queryNorm
              0.5170781 = fieldWeight in 5384, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6185994 = idf(docFreq=156, maxDocs=43254)
                0.078125 = fieldNorm(doc=5384)
          0.021780102 = weight(abstract_txt:paper in 5384) [ClassicSimilarity], result of:
            0.021780102 = score(doc=5384,freq=1.0), product of:
              0.079879984 = queryWeight, product of:
                1.0664877 = boost
                3.4900522 = idf(docFreq=3585, maxDocs=43254)
                0.021461012 = queryNorm
              0.27266032 = fieldWeight in 5384, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4900522 = idf(docFreq=3585, maxDocs=43254)
                0.078125 = fieldNorm(doc=5384)
          0.021980165 = weight(abstract_txt:results in 5384) [ClassicSimilarity], result of:
            0.021980165 = score(doc=5384,freq=1.0), product of:
              0.0803684 = queryWeight, product of:
                1.0697432 = boost
                3.5007057 = idf(docFreq=3547, maxDocs=43254)
                0.021461012 = queryNorm
              0.27349263 = fieldWeight in 5384, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5007057 = idf(docFreq=3547, maxDocs=43254)
                0.078125 = fieldNorm(doc=5384)
          0.040717486 = weight(abstract_txt:approach in 5384) [ClassicSimilarity], result of:
            0.040717486 = score(doc=5384,freq=1.0), product of:
              0.13876578 = queryWeight, product of:
                1.7215661 = boost
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.021461012 = queryNorm
              0.29342598 = fieldWeight in 5384, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7558525 = idf(docFreq=2748, maxDocs=43254)
                0.078125 = fieldNorm(doc=5384)
          0.3390138 = weight(abstract_txt:classifier in 5384) [ClassicSimilarity], result of:
            0.3390138 = score(doc=5384,freq=3.0), product of:
              0.3452782 = queryWeight, product of:
                2.2172847 = boost
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.021461012 = queryNorm
              0.9818569 = fieldWeight in 5384, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.2560043 = idf(docFreq=82, maxDocs=43254)
                0.078125 = fieldNorm(doc=5384)
          0.1464123 = weight(abstract_txt:classification in 5384) [ClassicSimilarity], result of:
            0.1464123 = score(doc=5384,freq=5.0), product of:
              0.20963785 = queryWeight, product of:
                2.4433558 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.021461012 = queryNorm
              0.6984058 = fieldWeight in 5384, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.078125 = fieldNorm(doc=5384)
        0.24 = coord(6/25)