Document (#30654)

Author
Haravu, L.J.
Neelameghan, A.
Title
Text mining and data mining in knowledge organization and discovery : the making of knowledge-based products
Source
Cataloging and classification quarterly. 37(2003) nos.1/2, S.96-114
Year
2003
Abstract
Discusses the importance of knowledge organization in the context of the information overload caused by the vast quantities of data and information accessible on internal and external networks of an organization. Defines the characteristics of a knowledge-based product. Elaborates on the techniques and applications of text mining in developing knowledge products. Presents two approaches, as case studies, to the making of knowledge products: (1) steps and processes in the planning, designing and development of a composite multilingual multimedia CD product, with the potential international, inter-cultural end users in view, and (2) application of natural language processing software in text mining. Using a text mining software, it is possible to link concept terms from a processed text to a related thesaurus, glossary, schedules of a classification scheme, and facet structured subject representations. Concludes that the products of text mining and data mining could be made more useful if the features of a faceted scheme for subject classification are incorporated into text mining techniques and products.
Content
Beitrag eines Themenheftes "Knowledge organization and classification in international information retrieval"
Footnote
Vgl. auch: http://catalogingandclassificationquarterly.com/
Theme
Data Mining

Similar documents (author)

  1. Neelameghan, A.: Interdisciplinary research and classification problems : a case study (1974) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:neelameghan in 1817) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 1817, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=1817)
    
  2. Neelameghan, A.: Classification, theory of (1971) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:neelameghan in 1988) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 1988, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=1988)
    
  3. Neelameghan, A.: Design of scheme for classification (1969) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:neelameghan in 1990) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 1990, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=1990)
    
  4. Neelameghan, A.: Use of computer for the synthesis of class number : a case study with a freely faceted version of Colon Classification (1968) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:neelameghan in 1991) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 1991, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=1991)
    
  5. Neelameghan, A.: Application of Ranganathan's general theory of knowledge classification in designing specialized databases (1992) 5.11
    5.106579 = sum of:
      5.106579 = weight(author_txt:neelameghan in 2963) [ClassicSimilarity], result of:
        5.106579 = fieldWeight in 2963, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.1705265 = idf(docFreq=33, maxDocs=44218)
          0.625 = fieldNorm(doc=2963)
    

Similar documents (content)

  1. Kantardzic, M.: Data mining : concepts, models, methods, and algorithms (2003) 0.31
    0.31400424 = sum of:
      0.31400424 = product of:
        0.9812633 = sum of:
          0.053970926 = weight(abstract_txt:processed in 2291) [ClassicSimilarity], result of:
            0.053970926 = score(doc=2291,freq=1.0), product of:
              0.119421564 = queryWeight, product of:
                1.134119 = boost
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.014562202 = queryNorm
              0.4519362 = fieldWeight in 2291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.033372793 = weight(abstract_txt:software in 2291) [ClassicSimilarity], result of:
            0.033372793 = score(doc=2291,freq=2.0), product of:
              0.086677134 = queryWeight, product of:
                1.366421 = boost
                4.3560514 = idf(docFreq=1541, maxDocs=44218)
                0.014562202 = queryNorm
              0.3850242 = fieldWeight in 2291, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3560514 = idf(docFreq=1541, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.03752882 = weight(abstract_txt:techniques in 2291) [ClassicSimilarity], result of:
            0.03752882 = score(doc=2291,freq=2.0), product of:
              0.09373162 = queryWeight, product of:
                1.4209386 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.014562202 = queryNorm
              0.40038592 = fieldWeight in 2291, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.05734235 = weight(abstract_txt:data in 2291) [ClassicSimilarity], result of:
            0.05734235 = score(doc=2291,freq=13.0), product of:
              0.076269776 = queryWeight, product of:
                1.5698354 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.014562202 = queryNorm
              0.7518358 = fieldWeight in 2291, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.035789836 = weight(abstract_txt:making in 2291) [ClassicSimilarity], result of:
            0.035789836 = score(doc=2291,freq=1.0), product of:
              0.11441755 = queryWeight, product of:
                1.5699238 = boost
                5.0048037 = idf(docFreq=805, maxDocs=44218)
                0.014562202 = queryNorm
              0.31280023 = fieldWeight in 2291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0048037 = idf(docFreq=805, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.03840856 = weight(abstract_txt:knowledge in 2291) [ClassicSimilarity], result of:
            0.03840856 = score(doc=2291,freq=1.0), product of:
              0.17297311 = queryWeight, product of:
                3.3433526 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.014562202 = queryNorm
              0.2220493 = fieldWeight in 2291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.06607804 = weight(abstract_txt:text in 2291) [ClassicSimilarity], result of:
            0.06607804 = score(doc=2291,freq=1.0), product of:
              0.26144496 = queryWeight, product of:
                4.4397283 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014562202 = queryNorm
              0.25274166 = fieldWeight in 2291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.65877193 = weight(abstract_txt:mining in 2291) [ClassicSimilarity], result of:
            0.65877193 = score(doc=2291,freq=6.0), product of:
              0.69680697 = queryWeight, product of:
                7.748515 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.014562202 = queryNorm
              0.94541526 = fieldWeight in 2291, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
        0.32 = coord(8/25)
    
  2. Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.24
    0.23636308 = sum of:
      0.23636308 = product of:
        1.1818154 = sum of:
          0.03752882 = weight(abstract_txt:techniques in 354) [ClassicSimilarity], result of:
            0.03752882 = score(doc=354,freq=2.0), product of:
              0.09373162 = queryWeight, product of:
                1.4209386 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.014562202 = queryNorm
              0.40038592 = fieldWeight in 354, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.0625 = fieldNorm(doc=354)
          0.04207778 = weight(abstract_txt:data in 354) [ClassicSimilarity], result of:
            0.04207778 = score(doc=354,freq=7.0), product of:
              0.076269776 = queryWeight, product of:
                1.5698354 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.014562202 = queryNorm
              0.55169666 = fieldWeight in 354, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=354)
          0.03840856 = weight(abstract_txt:knowledge in 354) [ClassicSimilarity], result of:
            0.03840856 = score(doc=354,freq=1.0), product of:
              0.17297311 = queryWeight, product of:
                3.3433526 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.014562202 = queryNorm
              0.2220493 = fieldWeight in 354, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.0625 = fieldNorm(doc=354)
          0.13215607 = weight(abstract_txt:text in 354) [ClassicSimilarity], result of:
            0.13215607 = score(doc=354,freq=4.0), product of:
              0.26144496 = queryWeight, product of:
                4.4397283 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014562202 = queryNorm
              0.5054833 = fieldWeight in 354, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=354)
          0.9316442 = weight(abstract_txt:mining in 354) [ClassicSimilarity], result of:
            0.9316442 = score(doc=354,freq=12.0), product of:
              0.69680697 = queryWeight, product of:
                7.748515 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.014562202 = queryNorm
              1.3370191 = fieldWeight in 354, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=354)
        0.2 = coord(5/25)
    
  3. Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.22
    0.22260392 = sum of:
      0.22260392 = product of:
        1.1130196 = sum of:
          0.1034493 = weight(abstract_txt:quantities in 2899) [ClassicSimilarity], result of:
            0.1034493 = score(doc=2899,freq=1.0), product of:
              0.14062646 = queryWeight, product of:
                1.230696 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.014562202 = queryNorm
              0.7356318 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.039805323 = weight(abstract_txt:techniques in 2899) [ClassicSimilarity], result of:
            0.039805323 = score(doc=2899,freq=1.0), product of:
              0.09373162 = queryWeight, product of:
                1.4209386 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.014562202 = queryNorm
              0.42467338 = fieldWeight in 2899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.04771172 = weight(abstract_txt:data in 2899) [ClassicSimilarity], result of:
            0.04771172 = score(doc=2899,freq=4.0), product of:
              0.076269776 = queryWeight, product of:
                1.5698354 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.014562202 = queryNorm
              0.62556523 = fieldWeight in 2899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.11522567 = weight(abstract_txt:knowledge in 2899) [ClassicSimilarity], result of:
            0.11522567 = score(doc=2899,freq=4.0), product of:
              0.17297311 = queryWeight, product of:
                3.3433526 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.014562202 = queryNorm
              0.6661479 = fieldWeight in 2899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.80682755 = weight(abstract_txt:mining in 2899) [ClassicSimilarity], result of:
            0.80682755 = score(doc=2899,freq=4.0), product of:
              0.69680697 = queryWeight, product of:
                7.748515 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.014562202 = queryNorm
              1.1578925 = fieldWeight in 2899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
        0.2 = coord(5/25)
    
  4. Srinivasan, P.: Text mining in biomedicine : challenges and opportunities (2006) 0.21
    0.21270326 = sum of:
      0.21270326 = product of:
        1.0635163 = sum of:
          0.011562027 = weight(abstract_txt:based in 1497) [ClassicSimilarity], result of:
            0.011562027 = score(doc=1497,freq=1.0), product of:
              0.046423245 = queryWeight, product of:
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.014562202 = queryNorm
              0.24905685 = fieldWeight in 1497, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.078125 = fieldNorm(doc=1497)
          0.053177927 = weight(abstract_txt:vast in 1497) [ClassicSimilarity], result of:
            0.053177927 = score(doc=1497,freq=1.0), product of:
              0.10190381 = queryWeight, product of:
                1.0476409 = boost
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.014562202 = queryNorm
              0.5218443 = fieldWeight in 1497, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.078125 = fieldNorm(doc=1497)
          0.044737294 = weight(abstract_txt:making in 1497) [ClassicSimilarity], result of:
            0.044737294 = score(doc=1497,freq=1.0), product of:
              0.11441755 = queryWeight, product of:
                1.5699238 = boost
                5.0048037 = idf(docFreq=805, maxDocs=44218)
                0.014562202 = queryNorm
              0.39100027 = fieldWeight in 1497, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0048037 = idf(docFreq=805, maxDocs=44218)
                0.078125 = fieldNorm(doc=1497)
          0.20232183 = weight(abstract_txt:text in 1497) [ClassicSimilarity], result of:
            0.20232183 = score(doc=1497,freq=6.0), product of:
              0.26144496 = queryWeight, product of:
                4.4397283 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014562202 = queryNorm
              0.77386016 = fieldWeight in 1497, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=1497)
          0.75171715 = weight(abstract_txt:mining in 1497) [ClassicSimilarity], result of:
            0.75171715 = score(doc=1497,freq=5.0), product of:
              0.69680697 = queryWeight, product of:
                7.748515 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.014562202 = queryNorm
              1.0788026 = fieldWeight in 1497, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.078125 = fieldNorm(doc=1497)
        0.2 = coord(5/25)
    
  5. Joo, S.; Choi, I.; Choi, N.: Topic analysis of the research domain in knowledge organization : a Latent Dirichlet Allocation approach (2018) 0.21
    0.21167327 = sum of:
      0.21167327 = product of:
        0.75597596 = sum of:
          0.009249622 = weight(abstract_txt:based in 4304) [ClassicSimilarity], result of:
            0.009249622 = score(doc=4304,freq=1.0), product of:
              0.046423245 = queryWeight, product of:
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.014562202 = queryNorm
              0.19924548 = fieldWeight in 4304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=4304)
          0.018163297 = weight(abstract_txt:classification in 4304) [ClassicSimilarity], result of:
            0.018163297 = score(doc=4304,freq=1.0), product of:
              0.07279742 = queryWeight, product of:
                1.2522477 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.014562202 = queryNorm
              0.2495047 = fieldWeight in 4304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=4304)
          0.0470416 = weight(abstract_txt:scheme in 4304) [ClassicSimilarity], result of:
            0.0470416 = score(doc=4304,freq=1.0), product of:
              0.13729063 = queryWeight, product of:
                1.7197 = boost
                5.4822793 = idf(docFreq=499, maxDocs=44218)
                0.014562202 = queryNorm
              0.34264246 = fieldWeight in 4304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4822793 = idf(docFreq=499, maxDocs=44218)
                0.0625 = fieldNorm(doc=4304)
          0.09264738 = weight(abstract_txt:organization in 4304) [ClassicSimilarity], result of:
            0.09264738 = score(doc=4304,freq=6.0), product of:
              0.13589025 = queryWeight, product of:
                2.0954244 = boost
                4.4533744 = idf(docFreq=1398, maxDocs=44218)
                0.014562202 = queryNorm
              0.68178093 = fieldWeight in 4304, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.4533744 = idf(docFreq=1398, maxDocs=44218)
                0.0625 = fieldNorm(doc=4304)
          0.094081365 = weight(abstract_txt:knowledge in 4304) [ClassicSimilarity], result of:
            0.094081365 = score(doc=4304,freq=6.0), product of:
              0.17297311 = queryWeight, product of:
                3.3433526 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.014562202 = queryNorm
              0.54390746 = fieldWeight in 4304, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.0625 = fieldNorm(doc=4304)
          0.11445051 = weight(abstract_txt:text in 4304) [ClassicSimilarity], result of:
            0.11445051 = score(doc=4304,freq=3.0), product of:
              0.26144496 = queryWeight, product of:
                4.4397283 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014562202 = queryNorm
              0.4377614 = fieldWeight in 4304, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=4304)
          0.38034216 = weight(abstract_txt:mining in 4304) [ClassicSimilarity], result of:
            0.38034216 = score(doc=4304,freq=2.0), product of:
              0.69680697 = queryWeight, product of:
                7.748515 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.014562202 = queryNorm
              0.54583573 = fieldWeight in 4304, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=4304)
        0.28 = coord(7/25)