Document (#33073)

Author
Miyamoto, S.
Title
Information clustering based an fuzzy multisets
Source
Information processing and management. 39(2003) no.2, S.195-213
Year
2003
Abstract
A fuzzy multiset model for information clustering is proposed with application to information retrieval on the World Wide Web. Noting that a search engine retrieves multiple occurrences of the same subjects with possibly different degrees of relevance, we observe that fuzzy multisets provide an appropriate model of information retrieval on the WWW. Information clustering which means both term clustering and document clustering is considered. Three methods of the hard c-means, fuzzy c-means, and an agglomerative method using cluster centers are proposed. Two distances between fuzzy multisets and algorithms for calculating cluster centers are defined. Theoretical properties concerning the clustering algorithms are studied. Illustrative examples are given to show how the algorithms work.
Theme
Automatisches Klassifizieren

Similar documents (content)

  1. Cathey, R.J.; Jensen, E.C.; Beitzel, S.M.; Frieder, O.; Grossman, D.: Exploiting parallelism to support scalable hierarchical clustering (2007) 0.27
    0.2674167 = sum of:
      0.2674167 = product of:
        0.9550596 = sum of:
          0.011753379 = weight(abstract_txt:retrieval in 2449) [ClassicSimilarity], result of:
            0.011753379 = score(doc=2449,freq=1.0), product of:
              0.05419582 = queryWeight, product of:
                1.258179 = boost
                3.4699 = idf(docFreq=3658, maxDocs=43254)
                0.012413848 = queryNorm
              0.21686874 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4699 = idf(docFreq=3658, maxDocs=43254)
                0.0625 = fieldNorm(doc=2449)
          0.027858514 = weight(abstract_txt:proposed in 2449) [ClassicSimilarity], result of:
            0.027858514 = score(doc=2449,freq=1.0), product of:
              0.096345015 = queryWeight, product of:
                1.6775448 = boost
                4.6264586 = idf(docFreq=1150, maxDocs=43254)
                0.012413848 = queryNorm
              0.28915367 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6264586 = idf(docFreq=1150, maxDocs=43254)
                0.0625 = fieldNorm(doc=2449)
          0.16935374 = weight(abstract_txt:agglomerative in 2449) [ClassicSimilarity], result of:
            0.16935374 = score(doc=2449,freq=2.0), product of:
              0.20216244 = queryWeight, product of:
                1.7182833 = boost
                9.47762 = idf(docFreq=8, maxDocs=43254)
                0.012413848 = queryNorm
              0.83771116 = fieldWeight in 2449, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.47762 = idf(docFreq=8, maxDocs=43254)
                0.0625 = fieldNorm(doc=2449)
          0.079088084 = weight(abstract_txt:cluster in 2449) [ClassicSimilarity], result of:
            0.079088084 = score(doc=2449,freq=1.0), product of:
              0.19316629 = queryWeight, product of:
                2.3753366 = boost
                6.550881 = idf(docFreq=167, maxDocs=43254)
                0.012413848 = queryNorm
              0.40943006 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.550881 = idf(docFreq=167, maxDocs=43254)
                0.0625 = fieldNorm(doc=2449)
          0.05320179 = weight(abstract_txt:means in 2449) [ClassicSimilarity], result of:
            0.05320179 = score(doc=2449,freq=1.0), product of:
              0.16976124 = queryWeight, product of:
                2.7272468 = boost
                5.01427 = idf(docFreq=780, maxDocs=43254)
                0.012413848 = queryNorm
              0.31339186 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.01427 = idf(docFreq=780, maxDocs=43254)
                0.0625 = fieldNorm(doc=2449)
          0.11333136 = weight(abstract_txt:algorithms in 2449) [ClassicSimilarity], result of:
            0.11333136 = score(doc=2449,freq=2.0), product of:
              0.22307168 = queryWeight, product of:
                3.1262765 = boost
                5.747919 = idf(docFreq=374, maxDocs=43254)
                0.012413848 = queryNorm
              0.5080491 = fieldWeight in 2449, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.747919 = idf(docFreq=374, maxDocs=43254)
                0.0625 = fieldNorm(doc=2449)
          0.5004727 = weight(abstract_txt:clustering in 2449) [ClassicSimilarity], result of:
            0.5004727 = score(doc=2449,freq=6.0), product of:
              0.52452666 = queryWeight, product of:
                6.7795978 = boost
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.012413848 = queryNorm
              0.9541417 = fieldWeight in 2449, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.0625 = fieldNorm(doc=2449)
        0.28 = coord(7/25)
    
  2. Bose, I.; Chen, X.: ¬A method for extension of generative topographic mapping for fuzzy clustering (2009) 0.27
    0.2669619 = sum of:
      0.2669619 = product of:
        1.1123413 = sum of:
          0.049247358 = weight(abstract_txt:proposed in 4712) [ClassicSimilarity], result of:
            0.049247358 = score(doc=4712,freq=2.0), product of:
              0.096345015 = queryWeight, product of:
                1.6775448 = boost
                4.6264586 = idf(docFreq=1150, maxDocs=43254)
                0.012413848 = queryNorm
              0.51115626 = fieldWeight in 4712, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6264586 = idf(docFreq=1150, maxDocs=43254)
                0.078125 = fieldNorm(doc=4712)
          0.09886011 = weight(abstract_txt:cluster in 4712) [ClassicSimilarity], result of:
            0.09886011 = score(doc=4712,freq=1.0), product of:
              0.19316629 = queryWeight, product of:
                2.3753366 = boost
                6.550881 = idf(docFreq=167, maxDocs=43254)
                0.012413848 = queryNorm
              0.5117876 = fieldWeight in 4712, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.550881 = idf(docFreq=167, maxDocs=43254)
                0.078125 = fieldNorm(doc=4712)
          0.06650224 = weight(abstract_txt:means in 4712) [ClassicSimilarity], result of:
            0.06650224 = score(doc=4712,freq=1.0), product of:
              0.16976124 = queryWeight, product of:
                2.7272468 = boost
                5.01427 = idf(docFreq=780, maxDocs=43254)
                0.012413848 = queryNorm
              0.39173985 = fieldWeight in 4712, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.01427 = idf(docFreq=780, maxDocs=43254)
                0.078125 = fieldNorm(doc=4712)
          0.14166419 = weight(abstract_txt:algorithms in 4712) [ClassicSimilarity], result of:
            0.14166419 = score(doc=4712,freq=2.0), product of:
              0.22307168 = queryWeight, product of:
                3.1262765 = boost
                5.747919 = idf(docFreq=374, maxDocs=43254)
                0.012413848 = queryNorm
              0.6350613 = fieldWeight in 4712, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.747919 = idf(docFreq=374, maxDocs=43254)
                0.078125 = fieldNorm(doc=4712)
          0.39488232 = weight(abstract_txt:fuzzy in 4712) [ClassicSimilarity], result of:
            0.39488232 = score(doc=4712,freq=2.0), product of:
              0.5238405 = queryWeight, product of:
                6.1848483 = boost
                6.822815 = idf(docFreq=127, maxDocs=43254)
                0.012413848 = queryNorm
              0.7538217 = fieldWeight in 4712, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.822815 = idf(docFreq=127, maxDocs=43254)
                0.078125 = fieldNorm(doc=4712)
          0.36118507 = weight(abstract_txt:clustering in 4712) [ClassicSimilarity], result of:
            0.36118507 = score(doc=4712,freq=2.0), product of:
              0.52452666 = queryWeight, product of:
                6.7795978 = boost
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.012413848 = queryNorm
              0.68859243 = fieldWeight in 4712, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.078125 = fieldNorm(doc=4712)
        0.24 = coord(6/25)
    
  3. Pop, H.P.: ¬A fuzzy classification of the chemical elements (1996) 0.18
    0.17932129 = sum of:
      0.17932129 = product of:
        1.120758 = sum of:
          0.07196364 = weight(abstract_txt:properties in 6185) [ClassicSimilarity], result of:
            0.07196364 = score(doc=6185,freq=2.0), product of:
              0.078684166 = queryWeight, product of:
                1.0719837 = boost
                5.9127936 = idf(docFreq=317, maxDocs=43254)
                0.012413848 = queryNorm
              0.9145886 = fieldWeight in 6185, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9127936 = idf(docFreq=317, maxDocs=43254)
                0.109375 = fieldNorm(doc=6185)
          0.13840415 = weight(abstract_txt:cluster in 6185) [ClassicSimilarity], result of:
            0.13840415 = score(doc=6185,freq=1.0), product of:
              0.19316629 = queryWeight, product of:
                2.3753366 = boost
                6.550881 = idf(docFreq=167, maxDocs=43254)
                0.012413848 = queryNorm
              0.7165026 = fieldWeight in 6185, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.550881 = idf(docFreq=167, maxDocs=43254)
                0.109375 = fieldNorm(doc=6185)
          0.5528352 = weight(abstract_txt:fuzzy in 6185) [ClassicSimilarity], result of:
            0.5528352 = score(doc=6185,freq=2.0), product of:
              0.5238405 = queryWeight, product of:
                6.1848483 = boost
                6.822815 = idf(docFreq=127, maxDocs=43254)
                0.012413848 = queryNorm
              1.0553503 = fieldWeight in 6185, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.822815 = idf(docFreq=127, maxDocs=43254)
                0.109375 = fieldNorm(doc=6185)
          0.357555 = weight(abstract_txt:clustering in 6185) [ClassicSimilarity], result of:
            0.357555 = score(doc=6185,freq=1.0), product of:
              0.52452666 = queryWeight, product of:
                6.7795978 = boost
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.012413848 = queryNorm
              0.68167174 = fieldWeight in 6185, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.109375 = fieldNorm(doc=6185)
        0.16 = coord(4/25)
    
  4. Miyamoto, S.: Application of rough sets to information retrieval (1998) 0.17
    0.17499426 = sum of:
      0.17499426 = product of:
        0.8749713 = sum of:
          0.030536175 = weight(abstract_txt:retrieval in 2560) [ClassicSimilarity], result of:
            0.030536175 = score(doc=2560,freq=3.0), product of:
              0.05419582 = queryWeight, product of:
                1.258179 = boost
                3.4699 = idf(docFreq=3658, maxDocs=43254)
                0.012413848 = queryNorm
              0.5634415 = fieldWeight in 2560, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4699 = idf(docFreq=3658, maxDocs=43254)
                0.09375 = fieldNorm(doc=2560)
          0.027163418 = weight(abstract_txt:model in 2560) [ClassicSimilarity], result of:
            0.027163418 = score(doc=2560,freq=1.0), product of:
              0.07229685 = queryWeight, product of:
                1.4531794 = boost
                4.0076866 = idf(docFreq=2136, maxDocs=43254)
                0.012413848 = queryNorm
              0.37572062 = fieldWeight in 2560, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0076866 = idf(docFreq=2136, maxDocs=43254)
                0.09375 = fieldNorm(doc=2560)
          0.12580739 = weight(abstract_txt:illustrative in 2560) [ClassicSimilarity], result of:
            0.12580739 = score(doc=2560,freq=1.0), product of:
              0.15943752 = queryWeight, product of:
                1.5259482 = boost
                8.416748 = idf(docFreq=25, maxDocs=43254)
                0.012413848 = queryNorm
              0.7890701 = fieldWeight in 2560, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.416748 = idf(docFreq=25, maxDocs=43254)
                0.09375 = fieldNorm(doc=2560)
          0.021326741 = weight(abstract_txt:information in 2560) [ClassicSimilarity], result of:
            0.021326741 = score(doc=2560,freq=2.0), product of:
              0.066280045 = queryWeight, product of:
                2.199991 = boost
                2.42692 = idf(docFreq=10382, maxDocs=43254)
                0.012413848 = queryNorm
              0.32176715 = fieldWeight in 2560, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.42692 = idf(docFreq=10382, maxDocs=43254)
                0.09375 = fieldNorm(doc=2560)
          0.6701375 = weight(abstract_txt:fuzzy in 2560) [ClassicSimilarity], result of:
            0.6701375 = score(doc=2560,freq=4.0), product of:
              0.5238405 = queryWeight, product of:
                6.1848483 = boost
                6.822815 = idf(docFreq=127, maxDocs=43254)
                0.012413848 = queryNorm
              1.2792778 = fieldWeight in 2560, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.822815 = idf(docFreq=127, maxDocs=43254)
                0.09375 = fieldNorm(doc=2560)
        0.2 = coord(5/25)
    
  5. Mather, L.A.: ¬A linear algebra measure of cluster quality (2000) 0.17
    0.1706965 = sum of:
      0.1706965 = product of:
        0.7112354 = sum of:
          0.016621789 = weight(abstract_txt:retrieval in 6768) [ClassicSimilarity], result of:
            0.016621789 = score(doc=6768,freq=2.0), product of:
              0.05419582 = queryWeight, product of:
                1.258179 = boost
                3.4699 = idf(docFreq=3658, maxDocs=43254)
                0.012413848 = queryNorm
              0.3066987 = fieldWeight in 6768, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4699 = idf(docFreq=3658, maxDocs=43254)
                0.0625 = fieldNorm(doc=6768)
          0.025609914 = weight(abstract_txt:model in 6768) [ClassicSimilarity], result of:
            0.025609914 = score(doc=6768,freq=2.0), product of:
              0.07229685 = queryWeight, product of:
                1.4531794 = boost
                4.0076866 = idf(docFreq=2136, maxDocs=43254)
                0.012413848 = queryNorm
              0.3542328 = fieldWeight in 6768, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0076866 = idf(docFreq=2136, maxDocs=43254)
                0.0625 = fieldNorm(doc=6768)
          0.010053523 = weight(abstract_txt:information in 6768) [ClassicSimilarity], result of:
            0.010053523 = score(doc=6768,freq=1.0), product of:
              0.066280045 = queryWeight, product of:
                2.199991 = boost
                2.42692 = idf(docFreq=10382, maxDocs=43254)
                0.012413848 = queryNorm
              0.1516825 = fieldWeight in 6768, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.42692 = idf(docFreq=10382, maxDocs=43254)
                0.0625 = fieldNorm(doc=6768)
          0.13698457 = weight(abstract_txt:cluster in 6768) [ClassicSimilarity], result of:
            0.13698457 = score(doc=6768,freq=3.0), product of:
              0.19316629 = queryWeight, product of:
                2.3753366 = boost
                6.550881 = idf(docFreq=167, maxDocs=43254)
                0.012413848 = queryNorm
              0.70915365 = fieldWeight in 6768, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.550881 = idf(docFreq=167, maxDocs=43254)
                0.0625 = fieldNorm(doc=6768)
          0.11333136 = weight(abstract_txt:algorithms in 6768) [ClassicSimilarity], result of:
            0.11333136 = score(doc=6768,freq=2.0), product of:
              0.22307168 = queryWeight, product of:
                3.1262765 = boost
                5.747919 = idf(docFreq=374, maxDocs=43254)
                0.012413848 = queryNorm
              0.5080491 = fieldWeight in 6768, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.747919 = idf(docFreq=374, maxDocs=43254)
                0.0625 = fieldNorm(doc=6768)
          0.40863428 = weight(abstract_txt:clustering in 6768) [ClassicSimilarity], result of:
            0.40863428 = score(doc=6768,freq=4.0), product of:
              0.52452666 = queryWeight, product of:
                6.7795978 = boost
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.012413848 = queryNorm
              0.7790534 = fieldWeight in 6768, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.232427 = idf(docFreq=230, maxDocs=43254)
                0.0625 = fieldNorm(doc=6768)
        0.24 = coord(6/25)