Search (1 results, page 1 of 1)

  • × author_ss:"Kwong, C.-P."
  • × author_ss:"Li, D."
  • × year_i:[2010 TO 2020}
  1. Li, D.; Kwong, C.-P.: Understanding latent semantic indexing : a topological structure analysis using Q-analysis (2010) 0.01
    0.005565266 = product of:
      0.038956862 = sum of:
        0.013536699 = weight(_text_:information in 3427) [ClassicSimilarity], result of:
          0.013536699 = score(doc=3427,freq=10.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.2602176 = fieldWeight in 3427, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3427)
        0.025420163 = weight(_text_:retrieval in 3427) [ClassicSimilarity], result of:
          0.025420163 = score(doc=3427,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.2835858 = fieldWeight in 3427, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=3427)
      0.14285715 = coord(2/14)
    
    Abstract
    The method of latent semantic indexing (LSI) is well-known for tackling the synonymy and polysemy problems in information retrieval; however, its performance can be very different for various datasets, and the questions of what characteristics of a dataset and why these characteristics contribute to this difference have not been fully understood. In this article, we propose that the mathematical structure of simplexes can be attached to a term-document matrix in the vector space model (VSM) for information retrieval. The Q-analysis devised by R.H. Atkin ([1974]) may then be applied to effect an analysis of the topological structure of the simplexes and their corresponding dataset. Experimental results of this analysis reveal that there is a correlation between the effectiveness of LSI and the topological structure of the dataset. By using the information obtained from the topological analysis, we develop a new method to explore the semantic information in a dataset. Experimental results show that our method can enhance the performance of VSM for datasets over which LSI is not effective.
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.3, S.592-608