Search (2 results, page 1 of 1)

  • × author_ss:"Zhang, X."
  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Jiang, Y.; Bai, W.; Zhang, X.; Hu, J.: Wikipedia-based information content and semantic similarity computation (2017) 0.00
    0.0033881254 = product of:
      0.023716876 = sum of:
        0.008737902 = weight(_text_:information in 2877) [ClassicSimilarity], result of:
          0.008737902 = score(doc=2877,freq=6.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.16796975 = fieldWeight in 2877, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2877)
        0.014978974 = weight(_text_:retrieval in 2877) [ClassicSimilarity], result of:
          0.014978974 = score(doc=2877,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.16710453 = fieldWeight in 2877, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2877)
      0.14285715 = coord(2/14)
    
    Abstract
    The Information Content (IC) of a concept is a fundamental dimension in computational linguistics. It enables a better understanding of concept's semantics. In the past, several approaches to compute IC of a concept have been proposed. However, there are some limitations such as the facts of relying on corpora availability, manual tagging, or predefined ontologies and fitting non-dynamic domains in the existing methods. Wikipedia provides a very large domain-independent encyclopedic repository and semantic network for computing IC of concepts with more coverage than usual ontologies. In this paper, we propose some novel methods to IC computation of a concept to solve the shortcomings of existing approaches. The presented methods focus on the IC computation of a concept (i.e., Wikipedia category) drawn from the Wikipedia category structure. We propose several new IC-based measures to compute the semantic similarity between concepts. The evaluation, based on several widely used benchmarks and a benchmark developed in ourselves, sustains the intuitions with respect to human judgments. Overall, some methods proposed in this paper have a good human correlation and constitute some effective ways of determining IC values for concepts and semantic similarity between concepts.
    Source
    Information processing and management. 53(2017) no.1, S.248-265
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  2. Jiang, Y.; Zhang, X.; Tang, Y.; Nie, R.: Feature-based approaches to semantic similarity assessment of concepts using Wikipedia (2015) 0.00
    0.0028605436 = product of:
      0.020023804 = sum of:
        0.0050448296 = weight(_text_:information in 2682) [ClassicSimilarity], result of:
          0.0050448296 = score(doc=2682,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.09697737 = fieldWeight in 2682, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2682)
        0.014978974 = weight(_text_:retrieval in 2682) [ClassicSimilarity], result of:
          0.014978974 = score(doc=2682,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.16710453 = fieldWeight in 2682, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2682)
      0.14285715 = coord(2/14)
    
    Source
    Information processing and management. 51(2015) no.3, S.215-234
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval