Search (1 results, page 1 of 1)

Did you mean:
author's%3a%22Gilliland-swetland%2c A.%22 1
author's%3a%22Gilliland-scotland%2c A.%22 1
authors%3a%22Gilliland-swetland%2c A.%22 1
author's%3a%22Gilliland-seland%2c A.%22 1
authors%3a%22Gilliland-scotland%2c A.%22 1

Jiang, Y.; Bai, W.; Zhang, X.; Hu, J.: Wikipedia-based information content and semantic similarity computation (2017) 0.00
```
0.0024032309 = product of:
  0.0048064617 = sum of:
    0.0048064617 = product of:
      0.0096129235 = sum of:
        0.0096129235 = weight(_text_:a in 2877) [ClassicSimilarity], result of:
          0.0096129235 = score(doc=2877,freq=20.0), product of:
            0.04772363 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.041389145 = queryNorm
            0.20142901 = fieldWeight in 2877, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2877)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The Information Content (IC) of a concept is a fundamental dimension in computational linguistics. It enables a better understanding of concept's semantics. In the past, several approaches to compute IC of a concept have been proposed. However, there are some limitations such as the facts of relying on corpora availability, manual tagging, or predefined ontologies and fitting non-dynamic domains in the existing methods. Wikipedia provides a very large domain-independent encyclopedic repository and semantic network for computing IC of concepts with more coverage than usual ontologies. In this paper, we propose some novel methods to IC computation of a concept to solve the shortcomings of existing approaches. The presented methods focus on the IC computation of a concept (i.e., Wikipedia category) drawn from the Wikipedia category structure. We propose several new IC-based measures to compute the semantic similarity between concepts. The evaluation, based on several widely used benchmarks and a benchmark developed in ourselves, sustains the intuitions with respect to human judgments. Overall, some methods proposed in this paper have a good human correlation and constitute some effective ways of determining IC values for concepts and semantic similarity between concepts.

Type

a