Search (2 results, page 1 of 1)

Jiang, Y.; Zhang, X.; Tang, Y.; Nie, R.: Feature-based approaches to semantic similarity assessment of concepts using Wikipedia (2015) 0.00
```
0.0021034614 = product of:
  0.012620768 = sum of:
    0.012620768 = weight(_text_:in in 2682) [ClassicSimilarity], result of:
      0.012620768 = score(doc=2682,freq=16.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.21253976 = fieldWeight in 2682, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2682)
  0.16666667 = coord(1/6)
```
Abstract

Semantic similarity assessment between concepts is an important task in many language related applications. In the past, several approaches to assess similarity by evaluating the knowledge modeled in an (or multiple) ontology (or ontologies) have been proposed. However, there are some limitations such as the facts of relying on predefined ontologies and fitting non-dynamic domains in the existing measures. Wikipedia provides a very large domain-independent encyclopedic repository and semantic network for computing semantic similarity of concepts with more coverage than usual ontologies. In this paper, we propose some novel feature based similarity assessment methods that are fully dependent on Wikipedia and can avoid most of the limitations and drawbacks introduced above. To implement similarity assessment based on feature by making use of Wikipedia, firstly a formal representation of Wikipedia concepts is presented. We then give a framework for feature based similarity based on the formal representation of Wikipedia concepts. Lastly, we investigate several feature based approaches to semantic similarity measures resulting from instantiations of the framework. The evaluation, based on several widely used benchmarks and a benchmark developed in ourselves, sustains the intuitions with respect to human judgements. Overall, several methods proposed in this paper have good human correlation and constitute some effective ways of determining similarity between Wikipedia concepts.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Jiang, Y.; Bai, W.; Zhang, X.; Hu, J.: Wikipedia-based information content and semantic similarity computation (2017) 0.00
```
0.0019676082 = product of:
  0.011805649 = sum of:
    0.011805649 = weight(_text_:in in 2877) [ClassicSimilarity], result of:
      0.011805649 = score(doc=2877,freq=14.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.19881277 = fieldWeight in 2877, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2877)
  0.16666667 = coord(1/6)
```
Abstract

The Information Content (IC) of a concept is a fundamental dimension in computational linguistics. It enables a better understanding of concept's semantics. In the past, several approaches to compute IC of a concept have been proposed. However, there are some limitations such as the facts of relying on corpora availability, manual tagging, or predefined ontologies and fitting non-dynamic domains in the existing methods. Wikipedia provides a very large domain-independent encyclopedic repository and semantic network for computing IC of concepts with more coverage than usual ontologies. In this paper, we propose some novel methods to IC computation of a concept to solve the shortcomings of existing approaches. The presented methods focus on the IC computation of a concept (i.e., Wikipedia category) drawn from the Wikipedia category structure. We propose several new IC-based measures to compute the semantic similarity between concepts. The evaluation, based on several widely used benchmarks and a benchmark developed in ourselves, sustains the intuitions with respect to human judgments. Overall, some methods proposed in this paper have a good human correlation and constitute some effective ways of determining IC values for concepts and semantic similarity between concepts.

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Search (2 results, page 1 of 1)

Authors