Search (2 results, page 1 of 1)

Did you mean:
themes%3a%22Data mining%22 2

Ye, Z.; Huang, J.X.; He, B.; Lin, H.: Mining a multilingual association dictionary from Wikipedia for cross-language information retrieval (2012) 0.04
```
0.038582932 = product of:
  0.077165864 = sum of:
    0.077165864 = product of:
      0.15433173 = sum of:
        0.15433173 = weight(_text_:mining in 513) [ClassicSimilarity], result of:
          0.15433173 = score(doc=513,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.5398875 = fieldWeight in 513, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=513)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Wikipedia is characterized by its dense link structure and a large number of articles in different languages, which make it a notable Web corpus for knowledge extraction and mining, in particular for mining the multilingual associations. In this paper, motivated by a psychological theory of word meaning, we propose a graph-based approach to constructing a cross-language association dictionary (CLAD) from Wikipedia, which can be used in a variety of cross-language accessing and processing applications. In order to evaluate the quality of the mined CLAD, and to demonstrate how the mined CLAD can be used in practice, we explore two different applications of the mined CLAD to cross-language information retrieval (CLIR). First, we use the mined CLAD to conduct cross-language query expansion; and, second, we use it to filter out translation candidates with low translation probabilities. Experimental results on a variety of standard CLIR test collections show that the CLIR retrieval performance can be substantially improved with the above two applications of CLAD, which indicates that the mined CLAD is of sound quality.
Ayadi, H.; Torjmen-Khemakhem, M.; Daoud, M.; Huang, J.X.; Jemaa, M.B.: Mining correlations between medically dependent features and image retrieval models for query classification (2017) 0.04
```
0.038582932 = product of:
  0.077165864 = sum of:
    0.077165864 = product of:
      0.15433173 = sum of:
        0.15433173 = weight(_text_:mining in 3607) [ClassicSimilarity], result of:
          0.15433173 = score(doc=3607,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.5398875 = fieldWeight in 3607, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3607)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The abundance of medical resources has encouraged the development of systems that allow for efficient searches of information in large medical image data sets. State-of-the-art image retrieval models are classified into three categories: content-based (visual) models, textual models, and combined models. Content-based models use visual features to answer image queries, textual image retrieval models use word matching to answer textual queries, and combined image retrieval models, use both textual and visual features to answer queries. Nevertheless, most of previous works in this field have used the same image retrieval model independently of the query type. In this article, we define a list of generic and specific medical query features and exploit them in an association rule mining technique to discover correlations between query features and image retrieval models. Based on these rules, we propose to use an associative classifier (NaiveClass) to find the best suitable retrieval model given a new textual query. We also propose a second associative classifier (SmartClass) to select the most appropriate default class for the query. Experiments are performed on Medical ImageCLEF queries from 2008 to 2012 to evaluate the impact of the proposed query features on the classification performance. The results show that combining our proposed specific and generic query features is effective in query classification.

Theme

Data Mining

Search (2 results, page 1 of 1)

Authors

Themes