Search (3 results, page 1 of 1)

Eckert, K.; Pfeffer, M.; Stuckenschmidt, H.: Assessing thesaurus-based annotations for semantic search applications (2008) 0.04
```
0.037932523 = product of:
  0.1896626 = sum of:
    0.1896626 = weight(_text_:thesaurus in 1528) [ClassicSimilarity], result of:
      0.1896626 = score(doc=1528,freq=10.0), product of:
        0.23732872 = queryWeight, product of:
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.051357865 = queryNorm
        0.7991557 = fieldWeight in 1528, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1528)
  0.2 = coord(1/5)
```
Abstract

Statistical methods for automated document indexing are becoming an alternative to the manual assignment of keywords. We argue that the quality of the thesaurus used as a basis for indexing in regard to its ability to adequately cover the contents to be indexed and as a basis for the specific indexing method used is of crucial importance in automatic indexing. We present an interactive tool for thesaurus evaluation that is based on a combination of statistical measures and appropriate visualisation techniques that supports the detection of potential problems in a thesaurus. We describe the methods used and show that the tool supports the detection and correction of errors, leading to a better indexing result.

Theme

Konzeption und Anwendung des Prinzips Thesaurus

Pfeffer, M.; Eckert, K.; Stuckenschmidt, H.: Visual analysis of classification systems and library collections (2008) 0.02

0.019387359 = product of:
  0.09693679 = sum of:
    0.09693679 = weight(_text_:thesaurus in 317) [ClassicSimilarity], result of:
      0.09693679 = score(doc=317,freq=2.0), product of:
        0.23732872 = queryWeight, product of:
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.051357865 = queryNorm
        0.40844947 = fieldWeight in 317, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.0625 = fieldNorm(doc=317)
  0.2 = coord(1/5)

Theme: Konzeption und Anwendung des Prinzips Thesaurus

Stuckenschmidt, H.; Harmelen, F van; Waard, A. de; Scerri, T.; Bhogal, R.; Buel, J. van; Crowlesmith, I.; Fluit, C.; Kampman, A.; Broekstra, J.; Mulligen, E. van: Exploring large document repositories with RDF technology : the DOPE project (2004) 0.02
```
0.019387359 = product of:
  0.09693679 = sum of:
    0.09693679 = weight(_text_:thesaurus in 762) [ClassicSimilarity], result of:
      0.09693679 = score(doc=762,freq=8.0), product of:
        0.23732872 = queryWeight, product of:
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.051357865 = queryNorm
        0.40844947 = fieldWeight in 762, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.6210785 = idf(docFreq=1182, maxDocs=44218)
          0.03125 = fieldNorm(doc=762)
  0.2 = coord(1/5)
```
Abstract

This thesaurus-based search system uses automatic indexing, RDF-based querying, and concept-based visualization of results to support exploration of large online document repositories. Innovative research institutes rely on the availability of complete and accurate information about new research and development. Information providers such as Elsevier make it their business to provide the required information in a cost-effective way. The Semantic Web will likely contribute significantly to this effort because it facilitates access to an unprecedented quantity of data. The DOPE project (Drug Ontology Project for Elsevier) explores ways to provide access to multiple lifescience information sources through a single interface. With the unremitting growth of scientific information, integrating access to all this information remains an important problem, primarily because the information sources involved are so heterogeneous. Sources might use different syntactic standards (syntactic heterogeneity), organize information in different ways (structural heterogeneity), and even use different terminologies to refer to the same information (semantic heterogeneity). Integrated access hinges on the ability to address these different kinds of heterogeneity. Also, mental models and keywords for accessing data generally diverge between subject areas and communities; hence, many different ontologies have emerged. An ideal architecture must therefore support the disclosure of distributed and heterogeneous data sources through different ontologies. To serve this need, we've developed a thesaurus-based search system that uses automatic indexing, RDF-based querying, and concept-based visualization. We describe here the conversion of an existing proprietary thesaurus to an open standard format, a generic architecture for thesaurus-based information access, an innovative user interface, and results of initial user studies with the resulting DOPE system.

Search (3 results, page 1 of 1)

Authors

Themes