Search (1 results, page 1 of 1)

Stuckenschmidt, H.; Harmelen, F van; Waard, A. de; Scerri, T.; Bhogal, R.; Buel, J. van; Crowlesmith, I.; Fluit, C.; Kampman, A.; Broekstra, J.; Mulligen, E. van: Exploring large document repositories with RDF technology : the DOPE project (2004) 0.01
```
0.0074320463 = product of:
  0.044592276 = sum of:
    0.044592276 = product of:
      0.08918455 = sum of:
        0.08918455 = weight(_text_:thesaurus in 762) [ClassicSimilarity], result of:
          0.08918455 = score(doc=762,freq=8.0), product of:
            0.21834905 = queryWeight, product of:
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.04725067 = queryNorm
            0.40844947 = fieldWeight in 762, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.03125 = fieldNorm(doc=762)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

This thesaurus-based search system uses automatic indexing, RDF-based querying, and concept-based visualization of results to support exploration of large online document repositories. Innovative research institutes rely on the availability of complete and accurate information about new research and development. Information providers such as Elsevier make it their business to provide the required information in a cost-effective way. The Semantic Web will likely contribute significantly to this effort because it facilitates access to an unprecedented quantity of data. The DOPE project (Drug Ontology Project for Elsevier) explores ways to provide access to multiple lifescience information sources through a single interface. With the unremitting growth of scientific information, integrating access to all this information remains an important problem, primarily because the information sources involved are so heterogeneous. Sources might use different syntactic standards (syntactic heterogeneity), organize information in different ways (structural heterogeneity), and even use different terminologies to refer to the same information (semantic heterogeneity). Integrated access hinges on the ability to address these different kinds of heterogeneity. Also, mental models and keywords for accessing data generally diverge between subject areas and communities; hence, many different ontologies have emerged. An ideal architecture must therefore support the disclosure of distributed and heterogeneous data sources through different ontologies. To serve this need, we've developed a thesaurus-based search system that uses automatic indexing, RDF-based querying, and concept-based visualization. We describe here the conversion of an existing proprietary thesaurus to an open standard format, a generic architecture for thesaurus-based information access, an innovative user interface, and results of initial user studies with the resulting DOPE system.