Search (169 results, page 1 of 9)

Bittner, T.; Donnelly, M.; Winter, S.: Ontology and semantic interoperability (2006) 0.14

0.1384162 = product of:
  0.24914916 = sum of:
    0.10067343 = weight(_text_:applications in 4820) [ClassicSimilarity], result of:
      0.10067343 = score(doc=4820,freq=8.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.5836958 = fieldWeight in 4820, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.046875 = fieldNorm(doc=4820)
    0.012701439 = weight(_text_:of in 4820) [ClassicSimilarity], result of:
      0.012701439 = score(doc=4820,freq=8.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.20732689 = fieldWeight in 4820, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=4820)
    0.0490556 = weight(_text_:systems in 4820) [ClassicSimilarity], result of:
      0.0490556 = score(doc=4820,freq=8.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.4074492 = fieldWeight in 4820, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=4820)
    0.070794985 = weight(_text_:software in 4820) [ClassicSimilarity], result of:
      0.070794985 = score(doc=4820,freq=6.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.4555077 = fieldWeight in 4820, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.046875 = fieldNorm(doc=4820)
    0.015923709 = product of:
      0.031847417 = sum of:
        0.031847417 = weight(_text_:22 in 4820) [ClassicSimilarity], result of:
          0.031847417 = score(doc=4820,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.23214069 = fieldWeight in 4820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=4820)
      0.5 = coord(1/2)
  0.5555556 = coord(5/9)

Abstract: One of the major problems facing systems for Computer Aided Design (CAD), Architecture Engineering and Construction (AEC) and Geographic Information Systems (GIS) applications today is the lack of interoperability among the various systems. When integrating software applications, substantial di culties can arise in translating information from one application to the other. In this paper, we focus on semantic di culties that arise in software integration. Applications may use di erent terminologies to describe the same domain. Even when appli-cations use the same terminology, they often associate di erent semantics with the terms. This obstructs information exchange among applications. To cir-cumvent this obstacle, we need some way of explicitly specifying the semantics for each terminology in an unambiguous fashion. Ontologies can provide such specification. It will be the task of this paper to explain what ontologies are and how they can be used to facilitate interoperability between software systems used in computer aided design, architecture engineering and construction, and geographic information processing.
Date: 3.12.2016 18:39:22

Warnick, W.L.; Leberman, A.; Scott, R.L.; Spence, K.J.; Johnsom, L.A.; Allen, V.S.: Searching the deep Web : directed query engine applications at the Department of Energy (2001) 0.04

0.0378924 = product of:
  0.1136772 = sum of:
    0.07118686 = weight(_text_:applications in 1215) [ClassicSimilarity], result of:
      0.07118686 = score(doc=1215,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.41273528 = fieldWeight in 1215, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.046875 = fieldNorm(doc=1215)
    0.017962547 = weight(_text_:of in 1215) [ClassicSimilarity], result of:
      0.017962547 = score(doc=1215,freq=16.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2932045 = fieldWeight in 1215, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1215)
    0.0245278 = weight(_text_:systems in 1215) [ClassicSimilarity], result of:
      0.0245278 = score(doc=1215,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.2037246 = fieldWeight in 1215, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=1215)
  0.33333334 = coord(3/9)

Abstract: Directed Query Engines, an emerging class of search engine specifically designed to access distributed resources on the deep web, offer the opportunity to create inexpensive digital libraries. Already, one such engine, Distributed Explorer, has been used to select and assemble high quality information resources and incorporate them into publicly available systems for the physical sciences. By nesting Directed Query Engines so that one query launches several other engines in a cascading fashion, enormous virtual collections may soon be assembled to form a comprehensive information infrastructure for the physical sciences. Once a Directed Query Engine has been configured for a set of information resources, distributed alerts tools can provide patrons with personalized, profile-based notices of recent additions to any of the selected resources. Due to the potentially enormous size and scope of Directed Query Engine applications, consideration must be given to issues surrounding the representation of large quantities of information from multiple, heterogeneous sources.

Hitchcock, S.; Bergmark, D.; Brody, T.; Gutteridge, C.; Carr, L.; Hall, W.; Lagoze, C.; Harnad, S.: Open citation linking : the way forward (2002) 0.03
```
0.031447157 = product of:
  0.094341464 = sum of:
    0.041947264 = weight(_text_:applications in 1207) [ClassicSimilarity], result of:
      0.041947264 = score(doc=1207,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2432066 = fieldWeight in 1207, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1207)
    0.018332949 = weight(_text_:of in 1207) [ClassicSimilarity], result of:
      0.018332949 = score(doc=1207,freq=24.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2992506 = fieldWeight in 1207, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1207)
    0.034061253 = weight(_text_:software in 1207) [ClassicSimilarity], result of:
      0.034061253 = score(doc=1207,freq=2.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.21915624 = fieldWeight in 1207, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1207)
  0.33333334 = coord(3/9)
```
Abstract

The speed of scientific communication - the rate of ideas affecting other researchers' ideas - is increasing dramatically. The factor driving this is free, unrestricted access to research papers. Measurements of user activity in mature eprint archives of research papers such as arXiv have shown, for the first time, the degree to which such services support an evolving network of texts commenting on, citing, classifying, abstracting, listing and revising other texts. The Open Citation project has built tools to measure this activity, to build new archives, and has been closely involved with the development of the infrastructure to support open access on which these new services depend. This is the story of the project, intertwined with the concurrent emergence of the Open Archives Initiative (OAI). The paper describes the broad scope of the project's work, showing how it has progressed from early demonstrators of reference linking to produce Citebase, a Web-based citation and impact-ranked search service, and how it has supported the development of the EPrints.org software for building OAI-compliant archives. The work has been underpinned by analysis and experiments on the semantics of documents (digital objects) to determine the features required for formally perfect linking - instantiated as an application programming interface (API) for reference linking - that will enable other applications to build on this work in broader digital library information environments.
Hammond, T.; Hannay, T.; Lund, B.; Scott, J.: Social bookmarking tools (I) : a general review (2005) 0.03
```
0.03087355 = product of:
  0.09262065 = sum of:
    0.041525673 = weight(_text_:applications in 1188) [ClassicSimilarity], result of:
      0.041525673 = score(doc=1188,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.24076225 = fieldWeight in 1188, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1188)
    0.01737605 = weight(_text_:of in 1188) [ClassicSimilarity], result of:
      0.01737605 = score(doc=1188,freq=44.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.28363106 = fieldWeight in 1188, product of:
          6.6332498 = tf(freq=44.0), with freq of:
            44.0 = termFreq=44.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1188)
    0.033718925 = weight(_text_:software in 1188) [ClassicSimilarity], result of:
      0.033718925 = score(doc=1188,freq=4.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.21695362 = fieldWeight in 1188, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1188)
  0.33333334 = coord(3/9)
```
Abstract

Because, to paraphrase a pop music lyric from a certain rock and roll band of yesterday, "the Web is old, the Web is new, the Web is all, the Web is you", it seems like we might have to face up to some of these stark realities. With the introduction of new social software applications such as blogs, wikis, newsfeeds, social networks, and bookmarking tools (the subject of this paper), the claim that Shelley Powers makes in a Burningbird blog entry seems apposite: "This is the user's web now, which means it's my web and I can make the rules." Reinvention is revolution - it brings us always back to beginnings. We are here going to remind you of hyperlinks in all their glory, sell you on the idea of bookmarking hyperlinks, point you at other folks who are doing the same, and tell you why this is a good thing. Just as long as those hyperlinks (or let's call them plain old links) are managed, tagged, commented upon, and published onto the Web, they represent a user's own personal library placed on public record, which - when aggregated with other personal libraries - allows for rich, social networking opportunities. Why spill any ink (digital or not) in rewriting what someone else has already written about instead of just pointing at the original story and adding the merest of titles, descriptions and tags for future reference? More importantly, why not make these personal 'link playlists' available to oneself and to others from whatever browser or computer one happens to be using at the time? This paper reviews some current initiatives, as of early 2005, in providing public link management applications on the Web - utilities that are often referred to under the general moniker of 'social bookmarking tools'. There are a couple of things going on here: 1) server-side software aimed specifically at managing links with, crucially, a strong, social networking flavour, and 2) an unabashedly open and unstructured approach to tagging, or user classification, of those links.
A number of such utilities are presented here, together with an emergent new class of tools that caters more to the academic communities and that stores not only user-supplied tags, but also structured citation metadata terms wherever it is possible to glean this information from service providers. This provision of rich, structured metadata means that the user is provided with an accurate third-party identification of a document, which could be used to retrieve that document, but is also free to search on user-supplied terms so that documents of interest (or rather, references to documents) can be made discoverable and aggregated with other similar descriptions either recorded by the user or by other users. Matt Biddulph in an XML.com article last year, in which he reviews one of the better known social bookmarking tools, del.icio.us, declares that the "del.icio.us-space has three major axes: users, tags, and URLs". We fully support that assessment but choose to present this deconstruction in a reverse order. This paper thus first recaps a brief history of bookmarks, then discusses the current interest in tagging, moves on to look at certain social issues, and finally considers some of the feature sets offered by the new bookmarking tools. A general review of a number of common social bookmarking tools is presented in the annex. A companion paper describes a case study in more detail: the tool that Nature Publishing Group has made available to the scientific community as an experimental entrée into this field - Connotea; our reasons for endeavouring to provide such a utility; and experiences gained and lessons learned.

Van der Veer Martens, B.: Do citation systems represent theories of truth? (2001) 0.03

0.029665582 = product of:
  0.088996746 = sum of:
    0.010584532 = weight(_text_:of in 3925) [ClassicSimilarity], result of:
      0.010584532 = score(doc=3925,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.17277241 = fieldWeight in 3925, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=3925)
    0.040879667 = weight(_text_:systems in 3925) [ClassicSimilarity], result of:
      0.040879667 = score(doc=3925,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.339541 = fieldWeight in 3925, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.078125 = fieldNorm(doc=3925)
    0.037532546 = product of:
      0.07506509 = sum of:
        0.07506509 = weight(_text_:22 in 3925) [ClassicSimilarity], result of:
          0.07506509 = score(doc=3925,freq=4.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.54716086 = fieldWeight in 3925, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3925)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Date: 22. 7.2006 15:22:28

Heery, R.; Wagner, H.: ¬A metadata registry for the Semantic Web (2002) 0.03
```
0.028086103 = product of:
  0.08425831 = sum of:
    0.041525673 = weight(_text_:applications in 1210) [ClassicSimilarity], result of:
      0.041525673 = score(doc=1210,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.24076225 = fieldWeight in 1210, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1210)
    0.018889757 = weight(_text_:of in 1210) [ClassicSimilarity], result of:
      0.018889757 = score(doc=1210,freq=52.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.30833945 = fieldWeight in 1210, product of:
          7.2111025 = tf(freq=52.0), with freq of:
            52.0 = termFreq=52.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1210)
    0.023842877 = weight(_text_:software in 1210) [ClassicSimilarity], result of:
      0.023842877 = score(doc=1210,freq=2.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.15340936 = fieldWeight in 1210, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1210)
  0.33333334 = coord(3/9)
```
Abstract

The Semantic Web activity is a W3C project whose goal is to enable a 'cooperative' Web where machines and humans can exchange electronic content that has clear-cut, unambiguous meaning. This vision is based on the automated sharing of metadata terms across Web applications. The declaration of schemas in metadata registries advance this vision by providing a common approach for the discovery, understanding, and exchange of semantics. However, many of the issues regarding registries are not clear, and ideas vary regarding their scope and purpose. Additionally, registry issues are often difficult to describe and comprehend without a working example. This article will explore the role of metadata registries and will describe three prototypes, written by the Dublin Core Metadata Initiative. The article will outline how the prototypes are being used to demonstrate and evaluate application scope, functional requirements, and technology solutions for metadata registries. Metadata schema registries are, in effect, databases of schemas that can trace an historical line back to shared data dictionaries and the registration process encouraged by the ISO/IEC 11179 community. New impetus for the development of registries has come with the development activities surrounding creation of the Semantic Web. The motivation for establishing registries arises from domain and standardization communities, and from the knowledge management community. Examples of current registry activity include:
* Agencies maintaining directories of data elements in a domain area in accordance with ISO/IEC 11179 (This standard specifies good practice for data element definition as well as the registration process. Example implementations are the National Health Information Knowledgebase hosted by the Australian Institute of Health and Welfare and the Environmental Data Registry hosted by the US Environmental Protection Agency.); * The xml.org directory of the Extended Markup Language (XML) document specifications facilitating re-use of Document Type Definition (DTD), hosted by the Organization for the Advancement of Structured Information Standards (OASIS); * The MetaForm database of Dublin Core usage and mappings maintained at the State and University Library in Goettingen; * The Semantic Web Agreement Group Dictionary, a database of terms for the Semantic Web that can be referred to by humans and software agents; * LEXML, a multi-lingual and multi-jurisdictional RDF Dictionary for the legal world; * The SCHEMAS registry maintained by the European Commission funded SCHEMAS project, which indexes several metadata element sets as well as a large number of activity reports describing metadata related activities and initiatives. Metadata registries essentially provide an index of terms. Given the distributed nature of the Web, there are a number of ways this can be accomplished. For example, the registry could link to terms and definitions in schemas published by implementers and stored locally by the schema maintainer. Alternatively, the registry might harvest various metadata schemas from their maintainers. Registries provide 'added value' to users by indexing schemas relevant to a particular 'domain' or 'community of use' and by simplifying the navigation of terms by enabling multiple schemas to be accessed from one view. An important benefit of this approach is an increase in the reuse of existing terms, rather than users having to reinvent them. Merging schemas to one view leads to harmonization between applications and helps avoid duplication of effort. Additionally, the establishment of registries to index terms actively being used in local implementations facilitates the metadata standards activity by providing implementation experience transferable to the standards-making process.

Soergel, D.; Lauser, B.; Liang, A.; Fisseha, F.; Keizer, J.; Katz, S.: Reengineering thesauri for new applications : the AGROVOC example (2004) 0.03

0.025194416 = product of:
  0.11337487 = sum of:
    0.10067343 = weight(_text_:applications in 2347) [ClassicSimilarity], result of:
      0.10067343 = score(doc=2347,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.5836958 = fieldWeight in 2347, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.09375 = fieldNorm(doc=2347)
    0.012701439 = weight(_text_:of in 2347) [ClassicSimilarity], result of:
      0.012701439 = score(doc=2347,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.20732689 = fieldWeight in 2347, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=2347)
  0.22222222 = coord(2/9)

Source: Journal of digital information. 4(2004) no.4, art.#257

Beppler, F.D.; Fonseca, F.T.; Pacheco, R.C.S.: Hermeneus: an architecture for an ontology-enabled information retrieval (2008) 0.02

0.021603964 = product of:
  0.06481189 = sum of:
    0.014200641 = weight(_text_:of in 3261) [ClassicSimilarity], result of:
      0.014200641 = score(doc=3261,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.23179851 = fieldWeight in 3261, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3261)
    0.034687545 = weight(_text_:systems in 3261) [ClassicSimilarity], result of:
      0.034687545 = score(doc=3261,freq=4.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.28811008 = fieldWeight in 3261, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=3261)
    0.015923709 = product of:
      0.031847417 = sum of:
        0.031847417 = weight(_text_:22 in 3261) [ClassicSimilarity], result of:
          0.031847417 = score(doc=3261,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.23214069 = fieldWeight in 3261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3261)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: Ontologies improve IR systems regarding its retrieval and presentation of information, which make the task of finding information more effective, efficient, and interactive. In this paper we argue that ontologies also greatly improve the engineering of such systems. We created a framework that uses ontology to drive the process of engineering an IR system. We developed a prototype that shows how a domain specialist without knowledge in the IR field can build an IR system with interactive components. The resulting system provides support for users not only to find their information needs but also to extend their state of knowledge. This way, our approach to ontology-enabled information retrieval addresses both the engineering aspect described here and also the usability aspect described elsewhere.
Date: 28.11.2016 12:43:22

Crane, G.; Jones, A.: Text, information, knowledge and the evolving record of humanity (2006) 0.02
```
0.020951357 = product of:
  0.06285407 = sum of:
    0.015654733 = weight(_text_:of in 1182) [ClassicSimilarity], result of:
      0.015654733 = score(doc=1182,freq=70.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2555338 = fieldWeight in 1182, product of:
          8.3666 = tf(freq=70.0), with freq of:
            70.0 = termFreq=70.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1182)
    0.017701415 = weight(_text_:systems in 1182) [ClassicSimilarity], result of:
      0.017701415 = score(doc=1182,freq=6.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.14702557 = fieldWeight in 1182, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1182)
    0.029497914 = weight(_text_:software in 1182) [ClassicSimilarity], result of:
      0.029497914 = score(doc=1182,freq=6.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.18979488 = fieldWeight in 1182, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1182)
  0.33333334 = coord(3/9)
```
Abstract

Consider a sentence such as "the current price of tea in China is 35 cents per pound." In a library with millions of books we might find many statements of the above form that we could capture today with relatively simple rules: rather than pursuing every variation of a statement, programs can wait, like predators at a water hole, for their informational prey to reappear in a standard linguistic pattern. We can make inferences from sentences such as "NAME1 born at NAME2 in DATE" that NAME more likely than not represents a person and NAME a place and then convert the statement into a proposition about a person born at a given place and time. The changing price of tea in China, pedestrian birth and death dates, or other basic statements may not be truth and beauty in the Phaedrus, but a digital library that could plot the prices of various commodities in different markets over time, plot the various lifetimes of individuals, or extract and classify many events would be very useful. Services such as the Syllabus Finder1 and H-Bot2 (which Dan Cohen describes elsewhere in this issue of D-Lib) represent examples of information extraction already in use. H-Bot, in particular, builds on our evolving ability to extract information from very large corpora such as the billions of web pages available through the Google API. Aside from identifying higher order statements, however, users also want to search and browse named entities: they want to read about "C. P. E. Bach" rather than his father "Johann Sebastian" or about "Cambridge, Maryland", without hearing about "Cambridge, Massachusetts", Cambridge in the UK or any of the other Cambridges scattered around the world. Named entity identification is a well-established area with an ongoing literature. The Natural Language Processing Research Group at the University of Sheffield has developed its open source Generalized Architecture for Text Engineering (GATE) for years, while IBM's Unstructured Information Analysis and Search (UIMA) is "available as open source software to provide a common foundation for industry and academia." Powerful tools are thus freely available and more demanding users can draw upon published literature to develop their own systems. Major search engines such as Google and Yahoo also integrate increasingly sophisticated tools to categorize and identify places. The software resources are rich and expanding. The reference works on which these systems depend, however, are ill-suited for historical analysis. First, simple gazetteers and similar authority lists quickly grow too big for useful information extraction. They provide us with potential entities against which to match textual references, but existing electronic reference works assume that human readers can use their knowledge of geography and of the immediate context to pick the right Boston from the Bostons in the Getty Thesaurus of Geographic Names (TGN), but, with the crucial exception of geographic location, the TGN records do not provide any machine readable clues: we cannot tell which Bostons are large or small. If we are analyzing a document published in 1818, we cannot filter out those places that did not yet exist or that had different names: "Jefferson Davis" is not the name of a parish in Louisiana (tgn,2000880) or a county in Mississippi (tgn,2001118) until after the Civil War.
Although the Alexandria Digital Library provides far richer data than the TGN (5.9 vs. 1.3 million names), its added size lowers, rather than increases, the accuracy of most geographic name identification systems for historical documents: most of the extra 4.6 million names cover low frequency entities that rarely occur in any particular corpus. The TGN is sufficiently comprehensive to provide quite enough noise: we find place names that are used over and over (there are almost one hundred Washingtons) and semantically ambiguous (e.g., is Washington a person or a place?). Comprehensive knowledge sources emphasize recall but lower precision. We need data with which to determine which "Tribune" or "John Brown" a particular passage denotes. Secondly and paradoxically, our reference works may not be comprehensive enough. Human actors come and go over time. Organizations appear and vanish. Even places can change their names or vanish. The TGN does associate the obsolete name Siam with the nation of Thailand (tgn,1000142) - but also with towns named Siam in Iowa (tgn,2035651), Tennessee (tgn,2101519), and Ohio (tgn,2662003). Prussia appears but as a general region (tgn,7016786), with no indication when or if it was a sovereign nation. And if places do point to the same object over time, that object may have very different significance over time: in the foundational works of Western historiography, Herodotus reminds us that the great cities of the past may be small today, and the small cities of today great tomorrow (Hdt. 1.5), while Thucydides stresses that we cannot estimate the past significance of a place by its appearance today (Thuc. 1.10). In other words, we need to know the population figures for the various Washingtons in 1870 if we are analyzing documents from 1870. The foundations have been laid for reference works that provide machine actionable information about entities at particular times in history. The Alexandria Digital Library Gazetteer Content Standard8 represents a sophisticated framework with which to create such resources: places can be associated with temporal information about their foundation (e.g., Washington, DC, founded on 16 July 1790), changes in names for the same location (e.g., Saint Petersburg to Leningrad and back again), population figures at various times and similar historically contingent data. But if we have the software and the data structures, we do not yet have substantial amounts of historical content such as plentiful digital gazetteers, encyclopedias, lexica, grammars and other reference works to illustrate many periods and, even if we do, those resources may not be in a useful form: raw OCR output of a complex lexicon or gazetteer may have so many errors and have captured so little of the underlying structure that the digital resource is useless as a knowledge base. Put another way, human beings are still much better at reading and interpreting the contents of page images than machines. While people, places, and dates are probably the most important core entities, we will find a growing set of objects that we need to identify and track across collections, and each of these categories of objects will require its own knowledge sources. The following section enumerates and briefly describes some existing categories of documents that we need to mine for knowledge. This brief survey focuses on the format of print sources (e.g., highly structured textual "database" vs. unstructured text) to illustrate some of the challenges involved in converting our published knowledge into semantically annotated, machine actionable form.

Aitken, S.; Reid, S.: Evaluation of an ontology-based information retrieval tool (2000) 0.02

0.019893078 = product of:
  0.08951885 = sum of:
    0.06711562 = weight(_text_:applications in 2862) [ClassicSimilarity], result of:
      0.06711562 = score(doc=2862,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.38913056 = fieldWeight in 2862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0625 = fieldNorm(doc=2862)
    0.022403233 = weight(_text_:of in 2862) [ClassicSimilarity], result of:
      0.022403233 = score(doc=2862,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.36569026 = fieldWeight in 2862, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=2862)
  0.22222222 = coord(2/9)

Abstract: This paper evaluates the use of an explicit domain ontology in an information retrieval tool. The evaluation compares the performance of ontology-enhanced retrieval with keyword retrieval for a fixed set of queries across several data sets. The robustness of the IR approach is assessed by comparing the performance of the tool on the original data set with that on previously unseen data.
Content: Beitrag für: Workshop on the Applications of Ontologies and Problem-Solving Methods, (eds) Gómez-Pérez, A., Benjamins, V.R., Guarino, N., and Uschold, M. European Conference on Artificial Intelligence 2000, Berlin.

Paskin, N.: DOI: a 2003 progress report (2003) 0.02
```
0.018623676 = product of:
  0.083806545 = sum of:
    0.065657854 = weight(_text_:applications in 1203) [ClassicSimilarity], result of:
      0.065657854 = score(doc=1203,freq=10.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.38067853 = fieldWeight in 1203, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1203)
    0.018148692 = weight(_text_:of in 1203) [ClassicSimilarity], result of:
      0.018148692 = score(doc=1203,freq=48.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.29624295 = fieldWeight in 1203, product of:
          6.928203 = tf(freq=48.0), with freq of:
            48.0 = termFreq=48.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1203)
  0.22222222 = coord(2/9)
```
Abstract

The International DOI Foundation (IDF) recently published the third edition of its DOI Handbook, which sets the scene for DOI's expansion into much wider applications. Edition 3 is not simply an updated user guide. A great deal has happened in the underlying technologies and in the practical deployment and development of DOIs (Digital Object Identifiers) since the last edition was published a year ago. Much of the program of technical work foreseen at the inception of DOIs has now been completed. The initial simple implementation of DOI as a persistent name linked to redirection continues to grow, with approaching ten million DOIs assigned from several hundred organisations through a number of Registration Agencies in USA, Europe, and Australasia, supporting large scale business uses. Implementations of more sophisticated applications (offering associated services) have been developing well but on a smaller scale: a framework for building these has been completed as part of the latest release and promises to stimulate a new wave of growth. From its original starting point in text publishing, there has been gradual embrace by a number of communities: these include national libraries (a consortium of national libraries recently joined the IDF); government documentation (with the appointment of TSO The Stationery Office in the UK as a DOI agency and the announced intention of the EC Office of Publications to use DOIs); non-English language markets (France, Germany, Spain, Italy, Korea). However implementations in non-text sectors have been far slower to develop, though several are now under discussion. The DOI community can point to several significant achievements over the past few years: * A practical successful open implementation of naming objects, treating content as information objects, not simply packets of bits; * The IDF's role in co-sponsoring, championing, and now implementing the <indecs>T framework as a semantic tool for structured metadata - an essential step for treating content as information in Semantic-Web-like applications; * A template for building advanced applications, connecting resolution and metadata technologies, and offering hooks to web services and similar applications; * The development of a policy framework that allows multiple communities autonomy; * The practical implementation of DOIs with emerging related standards such as the OpenURL framework in contextual linking.
A number of issues remain to be solved. In the main these are no longer technical in nature, but more concerned with perception and outreach to other communities. They include: correctly positioning the DOI in the standards community as a practical implementation (based on standards, but more than standards); offering the benefits of DOI to other communities working in related identifier development whilst allowing them to remain largely autonomous; demonstrating how DOIs can complement, rather than compete with, other activities; and ensuring that a sustainable long-term infrastructure for any application (commercial and non-commercial alike) is in place. Persistent, actionable identifiers with a fully managed sustainable infrastructure are not appropriate for every activity; but they are suitable for many, and where they are used, the key to providing a successful and widely adopted system is encouraging economy of scale (and so, where possible, convergence with other related efforts), flexibility of use, and a low barrier to use. DOI is well on the way to providing this, but not yet guaranteed of success without the further effort that is now being applied.
Bradford, R.B.: Relationship discovery in large text collections using Latent Semantic Indexing (2006) 0.02
```
0.01860912 = product of:
  0.055827357 = sum of:
    0.017962547 = weight(_text_:of in 1163) [ClassicSimilarity], result of:
      0.017962547 = score(doc=1163,freq=36.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2932045 = fieldWeight in 1163, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1163)
    0.027249003 = weight(_text_:software in 1163) [ClassicSimilarity], result of:
      0.027249003 = score(doc=1163,freq=2.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.17532499 = fieldWeight in 1163, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03125 = fieldNorm(doc=1163)
    0.010615807 = product of:
      0.021231614 = sum of:
        0.021231614 = weight(_text_:22 in 1163) [ClassicSimilarity], result of:
          0.021231614 = score(doc=1163,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.15476047 = fieldWeight in 1163, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1163)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)
```
Abstract

This paper addresses the problem of information discovery in large collections of text. For users, one of the key problems in working with such collections is determining where to focus their attention. In selecting documents for examination, users must be able to formulate reasonably precise queries. Queries that are too broad will greatly reduce the efficiency of information discovery efforts by overwhelming the users with peripheral information. In order to formulate efficient queries, a mechanism is needed to automatically alert users regarding potentially interesting information contained within the collection. This paper presents the results of an experiment designed to test one approach to generation of such alerts. The technique of latent semantic indexing (LSI) is used to identify relationships among entities of interest. Entity extraction software is used to pre-process the text of the collection so that the LSI space contains representation vectors for named entities in addition to those for individual terms. In the LSI space, the cosine of the angle between the representation vectors for two entities captures important information regarding the degree of association of those two entities. For appropriate choices of entities, determining the entity pairs with the highest mutual cosine values yields valuable information regarding the contents of the text collection. The test database used for the experiment consists of 150,000 news articles. The proposed approach for alert generation is tested using a counterterrorism analysis example. The approach is shown to have significant potential for aiding users in rapidly focusing on information of potential importance in large text collections. The approach also has value in identifying possible use of aliases.

Source

Proceedings of the Fourth Workshop on Link Analysis, Counterterrorism, and Security, SIAM Data Mining Conference, Bethesda, MD, 20-22 April, 2006. [http://www.siam.org/meetings/sdm06/workproceed/Link%20Analysis/15.pdf]
Beagle, D.: Visualizing keyword distribution across multidisciplinary c-space (2003) 0.02
```
0.018078228 = product of:
  0.054234684 = sum of:
    0.025168357 = weight(_text_:applications in 1202) [ClassicSimilarity], result of:
      0.025168357 = score(doc=1202,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.14592396 = fieldWeight in 1202, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1202)
    0.016802425 = weight(_text_:of in 1202) [ClassicSimilarity], result of:
      0.016802425 = score(doc=1202,freq=56.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2742677 = fieldWeight in 1202, product of:
          7.483315 = tf(freq=56.0), with freq of:
            56.0 = termFreq=56.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1202)
    0.0122639 = weight(_text_:systems in 1202) [ClassicSimilarity], result of:
      0.0122639 = score(doc=1202,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1018623 = fieldWeight in 1202, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1202)
  0.33333334 = coord(3/9)
```
Abstract

The concept of c-space is proposed as a visualization schema relating containers of content to cataloging surrogates and classification structures. Possible applications of keyword vector clusters within c-space could include improved retrieval rates through the use of captioning within visual hierarchies, tracings of semantic bleeding among subclasses, and access to buried knowledge within subject-neutral publication containers. The Scholastica Project is described as one example, following a tradition of research dating back to the 1980's. Preliminary focus group assessment indicates that this type of classification rendering may offer digital library searchers enriched entry strategies and an expanded range of re-entry vocabularies. Those of us who work in traditional libraries typically assume that our systems of classification: Library of Congress Classification (LCC) and Dewey Decimal Classification (DDC), are descriptive rather than prescriptive. In other words, LCC classes and subclasses approximate natural groupings of texts that reflect an underlying order of knowledge, rather than arbitrary categories prescribed by librarians to facilitate efficient shelving. Philosophical support for this assumption has traditionally been found in a number of places, from the archetypal tree of knowledge, to Aristotelian categories, to the concept of discursive formations proposed by Michel Foucault. Gary P. Radford has elegantly described an encounter with Foucault's discursive formations in the traditional library setting: "Just by looking at the titles on the spines, you can see how the books cluster together...You can identify those books that seem to form the heart of the discursive formation and those books that reside on the margins. Moving along the shelves, you see those books that tend to bleed over into other classifications and that straddle multiple discursive formations. You can physically and sensually experience...those points that feel like state borders or national boundaries, those points where one subject ends and another begins, or those magical places where one subject has morphed into another..."
But what happens to this awareness in a digital library? Can discursive formations be represented in cyberspace, perhaps through diagrams in a visualization interface? And would such a schema be helpful to a digital library user? To approach this question, it is worth taking a moment to reconsider what Radford is looking at. First, he looks at titles to see how the books cluster. To illustrate, I scanned one hundred books on the shelves of a college library under subclass HT 101-395, defined by the LCC subclass caption as Urban groups. The City. Urban sociology. Of the first 100 titles in this sequence, fifty included the word "urban" or variants (e.g. "urbanization"). Another thirty-five used the word "city" or variants. These keywords appear to mark their titles as the heart of this discursive formation. The scattering of titles not using "urban" or "city" used related terms such as "town," "community," or in one case "skyscrapers." So we immediately see some empirical correlation between keywords and classification. But we also see a problem with the commonly used search technique of title-keyword. A student interested in urban studies will want to know about this entire subclass, and may wish to browse every title available therein. A title-keyword search on "urban" will retrieve only half of the titles, while a search on "city" will retrieve just over a third. There will be no overlap, since no titles in this sample contain both words. The only place where both words appear in a common string is in the LCC subclass caption, but captions are not typically indexed in library Online Public Access Catalogs (OPACs). In a traditional library, this problem is mitigated when the student goes to the shelf looking for any one of the books and suddenly discovers a much wider selection than the keyword search had led him to expect. But in a digital library, the issue of non-retrieval can be more problematic, as studies have indicated. Micco and Popp reported that, in a study funded partly by the U.S. Department of Education, 65 of 73 unskilled users searching for material on U.S./Soviet foreign relations found some material but never realized they had missed a large percentage of what was in the database.

Hoang, H.H.; Tjoa, A.M: ¬The state of the art of ontology-based query systems : a comparison of existing approaches (2006) 0.02

0.017566169 = product of:
  0.07904776 = sum of:
    0.022403233 = weight(_text_:of in 792) [ClassicSimilarity], result of:
      0.022403233 = score(doc=792,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.36569026 = fieldWeight in 792, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=792)
    0.05664453 = weight(_text_:systems in 792) [ClassicSimilarity], result of:
      0.05664453 = score(doc=792,freq=6.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.4704818 = fieldWeight in 792, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0625 = fieldNorm(doc=792)
  0.22222222 = coord(2/9)

Abstract: Based on an in-depth analysis of existing approaches in building ontology-based query systems we discuss and compare the methods, approaches to be used in current query systems using Ontology or the Semantic Web techniques. This paper identifies various relevant research directions in ontology-based querying research. Based on the results of our investigation we summarise the state of the art ontology-based query/search and name areas of further research activities.

Witten, I.H.; Bainbridge, D.; Boddie, S.J.: Greenstone : open-source digital library software (2001) 0.02

0.01714349 = product of:
  0.0771457 = sum of:
    0.0063507194 = weight(_text_:of in 1225) [ClassicSimilarity], result of:
      0.0063507194 = score(doc=1225,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.103663445 = fieldWeight in 1225, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1225)
    0.070794985 = weight(_text_:software in 1225) [ClassicSimilarity], result of:
      0.070794985 = score(doc=1225,freq=6.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.4555077 = fieldWeight in 1225, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.046875 = fieldNorm(doc=1225)
  0.22222222 = coord(2/9)

Abstract: The Greenstone digital library software is an open-source system for the construction and presentation of information collections. It builds collections with effective full-text searching and metadata-based browsing facilities that are attractive and easy to use. Moreover, they are easily maintained and can be augmented and rebuilt entirely automatically. The system is extensible: software "plugins" accommodate different document and metadata types. Greenstone incorporates an interface that makes it easy for people to create their own library collections. Collections may be built and served locally from the user's own web server, or (given appropriate permissions) remotely on a shared digital library host. End users can easily build new collections styled after existing ones from material on the Web or from their local files (or both), and collections can be updated and new ones brought on-line at any time.

Young, J.A.; Hickey, T.B.: WikiD: an OpenURL 1.0 application (2006) 0.02

0.017083302 = product of:
  0.07687486 = sum of:
    0.05872617 = weight(_text_:applications in 3065) [ClassicSimilarity], result of:
      0.05872617 = score(doc=3065,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.34048924 = fieldWeight in 3065, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3065)
    0.018148692 = weight(_text_:of in 3065) [ClassicSimilarity], result of:
      0.018148692 = score(doc=3065,freq=12.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.29624295 = fieldWeight in 3065, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3065)
  0.22222222 = coord(2/9)

Abstract: OpenURL was originally developed to enable link resolution of citation information in a distributed interoperable way. The initial standard (version 0.1) has been effectively subsumed as an application (named the San Antonio Level 1 profile) of a much more general framework called OpenURL 1.0. We used the framework to create WikiD (Wiki/Data), an application that has little to do with citation link resolvers, but is instead a set of general purpose services for managing arbitrary collections of items. The model for this application is a wiki engine generalized to manage multiple collections of XML records. This article describes WikiD and how it can serve as an example for applications that can be built on the foundation of the OpenURL framework.

Cranefield, S.: Networked knowledge representation and exchange using UML and RDF (2001) 0.02

0.016731909 = product of:
  0.075293586 = sum of:
    0.05872617 = weight(_text_:applications in 5896) [ClassicSimilarity], result of:
      0.05872617 = score(doc=5896,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.34048924 = fieldWeight in 5896, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5896)
    0.016567415 = weight(_text_:of in 5896) [ClassicSimilarity], result of:
      0.016567415 = score(doc=5896,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2704316 = fieldWeight in 5896, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5896)
  0.22222222 = coord(2/9)

Abstract: This paper proposes the use of the Unified Modeling Language (UML) as a language for modelling ontologies for Web resources and the knowledge contained within them. To provide a mechanism for serialising and processing object diagrams representing knowledge, a pair of XSI-T stylesheets have been developed to map from XML Metadata Interchange (XMI) encodings of class diagrams to corresponding RDF schemas and to Java classes representing the concepts in the ontologies. The Java code includes methods for marshalling and unmarshalling object-oriented information between in-memory data structures and RDF serialisations of that information. This provides a convenient mechanism for Java applications to share knowledge on the Web
Source: Journal of digital information. 1(2001) no.8

Van de Sompel, H.; Young, J.A.; Hickey, T.B.: Using the OAI-PMH ... differently (2003) 0.02
```
0.016509151 = product of:
  0.07429118 = sum of:
    0.059322387 = weight(_text_:applications in 1191) [ClassicSimilarity], result of:
      0.059322387 = score(doc=1191,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.34394607 = fieldWeight in 1191, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1191)
    0.014968789 = weight(_text_:of in 1191) [ClassicSimilarity], result of:
      0.014968789 = score(doc=1191,freq=16.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.24433708 = fieldWeight in 1191, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1191)
  0.22222222 = coord(2/9)
```
Abstract

The Open Archives Initiative's Protocol for Metadata Harvesting (OAI-PMH) was created to facilitate discovery of distributed resources. The OAI-PMH achieves this by providing a simple, yet powerful framework for metadata harvesting. Harvesters can incrementally gather records contained in OAI-PMH repositories and use them to create services covering the content of several repositories. The OAI-PMH has been widely accepted, and until recently, it has mainly been applied to make Dublin Core metadata about scholarly objects contained in distributed repositories searchable through a single user interface. This article describes innovative applications of the OAI-PMH that we have introduced in recent projects. In these projects, OAI-PMH concepts such as resource and metadata format have been interpreted in novel ways. The result of doing so illustrates the usefulness of the OAI-PMH beyond the typical resource discovery using Dublin Core metadata. Also, through the inclusion of XSL1 stylesheets in protocol responses, OAI-PMH repositories have been directly overlaid with an interface that allows users to navigate the contained metadata by means of a Web browser. In addition, through the introduction of PURL2 partial redirects, complex OAI-PMH protocol requests have been turned into simple URIs that can more easily be published and used in downstream applications.

Mongin, L.; Fu, Y.Y.; Mostafa, J.: Open Archives data Service prototype and automated subject indexing using D-Lib archive content as a testbed (2003) 0.02

0.016302198 = product of:
  0.07335989 = sum of:
    0.015556021 = weight(_text_:of in 1167) [ClassicSimilarity], result of:
      0.015556021 = score(doc=1167,freq=12.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.25392252 = fieldWeight in 1167, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1167)
    0.05780387 = weight(_text_:software in 1167) [ClassicSimilarity], result of:
      0.05780387 = score(doc=1167,freq=4.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.3719205 = fieldWeight in 1167, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.046875 = fieldNorm(doc=1167)
  0.22222222 = coord(2/9)

Abstract: The Indiana University School of Library and Information Science opened a new research laboratory in January 2003; The Indiana University School of Library and Information Science Information Processing Laboratory [IU IP Lab]. The purpose of the new laboratory is to facilitate collaboration between scientists in the department in the areas of information retrieval (IR) and information visualization (IV) research. The lab has several areas of focus. These include grid and cluster computing, and a standard Java-based software platform to support plug and play research datasets, a selection of standard IR modules and standard IV algorithms. Future development includes software to enable researchers to contribute datasets, IR algorithms, and visualization algorithms into the standard environment. We decided early on to use OAI-PMH as a resource discovery tool because it is consistent with our mission.

Paskin, N.: Identifier interoperability : a report on two recent ISO activities (2006) 0.02
```
0.015976388 = product of:
  0.047929164 = sum of:
    0.020973632 = weight(_text_:applications in 1179) [ClassicSimilarity], result of:
      0.020973632 = score(doc=1179,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.1216033 = fieldWeight in 1179, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1179)
    0.016735615 = weight(_text_:of in 1179) [ClassicSimilarity], result of:
      0.016735615 = score(doc=1179,freq=80.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.27317715 = fieldWeight in 1179, product of:
          8.944272 = tf(freq=80.0), with freq of:
            80.0 = termFreq=80.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1179)
    0.010219917 = weight(_text_:systems in 1179) [ClassicSimilarity], result of:
      0.010219917 = score(doc=1179,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.08488525 = fieldWeight in 1179, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1179)
  0.33333334 = coord(3/9)
```
Abstract

Two significant activities within ISO, the International Organisation for Standardization, are underway, each of which has potential implications for the management of content by digital libraries and their users. Moreover these two activities are complementary and have the potential to provide tools for significantly improved identifier interoperability. This article presents a report on these: the first activity investigates the practical implications of interoperability across the family of ISO TC46/SC9 identifiers (better known as the ISBN and related identifiers); the second activity is the implementation of an ontology-based data dictionary that could provide a mechanism for this, the ISO/IEC 21000-6 standard. ISO/TC 46 is the ISO Technical Committee responsible for standards of "Information and documentation". Subcommittee 9 (SC9) of that body is responsible for "Presentation, identification and description of documents": the standards that it manages are identifiers familiar to the content and digital library communities, including the International Standard Book Number (ISBN); International Standard Serial Number (ISSN); International Standard Recording Code (ISRC); International Standard Music Number (ISMN); International Standard Audio-visual Number (ISAN) and the related Version identifier for Audio-visual Works (V-ISAN); and the International Standard Musical Work Code (ISWC). Most recently ISO has introduced the International Standard Text Code (ISTC), and is about to consider standardisation of the DOI system. The ISO identifier schemes provide numbering schemes as labels of entities of "content": many of the identifiers have as referents abstract content entities ("works" rather than a specific physical or digital form: e.g., ISAN, ISWC, ISTC). The existing schemes are numbering management schemes, not tied to any specific implementation (hence for internet "actionability", these identifiers may be incorporated into URN, URI, or DOI formats, etc.). Recently SC9 has requested that new and revised identifier schemes specify mandatory structured metadata to specify the item identified; that metadata is now becoming key to interoperability.
There has been continuing discussion over a number of years within ISO TC46 SC9 of the need for interoperability between the various standard identifiers for which this committee is responsible. However, the nature of what that interoperability might mean - and how it might be achieved - has not been well explored. Considerable amounts of work have been done on standardising the identification schemes within each media sector, by creating standard identifiers that can be used within that sector. Equally, much work has been done on creating standard or reference metadata sets that can be used to associate key metadata descriptors with content. Much less work has been done on the impact of cross-sector working. Relatively little is understood about the effect of using one industry's identifiers in another industry, or on attempting to import metadata from one identification scheme into a system based on another. In the long term it is clear that interoperability of all these media identifiers and metadata schemes will be required. What is not clear is what initial steps are likely to deliver this soonest. Under the auspices of ISO TC46, an ad hoc group of representatives of TC46 SC9 Registration Authorities and invited experts met in London in late 2005, in a facilitated workshop funded by the registration agencies (RAs) responsible for ISAN, ISWC, ISRC, ISAN and DOI, to develop definitions and use cases, with the intention of providing a framework within which a more structured exploration of the issues might be undertaken. A report of the workshop prepared by Mark Bide of Rightscom Ltd. was used as the input for a wider discussion at the ISO TC46 meeting held in Thailand in February 2006, at which ISO TC46/SC9 agreed that Registration Authorities for ISRC, ISWC, ISAN, ISBN, ISSN and ISMN and the proposed RAs for ISTC and DOI should continue working on common issues relating to interoperability of identifier systems developed within TC46/SC9; some of the use cases have been selected for further in-depth investigation, in parallel with discussions on potential solutions.
Section 2 below is based extensively on the report of the output from that workshop, with minor editorial changes to reflect points raised in the subsequent discussion. The second activity, not yet widely appreciated as being related, is the development of a content-focussed data dictionary within MPEG. ISO/IEC JTC 1/SC29, The Moving Picture Experts Group (MPEG), is formally a joint working group of ISO and the International Electrotechnical Commission. Originally best known for compression standards for audio, MPEG now includes the MPEG-21 "Multimedia Framework", which includes several components of digital rights management technology standardisation. Some of the components are already being used in digital library activities. One component is a Rights Data Dictionary that was established as a component to support activities such as the MPEG Rights Expression Language. In April 2005, the ISO/IEC Technical Management Board appointed a Registration Authority for the MPEG 21 Rights Data Dictionary (ISO/IEC Information technology - Multimedia framework (MPEG-21) - Part 6: Rights Data Dictionary, ISO/IEC 21000-6), and an implementation of the dictionary is about to be launched. However, the Dictionary design is based on a generic interoperability framework, and it will offer extensive additional possibilities. The design of the dictionary goes back to one of the major studies of the conceptual model of interoperability, <indecs>. Section 3 below provides a brief summary of the origins and possible applications of the ISO/IEC 21000-6 Dictionary.

Search (169 results, page 1 of 9)

Authors

Languages

Themes