Search (8 results, page 1 of 1)

Tappenbeck, I.; Wessel, C.: CARMEN : Content Analysis, Retrieval and Metadata: Effective Net-working. Ein Halbzeitbericht (2001) 0.01
```
0.0095827235 = product of:
  0.03353953 = sum of:
    0.005074066 = product of:
      0.02537033 = sum of:
        0.02537033 = weight(_text_:retrieval in 5900) [ClassicSimilarity], result of:
          0.02537033 = score(doc=5900,freq=6.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.23154683 = fieldWeight in 5900, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=5900)
      0.2 = coord(1/5)
    0.028465465 = product of:
      0.05693093 = sum of:
        0.05693093 = weight(_text_:zugriff in 5900) [ClassicSimilarity], result of:
          0.05693093 = score(doc=5900,freq=2.0), product of:
            0.2160124 = queryWeight, product of:
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.03622214 = queryNorm
            0.26355398 = fieldWeight in 5900, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.03125 = fieldNorm(doc=5900)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

Das Projekt CARMEN startete als Sonderfördermaßnahme im Rahmen von Global lnfo im Oktober 1999 mit einer geplanten Laufzeit von 29 Monaten. Der Schwerpunkt des Projekts liegt in der Weiterentwicklung von Konzepten und Verfahren der Dokumenterschließung, die den Zugriff auf heterogene, dezentral verteilte Informationsbestände und deren Verwaltung nach gemeinsamen Prinzipien ermöglichen sollen. Dabei geht CARMEN gezielt einen anderen Weg als die meisten bisherigen Ansätze in diesem Bereich, die versuchen, Homogenität und Konsistenz in einer dezentralen Informationslandschaft technikorientiert herzustellen, indem Verfahren entwickelt werden, durch die physikalisch auf verschiedene Dokumentenräume gleichzeitig zugegriffen werden kann. Eine rein technische Parallelisierung von Zugriffsmöglichkeiten reicht jedoch nicht aus, denn das Hauptproblem der inhaltlichen, strukturellen und konzeptionellen Differenz der einzelnen Datenbestände wird damit nicht gelöst. Um diese Differenzen zu kompensieren, werden Problemlösungen und Weiterentwicklungen innerhalb des Projekts CARMEN in drei Bereichen erarbeitet: (1) Metadaten (Dokumentbeschreibung, Retrieval, Verwaltung, Archivierung) (2) Methoden des Umgangs mit der verbleibenden Heterogenität der Datenbestände (3) Retrieval für strukturierte Dokumente mit Metadaten und heterogenen Datentypen. Diese drei Aufgabenbereiche hängen eng zusammen. Durch die Entwicklungen im Bereich der Metadaten soll einerseits die verlorengegangene Konsistenz partiell wiederhergestellt und auf eine den neuen Medien gerechte Basis gestellt werden. Andererseits sollen durch Verfahren zur Heterogenitätsbehandlung Dokumente mit unterschiedlicher Datenrelevanz und Inhaltserschließung aufeinander bezogen und retrievalseitig durch ein Rechercheverfahren erganzt werden, das den unterschiedlichen Datentypen gerecht wird Innerhalb des Gesamtprojekts CARMEN werden diese Aspekte arbeitsteilig behandelt. Acht Arbeitspakete (APs) befassen sich in Abstimmung miteinander mit je verschiedenen Schwerpunkten. Um die Koordination der Arbeiten der verschiedenen APs untereinander zu unterstützen, trafen sich die ca. 40 Projektbearbeiter am 1. und 2. Februar 2001 zum "CARMEN middle OfTheRoad Workshop" in Bonn. Anlässlich dieses Workshops wurden die inhaltlichen und technischen Ergebnisse, die in der ersten Hälfte der Projektlaufzeit von den einzelnen APs erzielt worden sind, in insgesamt 17 Präsentationen vorgestellt
Tappenbeck, I.; Wessel, C.: CARMEN : Content Analysis, Retrieval and Metadata: Effective Net-working. Bericht über den middleOfTheRoad Workshop (2001) 0.01
```
0.0095827235 = product of:
  0.03353953 = sum of:
    0.005074066 = product of:
      0.02537033 = sum of:
        0.02537033 = weight(_text_:retrieval in 5901) [ClassicSimilarity], result of:
          0.02537033 = score(doc=5901,freq=6.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.23154683 = fieldWeight in 5901, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=5901)
      0.2 = coord(1/5)
    0.028465465 = product of:
      0.05693093 = sum of:
        0.05693093 = weight(_text_:zugriff in 5901) [ClassicSimilarity], result of:
          0.05693093 = score(doc=5901,freq=2.0), product of:
            0.2160124 = queryWeight, product of:
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.03622214 = queryNorm
            0.26355398 = fieldWeight in 5901, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.03125 = fieldNorm(doc=5901)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

Das Projekt CARMEN startete als Sonderfördermaßnahme im Rahmen von Global lnfo im Oktober 1999 mit einer geplanten Laufzeit von 29 Monaten. Der Schwerpunkt des Projekts liegt in der Weiterentwicklung von Konzepten und Verfahren der Dokumenterschließung, die den Zugriff auf heterogene, dezentral verteilte Informationsbestände und deren Verwaltung nach gemeinsamen Prinzipien ermöglichen sollen. Dabei geht CARMEN gezielt einen anderen Weg als die meisten bisherigen Ansätze in diesem Bereich, die versuchen, Homogenität und Konsistenz in einer dezentralen Informationslandschaft technikorientiert herzustellen, indem Verfahren entwickelt werden, durch die physikalisch auf verschiedene Dokumentenräume gleichzeitig zugegriffen werden kann. Eine rein technische Parallelisierung von Zugriffsmöglichkeiten reicht jedoch nicht aus, denn das Hauptproblem der inhaltlichen, strukturellen und konzeptionellen Differenz der einzelnen Datenbestände wird damit nicht gelöst. Um diese Differenzen zu kompensieren, werden Problemlösungen und Weiterentwicklungen innerhalb des Projekts CARMEN in drei Bereichen erarbeitet: (1) Metadaten (Dokumentbeschreibung, Retrieval, Verwaltung, Archivierung) (2) Methoden des Umgangs mit der verbleibenden Heterogenität der Datenbestände (3) Retrieval für strukturierte Dokumente mit Metadaten und heterogenen Datentypen. Diese drei Aufgabenbereiche hängen eng zusammen. Durch die Entwicklungen im Bereich der Metadaten soll einerseits die verlorengegangene Konsistenz partiell wiederhergestellt und auf eine den neuen Medien gerechte Basis gestellt werden. Andererseits sollen durch Verfahren zur Heterogenitätsbehandlung Dokumente mit unterschiedlicher Datenrelevanz und Inhaltserschließung aufeinander bezogen und retrievalseitig durch ein Rechercheverfahren erganzt werden, das den unterschiedlichen Datentypen gerecht wird Innerhalb des Gesamtprojekts CARMEN werden diese Aspekte arbeitsteilig behandelt. Acht Arbeitspakete (APs) befassen sich in Abstimmung miteinander mit je verschiedenen Schwerpunkten. Um die Koordination der Arbeiten der verschiedenen APs untereinander zu unterstützen, trafen sich die ca. 40 Projektbearbeiter am 1. und 2. Februar 2001 zum "CARMEN middle OfTheRoad Workshop" in Bonn. Anlässlich dieses Workshops wurden die inhaltlichen und technischen Ergebnisse, die in der ersten Hälfte der Projektlaufzeit von den einzelnen APs erzielt worden sind, in insgesamt 17 Präsentationen vorgestellt
Hickey, T.R.: CORC : a system for gateway creation (2000) 0.00
```
0.0013752056 = product of:
  0.009626439 = sum of:
    0.009626439 = product of:
      0.048132192 = sum of:
        0.048132192 = weight(_text_:system in 4870) [ClassicSimilarity], result of:
          0.048132192 = score(doc=4870,freq=6.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.42190298 = fieldWeight in 4870, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4870)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

CORC is an OCLC project that id developing tools and systems to enable libraries to provide enhanced access to Internet resources. By adapting and extending library techniques and procedures, we are developing a self-supporting system capable of describing a large and useful subset of the Web. CORC is more a system for hosting and supporting subject gateways than a gateway itself and relies on large-scale cooperation among libraries to maintain a centralized database. By supporting emerging metadata standards such as Dublin Core and other standards such as Unicode and RDF, CORC broadens the range of libraries and librarians able to participate. Current plans are for OCLC as a full service in July 2000
Lam, V.-T.: Cataloging Internet resources : Why, what, how (2000) 0.00
```
7.9397525E-4 = product of:
  0.0055578267 = sum of:
    0.0055578267 = product of:
      0.027789133 = sum of:
        0.027789133 = weight(_text_:system in 967) [ClassicSimilarity], result of:
          0.027789133 = score(doc=967,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.2435858 = fieldWeight in 967, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=967)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

Internet resources have brought great excitement but also grave concerns to the library world, especially to the cataloging community. In spite of the various problematic aspects presented by Internet resources (poorly organized, lack of stability, variable quality), catalogers have decided that they are worth cataloging, in particular those meeting library selection criteria. This paper tries to trace the decade-long history of the library comrnunity's efforts in providing an effective way to catalog Internet resources. Basically, its olbjective is to answer the following questions: Why catalog? What to catalog? and, How to catalog. Some issues of cataloging electronic journals and developments of the Dublin Core Metadata system are also discussed.
Crowston, K.; Kwasnik, B.H.: Can document-genre metadata improve information access to large digital collections? (2004) 0.00
```
7.398139E-4 = product of:
  0.005178697 = sum of:
    0.005178697 = product of:
      0.025893483 = sum of:
        0.025893483 = weight(_text_:retrieval in 824) [ClassicSimilarity], result of:
          0.025893483 = score(doc=824,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.23632148 = fieldWeight in 824, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=824)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

We discuss the issues of resolving the information-retrieval problem in large digital collections through the identification and use of document genres. Explicit identification of genre seems particularly important for such collections because any search usually retrieves documents with a diversity of genres that are undifferentiated by obvious clues as to their identity. Also, because most genres are characterized by both form and purpose, identifying the genre of a document provides information as to the document's purpose and its fit to the user's situation, which can be otherwise difficult to assess. We begin by outlining the possible role of genre identification in the information-retrieval process. Our assumption is that genre identification would enhance searching, first because we know that topic alone is not enough to define an information problem and, second, because search results containing genre information would be more easily understandable. Next, we discuss how information professionals have traditionally tackled the issues of representing genre in settings where topical representation is the norm. Finally, we address the issues of studying the efficacy of identifying genre in large digital collections. Because genre is often an implicit notion, studying it in a systematic way presents many problems. We outline a research protocol that would provide guidance for identifying Web document genres, for observing how genre is used in searching and evaluating search results, and finally for representing and visualizing genres.
Aldana, J.F.; Gómez, A.C.; Moreno, N.; Nebro, A.J.; Roldán, M.M.: Metadata functionality for semantic Web integration (2003) 0.00
```
5.918511E-4 = product of:
  0.0041429573 = sum of:
    0.0041429573 = product of:
      0.020714786 = sum of:
        0.020714786 = weight(_text_:retrieval in 2731) [ClassicSimilarity], result of:
          0.020714786 = score(doc=2731,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.18905719 = fieldWeight in 2731, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=2731)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

We propose an extension of a mediator architecture. This extension is oriented to ontology-driven data integration. In our architecture ontologies are not managed by an extemal component or service, but are integrated in the mediation layer. This approach implies rethinking the mediator design, but at the same time provides advantages from a database perspective. Some of these advantages include the application of optimization and evaluation techniques that use and combine information from all abstraction levels (physical schema, logical schema and semantic information defined by ontology). 1. Introduction Although the Web is probably the richest information repository in human history, users cannot specify what they want from it. Two major problems that arise in current search engines (Heflin, 2001) are: a) polysemy, when the same word is used with different meanings; b) synonymy, when two different words have the same meaning. Polysemy causes irrelevant information retrieval. On the other hand, synonymy produces lost of useful documents. The lack of a capability to understand the context of the words and the relationships among required terms, explains many of the lost and false results produced by search engines. The Semantic Web will bring structure to the meaningful content of Web pages, giving semantic relationships among terms and possibly avoiding the previous problems. Various proposals have appeared for meta-data representation and communication standards, and other services and tools that may eventually merge into the global Semantic Web (Berners-lee, 2001). Hopefully, in the next few years we will see the universal adoption of open standards for representation and sharing of meta-information. In this environment, software agents roaming from page to page can readily carry out sophisticated tasks for users (Berners-Lee, 2001). In this context, ontologies can be seen as metadata that represent semantic of data; providing a knowledge domain standard vocabulary, like DTDs and XML Schema do. If its pages were so structured, the Web could be seen as a heterogeneous collection of autonomous databases. This suggests that techniques developed in the Database area could be useful. Database research mainly deals with efficient storage and retrieval and with powerful query languages.
Özel, S.A.; Altingövde, I.S.; Ulusoy, Ö.; Özsoyoglu, G.; Özsoyoglu, Z.M.: Metadata-Based Modeling of Information Resources an the Web (2004) 0.00
```
5.6712516E-4 = product of:
  0.003969876 = sum of:
    0.003969876 = product of:
      0.01984938 = sum of:
        0.01984938 = weight(_text_:system in 2093) [ClassicSimilarity], result of:
          0.01984938 = score(doc=2093,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.17398985 = fieldWeight in 2093, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2093)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

This paper deals with the problem of modeling Web information resources using expert knowledge and personalized user information for improved Web searching capabilities. We propose a "Web information space" model, which is composed of Web-based information resources (HTML/XML [Hypertext Markup Language/Extensible Markup Language] documents an the Web), expert advice repositories (domain-expert-specified metadata for information resources), and personalized information about users (captured as user profiles that indicate users' preferences about experts as well as users' knowledge about topics). Expert advice, the heart of the Web information space model, is specified using topics and relationships among topics (called metalinks), along the lines of the recently proposed topic maps. Topics and metalinks constitute metadata that describe the contents of the underlying HTML/XML Web resources. The metadata specification process is semiautomated, and it exploits XML DTDs (Document Type Definition) to allow domain-expert guided mapping of DTD elements to topics and metalinks. The expert advice is stored in an object-relational database management system (DBMS). To demonstrate the practicality and usability of the proposed Web information space model, we created a prototype expert advice repository of more than one million topics/metalinks for DBLP (Database and Logic Programming) Bibliography data set. We also present a query interface that provides sophisticated querying fa cilities for DBLP Bibliography resources using the expert advice repository.
Strötgen, R.; Kokkelink, S.: Metadatenextraktion aus Internetquellen : Heterogenitätsbehandlung im Projekt CARMEN (2001) 0.00
```
5.2312744E-4 = product of:
  0.003661892 = sum of:
    0.003661892 = product of:
      0.01830946 = sum of:
        0.01830946 = weight(_text_:retrieval in 5808) [ClassicSimilarity], result of:
          0.01830946 = score(doc=5808,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.16710453 = fieldWeight in 5808, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5808)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

Die Sonderfördermaßnahme CARMEN (Content Analysis, Retrieval and Metadata: Effective Networking) zielt im Rahmen des vom BMB+F geförderten Programms GLOBAL INFO darauf ab, in der heutigen dezentralen Informationsweit geeignete Informationssysteme für die verteilten Datenbestände in Bibliotheken, Fachinformationszentren und im Internet zu schaffen. Diese Zusammenführung ist weniger technisch als inhaltlich und konzeptuell problematisch. Heterogenität tritt beispielsweise auf, wenn unterschiedliche Datenbestände zur Inhaltserschließung verschiedene Thesauri oder Klassifikationen benutzen, wenn Metadaten unterschiedlich oder überhaupt nicht erfasst werden oder wenn intellektuell aufgearbeitete Quellen mit in der Regel vollständig unerschlossenen Internetdokumenten zusammentreffen. Im Projekt CARMEN wird dieses Problem mit mehreren Methoden angegangen: Über deduktiv-heuristische Verfahren werden Metadaten automatisch aus Dokumenten generiert, außerdem lassen sich mit statistisch-quantitativen Methoden die unterschiedlichen Verwendungen von Termen in den verschiedenen Beständen aufeinander abbilden, und intellektuell erstellte Crosskonkordanzen schaffen sichere Übergänge von einer Dokumentationssprache in eine andere. Für die Extraktion von Metadaten gemäß Dublin Core (v. a. Autor, Titel, Institution, Abstract, Schlagworte) werden anhand typischer Dokumente (Dissertationen aus Math-Net im PostScript-Format und verschiedenste HTML-Dateien von WWW-Servern deutscher sozialwissenschaftlicher Institutionen) Heuristiken entwickelt. Die jeweilige Wahrscheinlichkeit, dass die so gewonnenen Metadaten korrekt und vertrauenswürdig sind, wird über Gewichte den einzelnen Daten zugeordnet. Die Heuristiken werden iterativ in ein Extraktionswerkzeug implementiert, getestet und verbessert, um die Zuverlässigkeit der Verfahren zu erhöhen. Derzeit werden an der Universität Osnabrück und im InformationsZentrum Sozialwissenschaften Bonn anhand mathematischer und sozialwissenschaftlicher Datenbestände erste Prototypen derartiger Transfermodule erstellt

Search (8 results, page 1 of 1)

Authors

Languages

Themes