Search (22 results, page 1 of 2)

Eckert, K.: Thesaurus analysis and visualization in semantic search applications (2007) 0.22

0.2197166 = product of:
  0.26365992 = sum of:
    0.018133488 = weight(_text_:und in 3222) [ClassicSimilarity], result of:
      0.018133488 = score(doc=3222,freq=4.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.17315367 = fieldWeight in 3222, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3222)
    0.061184157 = weight(_text_:anwendung in 3222) [ClassicSimilarity], result of:
      0.061184157 = score(doc=3222,freq=2.0), product of:
        0.22876309 = queryWeight, product of:
          4.8414783 = idf(docFreq=948, maxDocs=44218)
          0.04725067 = queryNorm
        0.2674564 = fieldWeight in 3222, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.8414783 = idf(docFreq=948, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3222)
    0.020018218 = weight(_text_:des in 3222) [ClassicSimilarity], result of:
      0.020018218 = score(doc=3222,freq=2.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.15298408 = fieldWeight in 3222, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3222)
    0.08549531 = weight(_text_:prinzips in 3222) [ClassicSimilarity], result of:
      0.08549531 = score(doc=3222,freq=2.0), product of:
        0.27041927 = queryWeight, product of:
          5.723078 = idf(docFreq=392, maxDocs=44218)
          0.04725067 = queryNorm
        0.31615835 = fieldWeight in 3222, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.723078 = idf(docFreq=392, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3222)
    0.07882876 = product of:
      0.15765752 = sum of:
        0.15765752 = weight(_text_:thesaurus in 3222) [ClassicSimilarity], result of:
          0.15765752 = score(doc=3222,freq=16.0), product of:
            0.21834905 = queryWeight, product of:
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.04725067 = queryNorm
            0.7220435 = fieldWeight in 3222, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3222)
      0.5 = coord(1/2)
  0.8333333 = coord(5/6)

Abstract: The use of thesaurus-based indexing is a common approach for increasing the performance of information retrieval. In this thesis, we examine the suitability of a thesaurus for a given set of information and evaluate improvements of existing thesauri to get better search results. On this area, we focus on two aspects: 1. We demonstrate an analysis of the indexing results achieved by an automatic document indexer and the involved thesaurus. 2. We propose a method for thesaurus evaluation which is based on a combination of statistical measures and appropriate visualization techniques that support the detection of potential problems in a thesaurus. In this chapter, we give an overview of the context of our work. Next, we briefly outline the basics of thesaurus-based information retrieval and describe the Collexis Engine that was used for our experiments. In Chapter 3, we describe two experiments in automatically indexing documents in the areas of medicine and economics with corresponding thesauri and compare the results to available manual annotations. Chapter 4 describes methods for assessing thesauri and visualizing the result in terms of a treemap. We depict examples of interesting observations supported by the method and show that we actually find critical problems. We conclude with a discussion of open questions and future research in Chapter 5.
Imprint: Mannheim : Fakultät für Mathematik und Informatik
Theme: Konzeption und Anwendung des Prinzips Thesaurus

Tavakolizadeh-Ravari, M.: Analysis of the long term dynamics in thesaurus developments and its consequences (2017) 0.19

0.19344495 = product of:
  0.23213395 = sum of:
    0.03243817 = weight(_text_:und in 3081) [ClassicSimilarity], result of:
      0.03243817 = score(doc=3081,freq=20.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.3097467 = fieldWeight in 3081, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.03125 = fieldNorm(doc=3081)
    0.048947323 = weight(_text_:anwendung in 3081) [ClassicSimilarity], result of:
      0.048947323 = score(doc=3081,freq=2.0), product of:
        0.22876309 = queryWeight, product of:
          4.8414783 = idf(docFreq=948, maxDocs=44218)
          0.04725067 = queryNorm
        0.21396513 = fieldWeight in 3081, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.8414783 = idf(docFreq=948, maxDocs=44218)
          0.03125 = fieldNorm(doc=3081)
    0.027738057 = weight(_text_:des in 3081) [ClassicSimilarity], result of:
      0.027738057 = score(doc=3081,freq=6.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.21198097 = fieldWeight in 3081, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.03125 = fieldNorm(doc=3081)
    0.06839625 = weight(_text_:prinzips in 3081) [ClassicSimilarity], result of:
      0.06839625 = score(doc=3081,freq=2.0), product of:
        0.27041927 = queryWeight, product of:
          5.723078 = idf(docFreq=392, maxDocs=44218)
          0.04725067 = queryNorm
        0.25292668 = fieldWeight in 3081, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.723078 = idf(docFreq=392, maxDocs=44218)
          0.03125 = fieldNorm(doc=3081)
    0.054614164 = product of:
      0.10922833 = sum of:
        0.10922833 = weight(_text_:thesaurus in 3081) [ClassicSimilarity], result of:
          0.10922833 = score(doc=3081,freq=12.0), product of:
            0.21834905 = queryWeight, product of:
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.04725067 = queryNorm
            0.5002464 = fieldWeight in 3081, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
      0.5 = coord(1/2)
  0.8333333 = coord(5/6)

Abstract: Die Arbeit analysiert die dynamische Entwicklung und den Gebrauch von Thesaurusbegriffen. Zusätzlich konzentriert sie sich auf die Faktoren, die die Zahl von Indexbegriffen pro Dokument oder Zeitschrift beeinflussen. Als Untersuchungsobjekt dienten der MeSH und die entsprechende Datenbank "MEDLINE". Die wichtigsten Konsequenzen sind: 1. Der MeSH-Thesaurus hat sich durch drei unterschiedliche Phasen jeweils logarithmisch entwickelt. Solch einen Thesaurus sollte folgenden Gleichung folgen: "T = 3.076,6 Ln (d) - 22.695 + 0,0039d" (T = Begriffe, Ln = natürlicher Logarithmus und d = Dokumente). Um solch einen Thesaurus zu konstruieren, muss man demnach etwa 1.600 Dokumente von unterschiedlichen Themen des Bereiches des Thesaurus haben. Die dynamische Entwicklung von Thesauri wie MeSH erfordert die Einführung eines neuen Begriffs pro Indexierung von 256 neuen Dokumenten. 2. Die Verteilung der Thesaurusbegriffe erbrachte drei Kategorien: starke, normale und selten verwendete Headings. Die letzte Gruppe ist in einer Testphase, während in der ersten und zweiten Kategorie die neu hinzukommenden Deskriptoren zu einem Thesauruswachstum führen. 3. Es gibt ein logarithmisches Verhältnis zwischen der Zahl von Index-Begriffen pro Aufsatz und dessen Seitenzahl für die Artikeln zwischen einer und einundzwanzig Seiten. 4. Zeitschriftenaufsätze, die in MEDLINE mit Abstracts erscheinen erhalten fast zwei Deskriptoren mehr. 5. Die Findablity der nicht-englisch sprachigen Dokumente in MEDLINE ist geringer als die englische Dokumente. 6. Aufsätze der Zeitschriften mit einem Impact Factor 0 bis fünfzehn erhalten nicht mehr Indexbegriffe als die der anderen von MEDINE erfassten Zeitschriften. 7. In einem Indexierungssystem haben unterschiedliche Zeitschriften mehr oder weniger Gewicht in ihrem Findability. Die Verteilung der Indexbegriffe pro Seite hat gezeigt, dass es bei MEDLINE drei Kategorien der Publikationen gibt. Außerdem gibt es wenige stark bevorzugten Zeitschriften."
Footnote: Dissertation, Humboldt-Universität zu Berlin - Institut für Bibliotheks- und Informationswissenschaft.
Imprint: Berlin : Humboldt-Universität zu Berlin / Institut für Bibliotheks- und Informationswissenschaft
Theme: Konzeption und Anwendung des Prinzips Thesaurus

Knitel, M.: ¬The application of linked data principles to library data : opportunities and challenges (2012) 0.07
```
0.0685232 = product of:
  0.1370464 = sum of:
    0.022208897 = weight(_text_:und in 599) [ClassicSimilarity], result of:
      0.022208897 = score(doc=599,freq=6.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.21206908 = fieldWeight in 599, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0390625 = fieldNorm(doc=599)
    0.08652747 = weight(_text_:anwendung in 599) [ClassicSimilarity], result of:
      0.08652747 = score(doc=599,freq=4.0), product of:
        0.22876309 = queryWeight, product of:
          4.8414783 = idf(docFreq=948, maxDocs=44218)
          0.04725067 = queryNorm
        0.3782405 = fieldWeight in 599, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.8414783 = idf(docFreq=948, maxDocs=44218)
          0.0390625 = fieldNorm(doc=599)
    0.028310036 = weight(_text_:des in 599) [ClassicSimilarity], result of:
      0.028310036 = score(doc=599,freq=4.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.21635216 = fieldWeight in 599, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.0390625 = fieldNorm(doc=599)
  0.5 = coord(3/6)
```
Abstract

Linked Data hat sich im Laufe der letzten Jahre zu einem vorherrschenden Thema der Bibliothekswissenschaft entwickelt. Als ein Standard für Erfassung und Austausch von Daten, bestehen zahlreiche Berührungspunkte mit traditionellen bibliothekarischen Techniken. Diese Arbeit stellt in einem ersten Teil die grundlegenden Technologien dieses neuen Paradigmas vor, um sodann deren Anwendung auf bibliothekarische Daten zu untersuchen. Den zentralen Prinzipien der Linked Data Initiative folgend, werden dabei die Adressierung von Entitäten durch URIs, die Anwendung des RDF Datenmodells und die Verknüpfung von heterogenen Datenbeständen näher beleuchtet. Den dabei zu Tage tretenden Herausforderungen der Sicherstellung von qualitativ hochwertiger Information, der permanenten Adressierung von Inhalten im World Wide Web sowie Problemen der Interoperabilität von Metadatenstandards wird dabei besondere Aufmerksamkeit geschenkt. Der letzte Teil der Arbeit skizziert ein Programm, welches eine mögliche Erweiterung der Suchmaschine des österreichischen Bibliothekenverbundes darstellt. Dessen prototypische Umsetzung erlaubt eine realistische Einschätzung der derzeitigen Möglichkeiten von Linked Data und unterstreicht viele der vorher theoretisch erarbeiteten Themengebiete. Es zeigt sich, dass für den voll produktiven Einsatz von Linked Data noch viele Hürden zu überwinden sind. Insbesondere befinden sich viele Projekte derzeit noch in einem frühen Reifegrad. Andererseits sind die Möglichkeiten, die aus einem konsequenten Einsatz von RDF resultieren würden, vielversprechend. RDF qualifiziert sich somit als Kandidat für den Ersatz von auslaufenden bibliographischen Datenformaten wie MAB oder MARC.
Oberhauser, O.: Card-Image Public Access Catalogues (CIPACs) : a critical consideration of a cost-effective alternative to full retrospective catalogue conversion (2002) 0.05
```
0.047503993 = product of:
  0.095007986 = sum of:
    0.032362055 = weight(_text_:und in 1703) [ClassicSimilarity], result of:
      0.032362055 = score(doc=1703,freq=26.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.3090199 = fieldWeight in 1703, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1703)
    0.042828906 = weight(_text_:anwendung in 1703) [ClassicSimilarity], result of:
      0.042828906 = score(doc=1703,freq=2.0), product of:
        0.22876309 = queryWeight, product of:
          4.8414783 = idf(docFreq=948, maxDocs=44218)
          0.04725067 = queryNorm
        0.18721949 = fieldWeight in 1703, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.8414783 = idf(docFreq=948, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1703)
    0.019817024 = weight(_text_:des in 1703) [ClassicSimilarity], result of:
      0.019817024 = score(doc=1703,freq=4.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.1514465 = fieldWeight in 1703, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1703)
  0.5 = coord(3/6)
```
Footnote

Rez. in: ABI-Technik 21(2002) H.3, S.292 (E. Pietzsch): "Otto C. Oberhauser hat mit seiner Diplomarbeit eine beeindruckende Analyse digitalisierter Zettelkataloge (CIPACs) vorgelegt. Die Arbeit wartet mit einer Fülle von Daten und Statistiken auf, wie sie bislang nicht vorgelegen haben. BibliothekarInnen, die sich mit der Digitalisierung von Katalogen tragen, finden darin eine einzigartige Vorlage zur Entscheidungsfindung. Nach einem einführenden Kapitel bringt Oberhauser zunächst einen Überblick über eine Auswahl weltweit verfügbarer CIPACs, deren Indexierungsmethode (Binäre Suche, partielle Indexierung, Suche in OCR-Daten) und stellt vergleichende Betrachtungen über geographische Verteilung, Größe, Software, Navigation und andere Eigenschaften an. Anschließend beschreibt und analysiert er Implementierungsprobleme, beginnend bei Gründen, die zur Digitalisierung führen können: Kosten, Umsetzungsdauer, Zugriffsverbesserung, Stellplatzersparnis. Er fährt fort mit technischen Aspekten wie Scannen und Qualitätskontrolle, Image Standards, OCR, manueller Nacharbeit, Servertechnologie. Dabei geht er auch auf die eher hinderlichen Eigenschaften älterer Kataloge ein sowie auf die Präsentation im Web und die Anbindung an vorhandene Opacs. Einem wichtigen Aspekt, nämlich der Beurteilung durch die wichtigste Zielgruppe, die BibliotheksbenutzerInnen, hat Oberhauser eine eigene Feldforschung gewidmet, deren Ergebnisse er im letzten Kapitel eingehend analysiert. Anhänge über die Art der Datenerhebung und Einzelbeschreibung vieler Kataloge runden die Arbeit ab. Insgesamt kann ich die Arbeit nur als die eindrucksvollste Sammlung von Daten, Statistiken und Analysen zum Thema CIPACs bezeichnen, die mir bislang begegnet ist. Auf einen schön herausgearbeiteten Einzelaspekt, nämlich die weitgehende Zersplitterung bei den eingesetzten Softwaresystemen, will ich besonders eingehen: Derzeit können wir grob zwischen Komplettlösungen (eine beauftragte Firma führt als Generalunternehmung sämtliche Aufgaben von der Digitalisierung bis zur Ablieferung der fertigen Anwendung aus) und geteilten Lösungen (die Digitalisierung wird getrennt von der Indexierung und der Softwareerstellung vergeben bzw. im eigenen Hause vorgenommen) unterscheiden. Letztere setzen ein Projektmanagement im Hause voraus. Gerade die Softwareerstellung im eigenen Haus aber kann zu Lösungen führen, die kommerziellen Angeboten keineswegs nachstehen. Schade ist nur, daß die vielfältigen Eigenentwicklungen bislang noch nicht zu Initiativen geführt haben, die, ähnlich wie bei Public Domain Software, eine "optimale", kostengünstige und weithin akzeptierte Softwarelösung zum Ziel haben. Einige kritische Anmerkungen sollen dennoch nicht unerwähnt bleiben. Beispielsweise fehlt eine Differenzierung zwischen "Reiterkarten"-Systemen, d.h. solchen mit Indexierung jeder 20. oder 50. Karte, und Systemen mit vollständiger Indexierung sämtlicher Kartenköpfe, führt doch diese weitreichende Designentscheidung zu erheblichen Kostenverschiebungen zwischen Katalogerstellung und späterer Benutzung. Auch bei den statistischen Auswertungen der Feldforschung hätte ich mir eine feinere Differenzierung nach Typ des CIPAC oder nach Bibliothek gewünscht. So haben beispielsweise mehr als die Hälfte der befragten BenutzerInnen angegeben, die Bedienung des CIPAC sei zunächst schwer verständlich oder seine Benutzung sei zeitaufwendig gewesen. Offen beibt jedoch, ob es Unterschiede zwischen den verschiedenen Realisierungstypen gibt.
Geisriegler, E.: Enriching electronic texts with semantic metadata : a use case for the historical Newspaper Collection ANNO (Austrian Newspapers Online) of the Austrian National Libraryhek (2012) 0.04
```
0.040290773 = product of:
  0.080581546 = sum of:
    0.036266975 = weight(_text_:und in 595) [ClassicSimilarity], result of:
      0.036266975 = score(doc=595,freq=16.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.34630734 = fieldWeight in 595, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0390625 = fieldNorm(doc=595)
    0.028310036 = weight(_text_:des in 595) [ClassicSimilarity], result of:
      0.028310036 = score(doc=595,freq=4.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.21635216 = fieldWeight in 595, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.0390625 = fieldNorm(doc=595)
    0.016004534 = product of:
      0.03200907 = sum of:
        0.03200907 = weight(_text_:22 in 595) [ClassicSimilarity], result of:
          0.03200907 = score(doc=595,freq=2.0), product of:
            0.16546379 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04725067 = queryNorm
            0.19345059 = fieldWeight in 595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=595)
      0.5 = coord(1/2)
  0.5 = coord(3/6)
```
Abstract

Die vorliegende Master Thesis setzt sich mit der Frage nach Möglichkeiten der Anreicherung historischer Zeitungen mit semantischen Metadaten auseinander. Sie möchte außerdem analysieren, welcher Nutzen für vor allem geisteswissenschaftlich Forschende, durch die Anreicherung mit zusätzlichen Informationsquellen entsteht. Nach der Darstellung der Entwicklung der interdisziplinären 'Digital Humanities', wurde für die digitale Sammlung historischer Zeitungen (ANNO AustriaN Newspapers Online) der Österreichischen Nationalbibliothek ein Use Case entwickelt, bei dem 'Named Entities' (Personen, Orte, Organisationen und Daten) in ausgewählten Zeitungsausgaben manuell annotiert wurden. Methodisch wurde das Kodieren mit 'TEI', einem Dokumentenformat zur Kodierung und zum Austausch von Texten durchgeführt. Zusätzlich wurden zu allen annotierten 'Named Entities' Einträge in externen Datenbanken wie Wikipedia, Wikipedia Personensuche, der ehemaligen Personennamen- und Schlagwortnormdatei (jetzt Gemeinsame Normdatei GND), VIAF und dem Bildarchiv Austria gesucht und gegebenenfalls verlinkt. Eine Beschreibung der Ergebnisse des manuellen Annotierens der Zeitungsseiten schließt diesen Teil der Arbeit ab. In einem weiteren Abschnitt werden die Ergebnisse des manuellen Annotierens mit jenen Ergebnissen, die automatisch mit dem German NER (Named Entity Recognition) generiert wurden, verglichen und in ihrer Genauigkeit analysiert. Abschließend präsentiert die Arbeit einige Best Practice-Beispiele kodierter und angereicherter Zeitungsseiten, um den zusätzlichen Nutzen durch die Auszeichnung der 'Named Entities' und durch die Verlinkung mit externen Informationsquellen für die BenützerInnen darzustellen.

Date

3. 2.2013 18:00:22
Chen, X.: Indexing consistency between online catalogues (2008) 0.04
```
0.039648257 = product of:
  0.079296514 = sum of:
    0.031408124 = weight(_text_:und in 2209) [ClassicSimilarity], result of:
      0.031408124 = score(doc=2209,freq=12.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.29991096 = fieldWeight in 2209, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2209)
    0.020018218 = weight(_text_:des in 2209) [ClassicSimilarity], result of:
      0.020018218 = score(doc=2209,freq=2.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.15298408 = fieldWeight in 2209, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2209)
    0.027870173 = product of:
      0.055740345 = sum of:
        0.055740345 = weight(_text_:thesaurus in 2209) [ClassicSimilarity], result of:
          0.055740345 = score(doc=2209,freq=2.0), product of:
            0.21834905 = queryWeight, product of:
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.04725067 = queryNorm
            0.2552809 = fieldWeight in 2209, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2209)
      0.5 = coord(1/2)
  0.5 = coord(3/6)
```
Abstract

In der globalen Online-Umgebung stellen viele bibliographische Dienstleistungen integrierten Zugang zu unterschiedlichen internetbasierten OPACs zur Verfügung. In solch einer Umgebung erwarten Benutzer mehr Übereinstimmungen innerhalb und zwischen den Systemen zu sehen. Zweck dieser Studie ist, die Indexierungskonsistenz zwischen Systemen zu untersuchen. Währenddessen werden einige Faktoren, die die Indexierungskonsistenz beeinflussen können, untersucht. Wichtigstes Ziel dieser Studie ist, die Gründe für die Inkonsistenzen herauszufinden, damit sinnvolle Vorschläge gemacht werden können, um die Indexierungskonsistenz zu verbessern. Eine Auswahl von 3307 Monographien wurde aus zwei chinesischen bibliographischen Katalogen gewählt. Nach Hooper's Formel war die durchschnittliche Indexierungskonsistenz für Indexterme 64,2% und für Klassennummern 61,6%. Nach Rolling's Formel war sie für Indexterme 70,7% und für Klassennummern 63,4%. Mehrere Faktoren, die die Indexierungskonsistenz beeinflussen, wurden untersucht: (1) Indexierungsbereite; (2) Indexierungsspezifizität; (3) Länge der Monographien; (4) Kategorie der Indexierungssprache; (5) Sachgebiet der Monographien; (6) Entwicklung von Disziplinen; (7) Struktur des Thesaurus oder der Klassifikation; (8) Erscheinungsjahr. Gründe für die Inkonsistenzen wurden ebenfalls analysiert. Die Analyse ergab: (1) den Indexieren mangelt es an Fachwissen, Vertrautheit mit den Indexierungssprachen und den Indexierungsregeln, so dass viele Inkonsistenzen verursacht wurden; (2) der Mangel an vereinheitlichten oder präzisen Regeln brachte ebenfalls Inkonsistenzen hervor; (3) verzögerte Überarbeitungen der Indexierungssprachen, Mangel an terminologischer Kontrolle, zu wenige Erläuterungen und "siehe auch" Referenzen, sowie die hohe semantische Freiheit bei der Auswahl von Deskriptoren oder Klassen, verursachten Inkonsistenzen.

Imprint

Berlin : Humboldt-Universität / Institut für Bibliotheks- und Informationswissenschaft

Stojanovic, N.: Ontology-based Information Retrieval : methods and tools for cooperative query answering (2005) 0.02

0.022015212 = product of:
  0.066045634 = sum of:
    0.050031062 = product of:
      0.15009318 = sum of:
        0.15009318 = weight(_text_:3a in 701) [ClassicSimilarity], result of:
          0.15009318 = score(doc=701,freq=2.0), product of:
            0.4005917 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04725067 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
      0.33333334 = coord(1/3)
    0.016014574 = weight(_text_:des in 701) [ClassicSimilarity], result of:
      0.016014574 = score(doc=701,freq=2.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.12238726 = fieldWeight in 701, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.03125 = fieldNorm(doc=701)
  0.33333334 = coord(2/6)

Content: Vgl.: http%3A%2F%2Fdigbib.ubka.uni-karlsruhe.de%2Fvolltexte%2Fdocuments%2F1627&ei=tAtYUYrBNoHKtQb3l4GYBw&usg=AFQjCNHeaxKkKU3-u54LWxMNYGXaaDLCGw&sig2=8WykXWQoDKjDSdGtAakH2Q&bvm=bv.44442042,d.Yms.
Footnote: Zur Erlangung des akademischen Grades eines Doktors der Wirtschaftswissenschaften (Dr. rer. pol.) von der Fakultaet fuer Wirtschaftswissenschaften der Universitaet Fridericiana zu Karlsruhe genehmigte Dissertation.

Schmolz, H.: Anaphora resolution and text retrieval : a lnguistic analysis of hypertexts (2013) 0.02

0.021893688 = product of:
  0.06568106 = sum of:
    0.025644625 = weight(_text_:und in 1810) [ClassicSimilarity], result of:
      0.025644625 = score(doc=1810,freq=2.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.24487628 = fieldWeight in 1810, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.078125 = fieldNorm(doc=1810)
    0.040036436 = weight(_text_:des in 1810) [ClassicSimilarity], result of:
      0.040036436 = score(doc=1810,freq=2.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.30596817 = fieldWeight in 1810, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.078125 = fieldNorm(doc=1810)
  0.33333334 = coord(2/6)

Content: Trägerin des VFI-Dissertationspreises 2014: "Überzeugende gründliche linguistische und quantitative Analyse eines im Information Retrieval bisher wenig beachteten Textelementes anhand eines eigens erstellten grossen Hypertextkorpus, einschliesslich der Evaluation selbsterstellter Auflösungsregeln für die Nutzung in künftigen IR-Systemen.".

Baier Benninger, P.: Model requirements for the management of electronic records (MoReq2) : Anleitung zur Umsetzung (2011) 0.02
```
0.021577148 = product of:
  0.06473144 = sum of:
    0.04070958 = weight(_text_:und in 4343) [ClassicSimilarity], result of:
      0.04070958 = score(doc=4343,freq=14.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.38872904 = fieldWeight in 4343, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.046875 = fieldNorm(doc=4343)
    0.02402186 = weight(_text_:des in 4343) [ClassicSimilarity], result of:
      0.02402186 = score(doc=4343,freq=2.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.18358089 = fieldWeight in 4343, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.046875 = fieldNorm(doc=4343)
  0.33333334 = coord(2/6)
```
Abstract

Viele auch kleinere Unternehmen, Verwaltungen und Organisationen sind angesichts eines wachsenden Berges von digitalen Informationen mit dem Ordnen und Strukturieren ihrer Ablagen beschäftigt. In den meisten Organisationen besteht ein Konzept der Dokumentenlenkung. Records Management verfolgt vor allem in zwei Punkten einen weiterführenden Ansatz. Zum einen stellt es über den Geschäftsalltag hinaus den Kontext und den Entstehungszusammenhang ins Zentrum und zum anderen gibt es Regeln vor, wie mit ungenutzten oder inaktiven Dokumenten zu verfahren ist. Mit den «Model Requirements for the Management of Electronic Records» - MoReq - wurde von der europäischen Kommission ein Standard geschaffen, der alle Kernbereiche des Records Managements und damit den gesamten Entstehungs-, Nutzungs-, Archivierungsund Aussonderungsbereich von Dokumenten abdeckt. In der «Anleitung zur Umsetzung» wird die umfangreiche Anforderungsliste von MoReq2 (August 2008) zusammengefasst und durch erklärende Abschnitte ergänzt, mit dem Ziel, als griffiges Instrument bei der Einführung eines Record Management Systems zu dienen.

Imprint

Chur : Hochschule für Technik und Wirtschaft / Arbeitsbereich Informationswissenschaft
Mair, M.: Increasing the value of meta data by using associative semantic networks (2002) 0.02
```
0.018265136 = product of:
  0.054795407 = sum of:
    0.030773548 = weight(_text_:und in 4972) [ClassicSimilarity], result of:
      0.030773548 = score(doc=4972,freq=8.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.29385152 = fieldWeight in 4972, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.046875 = fieldNorm(doc=4972)
    0.02402186 = weight(_text_:des in 4972) [ClassicSimilarity], result of:
      0.02402186 = score(doc=4972,freq=2.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.18358089 = fieldWeight in 4972, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.046875 = fieldNorm(doc=4972)
  0.33333334 = coord(2/6)
```
Abstract

Momentan verbreitete Methoden zur Strukturierung von Information können ihre Aufgabe immer schlechter befriedigend erfüllen. Der Grund dafür ist das explosive Wachstum menschlichen Wissens. Diese Diplomarbeit schlägt als einen möglichen Ausweg die Verwendung assoziativer semantischer Netzwerke vor. Maschinelles Wissensmanagement kann wesentlich intuitiver und einfacher benutzbar werden, wenn man sich die Art und Weise zunutze macht, mit der das menschliche Gehirn Informationen verarbeitet (im Speziellen assoziative Verbindungen). Der theoretische Teil dieser Arbeit diskutiert verschiedene Aspekte eines möglichen Designs eines semantischen Netzwerks mit assoziativen Verbindungen. Außer den Grundelementen und Problemen der Visualisierung werden hauptsächlich Verbesserungen ausgearbeitet, welche ein leistungsstarkes Arbeiten mit einem solchen Netzwerk erlauben. Im praktischen Teil wird ein Netzwerk-Prototyp mit den wichtigsten herausgearbeiteten Merkmalen implementiert. Die Basis der Applikation bildet der Hyperwave Information Server. Dieser detailiiertere Design-Teil gewährt tieferen Einblick in Software Requirements, Use Cases und teilweise auch in Klassendetails. Am Ende wird eine kurze Einführung in die Benutzung des implementierten Prototypen gegeben.
Haslhofer, B.: ¬A Web-based mapping technique for establishing metadata interoperability (2008) 0.01
```
0.011823257 = product of:
  0.03546977 = sum of:
    0.018133488 = weight(_text_:und in 3173) [ClassicSimilarity], result of:
      0.018133488 = score(doc=3173,freq=16.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.17315367 = fieldWeight in 3173, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.01953125 = fieldNorm(doc=3173)
    0.017336285 = weight(_text_:des in 3173) [ClassicSimilarity], result of:
      0.017336285 = score(doc=3173,freq=6.0), product of:
        0.13085164 = queryWeight, product of:
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.04725067 = queryNorm
        0.1324881 = fieldWeight in 3173, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.7693076 = idf(docFreq=7536, maxDocs=44218)
          0.01953125 = fieldNorm(doc=3173)
  0.33333334 = coord(2/6)
```
Content

Die Integration von Metadaten aus unterschiedlichen, heterogenen Datenquellen erfordert Metadaten-Interoperabilität, eine Eigenschaft die nicht standardmäßig gegeben ist. Metadaten Mapping Verfahren ermöglichen es Domänenexperten Metadaten-Interoperabilität in einem bestimmten Integrationskontext herzustellen. Mapping Lösungen sollen dabei die notwendige Unterstützung bieten. Während diese für den etablierten Bereich interoperabler Datenbanken bereits existieren, ist dies für Web-Umgebungen nicht der Fall. Betrachtet man das Ausmaß ständig wachsender strukturierter Metadaten und Metadatenschemata im Web, so zeichnet sich ein Bedarf nach Web-basierten Mapping Lösungen ab. Den Kern einer solchen Lösung bildet ein Mappingmodell, das die zur Spezifikation von Mappings notwendigen Sprachkonstrukte definiert. Existierende Semantic Web Sprachen wie beispielsweise RDFS oder OWL bieten zwar grundlegende Mappingelemente (z.B.: owl:equivalentProperty, owl:sameAs), adressieren jedoch nicht das gesamte Sprektrum möglicher semantischer und struktureller Heterogenitäten, die zwischen unterschiedlichen, inkompatiblen Metadatenobjekten auftreten können. Außerdem fehlen technische Lösungsansätze zur Überführung zuvor definierter Mappings in ausfu¨hrbare Abfragen. Als zentraler wissenschaftlicher Beitrag dieser Dissertation, wird ein abstraktes Mappingmodell pr¨asentiert, welches das Mappingproblem auf generischer Ebene reflektiert und Lösungsansätze zum Abgleich inkompatibler Schemata bietet. Instanztransformationsfunktionen und URIs nehmen in diesem Modell eine zentrale Rolle ein. Erstere überbrücken ein breites Spektrum möglicher semantischer und struktureller Heterogenitäten, während letztere das Mappingmodell in die Architektur des World Wide Webs einbinden. Auf einer konkreten, sprachspezifischen Ebene wird die Anbindung des abstrakten Modells an die RDF Vocabulary Description Language (RDFS) präsentiert, wodurch ein Mapping zwischen unterschiedlichen, in RDFS ausgedrückten Metadatenschemata ermöglicht wird. Das Mappingmodell ist in einen zyklischen Mappingprozess eingebunden, der die Anforderungen an Mappinglösungen in vier aufeinanderfolgende Phasen kategorisiert: mapping discovery, mapping representation, mapping execution und mapping maintenance. Im Rahmen dieser Dissertation beschäftigen wir uns hauptsächlich mit der Representation-Phase sowie mit der Transformation von Mappingspezifikationen in ausführbare SPARQL-Abfragen. Zur Unterstützung der Discovery-Phase bietet das Mappingmodell eine Schnittstelle zur Einbindung von Schema- oder Ontologymatching-Algorithmen. Für die Maintenance-Phase präsentieren wir ein einfaches, aber seinen Zweck erfüllendes Mapping-Registry Konzept. Auf Basis des Mappingmodells stellen wir eine Web-basierte Mediator-Wrapper Architektur vor, die Domänenexperten die Möglichkeit bietet, SPARQL-Mediationsschnittstellen zu definieren. Die zu integrierenden Datenquellen müssen dafür durch Wrapper-Komponenen gekapselt werden, welche die enthaltenen Metadaten im Web exponieren und SPARQL-Zugriff ermöglichen. Als beipielhafte Wrapper Komponente präsentieren wir den OAI2LOD Server, mit dessen Hilfe Datenquellen eingebunden werden können, die ihre Metadaten über das Open Archives Initative Protocol for Metadata Harvesting (OAI-PMH) exponieren. Im Rahmen einer Fallstudie zeigen wir, wie Mappings in Web-Umgebungen erstellt werden können und wie unsere Mediator-Wrapper Architektur nach wenigen, einfachen Konfigurationsschritten Metadaten aus unterschiedlichen, heterogenen Datenquellen integrieren kann, ohne dass dadurch die Notwendigkeit entsteht, eine Mapping Lösung in einer lokalen Systemumgebung zu installieren.

Farazi, M.: Faceted lightweight ontologies : a formalization and some experiments (2010) 0.01

0.010423139 = product of:
  0.06253883 = sum of:
    0.06253883 = product of:
      0.18761648 = sum of:
        0.18761648 = weight(_text_:3a in 4997) [ClassicSimilarity], result of:
          0.18761648 = score(doc=4997,freq=2.0), product of:
            0.4005917 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04725067 = queryNorm
            0.46834838 = fieldWeight in 4997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4997)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Content: PhD Dissertation at International Doctorate School in Information and Communication Technology. Vgl.: https%3A%2F%2Fcore.ac.uk%2Fdownload%2Fpdf%2F150083013.pdf&usg=AOvVaw2n-qisNagpyT0lli_6QbAQ.

Francu, V.: Multilingual access to information using an intermediate language (2003) 0.01
```
0.009831674 = product of:
  0.058990043 = sum of:
    0.058990043 = product of:
      0.117980085 = sum of:
        0.117980085 = weight(_text_:thesaurus in 1742) [ClassicSimilarity], result of:
          0.117980085 = score(doc=1742,freq=14.0), product of:
            0.21834905 = queryWeight, product of:
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.04725067 = queryNorm
            0.5403279 = fieldWeight in 1742, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.03125 = fieldNorm(doc=1742)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

While being theoretically so widely available, information can be restricted from a more general use by linguistic barriers. The linguistic aspects of the information languages and particularly the chances of an enhanced access to information by means of multilingual access facilities will make the substance of this thesis. The main problem of this research is thus to demonstrate that information retrieval can be improved by using multilingual thesaurus terms based on an intermediate or switching language to search with. Universal classification systems in general can play the role of switching languages for reasons dealt with in the forthcoming pages. The Universal Decimal Classification (UDC) in particular is the classification system used as example of a switching language for our objectives. The question may arise: why a universal classification system and not another thesaurus? Because the UDC like most of the classification systems uses symbols. Therefore, it is language independent and the problems of compatibility between such a thesaurus and different other thesauri in different languages are avoided. Another question may still arise? Why not then, assign running numbers to the descriptors in a thesaurus and make a switching language out of the resulting enumerative system? Because of some other characteristics of the UDC: hierarchical structure and terminological richness, consistency and control. One big problem to find an answer to is: can a thesaurus be made having as a basis a classification system in any and all its parts? To what extent this question can be given an affirmative answer? This depends much on the attributes of the universal classification system which can be favourably used to this purpose. Examples of different situations will be given and discussed upon beginning with those classes of UDC which are best fitted for building a thesaurus structure out of them (classes which are both hierarchical and faceted)...

Content

Inhalt: INFORMATION LANGUAGES: A LINGUISTIC APPROACH MULTILINGUAL ASPECTS IN INFORMATION STORAGE AND RETRIEVAL COMPATIBILITY AND CONVERTIBILITY OF INFORMATION LANGUAGES CURRENT TRENDS IN MULTILINGUAL ACCESS BUILDING UDC-BASED MULTILINGUAL THESAURI ONLINE APPLICATIONS OF THE UDC-BASED MULTILINGUAL THESAURI THE IMPACT OF SPECIFICITY ON THE RETRIEVAL POWER OF A UDC-BASED MULTILINGUAL THESAURUS FINAL REMARKS AND GENERAL CONCLUSIONS Proefschrift voorgelegd tot het behalen van de graad van doctor in de Taal- en Letterkunde aan de Universiteit Antwerpen. - Vgl.: http://dlist.sir.arizona.edu/1862/.

Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.01

0.008338511 = product of:
  0.050031062 = sum of:
    0.050031062 = product of:
      0.15009318 = sum of:
        0.15009318 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
          0.15009318 = score(doc=5820,freq=2.0), product of:
            0.4005917 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04725067 = queryNorm
            0.3746787 = fieldWeight in 5820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Content: Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies. Vgl.: https%3A%2F%2Fwww.cs.cmu.edu%2F~cx%2Fpapers%2Fknowledge_based_text_representation.pdf&usg=AOvVaw0SaTSvhWLTh__Uz_HtOtl3.

Gordon, T.J.; Helmer-Hirschberg, O.: Report on a long-range forecasting study (1964) 0.01

0.0060356874 = product of:
  0.036214124 = sum of:
    0.036214124 = product of:
      0.07242825 = sum of:
        0.07242825 = weight(_text_:22 in 4204) [ClassicSimilarity], result of:
          0.07242825 = score(doc=4204,freq=4.0), product of:
            0.16546379 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04725067 = queryNorm
            0.4377287 = fieldWeight in 4204, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4204)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)

Date: 22. 6.2018 13:24:08
22. 6.2018 13:54:52

Schwarz, K.: Domain model enhanced search : a comparison of taxonomy, thesaurus and ontology (2005) 0.01
```
0.005255251 = product of:
  0.031531505 = sum of:
    0.031531505 = product of:
      0.06306301 = sum of:
        0.06306301 = weight(_text_:thesaurus in 4569) [ClassicSimilarity], result of:
          0.06306301 = score(doc=4569,freq=4.0), product of:
            0.21834905 = queryWeight, product of:
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.04725067 = queryNorm
            0.2888174 = fieldWeight in 4569, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.03125 = fieldNorm(doc=4569)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

The results of this thesis are intended to support the information architect in designing a solution for improved search in a corporate environment. Specifically we have examined the type of search problems that require a domain model to enhance the search process. There are several approaches to modeling a domain. We have considered 3 different types of domain modeling schemes; taxonomy, thesaurus and ontology. The intention is to support the information architect in making an informed choice between one or more of these schemes. In our opinion the main criteria for this choice are the modeling characteristics of a scheme and the suitability for application in the search process. The second chapter is a discussion of modeling characteristics of each scheme, followed by a comparison between them. This should give an information architect an idea of which aspects of a domain can be modeled with each scheme. What is missing here is an indication of the effort required to model a domain with each scheme. There are too many factors that influence the amount of required effort, ranging from measurable factors like domain size and resource characteristics to cultural matters such as the willingness to share knowledge and the existence of a project champion in the team to keep the project running. The third chapter shows what role domain models can play in each part of the search process. This gives an idea of the problems that domain models can solve. We have split the search process into individual parts to show that domain models can be applied very differently in the process. The fourth chapter makes recommendations about the suitability of each individualdomain modeling scheme for improving search. Each scheme has particular characteristics that make it especially suitable for a domain or a search problem. In the appendix each case study is described in detail. These descriptions are intended to serve as a benchmark. The current problem of the enterprise can be compared to those described to see which case study is most similar, which solution was chosen, which problems arose and how they were dealt with. An important issue that we have not touched upon in this thesis is that of maintenance. The real problems of a domain model are revealed when it is applied in a search system and its deficits and wrong assumptions become clear. Adaptation and maintenance are always required. Unfortunately we have not been able to glean sufficient information about maintenance issues from our case studies to draw any meaningful conclusions.

Seidlmayer, E.: ¬An ontology of digital objects in philosophy : an approach for practical use in research (2018) 0.00

0.0042311475 = product of:
  0.025386883 = sum of:
    0.025386883 = weight(_text_:und in 5496) [ClassicSimilarity], result of:
      0.025386883 = score(doc=5496,freq=4.0), product of:
        0.104724824 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04725067 = queryNorm
        0.24241515 = fieldWeight in 5496, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5496)
  0.16666667 = coord(1/6)

Footnote: Master thesis Library and Information Science, Fakultät für Informations- und Kommunikationswissenschaften, Technische Hochschule Köln. Schön auch: Bei Google Scholar unter 'Eva, S.' nachgewiesen.
Imprint: Köln : Technische Hochschule / Fakultät für Informations- und Kommunikationswissenschaften

Ziemba, L.: Information retrieval with concept discovery in digital collections for agriculture and natural resources (2011) 0.00
```
0.0037160232 = product of:
  0.022296138 = sum of:
    0.022296138 = product of:
      0.044592276 = sum of:
        0.044592276 = weight(_text_:thesaurus in 4728) [ClassicSimilarity], result of:
          0.044592276 = score(doc=4728,freq=2.0), product of:
            0.21834905 = queryWeight, product of:
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.04725067 = queryNorm
            0.20422474 = fieldWeight in 4728, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.03125 = fieldNorm(doc=4728)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

The amount and complexity of information available in a digital form is already huge and new information is being produced every day. Retrieving information relevant to address a particular need becomes a significant issue. This work utilizes knowledge organization systems (KOS), such as thesauri and ontologies and applies information extraction (IE) and computational linguistics (CL) techniques to organize, manage and retrieve information stored in digital collections in the agricultural domain. Two real world applications of the approach have been developed and are available and actively used by the public. An ontology is used to manage the Water Conservation Digital Library holding a dynamic collection of various types of digital resources in the domain of urban water conservation in Florida, USA. The ontology based back-end powers a fully operational web interface, available at http://library.conservefloridawater.org. The system has demonstrated numerous benefits of the ontology application, including accurate retrieval of resources, information sharing and reuse, and has proved to effectively facilitate information management. The major difficulty encountered with the approach is that large and dynamic number of concepts makes it difficult to keep the ontology consistent and to accurately catalog resources manually. To address the aforementioned issues, a combination of IE and CL techniques, such as Vector Space Model and probabilistic parsing, with the use of Agricultural Thesaurus were adapted to automatically extract concepts important for each of the texts in the Best Management Practices (BMP) Publication Library--a collection of documents in the domain of agricultural BMPs in Florida available at http://lyra.ifas.ufl.edu/LIB. A new approach of domain-specific concept discovery with the use of Internet search engine was developed. Initial evaluation of the results indicates significant improvement in precision of information extraction. The approach presented in this work focuses on problems unique to agriculture and natural resources domain, such as domain specific concepts and vocabularies, but should be applicable to any collection of texts in digital format. It may be of potential interest for anyone who needs to effectively manage a collection of digital resources.
Markó, K.G.: Foundation, implementation and evaluation of the MorphoSaurus system (2008) 0.00
```
0.0032515205 = product of:
  0.019509122 = sum of:
    0.019509122 = product of:
      0.039018244 = sum of:
        0.039018244 = weight(_text_:thesaurus in 4415) [ClassicSimilarity], result of:
          0.039018244 = score(doc=4415,freq=2.0), product of:
            0.21834905 = queryWeight, product of:
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.04725067 = queryNorm
            0.17869665 = fieldWeight in 4415, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6210785 = idf(docFreq=1182, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4415)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

The proper handling of acronyms plays a crucial role in medical texts, e.g. in patient records, as well as in scientific literature. Chapter six presents an approach, in which acronyms are automatically acquired from (bio-) medical literature. Furthermore, acronyms and their definitions in different languages are linked to each other using the MorphoSaurus text processing system. Automatic word sense disambiguation is still one of the most challenging tasks in Natural Language Processing. In Chapter seven, cross-lingual considerations lead to a new methodology for automatic disambiguation applied to subwords. Beginning with Chapter eight, a series of applications based onMorphoSaurus are introduced. Firstly, the implementation of the subword approach within a crosslanguage information retrieval setting for the medical domain is described and evaluated on standard test document collections. In Chapter nine, this methodology is extended to multilingual information retrieval in the Web, for which user queries are translated into target languages based on the segmentation into subwords and their interlingual mappings. The cross-lingual, automatic assignment of document descriptors to documents is the topic of Chapter ten. A large-scale evaluation of a heuristic, as well as a statistical algorithm is carried out using a prominent medical thesaurus as a controlled vocabulary. In Chapter eleven, it will be shown how MorphoSaurus can be used to map monolingual, lexical resources across different languages. As a result, a large multilingual medical lexicon with high coverage and complete lexical information is built and evaluated against a comparable, already available and commonly used lexical repository for the medical domain. Chapter twelve sketches a few applications based on MorphoSaurus. The generality and applicability of the subword approach to other domains is outlined, and proof-of-concepts in real-world scenarios are presented. Finally, Chapter thirteen recapitulates the most important aspects of MorphoSaurus and the potential benefit of its employment in medical information systems is carefully assessed, both for medical experts in their everyday life, but also with regard to health care consumers and their existential information needs.

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.00

0.0032009068 = product of:
  0.01920544 = sum of:
    0.01920544 = product of:
      0.03841088 = sum of:
        0.03841088 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.03841088 = score(doc=563,freq=2.0), product of:
            0.16546379 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04725067 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)

Date: 10. 1.2013 19:22:47

Search (22 results, page 1 of 2)

Authors

Years

Themes

Classifications