Search (332 results, page 1 of 17)

Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.30

0.29584056 = product of:
  0.5916811 = sum of:
    0.01802184 = weight(_text_:information in 5777) [ClassicSimilarity], result of:
      0.01802184 = score(doc=5777,freq=12.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.2850541 = fieldWeight in 5777, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5777)
    0.05779738 = weight(_text_:retrieval in 5777) [ClassicSimilarity], result of:
      0.05779738 = score(doc=5777,freq=14.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.5305404 = fieldWeight in 5777, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5777)
    0.34320274 = weight(_text_:mathematisches in 5777) [ClassicSimilarity], result of:
      0.34320274 = score(doc=5777,freq=8.0), product of:
        0.30533072 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.036014426 = queryNorm
        1.1240361 = fieldWeight in 5777, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=5777)
    0.17265914 = weight(_text_:modell in 5777) [ClassicSimilarity], result of:
      0.17265914 = score(doc=5777,freq=8.0), product of:
        0.21656582 = queryWeight, product of:
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.036014426 = queryNorm
        0.79725945 = fieldWeight in 5777, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.046875 = fieldNorm(doc=5777)
  0.5 = coord(4/8)

Abstract: This book discusses many of the key design issues for building search engines and emphazises the important role that applied mathematics can play in improving information retrieval. The authors discuss not only important data structures, algorithms, and software but also user-centered issues such as interfaces, manual indexing, and document preparation. They also present some of the current problems in information retrieval that many not be familiar to applied mathematicians and computer scientists and some of the driving computational methods (SVD, SDD) for automated conceptual indexing
RSWK: Suchmaschine / Information Retrieval
World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
Subject: Suchmaschine / Information Retrieval
World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)

Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.15
```
0.14867443 = product of:
  0.29734886 = sum of:
    0.012977208 = weight(_text_:information in 7) [ClassicSimilarity], result of:
      0.012977208 = score(doc=7,freq=14.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.20526241 = fieldWeight in 7, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=7)
    0.041192 = weight(_text_:retrieval in 7) [ClassicSimilarity], result of:
      0.041192 = score(doc=7,freq=16.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.37811437 = fieldWeight in 7, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=7)
    0.16178733 = weight(_text_:mathematisches in 7) [ClassicSimilarity], result of:
      0.16178733 = score(doc=7,freq=4.0), product of:
        0.30533072 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.036014426 = queryNorm
        0.5298757 = fieldWeight in 7, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=7)
    0.0813923 = weight(_text_:modell in 7) [ClassicSimilarity], result of:
      0.0813923 = score(doc=7,freq=4.0), product of:
        0.21656582 = queryWeight, product of:
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.036014426 = queryNorm
        0.37583172 = fieldWeight in 7, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.03125 = fieldNorm(doc=7)
  0.5 = coord(4/8)
```
Abstract

The second edition of Understanding Search Engines: Mathematical Modeling and Text Retrieval follows the basic premise of the first edition by discussing many of the key design issues for building search engines and emphasizing the important role that applied mathematics can play in improving information retrieval. The authors discuss important data structures, algorithms, and software as well as user-centered issues such as interfaces, manual indexing, and document preparation. Significant changes bring the text up to date on current information retrieval methods: for example the addition of a new chapter on link-structure algorithms used in search engines such as Google. The chapter on user interface has been rewritten to specifically focus on search engine usability. In addition the authors have added new recommendations for further reading and expanded the bibliography, and have updated and streamlined the index to make it more reader friendly.

Content

Inhalt: Introduction Document File Preparation - Manual Indexing - Information Extraction - Vector Space Modeling - Matrix Decompositions - Query Representations - Ranking and Relevance Feedback - Searching by Link Structure - User Interface - Book Format Document File Preparation Document Purification and Analysis - Text Formatting - Validation - Manual Indexing - Automatic Indexing - Item Normalization - Inverted File Structures - Document File - Dictionary List - Inversion List - Other File Structures Vector Space Models Construction - Term-by-Document Matrices - Simple Query Matching - Design Issues - Term Weighting - Sparse Matrix Storage - Low-Rank Approximations Matrix Decompositions QR Factorization - Singular Value Decomposition - Low-Rank Approximations - Query Matching - Software - Semidiscrete Decomposition - Updating Techniques Query Management Query Binding - Types of Queries - Boolean Queries - Natural Language Queries - Thesaurus Queries - Fuzzy Queries - Term Searches - Probabilistic Queries Ranking and Relevance Feedback Performance Evaluation - Precision - Recall - Average Precision - Genetic Algorithms - Relevance Feedback Searching by Link Structure HITS Method - HITS Implementation - HITS Summary - PageRank Method - PageRank Adjustments - PageRank Implementation - PageRank Summary User Interface Considerations General Guidelines - Search Engine Interfaces - Form Fill-in - Display Considerations - Progress Indication - No Penalties for Error - Results - Test and Retest - Final Considerations Further Reading

RSWK

Suchmaschine / Information Retrieval
Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)

Subject

Suchmaschine / Information Retrieval
Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
Fuhr, N.: Theorie des Information Retrieval I : Modelle (2004) 0.05
```
0.053959094 = product of:
  0.14389092 = sum of:
    0.010619472 = weight(_text_:information in 2912) [ClassicSimilarity], result of:
      0.010619472 = score(doc=2912,freq=6.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.16796975 = fieldWeight in 2912, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2912)
    0.03153106 = weight(_text_:retrieval in 2912) [ClassicSimilarity], result of:
      0.03153106 = score(doc=2912,freq=6.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.28943354 = fieldWeight in 2912, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2912)
    0.10174038 = weight(_text_:modell in 2912) [ClassicSimilarity], result of:
      0.10174038 = score(doc=2912,freq=4.0), product of:
        0.21656582 = queryWeight, product of:
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.036014426 = queryNorm
        0.46978965 = fieldWeight in 2912, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2912)
  0.375 = coord(3/8)
```
Abstract

Information-Retrieval-(IR-)Modelle spezifizieren, wie zur einer gegebenen Anfrage die Antwortdokumente aus einer Dokumentenkollektion bestimmt werden. Dabei macht jedes Modell bestimmte Annahmen über die Struktur von Dokumenten und Anfragen und definiert dann die so genannte Retrievalfunktion, die das Retrievalgewicht eines Dokumentes bezüglich einer Anfrage bestimmt - im Falle des Booleschen Retrieval etwa eines der Gewichte 0 oder 1. Die Dokumente werden dann nach fallenden Gewichten sortiert und dem Benutzer präsentiert. Zunächst sollen hier einige grundlegende Charakteristika von Retrievalmodellen beschrieben werden, bevor auf die einzelnen Modelle näher eingegangen wird. Wie eingangs erwähnt, macht jedes Modell Annahmen über die Struktur von Dokumenten und Fragen. Ein Dokument kann entweder als Menge oder Multimenge von so genannten Termen aufgefasst werden, wobei im zweiten Fall das Mehrfachvorkommen berücksichtigt wird. Dabei subsummiert 'Term' einen Suchbegriff, der ein einzelnes Wort, ein mehrgliedriger Begriff oder auch ein komplexes Freitextmuster sein kann. Diese Dokumentrepräsentation wird wiederum auf eine so genannte Dokumentbeschreibung abgebildet, in der die einzelnen Terme gewichtet sein können; dies ist Aufgabe der in Kapitel B 5 beschriebenen Indexierungsmodelle. Im Folgenden unterscheiden wir nur zwischen ungewichteter (Gewicht eines Terms ist entweder 0 oderl) und gewichteter Indexierung (das Gewicht ist eine nichtnegative reelle Zahl). Ebenso wie bei Dokumenten können auch die Terme in der Frage entweder ungewichtet oder gewichtet sein. Daneben unterscheidet man zwischen linearen (Frage als Menge von Termen, ungewichtet oder gewichtet) und Booleschen Anfragen.

Source

Grundlagen der praktischen Information und Dokumentation. 5., völlig neu gefaßte Ausgabe. 2 Bde. Hrsg. von R. Kuhlen, Th. Seeger u. D. Strauch. Begründet von Klaus Laisiepen, Ernst Lutterbeck, Karl-Heinrich Meyer-Uhlenried. Bd.1: Handbuch zur Einführung in die Informationswissenschaft und -praxis
Mandl, T.: Tolerantes Information Retrieval : Neuronale Netze zur Erhöhung der Adaptivität und Flexibilität bei der Informationssuche (2001) 0.04
```
0.043297358 = product of:
  0.11545962 = sum of:
    0.008842477 = weight(_text_:information in 5965) [ClassicSimilarity], result of:
      0.008842477 = score(doc=5965,freq=26.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.13986275 = fieldWeight in 5965, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.015625 = fieldNorm(doc=5965)
    0.025224846 = weight(_text_:retrieval in 5965) [ClassicSimilarity], result of:
      0.025224846 = score(doc=5965,freq=24.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.23154683 = fieldWeight in 5965, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.015625 = fieldNorm(doc=5965)
    0.0813923 = weight(_text_:modell in 5965) [ClassicSimilarity], result of:
      0.0813923 = score(doc=5965,freq=16.0), product of:
        0.21656582 = queryWeight, product of:
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.036014426 = queryNorm
        0.37583172 = fieldWeight in 5965, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.015625 = fieldNorm(doc=5965)
  0.375 = coord(3/8)
```
Abstract

Ein wesentliches Bedürfnis im Rahmen der Mensch-Maschine-Interaktion ist die Suche nach Information. Um Information Retrieval (IR) Systeme kognitiv adäquat zu gestalten und sie an den Menschen anzupassen bieten sich Modelle des Soft Computing an. Ein umfassender state-of-the-art Bericht zu neuronalen Netzen im IR zeigt dass die meisten bestehenden Modelle das Potential neuronaler Netze nicht ausschöpfen. Das vorgestellte COSIMIR-Modell (Cognitive Similarity learning in Information Retrieval) basiert auf neuronalen Netzen und lernt, die Ähnlichkeit zwischen Anfrage und Dokument zu berechnen. Es trägt somit die kognitive Modellierung in den Kern eines IR Systems. Das Transformations-Netzwerk ist ein weiteres neuronales Netzwerk, das die Behandlung von Heterogenität anhand von Expertenurteilen lernt. Das COSIMIR-Modell und das Transformations-Netzwerk werden ausführlich diskutiert und anhand realer Datenmengen evaluiert

Content

Kapitel: 1 Einleitung - 2 Grundlagen des Information Retrieval - 3 Grundlagen neuronaler Netze - 4 Neuronale Netze im Information Retrieval - 5 Heterogenität und ihre Behandlung im Information Retrieval - 6 Das COSIMIR-Modell - 7 Experimente mit dem COSIMIR-Modell und dem Transformations-Netzwerk - 8 Fazit

Footnote

Rez. in: nfd - Information 54(2003) H.6, S.379-380 (U. Thiel): "Kannte G. Salton bei der Entwicklung des Vektorraummodells die kybernetisch orientierten Versuche mit assoziativen Speicherstrukturen? An diese und ähnliche Vermutungen, die ich vor einigen Jahren mit Reginald Ferber und anderen Kollegen diskutierte, erinnerte mich die Thematik des vorliegenden Buches. Immerhin lässt sich feststellen, dass die Vektorrepräsentation eine genial einfache Darstellung sowohl der im Information Retrieval (IR) als grundlegende Datenstruktur benutzten "inverted files" als auch der assoziativen Speichermatrizen darstellt, die sich im Laufe der Zeit Über Perzeptrons zu Neuronalen Netzen (NN) weiterentwickelten. Dieser formale Zusammenhang stimulierte in der Folge eine Reihe von Ansätzen, die Netzwerke im Retrieval zu verwenden, wobei sich, wie auch im vorliegenden Band, hybride Ansätze, die Methoden aus beiden Disziplinen kombinieren, als sehr geeignet erweisen. Aber der Reihe nach... Das Buch wurde vom Autor als Dissertation beim Fachbereich IV "Sprachen und Technik" der Universität Hildesheim eingereicht und resultiert aus einer Folge von Forschungsbeiträgen zu mehreren Projekten, an denen der Autor in der Zeit von 1995 bis 2000 an verschiedenen Standorten beteiligt war. Dies erklärt die ungewohnte Breite der Anwendungen, Szenarien und Domänen, in denen die Ergebnisse gewonnen wurden. So wird das in der Arbeit entwickelte COSIMIR Modell (COgnitive SIMilarity learning in Information Retrieval) nicht nur anhand der klassischen Cranfield-Kollektion evaluiert, sondern auch im WING-Projekt der Universität Regensburg im Faktenretrieval aus einer Werkstoffdatenbank eingesetzt. Weitere Versuche mit der als "Transformations-Netzwerk" bezeichneten Komponente, deren Aufgabe die Abbildung von Gewichtungsfunktionen zwischen zwei Termräumen ist, runden das Spektrum der Experimente ab. Aber nicht nur die vorgestellten Resultate sind vielfältig, auch der dem Leser angebotene "State-of-the-Art"-Überblick fasst in hoch informativer Breite Wesentliches aus den Gebieten IR und NN zusammen und beleuchtet die Schnittpunkte der beiden Bereiche. So werden neben den Grundlagen des Text- und Faktenretrieval die Ansätze zur Verbesserung der Adaptivität und zur Beherrschung von Heterogenität vorgestellt, während als Grundlagen Neuronaler Netze neben einer allgemeinen Einführung in die Grundbegriffe u.a. das Backpropagation-Modell, KohonenNetze und die Adaptive Resonance Theory (ART) geschildert werden. Einweiteres Kapitel stellt die bisherigen NN-orientierten Ansätze im IR vor und rundet den Abriss der relevanten Forschungslandschaft ab. Als Vorbereitung der Präsentation des COSIMIR-Modells schiebt der Autor an dieser Stelle ein diskursives Kapitel zum Thema Heterogenität im IR ein, wodurch die Ziele und Grundannahmen der Arbeit noch einmal reflektiert werden. Als Dimensionen der Heterogenität werden der Objekttyp, die Qualität der Objekte und ihrer Erschließung und die Mehrsprachigkeit genannt. Wenn auch diese Systematik im Wesentlichen die Akzente auf Probleme aus den hier tangierten Projekten legt, und weniger eine umfassende Aufbereitung z.B. der Literatur zum Problem der Relevanz anstrebt, ist sie dennoch hilfreich zum Verständnis der in den nachfolgenden Kapitel oft nur implizit angesprochenen Designentscheidungen bei der Konzeption der entwickelten Prototypen. Der Ansatz, Heterogenität durch Transformationen zu behandeln, wird im speziellen Kontext der NN konkretisiert, wobei andere Möglichkeiten, die z.B. Instrumente der Logik und Probabilistik einzusetzen, nur kurz diskutiert werden. Eine weitergehende Analyse hätte wohl auch den Rahmen der Arbeit zu weit gespannt,
da nun nach fast 200 Seiten der Hauptteil der Dissertation folgt - die Vorstellung und Bewertung des bereits erwähnten COSIMIR Modells. Das COSIMIR Modell "berechnet die Ähnlichkeit zwischen den zwei anliegenden Input-Vektoren" (P.194). Der Output des Netzwerks wird an einem einzigen Knoten abgegriffen, an dem sich ein sogenannten Relevanzwert einstellt, wenn die Berechnungen der Gewichtungen interner Knoten zum Abschluss kommen. Diese Gewichtungen hängen von den angelegten Inputvektoren, aus denen die Gewichte der ersten Knotenschicht ermittelt werden, und den im Netzwerk vorgegebenen Kantengewichten ab. Die Gewichtung von Kanten ist der Kernpunkt des neuronalen Ansatzes: In Analogie zum biologischen Urbild (Dendrit mit Synapsen) wächst das Gewicht der Kante mit jeder Aktivierung während einer Trainingsphase. Legt man in dieser Phase zwei Inputvektoren, z.B. Dokumentvektor und Ouery gleichzeitig mit dem Relevanzurteil als Wert des Outputknoten an, verteilen sich durch den BackpropagationProzess die Gewichte entlang der Pfade, die zwischen den beteiligten Knoten bestehen. Da alle Knoten miteinander verbunden sind, entstehen nach mehreren Trainingsbeispielen bereits deutlich unterschiedliche Kantengewichte, weil die aktiv beteiligten Kanten die Änderungen akkumulativ speichern. Eine Variation des Verfahrens benutzt das NN als "Transformationsnetzwerk", wobei die beiden Inputvektoren mit einer Dokumentrepräsentation und einem dazugehörigen Indexat (von einem Experten bereitgestellt) belegt werden. Neben der schon aufgezeigten Trainingsnotwendigkeit weisen die Neuronalen Netze eine weitere intrinsische Problematik auf: Je mehr äußere Knoten benötigt werden, desto mehr interne Kanten (und bei der Verwendung von Zwischenschichten auch Knoten) sind zu verwalten, deren Anzahl nicht linear wächst. Dieser algorithmische Befund setzt naiven Einsätzen der NN-Modelle in der Praxis schnell Grenzen, deshalb ist es umso verdienstvoller, dass der Autor einen innovativen Weg zur Lösung des Problems mit den Mitteln des IR vorschlagen kann. Er verwendet das Latent Semantic Indexing, welches Dokumentrepräsentationen aus einem hochdimensionalen Vektorraum in einen niederdimensionalen abbildet, um die Anzahl der Knoten deutlich zu reduzieren. Damit ist eine sehr schöne Synthese gelungen, welche die eingangs angedeuteten formalen Übereinstimmungen zwischen Vektorraummodellen im IR und den NN aufzeigt und ausnutzt.
Im abschließenden Kapitel des Buchs berichtet der Autor über eine Reihe von Experimenten, die im Kontext unterschiedlicher Anwendungen durchgeführt wurden. Die Evaluationen wurden sehr sorgfältig durchgeführt und werden kompetent kommentiert, so dass der Leser sich ein Bild von der Komplexität der Untersuchungen machen kann. Inhaltlich sind die Ergebnisse unterschiedlich, die Verwendung des NN-Ansatzes ist sehr abhängig von der Menge und Qualität des Trainingsmaterials (so sind die Ergebnisse auf der Cranfield-Kollektion wegen der geringen Anzahl von zur Verfügung stehenden Relevanzurteilen schlechter als die der traditionellen Verfahren). Das Experiment mit Werkstoffinformationen im Projekt WING ist eine eher traditionelle NN-Applikation: Aus Merkmalsvektoren soll auf die "Anwendungsähnlichkeit" von Werkstoffen geschlossen werden, was offenbar gut gelingt. Hier sind die konkurrierenden Verfahren aber weniger im IR zu vermuten, sondern eher im Gebiet des Data Mining. Die Versuche mit Textdaten sind Anregung, hier weitere, systematischere Untersuchungen vorzunehmen. So sollte z.B. nicht nur ein Vergleich mit klassischen One-shot IR-Verfahren durchgeführt werden, viel interessanter und aussagekräftiger ist die Gegenüberstellung von NN-Systemen und lernfähigen IR-Systemen, die z.B. über Relevance Feedback Wissen akkumulieren (vergleichbar den NN in der Trainingsphase). Am Ende könnte dann nicht nur ein einheitliches Modell stehen, sondern auch Erkenntnisse darüber, welches Lernverfahren wann vorzuziehen ist. Fazit: Das Buch ist ein hervorragendes Beispiel der "Schriften zur Informationswissenschaft", mit denen der HI (Hochschulverband für Informationswissenschaft) die Ergebnisse der informationswissenschaftlichen Forschung seit etlichen Jahren einem größerem Publikum vorstellt. Es bietet einen umfassenden Überblick zum dynamisch sich entwickelnden Gebiet der Neuronalen Netze im IR, die sich anschicken, ein "tolerantes Information Retrieval" zu ermöglichen."

RSWK

Information Retrieval / Neuronales Netz

Subject

Information Retrieval / Neuronales Netz

Thompson, P.: Looking back: on relevance, probabilistic indexing and information retrieval (2008) 0.04

0.04068966 = product of:
  0.108505756 = sum of:
    0.024029119 = weight(_text_:information in 2074) [ClassicSimilarity], result of:
      0.024029119 = score(doc=2074,freq=12.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.38007212 = fieldWeight in 2074, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=2074)
    0.07134663 = weight(_text_:retrieval in 2074) [ClassicSimilarity], result of:
      0.07134663 = score(doc=2074,freq=12.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.6549133 = fieldWeight in 2074, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=2074)
    0.013130001 = product of:
      0.03939 = sum of:
        0.03939 = weight(_text_:29 in 2074) [ClassicSimilarity], result of:
          0.03939 = score(doc=2074,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.31092256 = fieldWeight in 2074, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=2074)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: Forty-eight years ago Maron and Kuhns published their paper, "On Relevance, Probabilistic Indexing and Information Retrieval" (1960). This was the first paper to present a probabilistic approach to information retrieval, and perhaps the first paper on ranked retrieval. Although it is one of the most widely cited papers in the field of information retrieval, many researchers today may not be familiar with its influence. This paper describes the Maron and Kuhns article and the influence that it has had on the field of information retrieval.
Date: 31. 7.2008 19:58:29
Source: Information processing and management. 44(2008) no.2, S.963-970

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.04

0.038961656 = product of:
  0.10389775 = sum of:
    0.019619694 = weight(_text_:information in 402) [ClassicSimilarity], result of:
      0.019619694 = score(doc=402,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.3103276 = fieldWeight in 402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.125 = fieldNorm(doc=402)
    0.058254283 = weight(_text_:retrieval in 402) [ClassicSimilarity], result of:
      0.058254283 = score(doc=402,freq=2.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.5347345 = fieldWeight in 402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.125 = fieldNorm(doc=402)
    0.026023773 = product of:
      0.07807132 = sum of:
        0.07807132 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.07807132 = score(doc=402,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Source: Information processing and management. 22(1986) no.6, S.465-476

Effektive Information Retrieval Verfahren in Theorie und Praxis : ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005 (2006) 0.03
```
0.034723014 = product of:
  0.0925947 = sum of:
    0.0120145595 = weight(_text_:information in 5973) [ClassicSimilarity], result of:
      0.0120145595 = score(doc=5973,freq=48.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.19003606 = fieldWeight in 5973, product of:
          6.928203 = tf(freq=48.0), with freq of:
            48.0 = termFreq=48.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.015625 = fieldNorm(doc=5973)
    0.039883982 = weight(_text_:retrieval in 5973) [ClassicSimilarity], result of:
      0.039883982 = score(doc=5973,freq=60.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.36610767 = fieldWeight in 5973, product of:
          7.745967 = tf(freq=60.0), with freq of:
            60.0 = termFreq=60.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.015625 = fieldNorm(doc=5973)
    0.04069615 = weight(_text_:modell in 5973) [ClassicSimilarity], result of:
      0.04069615 = score(doc=5973,freq=4.0), product of:
        0.21656582 = queryWeight, product of:
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.036014426 = queryNorm
        0.18791586 = fieldWeight in 5973, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.015625 = fieldNorm(doc=5973)
  0.375 = coord(3/8)
```
Abstract

Information Retrieval hat sich zu einer Schlüsseltechnologie in der Wissensgesellschaft entwickelt. Die Anzahl der täglichen Anfragen an Internet-Suchmaschinen bildet nur einen Indikator für die große Bedeutung dieses Themas. Der Sammelbandband informiert über Themen wie Information Retrieval-Grundlagen, Retrieval Systeme, Digitale Bibliotheken, Evaluierung und Multilinguale Systeme, beschreibt Anwendungsszenarien und setzt sich mit neuen Herausforderungen an das Information Retrieval auseinander. Die Beiträge behandeln aktuelle Themen und neue Herausforderungen an das Information Retrieval. Die intensive Beteiligung der Informationswissenschaft der Universität Hildesheim am Cross Language Evaluation Forum (CLEF), einer europäischen Evaluierungsinitiative zur Erforschung mehrsprachiger Retrieval Systeme, berührt mehrere der Beiträge. Ebenso spielen Anwendungsszenarien und die Auseinandersetzung mit aktuellen und praktischen Fragestellungen eine große Rolle.

Content

Inhalt: Jan-Hendrik Scheufen: RECOIN: Modell offener Schnittstellen für Information-Retrieval-Systeme und -Komponenten Markus Nick, Klaus-Dieter Althoff: Designing Maintainable Experience-based Information Systems Gesine Quint, Steffen Weichert: Die benutzerzentrierte Entwicklung des Produkt- Retrieval-Systems EIKON der Blaupunkt GmbH Claus-Peter Klas, Sascha Kriewel, André Schaefer, Gudrun Fischer: Das DAFFODIL System - Strategische Literaturrecherche in Digitalen Bibliotheken Matthias Meiert: Entwicklung eines Modells zur Integration digitaler Dokumente in die Universitätsbibliothek Hildesheim Daniel Harbig, René Schneider: Ontology Learning im Rahmen von MyShelf Michael Kluck, Marco Winter: Topic-Entwicklung und Relevanzbewertung bei GIRT: ein Werkstattbericht Thomas Mandl: Neue Entwicklungen bei den Evaluierungsinitiativen im Information Retrieval Joachim Pfister: Clustering von Patent-Dokumenten am Beispiel der Datenbanken des Fachinformationszentrums Karlsruhe Ralph Kölle, Glenn Langemeier, Wolfgang Semar: Programmieren lernen in kollaborativen Lernumgebungen Olga Tartakovski, Margaryta Shramko: Implementierung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten Nina Kummer: Indexierungstechniken für das japanische Retrieval Suriya Na Nhongkai, Hans-Joachim Bentz: Bilinguale Suche mittels Konzeptnetzen Robert Strötgen, Thomas Mandl, René Schneider: Entwicklung und Evaluierung eines Question Answering Systems im Rahmen des Cross Language Evaluation Forum (CLEF) Niels Jensen: Evaluierung von mehrsprachigem Web-Retrieval: Experimente mit dem EuroGOV-Korpus im Rahmen des Cross Language Evaluation Forum (CLEF)

Footnote

Rez. in: Information - Wissenschaft und Praxis 57(2006) H.5, S.290-291 (C. Schindler): "Weniger als ein Jahr nach dem "Vierten Hildesheimer Evaluierungs- und Retrievalworkshop" (HIER 2005) im Juli 2005 ist der dazugehörige Tagungsband erschienen. Eingeladen hatte die Hildesheimer Informationswissenschaft um ihre Forschungsergebnisse und die einiger externer Experten zum Thema Information Retrieval einem Fachpublikum zu präsentieren und zur Diskussion zu stellen. Unter dem Titel "Effektive Information Retrieval Verfahren in Theorie und Praxis" sind nahezu sämtliche Beiträge des Workshops in dem nun erschienenen, 15 Beiträge umfassenden Band gesammelt. Mit dem Schwerpunkt Information Retrieval (IR) wird ein Teilgebiet der Informationswissenschaft vorgestellt, das schon immer im Zentrum informationswissenschaftlicher Forschung steht. Ob durch den Leistungsanstieg von Prozessoren und Speichermedien, durch die Verbreitung des Internet über nationale Grenzen hinweg oder durch den stetigen Anstieg der Wissensproduktion, festzuhalten ist, dass in einer zunehmend wechselseitig vernetzten Welt die Orientierung und das Auffinden von Dokumenten in großen Wissensbeständen zu einer zentralen Herausforderung geworden sind. Aktuelle Verfahrensweisen zu diesem Thema, dem Information Retrieval, präsentiert der neue Band anhand von praxisbezogenen Projekten und theoretischen Diskussionen. Das Kernthema Information Retrieval wird in dem Sammelband in die Bereiche Retrieval-Systeme, Digitale Bibliothek, Evaluierung und Multilinguale Systeme untergliedert. Die Artikel der einzelnen Sektionen sind insgesamt recht heterogen und bieten daher keine Überschneidungen inhaltlicher Art. Jedoch ist eine vollkommene thematische Abdeckung der unterschiedlichen Bereiche ebenfalls nicht gegeben, was bei der Präsentation von Forschungsergebnissen eines Institutes und seiner Kooperationspartner auch nur bedingt erwartet werden kann. So lässt sich sowohl in der Gliederung als auch in den einzelnen Beiträgen eine thematische Verdichtung erkennen, die das spezielle Profil und die Besonderheit der Hildesheimer Informationswissenschaft im Feld des Information Retrieval wiedergibt. Teil davon ist die mehrsprachige und interdisziplinäre Ausrichtung, die die Schnittstellen zwischen Informationswissenschaft, Sprachwissenschaft und Informatik in ihrer praxisbezogenen und internationalen Forschung fokussiert.
Im ersten Kapitel "Retrieval-Systeme" werden verschiedene Information RetrievalSysteme präsentiert und Verfahren zu deren Gestaltung diskutiert. Jan-Hendrik Scheufen stellt das Meta-Framework RECOIN zur Information Retrieval Forschung vor, das sich durch eine flexible Handhabung unterschiedlichster Applikationen auszeichnet und dadurch eine zentrierte Protokollierung und Steuerung von Retrieval-Prozessen ermöglicht. Dieses Konzept eines offenen, komponentenbasierten Systems wurde in Form eines Plug-Ins für die javabasierte Open-Source-Plattform Eclipse realisiert. Markus Nick und Klaus-Dieter Althoff erläutern in ihrem Beitrag, der übrigens der einzige englischsprachige Text im Buch ist, das Verfahren DILLEBIS zur Erhaltung und Pflege (Maintenance) von erfahrungsbasierten Informationssystemen. Sie bezeichnen dieses Verfahren als Maintainable Experience-based Information System und plädieren für eine Ausrichtung von erfahrungsbasierten Systemen entsprechend diesem Modell. Gesine Quint und Steffen Weichert stellen dagegen in ihrem Beitrag die benutzerzentrierte Entwicklung des Produkt-Retrieval-Systems EIKON vor, das in Kooperation mit der Blaupunkt GmbH realisiert wurde. In einem iterativen Designzyklus erfolgte die Gestaltung von gruppenspezifischen Interaktionsmöglichkeiten für ein Car-Multimedia-Zubehör-System. Im zweiten Kapitel setzen sich mehrere Autoren dezidierter mit dem Anwendungsgebiet "Digitale Bibliothek" auseinander. Claus-Peter Klas, Sascha Kriewel, Andre Schaefer und Gudrun Fischer von der Universität Duisburg-Essen stellen das System DAFFODIL vor, das durch eine Vielzahl an Werkzeugen zur strategischen Unterstützung bei Literaturrecherchen in digitalen Bibliotheken dient. Zusätzlich ermöglicht die Protokollierung sämtlicher Ereignisse den Einsatz des Systems als Evaluationsplattform. Der Aufsatz von Matthias Meiert erläutert die Implementierung von elektronischen Publikationsprozessen an Hochschulen am Beispiel von Abschlussarbeiten des Studienganges Internationales Informationsmanagement der Universität Hildesheim. Neben Rahmenbedingungen werden sowohl der Ist-Zustand als auch der Soll-Zustand des wissenschaftlichen elektronischen Publizierens in Form von gruppenspezifischen Empfehlungen dargestellt. Daniel Harbig und Rene Schneider beschreiben in ihrem Aufsatz zwei Verfahrensweisen zum maschinellen Erlernen von Ontologien, angewandt am virtuellen Bibliotheksregal MyShelf. Nach der Evaluation dieser beiden Ansätze plädieren die Autoren für ein semi-automatisiertes Verfahren zur Erstellung von Ontologien.
"Evaluierung", das Thema des dritten Kapitels, ist in seiner Breite nicht auf das Information Retrieval beschränkt sondern beinhaltet ebenso einzelne Aspekte der Bereiche Mensch-Maschine-Interaktion sowie des E-Learning. Michael Muck und Marco Winter von der Stiftung Wissenschaft und Politik sowie dem Informationszentrum Sozialwissenschaften thematisieren in ihrem Beitrag den Einfluss der Fragestellung (Topic) auf die Bewertung von Relevanz und zeigen Verfahrensweisen für die Topic-Erstellung auf, die beim Cross Language Evaluation Forum (CLEF) Anwendung finden. Im darauf folgenden Aufsatz stellt Thomas Mandl verschiedene Evaluierungsinitiativen im Information Retrieval und aktuelle Entwicklungen dar. Joachim Pfister erläutert in seinem Beitrag das automatisierte Gruppieren, das sogenannte Clustering, von Patent-Dokumenten in den Datenbanken des Fachinformationszentrums Karlsruhe und evaluiert unterschiedliche Clusterverfahren auf Basis von Nutzerbewertungen. Ralph Kölle, Glenn Langemeier und Wolfgang Semar widmen sich dem kollaborativen Lernen unter den speziellen Bedingungen des Programmierens. Dabei werden das System VitaminL zur synchronen Bearbeitung von Programmieraufgaben und das Kennzahlensystem K-3 für die Bewertung kollaborativer Zusammenarbeit in einer Lehrveranstaltung angewendet. Der aktuelle Forschungsschwerpunkt der Hildesheimer Informationswissenschaft zeichnet sich im vierten Kapitel unter dem Thema "Multilinguale Systeme" ab. Hier finden sich die meisten Beiträge des Tagungsbandes wieder. Olga Tartakovski und Margaryta Shramko beschreiben und prüfen das System Langldent, das die Sprache von mono- und multilingualen Texten identifiziert. Die Eigenheiten der japanischen Schriftzeichen stellt Nina Kummer dar und vergleicht experimentell die unterschiedlichen Techniken der Indexierung. Suriya Na Nhongkai und Hans-Joachim Bentz präsentieren und prüfen eine bilinguale Suche auf Basis von Konzeptnetzen, wobei die Konzeptstruktur das verbindende Elemente der beiden Textsammlungen darstellt. Das Entwickeln und Evaluieren eines mehrsprachigen Question-Answering-Systems im Rahmen des Cross Language Evaluation Forum (CLEF), das die alltagssprachliche Formulierung von konkreten Fragestellungen ermöglicht, wird im Beitrag von Robert Strötgen, Thomas Mandl und Rene Schneider thematisiert. Den Schluss bildet der Aufsatz von Niels Jensen, der ein mehrsprachiges Web-Retrieval-System ebenfalls im Zusammenhang mit dem CLEF anhand des multilingualen EuroGOVKorpus evaluiert.
Abschließend lässt sich sagen, dass der Tagungsband einen gelungenen Überblick über die Information Retrieval Projekte der Hildesheimer Informationswissenschaft und ihrer Kooperationspartner gibt. Die einzelnen Beiträge sind sehr anregend und auf einem hohen Niveau angesiedelt. Ein kleines Hindernis für den Leser stellt die inhaltliche und strukturelle Orientierung innerhalb des Bandes dar. Der Bezug der einzelnen Artikel zum Thema des Kapitels wird zwar im Vorwort kurz erläutert. Erschwert wird die Orientierung im Buch jedoch durch fehlende Kapitelüberschriften am Anfang der einzelnen Sektionen. Außerdem ist zu erwähnen, dass einer der Artikel einen anderen Titel als im Inhaltsverzeichnis angekündigt trägt. Sieht der Leser von diesen formalen Mängeln ab, wird er reichlich mit praxisbezogenen und theoretisch fundierten Projektdarstellungen und Forschungsergebnissen belohnt. Dies insbesondere, da nicht nur aktuelle Themen der Informationswissenschaft aufgegriffen, sondern ebenso weiterentwickelt und durch die speziellen interdisziplinären und internationalen Bedingungen in Hildesheim geformt werden. Dabei zeigt sich anhand der verschiedenen Projekte, wie gut die Hildesheimer Informationswissenschaft in die Community überregionaler Informationseinrichtungen und anderer deutscher informationswissenschaftlicher Forschungsgruppen eingebunden ist. Damit hat der Workshop bei einer weiteren Öffnung der Expertengruppe das Potential zu einer eigenständigen Institution im Bereich des Information Retrieval zu werden. In diesem Sinne lässt sich auf weitere fruchtbare Workshops und deren Veröffentlichungen hoffen. Ein nächster Workshop der Universität Hildesheim zum Thema Information Retrieval, organisiert mit der Fachgruppe Information Retrieval der Gesellschaft für Informatik, kündigt sich bereits für den 9. bis 13- Oktober 2006 an."

Crestani, F.: Combination of similarity measures for effective spoken document retrieval (2003) 0.03

0.034168962 = product of:
  0.09111723 = sum of:
    0.017167233 = weight(_text_:information in 4690) [ClassicSimilarity], result of:
      0.017167233 = score(doc=4690,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.27153665 = fieldWeight in 4690, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.109375 = fieldNorm(doc=4690)
    0.0509725 = weight(_text_:retrieval in 4690) [ClassicSimilarity], result of:
      0.0509725 = score(doc=4690,freq=2.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.46789268 = fieldWeight in 4690, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=4690)
    0.022977501 = product of:
      0.0689325 = sum of:
        0.0689325 = weight(_text_:29 in 4690) [ClassicSimilarity], result of:
          0.0689325 = score(doc=4690,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.5441145 = fieldWeight in 4690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.109375 = fieldNorm(doc=4690)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Source: Journal of information science. 29(2003) no.2, S.87-96

Cole, C.: Intelligent information retrieval: diagnosing information need : Part II: uncertainty expansion in a prototype of a diagnostic IR tool (1998) 0.03

0.03332717 = product of:
  0.08887245 = sum of:
    0.025486732 = weight(_text_:information in 6432) [ClassicSimilarity], result of:
      0.025486732 = score(doc=6432,freq=6.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.40312737 = fieldWeight in 6432, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=6432)
    0.043690715 = weight(_text_:retrieval in 6432) [ClassicSimilarity], result of:
      0.043690715 = score(doc=6432,freq=2.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.40105087 = fieldWeight in 6432, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=6432)
    0.019695 = product of:
      0.059085 = sum of:
        0.059085 = weight(_text_:29 in 6432) [ClassicSimilarity], result of:
          0.059085 = score(doc=6432,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.46638384 = fieldWeight in 6432, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.09375 = fieldNorm(doc=6432)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Date: 11. 8.2001 14:48:29
Source: Information processing and management. 34(1998) no.6, S.721-731

Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.03

0.031155478 = product of:
  0.083081275 = sum of:
    0.019619694 = weight(_text_:information in 1422) [ClassicSimilarity], result of:
      0.019619694 = score(doc=1422,freq=8.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.3103276 = fieldWeight in 1422, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=1422)
    0.05044969 = weight(_text_:retrieval in 1422) [ClassicSimilarity], result of:
      0.05044969 = score(doc=1422,freq=6.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.46309367 = fieldWeight in 1422, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=1422)
    0.013011887 = product of:
      0.03903566 = sum of:
        0.03903566 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
          0.03903566 = score(doc=1422,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.30952093 = fieldWeight in 1422, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
Date: 22. 3.2003 19:27:23
Footnote: Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
Source: Journal of the American Society for Information Science and technology. 54(2003) no.4, S.285-301

Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.03

0.030484024 = product of:
  0.08129073 = sum of:
    0.01802184 = weight(_text_:information in 1451) [ClassicSimilarity], result of:
      0.01802184 = score(doc=1451,freq=12.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.2850541 = fieldWeight in 1451, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1451)
    0.053509977 = weight(_text_:retrieval in 1451) [ClassicSimilarity], result of:
      0.053509977 = score(doc=1451,freq=12.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.49118498 = fieldWeight in 1451, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1451)
    0.009758915 = product of:
      0.029276744 = sum of:
        0.029276744 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
          0.029276744 = score(doc=1451,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.23214069 = fieldWeight in 1451, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
Date: 22. 3.2003 19:27:36
Footnote: Einführung zu den Beiträgen eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
Source: Journal of the American Society for Information Science and technology. 54(2003) no.4, S.281-284

Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.03

0.02776444 = product of:
  0.11105776 = sum of:
    0.08828696 = weight(_text_:retrieval in 2134) [ClassicSimilarity], result of:
      0.08828696 = score(doc=2134,freq=6.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.8104139 = fieldWeight in 2134, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=2134)
    0.0227708 = product of:
      0.0683124 = sum of:
        0.0683124 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
          0.0683124 = score(doc=2134,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.5416616 = fieldWeight in 2134, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=2134)
      0.33333334 = coord(1/3)
  0.25 = coord(2/8)

Date: 30. 3.2001 13:32:22
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Faloutsos, C.: Signature files (1992) 0.03

0.027476784 = product of:
  0.07327142 = sum of:
    0.009809847 = weight(_text_:information in 3499) [ClassicSimilarity], result of:
      0.009809847 = score(doc=3499,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.1551638 = fieldWeight in 3499, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=3499)
    0.05044969 = weight(_text_:retrieval in 3499) [ClassicSimilarity], result of:
      0.05044969 = score(doc=3499,freq=6.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.46309367 = fieldWeight in 3499, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=3499)
    0.013011887 = product of:
      0.03903566 = sum of:
        0.03903566 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
          0.03903566 = score(doc=3499,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.30952093 = fieldWeight in 3499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: Presents a survey and discussion on signature-based text retrieval methods. It describes the main idea behind the signature approach and its advantages over other text retrieval methods, it provides a classification of the signature methods that have appeared in the literature, it describes the main representatives of each class, together with the relative advantages and drawbacks, and it gives a list of applications as well as commercial or university prototypes that use the signature approach
Date: 7. 5.1999 15:22:48
Source: Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates

Qi, Q.; Hessen, D.J.; Heijden, P.G.M. van der: Improving information retrieval through correspondenceanalysis instead of latent semantic analysis (2023) 0.03

0.026246186 = product of:
  0.06998983 = sum of:
    0.016451614 = weight(_text_:information in 1045) [ClassicSimilarity], result of:
      0.016451614 = score(doc=1045,freq=10.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.2602176 = fieldWeight in 1045, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1045)
    0.043690715 = weight(_text_:retrieval in 1045) [ClassicSimilarity], result of:
      0.043690715 = score(doc=1045,freq=8.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.40105087 = fieldWeight in 1045, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1045)
    0.0098475 = product of:
      0.0295425 = sum of:
        0.0295425 = weight(_text_:29 in 1045) [ClassicSimilarity], result of:
          0.0295425 = score(doc=1045,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.23319192 = fieldWeight in 1045, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=1045)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: The initial dimensions extracted by latent semantic analysis (LSA) of a document-term matrixhave been shown to mainly display marginal effects, which are irrelevant for informationretrieval. To improve the performance of LSA, usually the elements of the raw document-term matrix are weighted and the weighting exponent of singular values can be adjusted.An alternative information retrieval technique that ignores the marginal effects is correspon-dence analysis (CA). In this paper, the information retrieval performance of LSA and CA isempirically compared. Moreover, it is explored whether the two weightings also improve theperformance of CA. The results for four empirical datasets show that CA always performsbetter than LSA. Weighting the elements of the raw data matrix can improve CA; however,it is data dependent and the improvement is small. Adjusting the singular value weightingexponent often improves the performance of CA; however, the extent of the improvementdepends on the dataset and the number of dimensions. (PDF) Improving information retrieval through correspondence analysis instead of latent semantic analysis.
Date: 15. 9.2023 12:28:29
Source: Journal of intelligent information systems [https://doi.org/10.1007/s10844-023-00815-y]

Kwok, K.L.: ¬A network approach to probabilistic information retrieval (1995) 0.03

0.025594871 = product of:
  0.06825299 = sum of:
    0.014714771 = weight(_text_:information in 5696) [ClassicSimilarity], result of:
      0.014714771 = score(doc=5696,freq=8.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.23274569 = fieldWeight in 5696, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5696)
    0.043690715 = weight(_text_:retrieval in 5696) [ClassicSimilarity], result of:
      0.043690715 = score(doc=5696,freq=8.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.40105087 = fieldWeight in 5696, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5696)
    0.0098475 = product of:
      0.0295425 = sum of:
        0.0295425 = weight(_text_:29 in 5696) [ClassicSimilarity], result of:
          0.0295425 = score(doc=5696,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.23319192 = fieldWeight in 5696, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=5696)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: Shows how probabilistic information retrieval based on document components may be implemented as a feedforward (feedbackward) artificial neural network. The network supports adaptation of connection weights as well as the growing of new edges between queries and terms based on user relevance feedback data for training, and it reflects query modification and expansion in information retrieval. A learning rule is applied that can also be viewed as supporting sequential learning using a harmonic sequence learning rate. Experimental results with 4 standard small collections and a large Wall Street Journal collection show that small query expansion levels of about 30 terms can achieve most of the gains at the low-recall high-precision region, while larger expansion levels continue to provide gains at the high-recall low-precision region of a precision recall curve
Date: 29. 1.1996 18:42:14
Source: ACM transactions on information systems. 13(1995) no.3, S.324-353
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Burgin, R.: ¬The retrieval effectiveness of 5 clustering algorithms as a function of indexing exhaustivity (1995) 0.02

0.024657592 = product of:
  0.06575358 = sum of:
    0.0061311545 = weight(_text_:information in 3365) [ClassicSimilarity], result of:
      0.0061311545 = score(doc=3365,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.09697737 = fieldWeight in 3365, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3365)
    0.051489998 = weight(_text_:retrieval in 3365) [ClassicSimilarity], result of:
      0.051489998 = score(doc=3365,freq=16.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.47264296 = fieldWeight in 3365, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3365)
    0.00813243 = product of:
      0.024397288 = sum of:
        0.024397288 = weight(_text_:22 in 3365) [ClassicSimilarity], result of:
          0.024397288 = score(doc=3365,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.19345059 = fieldWeight in 3365, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3365)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: The retrieval effectiveness of 5 hierarchical clustering methods (single link, complete link, group average, Ward's method, and weighted average) is examined as a function of indexing exhaustivity with 4 test collections (CR, Cranfield, Medlars, and Time). Evaluations of retrieval effectiveness, based on 3 measures of optimal retrieval performance, confirm earlier findings that the performance of a retrieval system based on single link clustering varies as a function of indexing exhaustivity but fail ti find similar patterns for other clustering methods. The data also confirm earlier findings regarding the poor performance of single link clustering is a retrieval environment. The poor performance of single link clustering appears to derive from that method's tendency to produce a small number of large, ill defined document clusters. By contrast, the data examined here found the retrieval performance of the other clustering methods to be general comparable. The data presented also provides an opportunity to examine the theoretical limits of cluster based retrieval and to compare these theoretical limits to the effectiveness of operational implementations. Performance standards of the 4 document collections examined were found to vary widely, and the effectiveness of operational implementations were found to be in the range defined as unacceptable. Further improvements in search strategies and document representations warrant investigations
Date: 22. 2.1996 11:20:06
Source: Journal of the American Society for Information Science. 46(1995) no.8, S.562-572

Baeza-Yates, R.A.: Introduction to data structures and algorithms related to information retrieval (1992) 0.02

0.024335619 = product of:
  0.097342476 = sum of:
    0.024524618 = weight(_text_:information in 3082) [ClassicSimilarity], result of:
      0.024524618 = score(doc=3082,freq=8.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.38790947 = fieldWeight in 3082, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=3082)
    0.072817855 = weight(_text_:retrieval in 3082) [ClassicSimilarity], result of:
      0.072817855 = score(doc=3082,freq=8.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.6684181 = fieldWeight in 3082, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=3082)
  0.25 = coord(2/8)

Abstract: In this chapter we review the main concepts and data structures used in information retrieval, and we classify information retrieval related algorithms
Source: Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates

Ayadi, H.; Torjmen-Khemakhem, M.; Daoud, M.; Xiangji Huang, J.; Ben Jemaa, M.: MF-Re-Rank : a modality feature-based re-ranking model for medical image retrieval (2018) 0.02
```
0.02425378 = product of:
  0.06467675 = sum of:
    0.009809847 = weight(_text_:information in 4459) [ClassicSimilarity], result of:
      0.009809847 = score(doc=4459,freq=8.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.1551638 = fieldWeight in 4459, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=4459)
    0.0483019 = weight(_text_:retrieval in 4459) [ClassicSimilarity], result of:
      0.0483019 = score(doc=4459,freq=22.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.44337842 = fieldWeight in 4459, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=4459)
    0.0065650004 = product of:
      0.019695 = sum of:
        0.019695 = weight(_text_:29 in 4459) [ClassicSimilarity], result of:
          0.019695 = score(doc=4459,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.15546128 = fieldWeight in 4459, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03125 = fieldNorm(doc=4459)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)
```
Abstract

One of the main challenges in medical image retrieval is the increasing volume of image data, which render it difficult for domain experts to find relevant information from large data sets. Effective and efficient medical image retrieval systems are required to better manage medical image information. Text-based image retrieval (TBIR) was very successful in retrieving images with textual descriptions. Several TBIR approaches rely on models based on bag-of-words approaches, in which the image retrieval problem turns into one of standard text-based information retrieval; where the meanings and values of specific medical entities in the text and metadata are ignored in the image representation and retrieval process. However, we believe that TBIR should extract specific medical entities and terms and then exploit these elements to achieve better image retrieval results. Therefore, we propose a novel reranking method based on medical-image-dependent features. These features are manually selected by a medical expert from imaging modalities and medical terminology. First, we represent queries and images using only medical-image-dependent features such as image modality and image scale. Second, we exploit the defined features in a new reranking method for medical image retrieval. Our motivation is the large influence of image modality in medical image retrieval and its impact on image-relevance scores. To evaluate our approach, we performed a series of experiments on the medical ImageCLEF data sets from 2009 to 2013. The BM25 model, a language model, and an image-relevance feedback model are used as baselines to evaluate our approach. The experimental results show that compared to the BM25 model, the proposed model significantly enhances image retrieval performance. We also compared our approach with other state-of-the-art approaches and show that our approach performs comparably to those of the top three runs in the official ImageCLEF competition.

Date

29. 9.2018 11:43:31

Source

Journal of the Association for Information Science and Technology. 69(2018) no.9, S.1095-1108

Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.02

0.024223361 = product of:
  0.06459563 = sum of:
    0.017167233 = weight(_text_:information in 1319) [ClassicSimilarity], result of:
      0.017167233 = score(doc=1319,freq=8.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.27153665 = fieldWeight in 1319, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1319)
    0.036043 = weight(_text_:retrieval in 1319) [ClassicSimilarity], result of:
      0.036043 = score(doc=1319,freq=4.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.33085006 = fieldWeight in 1319, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1319)
    0.0113854 = product of:
      0.0341562 = sum of:
        0.0341562 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
          0.0341562 = score(doc=1319,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.2708308 = fieldWeight in 1319, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
Date: 1. 8.1996 22:08:06
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Paris, L.A.H.; Tibbo, H.R.: Freestyle vs. Boolean : a comparison of partial and exact match retrieval systems (1998) 0.02

0.024080943 = product of:
  0.06421585 = sum of:
    0.008583616 = weight(_text_:information in 3329) [ClassicSimilarity], result of:
      0.008583616 = score(doc=3329,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.13576832 = fieldWeight in 3329, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3329)
    0.04414348 = weight(_text_:retrieval in 3329) [ClassicSimilarity], result of:
      0.04414348 = score(doc=3329,freq=6.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.40520695 = fieldWeight in 3329, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3329)
    0.011488751 = product of:
      0.03446625 = sum of:
        0.03446625 = weight(_text_:29 in 3329) [ClassicSimilarity], result of:
          0.03446625 = score(doc=3329,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.27205724 = fieldWeight in 3329, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3329)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: Compares the performance of partial match options, LEXIS/NEXIS's Freestyle, with that of traditional Boolean retrieval. Defines natural language and the natural language search engines currently available. Although the Boolean searches had better results more often than the Freestyle searches, neither mechanism demonstrated superior performance for every query. These results do not in any way prove the superiority of partial match techniques or exact match techniques, but they do suggest that different queries demand different techniques. Further study and analysis are needed to determine which elements of a query make it best suited for partial match or exact match retrieval
Date: 12. 3.1999 10:29:27
Source: Information processing and management. 34(1998) nos.2/3, S.175-190

Search (332 results, page 1 of 17)

Authors

Years

Languages

Types

Themes

Subjects

Classifications