Search (107 results, page 1 of 6)

Effektive Information Retrieval Verfahren in Theorie und Praxis : ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005 (2006) 0.02
```
0.023282979 = product of:
  0.08149042 = sum of:
    0.03902347 = weight(_text_:beteiligung in 5973) [ClassicSimilarity], result of:
      0.03902347 = score(doc=5973,freq=2.0), product of:
        0.2379984 = queryWeight, product of:
          7.4202213 = idf(docFreq=71, maxDocs=44218)
          0.0320743 = queryNorm
        0.16396527 = fieldWeight in 5973, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.4202213 = idf(docFreq=71, maxDocs=44218)
          0.015625 = fieldNorm(doc=5973)
    0.014372586 = weight(_text_:open in 5973) [ClassicSimilarity], result of:
      0.014372586 = score(doc=5973,freq=2.0), product of:
        0.14443703 = queryWeight, product of:
          4.5032015 = idf(docFreq=1330, maxDocs=44218)
          0.0320743 = queryNorm
        0.09950763 = fieldWeight in 5973, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.5032015 = idf(docFreq=1330, maxDocs=44218)
          0.015625 = fieldNorm(doc=5973)
    0.017419131 = weight(_text_:source in 5973) [ClassicSimilarity], result of:
      0.017419131 = score(doc=5973,freq=2.0), product of:
        0.15900996 = queryWeight, product of:
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.0320743 = queryNorm
        0.10954742 = fieldWeight in 5973, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.015625 = fieldNorm(doc=5973)
    0.010675229 = weight(_text_:web in 5973) [ClassicSimilarity], result of:
      0.010675229 = score(doc=5973,freq=4.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.1019847 = fieldWeight in 5973, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.015625 = fieldNorm(doc=5973)
  0.2857143 = coord(4/14)
```
Abstract

Information Retrieval hat sich zu einer Schlüsseltechnologie in der Wissensgesellschaft entwickelt. Die Anzahl der täglichen Anfragen an Internet-Suchmaschinen bildet nur einen Indikator für die große Bedeutung dieses Themas. Der Sammelbandband informiert über Themen wie Information Retrieval-Grundlagen, Retrieval Systeme, Digitale Bibliotheken, Evaluierung und Multilinguale Systeme, beschreibt Anwendungsszenarien und setzt sich mit neuen Herausforderungen an das Information Retrieval auseinander. Die Beiträge behandeln aktuelle Themen und neue Herausforderungen an das Information Retrieval. Die intensive Beteiligung der Informationswissenschaft der Universität Hildesheim am Cross Language Evaluation Forum (CLEF), einer europäischen Evaluierungsinitiative zur Erforschung mehrsprachiger Retrieval Systeme, berührt mehrere der Beiträge. Ebenso spielen Anwendungsszenarien und die Auseinandersetzung mit aktuellen und praktischen Fragestellungen eine große Rolle.

Content

Inhalt: Jan-Hendrik Scheufen: RECOIN: Modell offener Schnittstellen für Information-Retrieval-Systeme und -Komponenten Markus Nick, Klaus-Dieter Althoff: Designing Maintainable Experience-based Information Systems Gesine Quint, Steffen Weichert: Die benutzerzentrierte Entwicklung des Produkt- Retrieval-Systems EIKON der Blaupunkt GmbH Claus-Peter Klas, Sascha Kriewel, André Schaefer, Gudrun Fischer: Das DAFFODIL System - Strategische Literaturrecherche in Digitalen Bibliotheken Matthias Meiert: Entwicklung eines Modells zur Integration digitaler Dokumente in die Universitätsbibliothek Hildesheim Daniel Harbig, René Schneider: Ontology Learning im Rahmen von MyShelf Michael Kluck, Marco Winter: Topic-Entwicklung und Relevanzbewertung bei GIRT: ein Werkstattbericht Thomas Mandl: Neue Entwicklungen bei den Evaluierungsinitiativen im Information Retrieval Joachim Pfister: Clustering von Patent-Dokumenten am Beispiel der Datenbanken des Fachinformationszentrums Karlsruhe Ralph Kölle, Glenn Langemeier, Wolfgang Semar: Programmieren lernen in kollaborativen Lernumgebungen Olga Tartakovski, Margaryta Shramko: Implementierung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten Nina Kummer: Indexierungstechniken für das japanische Retrieval Suriya Na Nhongkai, Hans-Joachim Bentz: Bilinguale Suche mittels Konzeptnetzen Robert Strötgen, Thomas Mandl, René Schneider: Entwicklung und Evaluierung eines Question Answering Systems im Rahmen des Cross Language Evaluation Forum (CLEF) Niels Jensen: Evaluierung von mehrsprachigem Web-Retrieval: Experimente mit dem EuroGOV-Korpus im Rahmen des Cross Language Evaluation Forum (CLEF)

Footnote

Im ersten Kapitel "Retrieval-Systeme" werden verschiedene Information RetrievalSysteme präsentiert und Verfahren zu deren Gestaltung diskutiert. Jan-Hendrik Scheufen stellt das Meta-Framework RECOIN zur Information Retrieval Forschung vor, das sich durch eine flexible Handhabung unterschiedlichster Applikationen auszeichnet und dadurch eine zentrierte Protokollierung und Steuerung von Retrieval-Prozessen ermöglicht. Dieses Konzept eines offenen, komponentenbasierten Systems wurde in Form eines Plug-Ins für die javabasierte Open-Source-Plattform Eclipse realisiert. Markus Nick und Klaus-Dieter Althoff erläutern in ihrem Beitrag, der übrigens der einzige englischsprachige Text im Buch ist, das Verfahren DILLEBIS zur Erhaltung und Pflege (Maintenance) von erfahrungsbasierten Informationssystemen. Sie bezeichnen dieses Verfahren als Maintainable Experience-based Information System und plädieren für eine Ausrichtung von erfahrungsbasierten Systemen entsprechend diesem Modell. Gesine Quint und Steffen Weichert stellen dagegen in ihrem Beitrag die benutzerzentrierte Entwicklung des Produkt-Retrieval-Systems EIKON vor, das in Kooperation mit der Blaupunkt GmbH realisiert wurde. In einem iterativen Designzyklus erfolgte die Gestaltung von gruppenspezifischen Interaktionsmöglichkeiten für ein Car-Multimedia-Zubehör-System. Im zweiten Kapitel setzen sich mehrere Autoren dezidierter mit dem Anwendungsgebiet "Digitale Bibliothek" auseinander. Claus-Peter Klas, Sascha Kriewel, Andre Schaefer und Gudrun Fischer von der Universität Duisburg-Essen stellen das System DAFFODIL vor, das durch eine Vielzahl an Werkzeugen zur strategischen Unterstützung bei Literaturrecherchen in digitalen Bibliotheken dient. Zusätzlich ermöglicht die Protokollierung sämtlicher Ereignisse den Einsatz des Systems als Evaluationsplattform. Der Aufsatz von Matthias Meiert erläutert die Implementierung von elektronischen Publikationsprozessen an Hochschulen am Beispiel von Abschlussarbeiten des Studienganges Internationales Informationsmanagement der Universität Hildesheim. Neben Rahmenbedingungen werden sowohl der Ist-Zustand als auch der Soll-Zustand des wissenschaftlichen elektronischen Publizierens in Form von gruppenspezifischen Empfehlungen dargestellt. Daniel Harbig und Rene Schneider beschreiben in ihrem Aufsatz zwei Verfahrensweisen zum maschinellen Erlernen von Ontologien, angewandt am virtuellen Bibliotheksregal MyShelf. Nach der Evaluation dieser beiden Ansätze plädieren die Autoren für ein semi-automatisiertes Verfahren zur Erstellung von Ontologien.
"Evaluierung", das Thema des dritten Kapitels, ist in seiner Breite nicht auf das Information Retrieval beschränkt sondern beinhaltet ebenso einzelne Aspekte der Bereiche Mensch-Maschine-Interaktion sowie des E-Learning. Michael Muck und Marco Winter von der Stiftung Wissenschaft und Politik sowie dem Informationszentrum Sozialwissenschaften thematisieren in ihrem Beitrag den Einfluss der Fragestellung (Topic) auf die Bewertung von Relevanz und zeigen Verfahrensweisen für die Topic-Erstellung auf, die beim Cross Language Evaluation Forum (CLEF) Anwendung finden. Im darauf folgenden Aufsatz stellt Thomas Mandl verschiedene Evaluierungsinitiativen im Information Retrieval und aktuelle Entwicklungen dar. Joachim Pfister erläutert in seinem Beitrag das automatisierte Gruppieren, das sogenannte Clustering, von Patent-Dokumenten in den Datenbanken des Fachinformationszentrums Karlsruhe und evaluiert unterschiedliche Clusterverfahren auf Basis von Nutzerbewertungen. Ralph Kölle, Glenn Langemeier und Wolfgang Semar widmen sich dem kollaborativen Lernen unter den speziellen Bedingungen des Programmierens. Dabei werden das System VitaminL zur synchronen Bearbeitung von Programmieraufgaben und das Kennzahlensystem K-3 für die Bewertung kollaborativer Zusammenarbeit in einer Lehrveranstaltung angewendet. Der aktuelle Forschungsschwerpunkt der Hildesheimer Informationswissenschaft zeichnet sich im vierten Kapitel unter dem Thema "Multilinguale Systeme" ab. Hier finden sich die meisten Beiträge des Tagungsbandes wieder. Olga Tartakovski und Margaryta Shramko beschreiben und prüfen das System Langldent, das die Sprache von mono- und multilingualen Texten identifiziert. Die Eigenheiten der japanischen Schriftzeichen stellt Nina Kummer dar und vergleicht experimentell die unterschiedlichen Techniken der Indexierung. Suriya Na Nhongkai und Hans-Joachim Bentz präsentieren und prüfen eine bilinguale Suche auf Basis von Konzeptnetzen, wobei die Konzeptstruktur das verbindende Elemente der beiden Textsammlungen darstellt. Das Entwickeln und Evaluieren eines mehrsprachigen Question-Answering-Systems im Rahmen des Cross Language Evaluation Forum (CLEF), das die alltagssprachliche Formulierung von konkreten Fragestellungen ermöglicht, wird im Beitrag von Robert Strötgen, Thomas Mandl und Rene Schneider thematisiert. Den Schluss bildet der Aufsatz von Niels Jensen, der ein mehrsprachiges Web-Retrieval-System ebenfalls im Zusammenhang mit dem CLEF anhand des multilingualen EuroGOVKorpus evaluiert.
Meghabghab, G.: Google's Web page ranking applied to different topological Web graph structures (2001) 0.02
```
0.019150212 = product of:
  0.13405149 = sum of:
    0.04354783 = weight(_text_:source in 6028) [ClassicSimilarity], result of:
      0.04354783 = score(doc=6028,freq=2.0), product of:
        0.15900996 = queryWeight, product of:
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.0320743 = queryNorm
        0.27386856 = fieldWeight in 6028, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6028)
    0.090503655 = weight(_text_:web in 6028) [ClassicSimilarity], result of:
      0.090503655 = score(doc=6028,freq=46.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.86461735 = fieldWeight in 6028, product of:
          6.78233 = tf(freq=46.0), with freq of:
            46.0 = termFreq=46.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6028)
  0.14285715 = coord(2/14)
```
Abstract

This research is part of the ongoing study to better understand web page ranking on the web. It looks at a web page as a graph structure or a web graph, and tries to classify different web graphs in the new coordinate space: (out-degree, in-degree). The out-degree coordinate od is defined as the number of outgoing web pages from a given web page. The in-degree id coordinate is the number of web pages that point to a given web page. In this new coordinate space a metric is built to classify how close or far different web graphs are. Google's web ranking algorithm (Brin & Page, 1998) on ranking web pages is applied in this new coordinate space. The results of the algorithm has been modified to fit different topological web graph structures. Also the algorithm was not successful in the case of general web graphs and new ranking web algorithms have to be considered. This study does not look at enhancing web ranking by adding any contextual information. It only considers web links as a source to web page ranking. The author believes that understanding the underlying web page as a graph will help design better ranking web algorithms, enhance retrieval and web performance, and recommends using graphs as a part of visual aid for browsing engine designers
Jiang, J.-D.; Jiang, J.-Y.; Cheng, P.-J.: Cocluster hypothesis and ranking consistency for relevance ranking in web search (2019) 0.01
```
0.01309183 = product of:
  0.091642804 = sum of:
    0.018871317 = weight(_text_:web in 5247) [ClassicSimilarity], result of:
      0.018871317 = score(doc=5247,freq=2.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.18028519 = fieldWeight in 5247, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5247)
    0.07277149 = weight(_text_:log in 5247) [ClassicSimilarity], result of:
      0.07277149 = score(doc=5247,freq=2.0), product of:
        0.205552 = queryWeight, product of:
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0320743 = queryNorm
        0.3540296 = fieldWeight in 5247, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5247)
  0.14285715 = coord(2/14)
```
Abstract

Conventional approaches to relevance ranking typically optimize ranking models by each query separately. The traditional cluster hypothesis also does not consider the dependency between related queries. The goal of this paper is to leverage similar search intents to perform ranking consistency so that the search performance can be improved accordingly. Different from the previous supervised approach, which learns relevance by click-through data, we propose a novel cocluster hypothesis to bridge the gap between relevance ranking and ranking consistency. A nearest-neighbors test is also designed to measure the extent to which the cocluster hypothesis holds. Based on the hypothesis, we further propose a two-stage unsupervised approach, in which two ranking heuristics and a cost function are developed to optimize the combination of consistency and uniqueness (or inconsistency). Extensive experiments have been conducted on a real and large-scale search engine log. The experimental results not only verify the applicability of the proposed cocluster hypothesis but also show that our approach is effective in boosting the retrieval performance of the commercial search engine and reaches a comparable performance to the supervised approach.

Habernal, I.; Konopík, M.; Rohlík, O.: Question answering (2012) 0.01

0.012040441 = product of:
  0.08428308 = sum of:
    0.052257393 = weight(_text_:source in 101) [ClassicSimilarity], result of:
      0.052257393 = score(doc=101,freq=2.0), product of:
        0.15900996 = queryWeight, product of:
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.0320743 = queryNorm
        0.32864225 = fieldWeight in 101, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.046875 = fieldNorm(doc=101)
    0.032025687 = weight(_text_:web in 101) [ClassicSimilarity], result of:
      0.032025687 = score(doc=101,freq=4.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.3059541 = fieldWeight in 101, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=101)
  0.14285715 = coord(2/14)

Abstract: Question Answering is an area of information retrieval with the added challenge of applying sophisticated techniques to identify the complex syntactic and semantic relationships present in text in order to provide a more sophisticated and satisfactory response to the user's information needs. For this reason, the authors see question answering as the next step beyond standard information retrieval. In this chapter state of the art question answering is covered focusing on providing an overview of systems, techniques and approaches that are likely to be employed in the next generations of search engines. Special attention is paid to question answering using the World Wide Web as the data source and to question answering exploiting the possibilities of Semantic Web. Considerations about the current issues and prospects for promising future research are also provided.

Henzinger, M.R.: Link analysis in Web information retrieval (2000) 0.01
```
0.01179705 = product of:
  0.082579345 = sum of:
    0.034838263 = weight(_text_:source in 801) [ClassicSimilarity], result of:
      0.034838263 = score(doc=801,freq=2.0), product of:
        0.15900996 = queryWeight, product of:
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.0320743 = queryNorm
        0.21909484 = fieldWeight in 801, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.03125 = fieldNorm(doc=801)
    0.047741078 = weight(_text_:web in 801) [ClassicSimilarity], result of:
      0.047741078 = score(doc=801,freq=20.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.45608947 = fieldWeight in 801, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=801)
  0.14285715 = coord(2/14)
```
Abstract

The analysis of the hyperlink structure of the web has led to significant improvements in web information retrieval. This survey describes two successful link analysis algorithms and the state-of-the art of the field.

Content

The goal of information retrieval is to find all documents relevant for a user query in a collection of documents. Decades of research in information retrieval were successful in developing and refining techniques that are solely word-based (see e.g., [2]). With the advent of the web new sources of information became available, one of them being the hyperlinks between documents and records of user behavior. To be precise, hypertexts (i.e., collections of documents connected by hyperlinks) have existed and have been studied for a long time. What was new was the large number of hyperlinks created by independent individuals. Hyperlinks provide a valuable source of information for web information retrieval as we will show in this article. This area of information retrieval is commonly called link analysis. Why would one expect hyperlinks to be useful? Ahyperlink is a reference of a web page B that is contained in a web page A. When the hyperlink is clicked on in a web browser, the browser displays page B. This functionality alone is not helpful for web information retrieval. However, the way hyperlinks are typically used by authors of web pages can give them valuable information content. Typically, authors create links because they think they will be useful for the readers of the pages. Thus, links are usually either navigational aids that, for example, bring the reader back to the homepage of the site, or links that point to pages whose content augments the content of the current page. The second kind of links tend to point to high-quality pages that might be on the same topic as the page containing the link.
Chen, Z.; Fu, B.: On the complexity of Rocchio's similarity-based relevance feedback algorithm (2007) 0.01
```
0.011623001 = product of:
  0.162722 = sum of:
    0.162722 = weight(_text_:log in 578) [ClassicSimilarity], result of:
      0.162722 = score(doc=578,freq=10.0), product of:
        0.205552 = queryWeight, product of:
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0320743 = queryNorm
        0.79163426 = fieldWeight in 578, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0390625 = fieldNorm(doc=578)
  0.071428575 = coord(1/14)
```
Abstract

Rocchio's similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive learning algorithm from examples in searching for documents represented by a linear classifier. Despite its popularity in various applications, there is little rigorous analysis of its learning complexity in literature. In this article, the authors prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d**2(log d + log n)) over the discretized vector space {0, ... , n - 1 }**d when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier (q, 0) over {0, ... , n - 1 }d can be improved to, at most, 1 + 2k (n - 1) (log d + log(n - 1)), where k is the number of nonzero components in q. Several lower bounds on the learning complexity are also obtained for Rocchio's algorithm. For example, the authors prove that Rocchio's algorithm has a lower bound Omega((d über 2)log n) on its learning complexity over the Boolean vector space {0,1}**d.
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment (1998) 0.01
```
0.010700425 = product of:
  0.074902974 = sum of:
    0.052257393 = weight(_text_:source in 5) [ClassicSimilarity], result of:
      0.052257393 = score(doc=5,freq=2.0), product of:
        0.15900996 = queryWeight, product of:
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.0320743 = queryNorm
        0.32864225 = fieldWeight in 5, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.046875 = fieldNorm(doc=5)
    0.02264558 = weight(_text_:web in 5) [ClassicSimilarity], result of:
      0.02264558 = score(doc=5,freq=2.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.21634221 = fieldWeight in 5, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5)
  0.14285715 = coord(2/14)
```
Abstract

The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of contexts on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of "authoritative" information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of "hub pages" that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristics for link-based analysis.
Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.01
```
0.0093277525 = product of:
  0.065294266 = sum of:
    0.052257393 = weight(_text_:source in 2419) [ClassicSimilarity], result of:
      0.052257393 = score(doc=2419,freq=2.0), product of:
        0.15900996 = queryWeight, product of:
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.0320743 = queryNorm
        0.32864225 = fieldWeight in 2419, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.046875 = fieldNorm(doc=2419)
    0.013036874 = product of:
      0.026073748 = sum of:
        0.026073748 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
          0.026073748 = score(doc=2419,freq=2.0), product of:
            0.11231873 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0320743 = queryNorm
            0.23214069 = fieldWeight in 2419, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2419)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)
```
Abstract

The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.

Date

16.11.2008 16:22:48

Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.01

0.00871003 = product of:
  0.060970202 = sum of:
    0.045760516 = weight(_text_:web in 1319) [ClassicSimilarity], result of:
      0.045760516 = score(doc=1319,freq=6.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.43716836 = fieldWeight in 1319, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1319)
    0.015209687 = product of:
      0.030419374 = sum of:
        0.030419374 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
          0.030419374 = score(doc=1319,freq=2.0), product of:
            0.11231873 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0320743 = queryNorm
            0.2708308 = fieldWeight in 1319, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)

Abstract: Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia

Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.01

0.008332577 = product of:
  0.058328032 = sum of:
    0.04529116 = weight(_text_:web in 2239) [ClassicSimilarity], result of:
      0.04529116 = score(doc=2239,freq=8.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.43268442 = fieldWeight in 2239, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2239)
    0.013036874 = product of:
      0.026073748 = sum of:
        0.026073748 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
          0.026073748 = score(doc=2239,freq=2.0), product of:
            0.11231873 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0320743 = queryNorm
            0.23214069 = fieldWeight in 2239, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2239)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)

Abstract: Genetic-based evolutionary learning algorithms, such as genetic algorithms (GAs) and genetic programming (GP), have been applied to information retrieval (IR) since the 1980s. Recently, GP has been applied to a new IR taskdiscovery of ranking functions for Web search-and has achieved very promising results. However, in our prior research, only one fitness function has been used for GP-based learning. It is unclear how other fitness functions may affect ranking function discovery for Web search, especially since it is weIl known that choosing a proper fitness function is very important for the effectiveness and efficiency of evolutionary algorithms. In this article, we report our experience in contrasting different fitness function designs an GP-based learning using a very large Web corpus. Our results indicate that the design of fitness functions is instrumental in performance improvement. We also give recommendations an the design of fitness functions for genetic-based information retrieval experiments.
Date: 31. 5.2004 19:22:06

Reimer, U.: Empfehlungssysteme (2023) 0.01
```
0.008149719 = product of:
  0.11409606 = sum of:
    0.11409606 = weight(_text_:benutzer in 519) [ClassicSimilarity], result of:
      0.11409606 = score(doc=519,freq=4.0), product of:
        0.18291734 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0320743 = queryNorm
        0.6237575 = fieldWeight in 519, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0546875 = fieldNorm(doc=519)
  0.071428575 = coord(1/14)
```
Abstract

Mit der wachsenden Informationsflut steigen die Anforderungen an Informationssysteme, aus der Menge potenziell relevanter Information die in einem bestimmten Kontext relevanteste zu selektieren. Empfehlungssysteme spielen hier eine besondere Rolle, da sie personalisiert - d. h. kontextspezifisch und benutzerindividuell - relevante Information herausfiltern können. Definition: Ein Empfehlungssystem empfiehlt einem Benutzer bzw. einer Benutzerin in einem definierten Kontext aus einer gegebenen Menge von Empfehlungsobjekten eine Teilmenge als relevant. Empfehlungssysteme machen Benutzer auf Objekte aufmerksam, die sie möglicherweise nie gefunden hätten, weil sie nicht danach gesucht hätten oder sie in der schieren Menge an insgesamt relevanter Information untergegangen wären.
Hancock-Beaulieu, M.; Walker, S.: ¬An evaluation of automatic query expansion in an online library catalogue (1992) 0.01
```
0.00727715 = product of:
  0.101880096 = sum of:
    0.101880096 = weight(_text_:log in 2731) [ClassicSimilarity], result of:
      0.101880096 = score(doc=2731,freq=2.0), product of:
        0.205552 = queryWeight, product of:
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0320743 = queryNorm
        0.49564147 = fieldWeight in 2731, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2731)
  0.071428575 = coord(1/14)
```
Abstract

An automatic query expansion (AQE) facility in anonline catalogue was evaluated in an operational library setting. The OKAPI experimental system had other features including: ranked output 'best match' keyword searching, automatic stemming, spelling normalisation and cross referencing as well as relevance feedback. A combination of transaction log analysis, search replays, questionnaires and interviews was used for data collection. Findings show that contrary to previous results, AQE was beneficial in a substantial number of searches. Use intentions, the effectiveness of the 'best match' search and user interaction were identified as the main factors affecting the take-up of the query expansion facility
Fuhr, N.: Zur Überwindung der Diskrepanz zwischen Retrievalforschung und -praxis (1990) 0.01
```
0.0065859677 = product of:
  0.09220354 = sum of:
    0.09220354 = weight(_text_:benutzer in 6625) [ClassicSimilarity], result of:
      0.09220354 = score(doc=6625,freq=2.0), product of:
        0.18291734 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0320743 = queryNorm
        0.5040722 = fieldWeight in 6625, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0625 = fieldNorm(doc=6625)
  0.071428575 = coord(1/14)
```
Abstract

In diesem Beitrag werden einige Forschungsergebnisse des Information Retrieval vorgestellt, die unmittelbar zur Verbesserung der Retrievalqualität für bereits existierende Datenbanken eingesetzt werden können: Linguistische Algorithmen zur Grund- und Stammformreduktion unterstützen die Suche nach Flexions- und Derivationsformen von Suchtermen. Rankingalgorithmen, die Frage- und Dokumentterme gewichten, führen zu signifikant besseren Retrievalergebnissen als beim Booleschen Retrieval. Durch Relevance Feedback können die Retrievalqualität weiter gesteigert und außerdem der Benutzer bei der sukzessiven Modifikation seiner Frageformulierung unterstützt werden. Es wird eine benutzerfreundliche Bedienungsoberfläche für ein System vorgestellt, das auf diesen Konzepten basiert.
Lanvent, A.: Praxis - Windows-Suche und Indexdienst : Auch Windows kann bei der Suche den Turbo einlegen: mit dem Indexdienst (2004) 0.01
```
0.0058212285 = product of:
  0.08149719 = sum of:
    0.08149719 = weight(_text_:benutzer in 3316) [ClassicSimilarity], result of:
      0.08149719 = score(doc=3316,freq=4.0), product of:
        0.18291734 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0320743 = queryNorm
        0.44554108 = fieldWeight in 3316, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3316)
  0.071428575 = coord(1/14)
```
Content

"Für eine 4-GByte-Festplatte mit mehreren Partitionen sucht Windows XP im Volltextmodus weit über zwei Stunden. Der Indexdienst verkürzt diese Recherchedauer drastisch um mehr als eine Stunde. Im Gegensatz zu den Indizes der kommerziellen Suchwerkzeuge erfasst der Windows-Indexdienst nur Text-, HTML- und OfficeDateien über entsprechend integrierte Dokumentfilter. Da er weder ZIP-Files noch PDFs erkennt und auch keine E-Mails scannt, ist er mit komplexen Anfragen schnell überfordert. Standardmäßig ist der Indexdienst zwar installiert, aber nicht aktiviert. Das erledigt der Anwender über Start/Arbeitsplatz und den Befehl Verwalten aus dem Kontextmenü. In der Computerverwaltung aktiviert der Benutzer den Eintrag Indexdienst und wählt Starten aus dem Kontextmenü. Die zu indizierenden Elemente verwaltet Windows über so genannte Kataloge, mit deren Hilfe der User bestimmt, welche Dateitypen aus welchen Ordnern indiziert werden sollen. Zwar kann der Anwender neben dem Katalog System weitere Kataloge einrichten. Ausreichend ist es aber in den meisten Fällen, dem Katalog System weitere Indizierungsordner über die Befehle Neu/Verzeichnis hinzuzufügen. Klickt der Benutzer dann einen der Indizierungsordner mit der rechten Maustaste an und wählt Alle Tasks/Erneut prüfen (Vollständig), beginnt der mitunter langwierige Indizierungsprozess. Über den Eigenschaften-Dialog lässt sich allerdings der Leistungsverbrauch drosseln. Eine inkrementelle Indizierung, bei der Windows nur neue Elemente im jeweiligen Verzeichnis unter die Lupe nimmt, erreicht der Nutzer über Alle Tasks/Erneut prüfen (inkrementell). Einschalten lässt sich der Indexdienst auch über die Eigenschaften eines Ordners und den Befehl Erweitert/ln-halt für schnelle Dateisuche indizieren. Auskunft über die dem Indexdienst zugeordneten Ordner und Laufwerke erhalten Sie, wenn Sie die WindowsSuche starten und Weitere Optionen/ Andere Suchoptionen/Bevorzugte Einstellungen ändern/Indexdienst verwenden anklicken."
Oberhauser, O.; Labner, J.: Relevance Ranking in Online-Katalogen : Informationsstand und Perspektiven (2003) 0.01
```
0.005762722 = product of:
  0.080678105 = sum of:
    0.080678105 = weight(_text_:benutzer in 2188) [ClassicSimilarity], result of:
      0.080678105 = score(doc=2188,freq=2.0), product of:
        0.18291734 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0320743 = queryNorm
        0.44106317 = fieldWeight in 2188, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2188)
  0.071428575 = coord(1/14)
```
Abstract

Bekanntlich führen Suchmaschinen wie Google &Co. beider Auflistung der Suchergebnisse ein "Ranking" nach "Relevanz" durch, d.h. die Dokumente werden in absteigender Reihenfolge entsprechend ihrer Erfüllung von Relevanzkriterien ausgeben. In Online-Katalogen (OPACs) ist derlei noch nicht allgemein übliche Praxis, doch bietet etwa das im Österreichischen Bibliothekenverbund eingesetzte System Aleph 500 tatsächlich eine solche Ranking-Option an (die im Verbundkatalog auch implementiert ist). Bislang liegen allerdings kaum Informationen zur Funktionsweise dieses Features, insbesondere auch im Hinblick auf eine Hilfestellung für Benutzer, vor. Daher möchten wir mit diesem Beitrag versuchen, den in unserem Verbund bestehenden Informationsstand zum Thema "Relevance Ranking" zu erweitern. Sowohl die Verwendung einer Ranking-Option in OPACs generell als auch die sich unter Aleph 500 konkret bietenden Möglichkeiten sollen im folgenden näher betrachtet werden.
Cross-language information retrieval (1998) 0.01
```
0.0057469467 = product of:
  0.040228624 = sum of:
    0.030792965 = weight(_text_:source in 6299) [ClassicSimilarity], result of:
      0.030792965 = score(doc=6299,freq=4.0), product of:
        0.15900996 = queryWeight, product of:
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.0320743 = queryNorm
        0.19365431 = fieldWeight in 6299, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.9575505 = idf(docFreq=844, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
    0.009435658 = weight(_text_:web in 6299) [ClassicSimilarity], result of:
      0.009435658 = score(doc=6299,freq=2.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.09014259 = fieldWeight in 6299, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
  0.14285715 = coord(2/14)
```
Footnote

Rez. in: Machine translation review: 1999, no.10, S.26-27 (D. Lewis): "Cross Language Information Retrieval (CLIR) addresses the growing need to access large volumes of data across language boundaries. The typical requirement is for the user to input a free form query, usually a brief description of a topic, into a search or retrieval engine which returns a list, in ranked order, of documents or web pages that are relevant to the topic. The search engine matches the terms in the query to indexed terms, usually keywords previously derived from the target documents. Unlike monolingual information retrieval, CLIR requires query terms in one language to be matched to indexed terms in another. Matching can be done by bilingual dictionary lookup, full machine translation, or by applying statistical methods. A query's success is measured in terms of recall (how many potentially relevant target documents are found) and precision (what proportion of documents found are relevant). Issues in CLIR are how to translate query terms into index terms, how to eliminate alternative translations (e.g. to decide that French 'traitement' in a query means 'treatment' and not 'salary'), and how to rank or weight translation alternatives that are retained (e.g. how to order the French terms 'aventure', 'business', 'affaire', and 'liaison' as relevant translations of English 'affair'). Grefenstette provides a lucid and useful overview of the field and the problems. The volume brings together a number of experiments and projects in CLIR. Mark Davies (New Mexico State University) describes Recuerdo, a Spanish retrieval engine which reduces translation ambiguities by scanning indexes for parallel texts; it also uses either a bilingual dictionary or direct equivalents from a parallel corpus in order to compare results for queries on parallel texts. Lisa Ballesteros and Bruce Croft (University of Massachusetts) use a 'local feedback' technique which automatically enhances a query by adding extra terms to it both before and after translation; such terms can be derived from documents known to be relevant to the query.
Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.
Wilhelmy, A.: Phonetische Ähnlichkeitssuche in Datenbanken (1991) 0.00
```
0.004939476 = product of:
  0.06915266 = sum of:
    0.06915266 = weight(_text_:benutzer in 5684) [ClassicSimilarity], result of:
      0.06915266 = score(doc=5684,freq=2.0), product of:
        0.18291734 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0320743 = queryNorm
        0.37805414 = fieldWeight in 5684, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.046875 = fieldNorm(doc=5684)
  0.071428575 = coord(1/14)
```
Abstract

In dialoggesteuerten Systemen zur Informationswiedergewinnung (Information Retrieval Systems, IRS) kann man - vergröbernd - das Wechselspiel zwischen Mensch und Computer als iterativen Prozess zur Erhöhung von Genauigkeit (Precision) auf der einen und Vollständigkeit (Recall) der Nachweise auf der anderen Seite verstehen. Vorgestellt wird ein maschinell anwendbares Verfahren, das auf phonologische Untersuchungen des Sprachwissenschaftlers Nikolaj S. Trubetzkoy (1890-1938) zurückgeht. In den Grundzügen kann es erheblich zur Verbesserung der Nachweisvollständigkeit beitragen. Dadurch, daß es die 'Ähnlichkeitsumgebungen' von Suchbegriffen in die Recherche mit einbezieht, zeigt es sich vor allem für Systeme mit koordinativer maschineller Indexierung als vorteilhaft. Bei alphabetischen Begriffen erweist sich die Einführung eines solchen zunächst nur auf den Benutzer hin orientierten Verfahrens auch aus technischer Sicht als günstig, da damit die Anzahl der Zugriffe bei den Suchvorgängen auch für große Datenvolumina niedrig gehalten werden kann
Mayr, P.: Bradfordizing mit Katalogdaten : Alternative Sicht auf Suchergebnisse und Publikationsquellen durch Re-Ranking (2010) 0.00
```
0.004939476 = product of:
  0.06915266 = sum of:
    0.06915266 = weight(_text_:benutzer in 4301) [ClassicSimilarity], result of:
      0.06915266 = score(doc=4301,freq=2.0), product of:
        0.18291734 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0320743 = queryNorm
        0.37805414 = fieldWeight in 4301, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.046875 = fieldNorm(doc=4301)
  0.071428575 = coord(1/14)
```
Abstract

Nutzer erwarten für Literaturrecherchen in wissenschaftlichen Suchsystemen einen möglichst hohen Anteil an relevanten und qualitativen Dokumenten in den Trefferergebnissen. Insbesondere die Reihenfolge und Struktur der gelisteten Ergebnisse (Ranking) spielt, neben dem direkten Volltextzugriff auf die Dokumente, für viele Nutzer inzwischen eine entscheidende Rolle. Abgegrenzt wird Ranking oder Relevance Ranking von sogenannten Sortierungen zum Beispiel nach dem Erscheinungsjahr der Publikation, obwohl hier die Grenze zu »nach inhaltlicher Relevanz« gerankten Listen konzeptuell nicht sauber zu ziehen ist. Das Ranking von Dokumenten führt letztlich dazu, dass sich die Benutzer fokussiert mit den oberen Treffermengen eines Suchergebnisses beschäftigen. Der mittlere und untere Bereich eines Suchergebnisses wird häufig nicht mehr in Betracht gezogen. Aufgrund der Vielzahl an relevanten und verfügbaren Informationsquellen ist es daher notwendig, Kernbereiche in den Suchräumen zu identifizieren und diese anschließend dem Nutzer hervorgehoben zu präsentieren. Phillipp Mayr fasst hier die Ergebnisse seiner Dissertation zum Thema »Re-Ranking auf Basis von Bradfordizing für die verteilte Suche in Digitalen Bibliotheken« zusammen.
Oberhauser, O.: Relevance Ranking in den Online-Katalogen der "nächsten Generation" (2010) 0.00
```
0.004939476 = product of:
  0.06915266 = sum of:
    0.06915266 = weight(_text_:benutzer in 4308) [ClassicSimilarity], result of:
      0.06915266 = score(doc=4308,freq=2.0), product of:
        0.18291734 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0320743 = queryNorm
        0.37805414 = fieldWeight in 4308, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.046875 = fieldNorm(doc=4308)
  0.071428575 = coord(1/14)
```
Abstract

Relevance Ranking in Online-Katalogen ist zwar kein neues Thema, doch liegt dazu nicht allzu viel Literatur vor, die das Prädikat "ernstzunehmen" verdient. Dies ist zum einen darin begründet, dass das Interesse an der Ausgabe ranggereihter Ergebnislisten auf Seiten aller Beteiligter (Bibliothekare, Softwarehersteller, Benutzer) traditionell gering war. Zum anderen ging die seit einigen Jahren populär gewordene Kritik an den bestehenden OPACs vielfach von einer unzureichenden Wissensbasis aus und produzierte oft nur polemische oder emotional gefärbte Beiträge, die zum Thema Ranking wenig beitrugen. ... Der hier beschriebene Test ist natürlich in keiner Weise erschöpfend oder repräsentativ. Dennoch gibt er, wie ich glaube, Anlass zu einiger Hoffnung. Er lässt vermuten, dass die "neuen" OPACs - zumindest was das Relevance Ranking betrifft - auf dem Weg in die richtige Richtung sind. Wie gut es wirklich gelingen wird, die Rankingleistung von Suchmaschinen wie Google, die unter völlig anderen Voraussetzungen arbeiten, einzuholen, wird aber erst die Zukunft zeigen.
Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.00
```
0.004890775 = product of:
  0.03423542 = sum of:
    0.018871317 = weight(_text_:web in 2591) [ClassicSimilarity], result of:
      0.018871317 = score(doc=2591,freq=2.0), product of:
        0.10467481 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0320743 = queryNorm
        0.18028519 = fieldWeight in 2591, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2591)
    0.015364104 = product of:
      0.030728208 = sum of:
        0.030728208 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
          0.030728208 = score(doc=2591,freq=4.0), product of:
            0.11231873 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0320743 = queryNorm
            0.27358043 = fieldWeight in 2591, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)
```
Abstract

Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.

Date

20. 1.2015 18:30:22
18. 9.2018 18:22:56

Search (107 results, page 1 of 6)

Authors

Years

Languages

Types

Themes

Subjects

Classifications