Search (10 results, page 1 of 1)

Larsen, B.; Ingwersen, P.; Lund, B.: Data fusion according to the principle of polyrepresentation (2009) 0.02
```
0.018375382 = product of:
  0.06431384 = sum of:
    0.05418802 = weight(_text_:interpretation in 2752) [ClassicSimilarity], result of:
      0.05418802 = score(doc=2752,freq=2.0), product of:
        0.21405315 = queryWeight, product of:
          5.7281795 = idf(docFreq=390, maxDocs=44218)
          0.037368443 = queryNorm
        0.25315216 = fieldWeight in 2752, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7281795 = idf(docFreq=390, maxDocs=44218)
          0.03125 = fieldNorm(doc=2752)
    0.010125816 = product of:
      0.020251632 = sum of:
        0.020251632 = weight(_text_:22 in 2752) [ClassicSimilarity], result of:
          0.020251632 = score(doc=2752,freq=2.0), product of:
            0.13085791 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.037368443 = queryNorm
            0.15476047 = fieldWeight in 2752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2752)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

We report data fusion experiments carried out on the four best-performing retrieval models from TREC 5. Three were conceptually/algorithmically very different from one another; one was algorithmically similar to one of the former. The objective of the test was to observe the performance of the 11 logical data fusion combinations compared to the performance of the four individual models and their intermediate fusions when following the principle of polyrepresentation. This principle is based on cognitive IR perspective (Ingwersen & Järvelin, 2005) and implies that each retrieval model is regarded as a representation of a unique interpretation of information retrieval (IR). It predicts that only fusions of very different, but equally good, IR models may outperform each constituent as well as their intermediate fusions. Two kinds of experiments were carried out. One tested restricted fusions, which entails that only the inner disjoint overlap documents between fused models are ranked. The second set of experiments was based on traditional data fusion methods. The experiments involved the 30 TREC 5 topics that contain more than 44 relevant documents. In all tests, the Borda and CombSUM scoring methods were used. Performance was measured by precision and recall, with document cutoff values (DCVs) at 100 and 15 documents, respectively. Results show that restricted fusions made of two, three, or four cognitively/algorithmically very different retrieval models perform significantly better than do the individual models at DCV100. At DCV15, however, the results of polyrepresentative fusion were less predictable. The traditional fusion method based on polyrepresentation principles demonstrates a clear picture of performance at both DCV levels and verifies the polyrepresentation predictions for data fusion in IR. Data fusion improves retrieval performance over their constituent IR models only if the models all are quite conceptually/algorithmically dissimilar and equally and well performing, in that order of importance.

Date

22. 3.2009 18:48:28
Lioma, C.; Ounis, I.: ¬A syntactically-based query reformulation technique for information retrieval (2008) 0.01
```
0.0077411463 = product of:
  0.05418802 = sum of:
    0.05418802 = weight(_text_:interpretation in 2031) [ClassicSimilarity], result of:
      0.05418802 = score(doc=2031,freq=2.0), product of:
        0.21405315 = queryWeight, product of:
          5.7281795 = idf(docFreq=390, maxDocs=44218)
          0.037368443 = queryNorm
        0.25315216 = fieldWeight in 2031, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7281795 = idf(docFreq=390, maxDocs=44218)
          0.03125 = fieldNorm(doc=2031)
  0.14285715 = coord(1/7)
```
Abstract

Whereas in language words of high frequency are generally associated with low content [Bookstein, A., & Swanson, D. (1974). Probabilistic models for automatic indexing. Journal of the American Society of Information Science, 25(5), 312-318; Damerau, F. J. (1965). An experiment in automatic indexing. American Documentation, 16, 283-289; Harter, S. P. (1974). A probabilistic approach to automatic keyword indexing. PhD thesis, University of Chicago; Sparck-Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11-21; Yu, C., & Salton, G. (1976). Precision weighting - an effective automatic indexing method. Journal of the Association for Computer Machinery (ACM), 23(1), 76-88], shallow syntactic fragments of high frequency generally correspond to lexical fragments of high content [Lioma, C., & Ounis, I. (2006). Examining the content load of part of speech blocks for information retrieval. In Proceedings of the international committee on computational linguistics and the association for computational linguistics (COLING/ACL 2006), Sydney, Australia]. We implement this finding to Information Retrieval, as follows. We present a novel automatic query reformulation technique, which is based on shallow syntactic evidence induced from various language samples, and used to enhance the performance of an Information Retrieval system. Firstly, we draw shallow syntactic evidence from language samples of varying size, and compare the effect of language sample size upon retrieval performance, when using our syntactically-based query reformulation (SQR) technique. Secondly, we compare SQR to a state-of-the-art probabilistic pseudo-relevance feedback technique. Additionally, we combine both techniques and evaluate their compatibility. We evaluate our proposed technique across two standard Text REtrieval Conference (TREC) English test collections, and three statistically different weighting models. Experimental results suggest that SQR markedly enhances retrieval performance, and is at least comparable to pseudo-relevance feedback. Notably, the combination of SQR and pseudo-relevance feedback further enhances retrieval performance considerably. These collective experimental results confirm the tenet that high frequency shallow syntactic fragments correspond to content-bearing lexical fragments.

Voorhees, E.M.; Harman, D.: Overview of the Sixth Text REtrieval Conference (TREC-6) (2000) 0.01

0.005062908 = product of:
  0.035440356 = sum of:
    0.035440356 = product of:
      0.07088071 = sum of:
        0.07088071 = weight(_text_:22 in 6438) [ClassicSimilarity], result of:
          0.07088071 = score(doc=6438,freq=2.0), product of:
            0.13085791 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.037368443 = queryNorm
            0.5416616 = fieldWeight in 6438, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6438)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Date: 11. 8.2001 16:22:19

Scherer, B.: Automatische Indexierung und ihre Anwendung im DFG-Projekt "Gemeinsames Portal für Bibliotheken, Archive und Museen (BAM)" (2003) 0.00
```
0.0048879073 = product of:
  0.03421535 = sum of:
    0.03421535 = product of:
      0.0684307 = sum of:
        0.0684307 = weight(_text_:anwendung in 4283) [ClassicSimilarity], result of:
          0.0684307 = score(doc=4283,freq=4.0), product of:
            0.1809185 = queryWeight, product of:
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.037368443 = queryNorm
            0.3782405 = fieldWeight in 4283, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4283)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)
```
Abstract

Automatische Indexierung verzeichnet schon seit einigen Jahren aufgrund steigender Informationsflut ein wachsendes Interesse. Allerdings gibt es immer noch Vorbehalte gegenüber der intellektuellen Indexierung in Bezug auf Qualität und größerem Aufwand der Systemimplementierung bzw. -pflege. Neuere Entwicklungen aus dem Bereich des Wissensmanagements, wie beispielsweise Verfahren aus der Künstlichen Intelligenz, der Informationsextraktion, dem Text Mining bzw. der automatischen Klassifikation sollen die automatische Indexierung aufwerten und verbessern. Damit soll eine intelligentere und mehr inhaltsbasierte Erschließung geleistet werden. In dieser Masterarbeit wird außerhalb der Darstellung von Grundlagen und Verfahren der automatischen Indexierung sowie neueren Entwicklungen auch Möglichkeiten der Evaluation dargestellt. Die mögliche Anwendung der automatischen Indexierung im DFG-ProjektGemeinsames Portal für Bibliotheken, Archive und Museen (BAM)" bilden den Schwerpunkt der Arbeit. Im Portal steht die bibliothekarische Erschließung von Texten im Vordergrund. In einem umfangreichen Test werden drei deutsche, linguistische Systeme mit statistischen Verfahren kombiniert (die aber teilweise im System bereits integriert ist) und evaluiert, allerdings nur auf der Basis der ausgegebenen Indexate. Abschließend kann festgestellt werden, dass die Ergebnisse und damit die Qualität (bezogen auf die Indexate) von intellektueller und automatischer Indexierung noch signifikant unterschiedlich sind. Die Gründe liegen in noch zu lösenden semantischen Problemen bzw, in der Obereinstimmung mit Worten aus einem Thesaurus, die von einem automatischen Indexierungssystem nicht immer nachvollzogen werden kann. Eine Inhaltsanreicherung mit den Indexaten zum Vorteil beim Retrieval kann, je nach System oder auch über die Einbindung durch einen Thesaurus, erreicht werden.

Dresel, R.; Hörnig, D.; Kaluza, H.; Peter, A.; Roßmann, A.; Sieber, W.: Evaluation deutscher Web-Suchwerkzeuge : Ein vergleichender Retrievaltest (2001) 0.00

0.0028930905 = product of:
  0.020251632 = sum of:
    0.020251632 = product of:
      0.040503263 = sum of:
        0.040503263 = weight(_text_:22 in 261) [ClassicSimilarity], result of:
          0.040503263 = score(doc=261,freq=2.0), product of:
            0.13085791 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.037368443 = queryNorm
            0.30952093 = fieldWeight in 261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=261)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Abstract: Die deutschen Suchmaschinen, Abacho, Acoon, Fireball und Lycos sowie die Web-Kataloge Web.de und Yahoo! werden einem Qualitätstest nach relativem Recall, Precision und Availability unterzogen. Die Methoden der Retrievaltests werden vorgestellt. Im Durchschnitt werden bei einem Cut-Off-Wert von 25 ein Recall von rund 22%, eine Precision von knapp 19% und eine Verfügbarkeit von 24% erreicht

¬The Eleventh Text Retrieval Conference, TREC 2002 (2003) 0.00

0.0028930905 = product of:
  0.020251632 = sum of:
    0.020251632 = product of:
      0.040503263 = sum of:
        0.040503263 = weight(_text_:22 in 4049) [ClassicSimilarity], result of:
          0.040503263 = score(doc=4049,freq=2.0), product of:
            0.13085791 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.037368443 = queryNorm
            0.30952093 = fieldWeight in 4049, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4049)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Abstract: Proceedings of the llth TREC-conference held in Gaithersburg, Maryland (USA), November 19-22, 2002. Aim of the conference was discussion an retrieval and related information-seeking tasks for large test collection. 93 research groups used different techniques, for information retrieval from the same large database. This procedure makes it possible to compare the results. The tasks are: Cross-language searching, filtering, interactive searching, searching for novelty, question answering, searching for video shots, and Web searching.

Leininger, K.: Interindexer consistency in PsychINFO (2000) 0.00

0.0021698177 = product of:
  0.015188723 = sum of:
    0.015188723 = product of:
      0.030377446 = sum of:
        0.030377446 = weight(_text_:22 in 2552) [ClassicSimilarity], result of:
          0.030377446 = score(doc=2552,freq=2.0), product of:
            0.13085791 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.037368443 = queryNorm
            0.23214069 = fieldWeight in 2552, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2552)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Date: 9. 2.1997 18:44:22

King, D.W.: Blazing new trails : in celebration of an audacious career (2000) 0.00

0.0018081815 = product of:
  0.01265727 = sum of:
    0.01265727 = product of:
      0.02531454 = sum of:
        0.02531454 = weight(_text_:22 in 1184) [ClassicSimilarity], result of:
          0.02531454 = score(doc=1184,freq=2.0), product of:
            0.13085791 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.037368443 = queryNorm
            0.19345059 = fieldWeight in 1184, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1184)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Date: 22. 9.1997 19:16:05

Petrelli, D.: On the role of user-centred evaluation in the advancement of interactive information retrieval (2008) 0.00

0.0018081815 = product of:
  0.01265727 = sum of:
    0.01265727 = product of:
      0.02531454 = sum of:
        0.02531454 = weight(_text_:22 in 2026) [ClassicSimilarity], result of:
          0.02531454 = score(doc=2026,freq=2.0), product of:
            0.13085791 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.037368443 = queryNorm
            0.19345059 = fieldWeight in 2026, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2026)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Source: Information processing and management. 44(2008) no.1, S.22-38

Effektive Information Retrieval Verfahren in Theorie und Praxis : ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005 (2006) 0.00
```
0.001382509 = product of:
  0.009677563 = sum of:
    0.009677563 = product of:
      0.019355126 = sum of:
        0.019355126 = weight(_text_:anwendung in 5973) [ClassicSimilarity], result of:
          0.019355126 = score(doc=5973,freq=2.0), product of:
            0.1809185 = queryWeight, product of:
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.037368443 = queryNorm
            0.10698257 = fieldWeight in 5973, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.8414783 = idf(docFreq=948, maxDocs=44218)
              0.015625 = fieldNorm(doc=5973)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)
```
Footnote

"Evaluierung", das Thema des dritten Kapitels, ist in seiner Breite nicht auf das Information Retrieval beschränkt sondern beinhaltet ebenso einzelne Aspekte der Bereiche Mensch-Maschine-Interaktion sowie des E-Learning. Michael Muck und Marco Winter von der Stiftung Wissenschaft und Politik sowie dem Informationszentrum Sozialwissenschaften thematisieren in ihrem Beitrag den Einfluss der Fragestellung (Topic) auf die Bewertung von Relevanz und zeigen Verfahrensweisen für die Topic-Erstellung auf, die beim Cross Language Evaluation Forum (CLEF) Anwendung finden. Im darauf folgenden Aufsatz stellt Thomas Mandl verschiedene Evaluierungsinitiativen im Information Retrieval und aktuelle Entwicklungen dar. Joachim Pfister erläutert in seinem Beitrag das automatisierte Gruppieren, das sogenannte Clustering, von Patent-Dokumenten in den Datenbanken des Fachinformationszentrums Karlsruhe und evaluiert unterschiedliche Clusterverfahren auf Basis von Nutzerbewertungen. Ralph Kölle, Glenn Langemeier und Wolfgang Semar widmen sich dem kollaborativen Lernen unter den speziellen Bedingungen des Programmierens. Dabei werden das System VitaminL zur synchronen Bearbeitung von Programmieraufgaben und das Kennzahlensystem K-3 für die Bewertung kollaborativer Zusammenarbeit in einer Lehrveranstaltung angewendet. Der aktuelle Forschungsschwerpunkt der Hildesheimer Informationswissenschaft zeichnet sich im vierten Kapitel unter dem Thema "Multilinguale Systeme" ab. Hier finden sich die meisten Beiträge des Tagungsbandes wieder. Olga Tartakovski und Margaryta Shramko beschreiben und prüfen das System Langldent, das die Sprache von mono- und multilingualen Texten identifiziert. Die Eigenheiten der japanischen Schriftzeichen stellt Nina Kummer dar und vergleicht experimentell die unterschiedlichen Techniken der Indexierung. Suriya Na Nhongkai und Hans-Joachim Bentz präsentieren und prüfen eine bilinguale Suche auf Basis von Konzeptnetzen, wobei die Konzeptstruktur das verbindende Elemente der beiden Textsammlungen darstellt. Das Entwickeln und Evaluieren eines mehrsprachigen Question-Answering-Systems im Rahmen des Cross Language Evaluation Forum (CLEF), das die alltagssprachliche Formulierung von konkreten Fragestellungen ermöglicht, wird im Beitrag von Robert Strötgen, Thomas Mandl und Rene Schneider thematisiert. Den Schluss bildet der Aufsatz von Niels Jensen, der ein mehrsprachiges Web-Retrieval-System ebenfalls im Zusammenhang mit dem CLEF anhand des multilingualen EuroGOVKorpus evaluiert.

Search (10 results, page 1 of 1)

Authors

Languages

Types

Themes