Search (16 results, page 1 of 1)

Voorhees, E.M.; Harman, D.: Overview of the Sixth Text REtrieval Conference (TREC-6) (2000) 0.02

0.02155007 = product of:
  0.04310014 = sum of:
    0.04310014 = product of:
      0.08620028 = sum of:
        0.08620028 = weight(_text_:22 in 6438) [ClassicSimilarity], result of:
          0.08620028 = score(doc=6438,freq=2.0), product of:
            0.15914047 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544495 = queryNorm
            0.5416616 = fieldWeight in 6438, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6438)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 11. 8.2001 16:22:19

Rapke, K.: Automatische Indexierung von Volltexten für die Gruner+Jahr Pressedatenbank (2001) 0.02
```
0.021249607 = product of:
  0.042499214 = sum of:
    0.042499214 = product of:
      0.08499843 = sum of:
        0.08499843 = weight(_text_:g in 6386) [ClassicSimilarity], result of:
          0.08499843 = score(doc=6386,freq=8.0), product of:
            0.17068884 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.04544495 = queryNorm
            0.49797297 = fieldWeight in 6386, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.046875 = fieldNorm(doc=6386)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Retrieval Tests sind die anerkannteste Methode, um neue Verfahren der Inhaltserschließung gegenüber traditionellen Verfahren zu rechtfertigen. Im Rahmen einer Diplomarbeit wurden zwei grundsätzlich unterschiedliche Systeme der automatischen inhaltlichen Erschließung anhand der Pressedatenbank des Verlagshauses Gruner + Jahr (G+J) getestet und evaluiert. Untersucht wurde dabei natürlichsprachliches Retrieval im Vergleich zu Booleschem Retrieval. Bei den beiden Systemen handelt es sich zum einen um Autonomy von Autonomy Inc. und DocCat, das von IBM an die Datenbankstruktur der G+J Pressedatenbank angepasst wurde. Ersteres ist ein auf natürlichsprachlichem Retrieval basierendes, probabilistisches System. DocCat demgegenüber basiert auf Booleschem Retrieval und ist ein lernendes System, das auf Grund einer intellektuell erstellten Trainingsvorlage indexiert. Methodisch geht die Evaluation vom realen Anwendungskontext der Textdokumentation von G+J aus. Die Tests werden sowohl unter statistischen wie auch qualitativen Gesichtspunkten bewertet. Ein Ergebnis der Tests ist, dass DocCat einige Mängel gegenüber der intellektuellen Inhaltserschließung aufweist, die noch behoben werden müssen, während das natürlichsprachliche Retrieval von Autonomy in diesem Rahmen und für die speziellen Anforderungen der G+J Textdokumentation so nicht einsetzbar ist
Rapke, K.: Automatische Indexierung von Volltexten für die Gruner+Jahr Pressedatenbank (2001) 0.02
```
0.017708007 = product of:
  0.035416014 = sum of:
    0.035416014 = product of:
      0.07083203 = sum of:
        0.07083203 = weight(_text_:g in 5863) [ClassicSimilarity], result of:
          0.07083203 = score(doc=5863,freq=8.0), product of:
            0.17068884 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.04544495 = queryNorm
            0.4149775 = fieldWeight in 5863, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5863)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Retrievaltests sind die anerkannteste Methode, um neue Verfahren der Inhaltserschließung gegenüber traditionellen Verfahren zu rechtfertigen. Im Rahmen einer Diplomarbeit wurden zwei grundsätzlich unterschiedliche Systeme der automatischen inhaltlichen Erschließung anhand der Pressedatenbank des Verlagshauses Gruner + Jahr (G+J) getestet und evaluiert. Untersucht wurde dabei natürlichsprachliches Retrieval im Vergleich zu Booleschem Retrieval. Bei den beiden Systemen handelt es sich zum einen um Autonomy von Autonomy Inc. und DocCat, das von IBM an die Datenbankstruktur der G+J Pressedatenbank angepasst wurde. Ersteres ist ein auf natürlichsprachlichem Retrieval basierendes, probabilistisches System. DocCat demgegenüber basiert auf Booleschem Retrieval und ist ein lernendes System, das aufgrund einer intellektuell erstellten Trainingsvorlage indexiert. Methodisch geht die Evaluation vom realen Anwendungskontext der Textdokumentation von G+J aus. Die Tests werden sowohl unter statistischen wie auch qualitativen Gesichtspunkten bewertet. Ein Ergebnis der Tests ist, dass DocCat einige Mängel gegenüber der intellektuellen Inhaltserschließung aufweist, die noch behoben werden müssen, während das natürlichsprachliche Retrieval von Autonomy in diesem Rahmen und für die speziellen Anforderungen der G+J Textdokumentation so nicht einsetzbar ist

Kazai, G.; Lalmas, M.: ¬The overlap problem in content-oriented XML retrieval evaluation (2004) 0.02

0.017708007 = product of:
  0.035416014 = sum of:
    0.035416014 = product of:
      0.07083203 = sum of:
        0.07083203 = weight(_text_:g in 4083) [ClassicSimilarity], result of:
          0.07083203 = score(doc=4083,freq=2.0), product of:
            0.17068884 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.04544495 = queryNorm
            0.4149775 = fieldWeight in 4083, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.078125 = fieldNorm(doc=4083)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Binder, G.; Stahl, M.; Faulborn, L.: Vergleichsuntersuchung MESSENGER-FULCRUM (2000) 0.01

0.0123956045 = product of:
  0.024791209 = sum of:
    0.024791209 = product of:
      0.049582418 = sum of:
        0.049582418 = weight(_text_:g in 4885) [ClassicSimilarity], result of:
          0.049582418 = score(doc=4885,freq=2.0), product of:
            0.17068884 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.04544495 = queryNorm
            0.29048425 = fieldWeight in 4885, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4885)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Dresel, R.; Hörnig, D.; Kaluza, H.; Peter, A.; Roßmann, A.; Sieber, W.: Evaluation deutscher Web-Suchwerkzeuge : Ein vergleichender Retrievaltest (2001) 0.01

0.012314326 = product of:
  0.024628652 = sum of:
    0.024628652 = product of:
      0.049257305 = sum of:
        0.049257305 = weight(_text_:22 in 261) [ClassicSimilarity], result of:
          0.049257305 = score(doc=261,freq=2.0), product of:
            0.15914047 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544495 = queryNorm
            0.30952093 = fieldWeight in 261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=261)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Die deutschen Suchmaschinen, Abacho, Acoon, Fireball und Lycos sowie die Web-Kataloge Web.de und Yahoo! werden einem Qualitätstest nach relativem Recall, Precision und Availability unterzogen. Die Methoden der Retrievaltests werden vorgestellt. Im Durchschnitt werden bei einem Cut-Off-Wert von 25 ein Recall von rund 22%, eine Precision von knapp 19% und eine Verfügbarkeit von 24% erreicht

¬The Eleventh Text Retrieval Conference, TREC 2002 (2003) 0.01

0.012314326 = product of:
  0.024628652 = sum of:
    0.024628652 = product of:
      0.049257305 = sum of:
        0.049257305 = weight(_text_:22 in 4049) [ClassicSimilarity], result of:
          0.049257305 = score(doc=4049,freq=2.0), product of:
            0.15914047 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544495 = queryNorm
            0.30952093 = fieldWeight in 4049, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4049)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Proceedings of the llth TREC-conference held in Gaithersburg, Maryland (USA), November 19-22, 2002. Aim of the conference was discussion an retrieval and related information-seeking tasks for large test collection. 93 research groups used different techniques, for information retrieval from the same large database. This procedure makes it possible to compare the results. The tasks are: Cross-language searching, filtering, interactive searching, searching for novelty, question answering, searching for video shots, and Web searching.

Ferret, O.; Grau, B.; Hurault-Plantet, M.; Illouz, G.; Jacquemin, C.; Monceaux, L.; Robba, I.; Vilnat, A.: How NLP can improve question answering (2002) 0.01

0.010624804 = product of:
  0.021249607 = sum of:
    0.021249607 = product of:
      0.042499214 = sum of:
        0.042499214 = weight(_text_:g in 1850) [ClassicSimilarity], result of:
          0.042499214 = score(doc=1850,freq=2.0), product of:
            0.17068884 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.04544495 = queryNorm
            0.24898648 = fieldWeight in 1850, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.046875 = fieldNorm(doc=1850)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Blandford, A.; Adams, A.; Attfield, S.; Buchanan, G.; Gow, J.; Makri, S.; Rimmer, J.; Warwick, C.: ¬The PRET A Rapporter framework : evaluating digital libraries from the perspective of information work (2008) 0.01

0.010624804 = product of:
  0.021249607 = sum of:
    0.021249607 = product of:
      0.042499214 = sum of:
        0.042499214 = weight(_text_:g in 2021) [ClassicSimilarity], result of:
          0.042499214 = score(doc=2021,freq=2.0), product of:
            0.17068884 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.04544495 = queryNorm
            0.24898648 = fieldWeight in 2021, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.046875 = fieldNorm(doc=2021)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abdou, S.; Savoy, J.: Searching in Medline : query expansion and manual indexing evaluation (2008) 0.01
```
0.010624804 = product of:
  0.021249607 = sum of:
    0.021249607 = product of:
      0.042499214 = sum of:
        0.042499214 = weight(_text_:g in 2062) [ClassicSimilarity], result of:
          0.042499214 = score(doc=2062,freq=2.0), product of:
            0.17068884 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.04544495 = queryNorm
            0.24898648 = fieldWeight in 2062, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.046875 = fieldNorm(doc=2062)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Based on a relatively large subset representing one third of the Medline collection, this paper evaluates ten different IR models, including recent developments in both probabilistic and language models. We show that the best performing IR models is a probabilistic model developed within the Divergence from Randomness framework [Amati, G., & van Rijsbergen, C.J. (2002) Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM-Transactions on Information Systems 20(4), 357-389], which result in 170% enhancements in mean average precision when compared to the classical tf idf vector-space model. This paper also reports on our impact evaluations on the retrieval effectiveness of manually assigned descriptors (MeSH or Medical Subject Headings), showing that by including these terms retrieval performance can improve from 2.4% to 13.5%, depending on the underling IR model. Finally, we design a new general blind-query expansion approach showing improved retrieval performances compared to those obtained using the Rocchio approach.

Leininger, K.: Interindexer consistency in PsychINFO (2000) 0.01

0.009235744 = product of:
  0.018471489 = sum of:
    0.018471489 = product of:
      0.036942977 = sum of:
        0.036942977 = weight(_text_:22 in 2552) [ClassicSimilarity], result of:
          0.036942977 = score(doc=2552,freq=2.0), product of:
            0.15914047 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544495 = queryNorm
            0.23214069 = fieldWeight in 2552, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2552)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 9. 2.1997 18:44:22

Keenan, S.; Smeaton, A.F.; Keogh, G.: ¬The effect of pool depth on system evaluation in TREC (2001) 0.01

0.008854004 = product of:
  0.017708007 = sum of:
    0.017708007 = product of:
      0.035416014 = sum of:
        0.035416014 = weight(_text_:g in 5908) [ClassicSimilarity], result of:
          0.035416014 = score(doc=5908,freq=2.0), product of:
            0.17068884 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.04544495 = queryNorm
            0.20748875 = fieldWeight in 5908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5908)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

King, D.W.: Blazing new trails : in celebration of an audacious career (2000) 0.01

0.007696454 = product of:
  0.015392908 = sum of:
    0.015392908 = product of:
      0.030785816 = sum of:
        0.030785816 = weight(_text_:22 in 1184) [ClassicSimilarity], result of:
          0.030785816 = score(doc=1184,freq=2.0), product of:
            0.15914047 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544495 = queryNorm
            0.19345059 = fieldWeight in 1184, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1184)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 9.1997 19:16:05

Petrelli, D.: On the role of user-centred evaluation in the advancement of interactive information retrieval (2008) 0.01

0.007696454 = product of:
  0.015392908 = sum of:
    0.015392908 = product of:
      0.030785816 = sum of:
        0.030785816 = weight(_text_:22 in 2026) [ClassicSimilarity], result of:
          0.030785816 = score(doc=2026,freq=2.0), product of:
            0.15914047 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544495 = queryNorm
            0.19345059 = fieldWeight in 2026, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2026)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information processing and management. 44(2008) no.1, S.22-38

Lioma, C.; Ounis, I.: ¬A syntactically-based query reformulation technique for information retrieval (2008) 0.01
```
0.0070832022 = product of:
  0.0141664045 = sum of:
    0.0141664045 = product of:
      0.028332809 = sum of:
        0.028332809 = weight(_text_:g in 2031) [ClassicSimilarity], result of:
          0.028332809 = score(doc=2031,freq=2.0), product of:
            0.17068884 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.04544495 = queryNorm
            0.165991 = fieldWeight in 2031, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03125 = fieldNorm(doc=2031)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Whereas in language words of high frequency are generally associated with low content [Bookstein, A., & Swanson, D. (1974). Probabilistic models for automatic indexing. Journal of the American Society of Information Science, 25(5), 312-318; Damerau, F. J. (1965). An experiment in automatic indexing. American Documentation, 16, 283-289; Harter, S. P. (1974). A probabilistic approach to automatic keyword indexing. PhD thesis, University of Chicago; Sparck-Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11-21; Yu, C., & Salton, G. (1976). Precision weighting - an effective automatic indexing method. Journal of the Association for Computer Machinery (ACM), 23(1), 76-88], shallow syntactic fragments of high frequency generally correspond to lexical fragments of high content [Lioma, C., & Ounis, I. (2006). Examining the content load of part of speech blocks for information retrieval. In Proceedings of the international committee on computational linguistics and the association for computational linguistics (COLING/ACL 2006), Sydney, Australia]. We implement this finding to Information Retrieval, as follows. We present a novel automatic query reformulation technique, which is based on shallow syntactic evidence induced from various language samples, and used to enhance the performance of an Information Retrieval system. Firstly, we draw shallow syntactic evidence from language samples of varying size, and compare the effect of language sample size upon retrieval performance, when using our syntactically-based query reformulation (SQR) technique. Secondly, we compare SQR to a state-of-the-art probabilistic pseudo-relevance feedback technique. Additionally, we combine both techniques and evaluate their compatibility. We evaluate our proposed technique across two standard Text REtrieval Conference (TREC) English test collections, and three statistically different weighting models. Experimental results suggest that SQR markedly enhances retrieval performance, and is at least comparable to pseudo-relevance feedback. Notably, the combination of SQR and pseudo-relevance feedback further enhances retrieval performance considerably. These collective experimental results confirm the tenet that high frequency shallow syntactic fragments correspond to content-bearing lexical fragments.

Larsen, B.; Ingwersen, P.; Lund, B.: Data fusion according to the principle of polyrepresentation (2009) 0.01

0.006157163 = product of:
  0.012314326 = sum of:
    0.012314326 = product of:
      0.024628652 = sum of:
        0.024628652 = weight(_text_:22 in 2752) [ClassicSimilarity], result of:
          0.024628652 = score(doc=2752,freq=2.0), product of:
            0.15914047 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04544495 = queryNorm
            0.15476047 = fieldWeight in 2752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2752)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2009 18:48:28

Search (16 results, page 1 of 1)

Authors

Languages

Types

Themes