Search (140 results, page 1 of 7)

  • × theme_ss:"Retrievalstudien"
  1. Breuer, T.; Tavakolpoursaleh, N.; Schaer, P.; Hienert, D.; Schaible, J.; Castro, L.J.: Online Information Retrieval Evaluation using the STELLA Framework (2022) 0.06
    0.05759575 = product of:
      0.18718618 = sum of:
        0.08662128 = weight(_text_:log in 640) [ClassicSimilarity], result of:
          0.08662128 = score(doc=640,freq=2.0), product of:
            0.20389368 = queryWeight, product of:
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.031815533 = queryNorm
            0.42483553 = fieldWeight in 640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.046875 = fieldNorm(doc=640)
        0.031159312 = weight(_text_:world in 640) [ClassicSimilarity], result of:
          0.031159312 = score(doc=640,freq=2.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.25480178 = fieldWeight in 640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.046875 = fieldNorm(doc=640)
        0.022462882 = weight(_text_:web in 640) [ClassicSimilarity], result of:
          0.022462882 = score(doc=640,freq=2.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.21634221 = fieldWeight in 640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=640)
        0.046942707 = weight(_text_:software in 640) [ClassicSimilarity], result of:
          0.046942707 = score(doc=640,freq=4.0), product of:
            0.12621705 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031815533 = queryNorm
            0.3719205 = fieldWeight in 640, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.046875 = fieldNorm(doc=640)
      0.30769232 = coord(4/13)
    
    Abstract
    Involving users in early phases of software development has become a common strategy as it enables developers to consider user needs from the beginning. Once a system is in production, new opportunities to observe, evaluate and learn from users emerge as more information becomes available. Gathering information from users to continuously evaluate their behavior is a common practice for commercial software, while the Cranfield paradigm remains the preferred option for Information Retrieval (IR) and recommendation systems in the academic world. Here we introduce the Infrastructures for Living Labs STELLA project which aims to create an evaluation infrastructure allowing experimental systems to run along production web-based academic search systems with real users. STELLA combines user interactions and log files analyses to enable large-scale A/B experiments for academic search.
  2. Agata, T.: ¬A measure for evaluating search engines on the World Wide Web : retrieval test with ESL (Expected Search Length) (1997) 0.04
    0.043858655 = product of:
      0.19005416 = sum of:
        0.062318623 = weight(_text_:world in 3892) [ClassicSimilarity], result of:
          0.062318623 = score(doc=3892,freq=2.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.50960356 = fieldWeight in 3892, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.09375 = fieldNorm(doc=3892)
        0.08280977 = weight(_text_:wide in 3892) [ClassicSimilarity], result of:
          0.08280977 = score(doc=3892,freq=2.0), product of:
            0.14096694 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.031815533 = queryNorm
            0.5874411 = fieldWeight in 3892, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.09375 = fieldNorm(doc=3892)
        0.044925764 = weight(_text_:web in 3892) [ClassicSimilarity], result of:
          0.044925764 = score(doc=3892,freq=2.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.43268442 = fieldWeight in 3892, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=3892)
      0.23076923 = coord(3/13)
    
  3. Lazonder, A.W.; Biemans, H.J.A.; Wopereis, I.G.J.H.: Differences between novice and experienced users in searching information on the World Wide Web (2000) 0.03
    0.028336786 = product of:
      0.122792736 = sum of:
        0.031159312 = weight(_text_:world in 4598) [ClassicSimilarity], result of:
          0.031159312 = score(doc=4598,freq=2.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.25480178 = fieldWeight in 4598, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.046875 = fieldNorm(doc=4598)
        0.041404884 = weight(_text_:wide in 4598) [ClassicSimilarity], result of:
          0.041404884 = score(doc=4598,freq=2.0), product of:
            0.14096694 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.031815533 = queryNorm
            0.29372054 = fieldWeight in 4598, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=4598)
        0.050228536 = weight(_text_:web in 4598) [ClassicSimilarity], result of:
          0.050228536 = score(doc=4598,freq=10.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.48375595 = fieldWeight in 4598, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4598)
      0.23076923 = coord(3/13)
    
    Abstract
    Searching for information on the WWW basically comes down to locating an appropriate Web site and to retrieving relevant information from that site. This study examined the effect of a user's WWW experience on both phases of the search process. 35 students from 2 schools for Dutch pre-university education were observed while performing 3 search tasks. The results indicate that subjects with WWW-experience are more proficient in locating Web sites than are novice WWW-users. The observed differences were ascribed to the experts' superior skills in operating Web search engines. However, on tasks that required subjects to locate information on specific Web sites, the performance of experienced and novice users was equivalent - a result that is in line with hypertext research. Based on these findings, implications for training and supporting students in searching for information on the WWW are identified. Finally, the role of the subjects' level of domain expertise is discussed and directions for future research are proposed
  4. Wu, C.-J.: Experiments on using the Dublin Core to reduce the retrieval error ratio (1998) 0.03
    0.025584215 = product of:
      0.11086493 = sum of:
        0.03635253 = weight(_text_:world in 5201) [ClassicSimilarity], result of:
          0.03635253 = score(doc=5201,freq=2.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.29726875 = fieldWeight in 5201, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5201)
        0.0483057 = weight(_text_:wide in 5201) [ClassicSimilarity], result of:
          0.0483057 = score(doc=5201,freq=2.0), product of:
            0.14096694 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.031815533 = queryNorm
            0.342674 = fieldWeight in 5201, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5201)
        0.026206696 = weight(_text_:web in 5201) [ClassicSimilarity], result of:
          0.026206696 = score(doc=5201,freq=2.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.25239927 = fieldWeight in 5201, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5201)
      0.23076923 = coord(3/13)
    
    Abstract
    In order to test the power of metadata on information retrieval, an experiment was designed and conducted on a group of 7 graduate students using the Dublin Core as the cataloguing metadata. Results show that, on average, the retrieval error rate is only 2.9 per cent for the MES system (http://140.136.85.194), which utilizes the Dublin Core to describe the documents on the World Wide Web, in contrast to 20.7 per cent for the 7 famous search engines including HOTBOT, GAIS, LYCOS, EXCITE, INFOSEEK, YAHOO, and OCTOPUS. The very low error rate indicates that the users can use the information of the Dublin Core to decide whether to retrieve the documents or not
  5. Khan, K.; Locatis, C.: Searching through cyberspace : the effects of link display and link density on information retrieval from hypertext on the World Wide Web (1998) 0.02
    0.021929327 = product of:
      0.09502708 = sum of:
        0.031159312 = weight(_text_:world in 446) [ClassicSimilarity], result of:
          0.031159312 = score(doc=446,freq=2.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.25480178 = fieldWeight in 446, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.046875 = fieldNorm(doc=446)
        0.041404884 = weight(_text_:wide in 446) [ClassicSimilarity], result of:
          0.041404884 = score(doc=446,freq=2.0), product of:
            0.14096694 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.031815533 = queryNorm
            0.29372054 = fieldWeight in 446, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=446)
        0.022462882 = weight(_text_:web in 446) [ClassicSimilarity], result of:
          0.022462882 = score(doc=446,freq=2.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.21634221 = fieldWeight in 446, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=446)
      0.23076923 = coord(3/13)
    
  6. Griesbaum, J.: Evaluierung hybrider Suchsysteme im WWW (2000) 0.02
    0.021929327 = product of:
      0.09502708 = sum of:
        0.031159312 = weight(_text_:world in 2482) [ClassicSimilarity], result of:
          0.031159312 = score(doc=2482,freq=2.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.25480178 = fieldWeight in 2482, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.046875 = fieldNorm(doc=2482)
        0.041404884 = weight(_text_:wide in 2482) [ClassicSimilarity], result of:
          0.041404884 = score(doc=2482,freq=2.0), product of:
            0.14096694 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.031815533 = queryNorm
            0.29372054 = fieldWeight in 2482, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=2482)
        0.022462882 = weight(_text_:web in 2482) [ClassicSimilarity], result of:
          0.022462882 = score(doc=2482,freq=2.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.21634221 = fieldWeight in 2482, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2482)
      0.23076923 = coord(3/13)
    
    Abstract
    Der Ausgangspunkt dieser Arbeit ist die Suchproblematik im World Wide Web. Suchmaschinen sind einerseits unverzichtbar für erfolgreiches Information Retrieval, andererseits wird ihnen eine mäßige Leistungsfähigkeit vorgeworfen. Das Thema dieser Arbeit ist die Untersuchung der Retrievaleffektivität deutschsprachiger Suchmaschinen. Es soll festgestellt werden, welche Retrievaleffektivität Nutzer derzeit erwarten können. Ein Ansatz, um die Retrievaleffektivität von Suchmaschinen zu erhöhen besteht darin, redaktionell von Menschen erstellte und automatisch generierte Suchergebnisse in einer Trefferliste zu vermengen. Ziel dieser Arbeit ist es, die Retrievaleffektivität solcher hybrider Systeme im Vergleich zu rein roboterbasierten Suchmaschinen zu evaluieren. Zunächst werden hierzu die grundlegenden Problembereiche bei der Evaluation von Retrievalsystemen analysiert. In Anlehnung an die von Tague-Sutcliff vorgeschlagene Methodik wird unter Beachtung der webspezifischen Besonderheiten eine mögliche Vorgehensweise erschlossen. Darauf aufbauend wird das konkrete Setting für die Durchführung der Evaluation erarbeitet und ein Retrievaleffektivitätstest bei den Suchmaschinen Lycos.de, AItaVista.de und QualiGo durchgeführt.
  7. Voorhees, E.M.; Harman, D.K.: ¬The Text REtrieval Conference (2005) 0.02
    0.018617572 = product of:
      0.060507108 = sum of:
        0.018176265 = weight(_text_:world in 5082) [ClassicSimilarity], result of:
          0.018176265 = score(doc=5082,freq=2.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.14863437 = fieldWeight in 5082, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.02734375 = fieldNorm(doc=5082)
        0.02415285 = weight(_text_:wide in 5082) [ClassicSimilarity], result of:
          0.02415285 = score(doc=5082,freq=2.0), product of:
            0.14096694 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.031815533 = queryNorm
            0.171337 = fieldWeight in 5082, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.02734375 = fieldNorm(doc=5082)
        0.013103348 = weight(_text_:web in 5082) [ClassicSimilarity], result of:
          0.013103348 = score(doc=5082,freq=2.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.12619963 = fieldWeight in 5082, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.02734375 = fieldNorm(doc=5082)
        0.0050746426 = product of:
          0.015223928 = sum of:
            0.015223928 = weight(_text_:29 in 5082) [ClassicSimilarity], result of:
              0.015223928 = score(doc=5082,freq=2.0), product of:
                0.11191709 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.031815533 = queryNorm
                0.13602862 = fieldWeight in 5082, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=5082)
          0.33333334 = coord(1/3)
      0.30769232 = coord(4/13)
    
    Abstract
    Text retrieval technology targets a problem that is all too familiar: finding relevant information in large stores of electronic documents. The problem is an old one, with the first research conference devoted to the subject held in 1958 [11]. Since then the problem has continued to grow as more information is created in electronic form and more people gain electronic access. The advent of the World Wide Web, where anyone can publish so everyone must search, is a graphic illustration of the need for effective retrieval technology. The Text REtrieval Conference (TREC) is a workshop series designed to build the infrastructure necessary for the large-scale evaluation of text retrieval technology, thereby accelerating its transfer into the commercial sector. The series is sponsored by the U.S. National Institute of Standards and Technology (NIST) and the U.S. Department of Defense. At the time of this writing, there have been twelve TREC workshops and preparations for the thirteenth workshop are under way. Participants in the workshops have been drawn from the academic, commercial, and government sectors, and have included representatives from more than twenty different countries. These collective efforts have accomplished a great deal: a variety of large test collections have been built for both traditional ad hoc retrieval and related tasks such as cross-language retrieval, speech retrieval, and question answering; retrieval effectiveness has approximately doubled; and many commercial retrieval systems now contain technology first developed in TREC.
    Date
    29. 3.1996 18:16:49
  8. Kantor, P.; Kim, M.H.; Ibraev, U.; Atasoy, K.: Estimating the number of relevant documents in enormous collections (1999) 0.02
    0.01827444 = product of:
      0.07918923 = sum of:
        0.025966093 = weight(_text_:world in 6690) [ClassicSimilarity], result of:
          0.025966093 = score(doc=6690,freq=2.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.21233483 = fieldWeight in 6690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
        0.03450407 = weight(_text_:wide in 6690) [ClassicSimilarity], result of:
          0.03450407 = score(doc=6690,freq=2.0), product of:
            0.14096694 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.031815533 = queryNorm
            0.24476713 = fieldWeight in 6690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
        0.01871907 = weight(_text_:web in 6690) [ClassicSimilarity], result of:
          0.01871907 = score(doc=6690,freq=2.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.18028519 = fieldWeight in 6690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
      0.23076923 = coord(3/13)
    
    Abstract
    In assessing information retrieval systems, it is important to know not only the precision of the retrieved set, but also to compare the number of retrieved relevant items to the total number of relevant items. For large collections, such as the TREC test collections, or the World Wide Web, it is not possible to enumerate the entire set of relevant documents. If the retrieved documents are evaluated, a variant of the statistical "capture-recapture" method can be used to estimate the total number of relevant documents, providing the several retrieval systems used are sufficiently independent. We show that the underlying signal detection model supporting such an analysis can be extended in two ways. First, assuming that there are two distinct performance characteristics (corresponding to the chance of retrieving a relevant, and retrieving a given non-relevant document), we show that if there are three or more independent systems available it is possible to estimate the number of relevant documents without actually having to decide whether each individual document is relevant. We report applications of this 3-system method to the TREC data, leading to the conclusion that the independence assumptions are not satisfied. We then extend the model to a multi-system, multi-problem model, and show that it is possible to include statistical dependencies of all orders in the model, and determine the number of relevant documents for each of the problems in the set. Application to the TREC setting will be presented
  9. Sarigil, E.; Sengor Altingovde, I.; Blanco, R.; Barla Cambazoglu, B.; Ozcan, R.; Ulusoy, Ö.: Characterizing, predicting, and handling web search queries that match very few or no results (2018) 0.02
    0.015178026 = product of:
      0.09865716 = sum of:
        0.0721844 = weight(_text_:log in 4039) [ClassicSimilarity], result of:
          0.0721844 = score(doc=4039,freq=2.0), product of:
            0.20389368 = queryWeight, product of:
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.031815533 = queryNorm
            0.3540296 = fieldWeight in 4039, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4039)
        0.026472762 = weight(_text_:web in 4039) [ClassicSimilarity], result of:
          0.026472762 = score(doc=4039,freq=4.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.25496176 = fieldWeight in 4039, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4039)
      0.15384616 = coord(2/13)
    
    Abstract
    A non-negligible fraction of user queries end up with very few or even no matching results in leading commercial web search engines. In this work, we provide a detailed characterization of such queries and show that search engines try to improve such queries by showing the results of related queries. Through a user study, we show that these query suggestions are usually perceived as relevant. Also, through a query log analysis, we show that the users are dissatisfied after submitting a query that match no results at least 88.5% of the time. As a first step towards solving these no-answer queries, we devised a large number of features that can be used to identify such queries and built machine-learning models. These models can be useful for scenarios such as the mobile- or meta-search, where identifying a query that will retrieve no results at the client device (i.e., even before submitting it to the search engine) may yield gains in terms of the bandwidth usage, power consumption, and/or monetary costs. Experiments over query logs indicate that, despite the heavy skew in class sizes, our models achieve good prediction quality, with accuracy (in terms of area under the curve) up to 0.95.
  10. Mandl, T.: Web- und Multimedia-Dokumente : Neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen (2003) 0.01
    0.011416695 = product of:
      0.07420851 = sum of:
        0.029950509 = weight(_text_:web in 1734) [ClassicSimilarity], result of:
          0.029950509 = score(doc=1734,freq=2.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.2884563 = fieldWeight in 1734, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=1734)
        0.044258006 = weight(_text_:software in 1734) [ClassicSimilarity], result of:
          0.044258006 = score(doc=1734,freq=2.0), product of:
            0.12621705 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031815533 = queryNorm
            0.35064998 = fieldWeight in 1734, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0625 = fieldNorm(doc=1734)
      0.15384616 = coord(2/13)
    
    Abstract
    Die Menge an Daten im Internet steigt weiter rapide an. Damit wächst auch der Bedarf an qualitativ hochwertigen Information Retrieval Diensten zur Orientierung und problemorientierten Suche. Die Entscheidung für die Benutzung oder Beschaffung von Information Retrieval Software erfordert aussagekräftige Evaluierungsergebnisse. Dieser Beitrag stellt neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen vor und zeigt den Trend zu Spezialisierung und Diversifizierung von Evaluierungsstudien, die den Realitätsgrad derErgebnisse erhöhen. DerSchwerpunkt liegt auf dem Retrieval von Fachtexten, Internet-Seiten und Multimedia-Objekten.
  11. Hawking, D.; Craswell, N.: ¬The very large collection and Web tracks (2005) 0.01
    0.009588391 = product of:
      0.06232454 = sum of:
        0.044925764 = weight(_text_:web in 5085) [ClassicSimilarity], result of:
          0.044925764 = score(doc=5085,freq=2.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.43268442 = fieldWeight in 5085, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=5085)
        0.017398775 = product of:
          0.052196324 = sum of:
            0.052196324 = weight(_text_:29 in 5085) [ClassicSimilarity], result of:
              0.052196324 = score(doc=5085,freq=2.0), product of:
                0.11191709 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.031815533 = queryNorm
                0.46638384 = fieldWeight in 5085, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5085)
          0.33333334 = coord(1/3)
      0.15384616 = coord(2/13)
    
    Date
    29. 3.1996 18:16:49
  12. Harter, S.P.: Variations in relevance assessments and the measurement of retrieval effectiveness (1996) 0.01
    0.009303102 = product of:
      0.060470164 = sum of:
        0.025966093 = weight(_text_:world in 3004) [ClassicSimilarity], result of:
          0.025966093 = score(doc=3004,freq=2.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.21233483 = fieldWeight in 3004, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3004)
        0.03450407 = weight(_text_:wide in 3004) [ClassicSimilarity], result of:
          0.03450407 = score(doc=3004,freq=2.0), product of:
            0.14096694 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.031815533 = queryNorm
            0.24476713 = fieldWeight in 3004, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3004)
      0.15384616 = coord(2/13)
    
    Abstract
    The purpose of this article is to bring attention to the problem of variations in relevance assessments and the effects that these may have on measures of retrieval effectiveness. Through an analytical review of the literature, I show that despite known wide variations in relevance assessments in experimental test collections, their effects on the measurement of retrieval performance are almost completely unstudied. I will further argue that what we know about tha many variables that have been found to affect relevance assessments under experimental conditions, as well as our new understanding of psychological, situational, user-based relevance, point to a single conclusion. We can no longer rest the evaluation of information retrieval systems on the assumption that such variations do not significantly affect the measurement of information retrieval performance. A series of thourough, rigorous, and extensive tests is needed, of precisely how, and under what conditions, variations in relevance assessments do, and do not, affect measures of retrieval performance. We need to develop approaches to evaluation that are sensitive to these variations and to human factors and individual differences more generally. Our approaches to evaluation must reflect the real world of real users
  13. Tonta, Y.: Analysis of search failures in document retrieval systems : a review (1992) 0.01
    0.008884234 = product of:
      0.11549504 = sum of:
        0.11549504 = weight(_text_:log in 4611) [ClassicSimilarity], result of:
          0.11549504 = score(doc=4611,freq=2.0), product of:
            0.20389368 = queryWeight, product of:
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.031815533 = queryNorm
            0.5664474 = fieldWeight in 4611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.0625 = fieldNorm(doc=4611)
      0.07692308 = coord(1/13)
    
    Abstract
    This paper examines search failures in document retrieval systems. Since search failures are closely related to overall document retrieval system performance, the paper briefly discusses retrieval effectiveness measures such as precision and recall. It examines 4 methods used to study retrieval failures: retrieval effectiveness measures, user satisfaction measures, transaction log analysis, and the critical incident technique. It summarizes the findings of major failure anaylsis studies and identifies the types of failures that usually occur in document retrieval systems
  14. Dresel, R.; Hörnig, D.; Kaluza, H.; Peter, A.; Roßmann, A.; Sieber, W.: Evaluation deutscher Web-Suchwerkzeuge : Ein vergleichender Retrievaltest (2001) 0.01
    0.008284809 = product of:
      0.053851258 = sum of:
        0.042356417 = weight(_text_:web in 261) [ClassicSimilarity], result of:
          0.042356417 = score(doc=261,freq=4.0), product of:
            0.10383032 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031815533 = queryNorm
            0.4079388 = fieldWeight in 261, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=261)
        0.011494841 = product of:
          0.034484524 = sum of:
            0.034484524 = weight(_text_:22 in 261) [ClassicSimilarity], result of:
              0.034484524 = score(doc=261,freq=2.0), product of:
                0.11141258 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031815533 = queryNorm
                0.30952093 = fieldWeight in 261, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=261)
          0.33333334 = coord(1/3)
      0.15384616 = coord(2/13)
    
    Abstract
    Die deutschen Suchmaschinen, Abacho, Acoon, Fireball und Lycos sowie die Web-Kataloge Web.de und Yahoo! werden einem Qualitätstest nach relativem Recall, Precision und Availability unterzogen. Die Methoden der Retrievaltests werden vorgestellt. Im Durchschnitt werden bei einem Cut-Off-Wert von 25 ein Recall von rund 22%, eine Precision von knapp 19% und eine Verfügbarkeit von 24% erreicht
  15. Crestani, F.; Rijsbergen, C.J. van: Information retrieval by imaging (1996) 0.01
    0.008105701 = product of:
      0.052687053 = sum of:
        0.044065922 = weight(_text_:world in 6967) [ClassicSimilarity], result of:
          0.044065922 = score(doc=6967,freq=4.0), product of:
            0.122288436 = queryWeight, product of:
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.031815533 = queryNorm
            0.36034414 = fieldWeight in 6967, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.8436708 = idf(docFreq=2573, maxDocs=44218)
              0.046875 = fieldNorm(doc=6967)
        0.008621131 = product of:
          0.025863392 = sum of:
            0.025863392 = weight(_text_:22 in 6967) [ClassicSimilarity], result of:
              0.025863392 = score(doc=6967,freq=2.0), product of:
                0.11141258 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031815533 = queryNorm
                0.23214069 = fieldWeight in 6967, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=6967)
          0.33333334 = coord(1/3)
      0.15384616 = coord(2/13)
    
    Abstract
    Explains briefly what constitutes the imaging process and explains how imaging can be used in information retrieval. Proposes an approach based on the concept of: 'a term is a possible world'; which enables the exploitation of term to term relationships which are estimated using an information theoretic measure. Reports results of an evaluation exercise to compare the performance of imaging retrieval, using possible world semantics, with a benchmark and using the Cranfield 2 document collection to measure precision and recall. Initially, the performance imaging retrieval was seen to be better but statistical analysis proved that the difference was not significant. The problem with imaging retrieval lies in the amount of computations needed to be performed at run time and a later experiement investigated the possibility of reducing this amount. Notes lines of further investigation
    Source
    Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
  16. Qiu, L.: Markov models of search state patterns in a hypertext information retrieval system (1993) 0.01
    0.0077737053 = product of:
      0.10105816 = sum of:
        0.10105816 = weight(_text_:log in 5296) [ClassicSimilarity], result of:
          0.10105816 = score(doc=5296,freq=2.0), product of:
            0.20389368 = queryWeight, product of:
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.031815533 = queryNorm
            0.49564147 = fieldWeight in 5296, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5296)
      0.07692308 = coord(1/13)
    
    Abstract
    The objective of this research is to discover the search state patterns through which users retrieve information in hypertext systems. The Markov model is used to describe users' search behavior. As determined by the log-linear model test, the second-order Markov model is the best model. Search patterns of different user groups were studied by comparing the corresponding transition probability matrices. The comparisons were made based on the following factors: gender, search experience, search task, and the user's academic background. The statistical tests revealed that there were significant differences between all the groups being compared
  17. Nicholas, D.: Are information professionals really better online searchers than end-users? : (and whose story do you believe?) (1995) 0.01
    0.0077737053 = product of:
      0.10105816 = sum of:
        0.10105816 = weight(_text_:log in 3871) [ClassicSimilarity], result of:
          0.10105816 = score(doc=3871,freq=2.0), product of:
            0.20389368 = queryWeight, product of:
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.031815533 = queryNorm
            0.49564147 = fieldWeight in 3871, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3871)
      0.07692308 = coord(1/13)
    
    Abstract
    Examines the searching behaviour of Guardian journalists searching FT PROFILE online system. Using transactional log analysis compares the searching styles of journalists with those of Guardian librarians. In some respects end users conform to the picture that professionals have of them - they search with a very limited range of commands - but in other respects they confound that image - they are very quick and economical searchers. Their behaviour relates to their general information seeking behaviour, and their searching styles would be seen in this regard
  18. Jascó, P.: CD-ROM commentaries : the speed of the retrieval software (1996) 0.01
    0.0072967964 = product of:
      0.09485835 = sum of:
        0.09485835 = weight(_text_:software in 6992) [ClassicSimilarity], result of:
          0.09485835 = score(doc=6992,freq=12.0), product of:
            0.12621705 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031815533 = queryNorm
            0.75154936 = fieldWeight in 6992, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6992)
      0.07692308 = coord(1/13)
    
    Abstract
    Smart retrieval software can have more impact on response time than a faster CD-ROM drive. This can be shown easily now that more commerical databases are being released in several versions with practically identical content but different software. Illustrates this with 3 versions of the PAIS database, comparing the original OCSI, SPIRS software, and EBSCO software. Whereas the OCSI version took no more than a few seconds, the other 2 versions took around 10times longer, occasionally in the order of minutes. Considers how the construction of these products can account for such speed differences. However, search facilities and other features should be considered as well as speed. Offers advice for testing retrieval software
  19. Huffman, G.D.; Vital, D.A.; Bivins, R.G.: Generating indices with lexical association methods : term uniqueness (1990) 0.01
    0.007133602 = product of:
      0.046368413 = sum of:
        0.039118923 = weight(_text_:software in 4152) [ClassicSimilarity], result of:
          0.039118923 = score(doc=4152,freq=4.0), product of:
            0.12621705 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031815533 = queryNorm
            0.30993375 = fieldWeight in 4152, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4152)
        0.0072494904 = product of:
          0.02174847 = sum of:
            0.02174847 = weight(_text_:29 in 4152) [ClassicSimilarity], result of:
              0.02174847 = score(doc=4152,freq=2.0), product of:
                0.11191709 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.031815533 = queryNorm
                0.19432661 = fieldWeight in 4152, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4152)
          0.33333334 = coord(1/3)
      0.15384616 = coord(2/13)
    
    Abstract
    A software system has been developed which orders citations retrieved from an online database in terms of relevancy. The system resulted from an effort generated by NASA's Technology Utilization Program to create new advanced software tools to largely automate the process of determining relevancy of database citations retrieved to support large technology transfer studies. The ranking is based on the generation of an enriched vocabulary using lexical association methods, a user assessment of the vocabulary and a combination of the user assessment and the lexical metric. One of the key elements in relevancy ranking is the enriched vocabulary -the terms mst be both unique and descriptive. This paper examines term uniqueness. Six lexical association methods were employed to generate characteristic word indices. A limited subset of the terms - the highest 20,40,60 and 7,5% of the uniquess words - we compared and uniquess factors developed. Computational times were also measured. It was found that methods based on occurrences and signal produced virtually the same terms. The limited subset of terms producedby the exact and centroid discrimination value were also nearly identical. Unique terms sets were produced by teh occurrence, variance and discrimination value (centroid), An end-user evaluation showed that the generated terms were largely distinct and had values of word precision which were consistent with values of the search precision.
    Date
    23.11.1995 11:29:46
  20. Ding, C.H.Q.: ¬A probabilistic model for Latent Semantic Indexing (2005) 0.01
    0.0066631753 = product of:
      0.08662128 = sum of:
        0.08662128 = weight(_text_:log in 3459) [ClassicSimilarity], result of:
          0.08662128 = score(doc=3459,freq=2.0), product of:
            0.20389368 = queryWeight, product of:
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.031815533 = queryNorm
            0.42483553 = fieldWeight in 3459, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.046875 = fieldNorm(doc=3459)
      0.07692308 = coord(1/13)
    
    Abstract
    Latent Semantic Indexing (LSI), when applied to semantic space built an text collections, improves information retrieval, information filtering, and word sense disambiguation. A new dual probability model based an the similarity concepts is introduced to provide deeper understanding of LSI. Semantic associations can be quantitatively characterized by their statistical significance, the likelihood. Semantic dimensions containing redundant and noisy information can be separated out and should be ignored because their negative contribution to the overall statistical significance. LSI is the optimal solution of the model. The peak in the likelihood curve indicates the existence of an intrinsic semantic dimension. The importance of LSI dimensions follows the Zipf-distribution, indicating that LSI dimensions represent latent concepts. Document frequency of words follows the Zipf distribution, and the number of distinct words follows log-normal distribution. Experiments an five standard document collections confirm and illustrate the analysis.

Languages

Types

  • a 127
  • s 8
  • m 5
  • el 4
  • x 2
  • p 1
  • r 1
  • More… Less…