Search (58 results, page 1 of 3)

  • × theme_ss:"Retrievalstudien"
  1. Clarke, S.J.; Willett, P.: Estimating the recall performance of Web search engines (1997) 0.05
    0.051100858 = product of:
      0.20440343 = sum of:
        0.20440343 = weight(_text_:engines in 760) [ClassicSimilarity], result of:
          0.20440343 = score(doc=760,freq=8.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.8981709 = fieldWeight in 760, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0625 = fieldNorm(doc=760)
      0.25 = coord(1/4)
    
    Abstract
    Reports a comparison of the retrieval effectiveness of the AltaVista, Excite and Lycos Web search engines. Describes a method for comparing the recall of the 3 sets of searches, despite the fact that they are carried out on non identical sets of Web pages. It is thus possible, unlike previous comparative studies of Web search engines, to consider both recall and precision when evaluating the effectiveness of search engines
  2. Mettrop, W.; Nieuwenhuysen, P.: Internet search engines : fluctuations in document accessibility (2001) 0.04
    0.039115943 = product of:
      0.15646377 = sum of:
        0.15646377 = weight(_text_:engines in 4481) [ClassicSimilarity], result of:
          0.15646377 = score(doc=4481,freq=12.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.68751884 = fieldWeight in 4481, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4481)
      0.25 = coord(1/4)
    
    Abstract
    An empirical investigation of the consistency of retrieval through Internet search engines is reported. Thirteen engines are evaluated: AltaVista, EuroFerret, Excite, HotBot, InfoSeek, Lycos, MSN, NorthernLight, Snap, WebCrawler and three national Dutch engines: Ilse, Search.nl and Vindex. The focus is on a characteristics related to size: the degree of consistency to which an engine retrieves documents. Does an engine always present the same relevant documents that are, or were, available in its databases? We observed and identified three types of fluctuations in the result sets of several kinds of searches, many of them significant. These should be taken into account by users who apply an Internet search engine, for instance to retrieve as many relevant documents as possible, or to retrieve a document that was already found in a previous search, or to perform scientometric/bibliometric measurements. The fluctuations should also be considered as a complication of other research on the behaviour and performance of Internet search engines. In conclusion: in view of the increasing importance of the Internet as a publication/communication medium, the fluctuations in the result sets of Internet search engines can no longer be neglected.
  3. Agata, T.: ¬A measure for evaluating search engines on the World Wide Web : retrieval test with ESL (Expected Search Length) (1997) 0.04
    0.03832564 = product of:
      0.15330257 = sum of:
        0.15330257 = weight(_text_:engines in 3892) [ClassicSimilarity], result of:
          0.15330257 = score(doc=3892,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.67362815 = fieldWeight in 3892, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.09375 = fieldNorm(doc=3892)
      0.25 = coord(1/4)
    
  4. Oppenheim, C.; Morris, A.; McKnight, C.: ¬The evaluation of WWW search engines (2000) 0.04
    0.03832564 = product of:
      0.15330257 = sum of:
        0.15330257 = weight(_text_:engines in 4546) [ClassicSimilarity], result of:
          0.15330257 = score(doc=4546,freq=8.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.67362815 = fieldWeight in 4546, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=4546)
      0.25 = coord(1/4)
    
    Abstract
    The literature of the evaluation of Internet search engines is reviewed. Although there have been many studies, there has been little consistency in the way such studies have been carried out. This problem is exacerbated by the fact that recall is virtually impossible to calculate in the fast changing Internet environment, and therefore the traditional Cranfield type of evaluation is not usually possible. A variety of alternative evaluation methods has been suggested to overcome this difficulty. The authors recommend that a standardised set of tools is developed for the evaluation of web search engines so that, in future, comparisons can be made between search engines more effectively, and that variations in performance of any given search engine over time can be tracked. The paper itself does not provide such a standard set of tools, but it investigates the issues and makes preliminary recommendations of the types of tools needed
  5. MacFarlane, A.: Evaluation of web search for the information practitioner (2007) 0.03
    0.027100323 = product of:
      0.10840129 = sum of:
        0.10840129 = weight(_text_:engines in 817) [ClassicSimilarity], result of:
          0.10840129 = score(doc=817,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.47632706 = fieldWeight in 817, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=817)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - The aim of the paper is to put forward a structured mechanism for web search evaluation. The paper seeks to point to useful scientific research and show how information practitioners can use these methods in evaluation of search on the web for their users. Design/methodology/approach - The paper puts forward an approach which utilizes traditional laboratory-based evaluation measures such as average precision/precision at N documents, augmented with diagnostic measures such as link broken, etc., which are used to show why precision measures are depressed as well as the quality of the search engines crawling mechanism. Findings - The paper shows how to use diagnostic measures in conjunction with precision in order to evaluate web search. Practical implications - The methodology presented in this paper will be useful to any information professional who regularly uses web search as part of their information seeking and needs to evaluate web search services. Originality/value - The paper argues that the use of diagnostic measures is essential in web search, as precision measures on their own do not allow a searcher to understand why search results differ between search engines.
  6. Bar-Ilan, J.: Methods for measuring search engine performance over time (2002) 0.03
    0.025550429 = product of:
      0.102201715 = sum of:
        0.102201715 = weight(_text_:engines in 305) [ClassicSimilarity], result of:
          0.102201715 = score(doc=305,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.44908544 = fieldWeight in 305, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0625 = fieldNorm(doc=305)
      0.25 = coord(1/4)
    
    Abstract
    This study introduces methods for evaluating search engine performance over a time period. Several measures are defined, which as a whole describe search engine functionality over time. The necessary setup for such studies is described, and the use of these measures is illustrated through a specific example. The set of measures introduced here may serve as a guideline for the search engines for testing and improving their functionality. We recommend setting up a standard suite of measures for evaluating search engine performance.
  7. Carterette, B.: Test collections (2009) 0.03
    0.025550429 = product of:
      0.102201715 = sum of:
        0.102201715 = weight(_text_:engines in 3891) [ClassicSimilarity], result of:
          0.102201715 = score(doc=3891,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.44908544 = fieldWeight in 3891, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0625 = fieldNorm(doc=3891)
      0.25 = coord(1/4)
    
    Abstract
    Research and development of search engines and other information retrieval (IR) systems proceeds by a cycle of design, implementation, and experimentation, with the results of each experiment influencing design decisions in the next iteration of the cycle. Batch experiments on test collections help ensure that this process goes as smoothly and as quickly as possible. A test collection comprises a collection of documents, a set of information needs, and judgments of the relevance of documents to those needs.
  8. Balog, K.; Schuth, A.; Dekker, P.; Tavakolpoursaleh, N.; Schaer, P.; Chuang, P.-Y.: Overview of the TREC 2016 Open Search track Academic Search Edition (2016) 0.03
    0.025550429 = product of:
      0.102201715 = sum of:
        0.102201715 = weight(_text_:engines in 43) [ClassicSimilarity], result of:
          0.102201715 = score(doc=43,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.44908544 = fieldWeight in 43, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0625 = fieldNorm(doc=43)
      0.25 = coord(1/4)
    
    Abstract
    We present the TREC Open Search track, which represents a new evaluation paradigm for information retrieval. It offers the possibility for researchers to evaluate their approaches in a live setting, with real, unsuspecting users of an existing search engine. The first edition of the track focuses on the academic search domain and features the ad-hoc scientific literature search task. We report on experiments with three different academic search engines: Cite-SeerX, SSOAR, and Microsoft Academic Search.
  9. Radev, D.R.; Libner, K.; Fan, W.: Getting answers to natural language questions on the Web (2002) 0.02
    0.022583602 = product of:
      0.09033441 = sum of:
        0.09033441 = weight(_text_:engines in 5204) [ClassicSimilarity], result of:
          0.09033441 = score(doc=5204,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39693922 = fieldWeight in 5204, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5204)
      0.25 = coord(1/4)
    
    Abstract
    Seven hundred natural language questions from TREC-8 and TREC-9 were sent by Radev, Libner, and Fan to each of nine web search engines. The top 40 sites returned by each system were stored for evaluation of their productivity of correct answers. Each question per engine was scored as the sum of the reciprocal ranks of identified correct answers. The large number of zero scores gave a positive skew violating the normality assumption for ANOVA, so values were transformed to zero for no hit and one for one or more hits. The non-zero values were then square-root transformed to remove the remaining positive skew. Interactions were observed between search engine and answer type (name, place, date, et cetera), search engine and number of proper nouns in the query, search engine and the need for time limitation, and search engine and total query words. All effects were significant. Shortest queries had the highest mean scores. One or more proper nouns present provides a significant advantage. Non-time dependent queries have an advantage. Place, name, person, and text description had mean scores between .85 and .9 with date at .81 and number at .59. There were significant differences in score by search engine. Search engines found at least one correct answer in between 87.7 and 75.45 of the cases. Google and Northern Light were just short of a 90% hit rate. No evidence indicated that a particular engine was better at answering any particular sort of question.
  10. Sarigil, E.; Sengor Altingovde, I.; Blanco, R.; Barla Cambazoglu, B.; Ozcan, R.; Ulusoy, Ö.: Characterizing, predicting, and handling web search queries that match very few or no results (2018) 0.02
    0.022583602 = product of:
      0.09033441 = sum of:
        0.09033441 = weight(_text_:engines in 4039) [ClassicSimilarity], result of:
          0.09033441 = score(doc=4039,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39693922 = fieldWeight in 4039, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4039)
      0.25 = coord(1/4)
    
    Abstract
    A non-negligible fraction of user queries end up with very few or even no matching results in leading commercial web search engines. In this work, we provide a detailed characterization of such queries and show that search engines try to improve such queries by showing the results of related queries. Through a user study, we show that these query suggestions are usually perceived as relevant. Also, through a query log analysis, we show that the users are dissatisfied after submitting a query that match no results at least 88.5% of the time. As a first step towards solving these no-answer queries, we devised a large number of features that can be used to identify such queries and built machine-learning models. These models can be useful for scenarios such as the mobile- or meta-search, where identifying a query that will retrieve no results at the client device (i.e., even before submitting it to the search engine) may yield gains in terms of the bandwidth usage, power consumption, and/or monetary costs. Experiments over query logs indicate that, despite the heavy skew in class sizes, our models achieve good prediction quality, with accuracy (in terms of area under the curve) up to 0.95.
  11. Vegt, A. van der; Zuccon, G.; Koopman, B.: Do better search engines really equate to better clinical decisions? : If not, why not? (2021) 0.02
    0.022583602 = product of:
      0.09033441 = sum of:
        0.09033441 = weight(_text_:engines in 150) [ClassicSimilarity], result of:
          0.09033441 = score(doc=150,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39693922 = fieldWeight in 150, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=150)
      0.25 = coord(1/4)
    
    Abstract
    Previous research has found that improved search engine effectiveness-evaluated using a batch-style approach-does not always translate to significant improvements in user task performance; however, these prior studies focused on simple recall and precision-based search tasks. We investigated the same relationship, but for realistic, complex search tasks required in clinical decision making. One hundred and nine clinicians and final year medical students answered 16 clinical questions. Although the search engine did improve answer accuracy by 20 percentage points, there was no significant difference when participants used a more effective, state-of-the-art search engine. We also found that the search engine effectiveness difference, identified in the lab, was diminished by around 70% when the search engines were used with real users. Despite the aid of the search engine, half of the clinical questions were answered incorrectly. We further identified the relative contribution of search engine effectiveness to the overall end task success. We found that the ability to interpret documents correctly was a much more important factor impacting task success. If these findings are representative, information retrieval research may need to reorient its emphasis towards helping users to better understand information, rather than just finding it for them.
  12. Harter, S.P.; Hert, C.A.: Evaluation of information retrieval systems : approaches, issues, and methods (1997) 0.02
    0.022356624 = product of:
      0.089426495 = sum of:
        0.089426495 = weight(_text_:engines in 2264) [ClassicSimilarity], result of:
          0.089426495 = score(doc=2264,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39294976 = fieldWeight in 2264, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2264)
      0.25 = coord(1/4)
    
    Abstract
    State of the art review of information retrieval systems, defined as systems retrieving documents a sopposed to numerical data. Explains the classic Cranfield studies that have served as a standard for retrieval testing since the 1960s and discusses the Cranfield model and its relevance based measures of retrieval effectiveness. Details sosme of the problems with the Cranfield instruments and issues of validity and reliability, generalizability, usefulness and basic concepts. Discusses the evaluation of the Internet search engines in light of the Cranfield model, noting the very real differences between batch systems (Cranfield) and interactive systems (Internet). Because the Internet collection is not fixed, it is impossible to determine recall as a measure of retrieval effectiveness. considers future directions in evaluating information retrieval systems
  13. Wu, C.-J.: Experiments on using the Dublin Core to reduce the retrieval error ratio (1998) 0.02
    0.022356624 = product of:
      0.089426495 = sum of:
        0.089426495 = weight(_text_:engines in 5201) [ClassicSimilarity], result of:
          0.089426495 = score(doc=5201,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39294976 = fieldWeight in 5201, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5201)
      0.25 = coord(1/4)
    
    Abstract
    In order to test the power of metadata on information retrieval, an experiment was designed and conducted on a group of 7 graduate students using the Dublin Core as the cataloguing metadata. Results show that, on average, the retrieval error rate is only 2.9 per cent for the MES system (http://140.136.85.194), which utilizes the Dublin Core to describe the documents on the World Wide Web, in contrast to 20.7 per cent for the 7 famous search engines including HOTBOT, GAIS, LYCOS, EXCITE, INFOSEEK, YAHOO, and OCTOPUS. The very low error rate indicates that the users can use the information of the Dublin Core to decide whether to retrieve the documents or not
  14. Bar-Ilan, J.: ¬The Web as an information source on informetrics? : A content analysis (2000) 0.02
    0.01916282 = product of:
      0.07665128 = sum of:
        0.07665128 = weight(_text_:engines in 4587) [ClassicSimilarity], result of:
          0.07665128 = score(doc=4587,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.33681408 = fieldWeight in 4587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=4587)
      0.25 = coord(1/4)
    
    Abstract
    This article addresses the question of whether the Web can serve as an information source for research. Specifically, it analyzes by way of content analysis the Web pages retrieved by the major search engines on a particular date (June 7, 1998), as a result of the query 'informetrics OR informetric'. In 807 out of the 942 retrieved pages, the search terms were mentioned in the context of information science. Over 70% of the pages contained only indirect information on the topic, in the form of hypertext links and bibliographical references without annotation. The bibliographical references extracted from the Web pages were analyzed, and lists of most productive authors, most cited authors, works, and sources were compiled. The list of reference obtained from the Web was also compared to data retrieved from commercial databases. For most cases, the list of references extracted from the Web outperformed the commercial, bibliographic databases. The results of these comparisons indicate that valuable, freely available data is hidden in the Web waiting to be extracted from the millions of Web pages
  15. Lazonder, A.W.; Biemans, H.J.A.; Wopereis, I.G.J.H.: Differences between novice and experienced users in searching information on the World Wide Web (2000) 0.02
    0.01916282 = product of:
      0.07665128 = sum of:
        0.07665128 = weight(_text_:engines in 4598) [ClassicSimilarity], result of:
          0.07665128 = score(doc=4598,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.33681408 = fieldWeight in 4598, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=4598)
      0.25 = coord(1/4)
    
    Abstract
    Searching for information on the WWW basically comes down to locating an appropriate Web site and to retrieving relevant information from that site. This study examined the effect of a user's WWW experience on both phases of the search process. 35 students from 2 schools for Dutch pre-university education were observed while performing 3 search tasks. The results indicate that subjects with WWW-experience are more proficient in locating Web sites than are novice WWW-users. The observed differences were ascribed to the experts' superior skills in operating Web search engines. However, on tasks that required subjects to locate information on specific Web sites, the performance of experienced and novice users was equivalent - a result that is in line with hypertext research. Based on these findings, implications for training and supporting students in searching for information on the WWW are identified. Finally, the role of the subjects' level of domain expertise is discussed and directions for future research are proposed
  16. Serrano Cobos, J.; Quintero Orta, A.: Design, development and management of an information recovery system for an Internet Website : from documentary theory to practice (2003) 0.02
    0.01916282 = product of:
      0.07665128 = sum of:
        0.07665128 = weight(_text_:engines in 2726) [ClassicSimilarity], result of:
          0.07665128 = score(doc=2726,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.33681408 = fieldWeight in 2726, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=2726)
      0.25 = coord(1/4)
    
    Abstract
    A real case study is shown, explaining in a timeline the whole process of design, development and evaluation of a search engine used as a navigational help tool for end users and clients an a content website, e-commerce driven. The nature of the website is a community website, which will determine the core design of the information service. This study will involve several steps, such as information recovery system analysis, comparative analysis of other commercial search engines, service design, functionalities and scope; software selection, design of the project, project management, future service administration and conclusions.
  17. Landoni, M.; Bell, S.: Information retrieval techniques for evaluating search engines : a critical overview (2000) 0.02
    0.01916282 = product of:
      0.07665128 = sum of:
        0.07665128 = weight(_text_:engines in 716) [ClassicSimilarity], result of:
          0.07665128 = score(doc=716,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.33681408 = fieldWeight in 716, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=716)
      0.25 = coord(1/4)
    
  18. Savoy , J.: Cross-language information retrieval : experiments based an CLEF 2000 corpora (2003) 0.02
    0.01916282 = product of:
      0.07665128 = sum of:
        0.07665128 = weight(_text_:engines in 1034) [ClassicSimilarity], result of:
          0.07665128 = score(doc=1034,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.33681408 = fieldWeight in 1034, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=1034)
      0.25 = coord(1/4)
    
    Abstract
    Search engines play an essential role in the usability of Internet-based information systems and without them the Web would be much less accessible, and at the very least would develop at a much slower rate. Given that non-English users now tend to make up the majority in this environment, our main objective is to analyze and evaluate the retrieval effectiveness of various indexing and search strategies based on test-collections written in four different languages: English, French, German, and Italian. Our second objective is to describe and evaluate various approaches that might be implemented in order to effectively access document collections written in another language. As a third objective, we will explore the underlying problems involved in searching document collections written in the four different languages, and we will suggest and evaluate different database merging strategies capable of providing the user with a single unique result list.
  19. Eastman, C.M.: 30,000 hits may be better than 300 : precision anomalies in Internet searches (2002) 0.02
    0.015969018 = product of:
      0.06387607 = sum of:
        0.06387607 = weight(_text_:engines in 5231) [ClassicSimilarity], result of:
          0.06387607 = score(doc=5231,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.2806784 = fieldWeight in 5231, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5231)
      0.25 = coord(1/4)
    
    Abstract
    In this issue we begin with a paper where Eastman points out that conventional narrower queries (the use of conjunctions and phrases) in a web engine search will reduce returned number of hits but not necessarily increase precision in the top ranked documents in the return. Thus by precision anomalies Eastman means that search narrowing activity results in no precision change or a decrease in precision. Multiple queries with multiple engines were run by students for a three-year period and the formulation/engine combination was recorded as was the number of hits. Relevance was also recorded for the top ten and top twenty ranked retrievals. While narrower searches reduced total hits they did not usually improve precision. Initial high precision and poor query reformulation account for some of the results, as did Alta Vista's failure to use the ranking algorithm incorporated in its regular search in its advanced search feature. However, since the top listed returns often reoccurred in all formulations, it would seem that the ranking algorithms are doing a consistent job of practical precision ranking that is not improved by reformulation.
  20. Schaer, P.; Mayr, P.; Sünkler, S.; Lewandowski, D.: How relevant is the long tail? : a relevance assessment study on million short (2016) 0.02
    0.015969018 = product of:
      0.06387607 = sum of:
        0.06387607 = weight(_text_:engines in 3144) [ClassicSimilarity], result of:
          0.06387607 = score(doc=3144,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.2806784 = fieldWeight in 3144, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3144)
      0.25 = coord(1/4)
    
    Abstract
    Users of web search engines are known to mostly focus on the top ranked results of the search engine result page. While many studies support this well known information seeking pattern only few studies concentrate on the question what users are missing by neglecting lower ranked results. To learn more about the relevance distributions in the so-called long tail we conducted a relevance assessment study with the Million Short long-tail web search engine. While we see a clear difference in the content between the head and the tail of the search engine result list we see no statistical significant differences in the binary relevance judgments and weak significant differences when using graded relevance. The tail contains different but still valuable results. We argue that the long tail can be a rich source for the diversification of web search engine result lists but it needs more evaluation to clearly describe the differences.

Languages

  • e 51
  • d 3
  • chi 1
  • f 1
  • ja 1
  • More… Less…

Types

  • a 52
  • m 4
  • s 4
  • el 1
  • More… Less…