Search (13 results, page 1 of 1)

  • × author_ss:"Croft, W.B."
  1. Croft, W.B.; Metzler, D.; Strohman, T.: Search engines : information retrieval in practice (2010) 0.14
    0.13513544 = product of:
      0.20270315 = sum of:
        0.10651682 = weight(_text_:search in 2605) [ClassicSimilarity], result of:
          0.10651682 = score(doc=2605,freq=14.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.6095997 = fieldWeight in 2605, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=2605)
        0.09618632 = product of:
          0.19237264 = sum of:
            0.19237264 = weight(_text_:engines in 2605) [ClassicSimilarity], result of:
              0.19237264 = score(doc=2605,freq=10.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.75313926 = fieldWeight in 2605, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2605)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    For introductory information retrieval courses at the undergraduate and graduate level in computer science, information science and computer engineering departments. Written by a leader in the field of information retrieval, Search Engines: Information Retrieval in Practice, is designed to give undergraduate students the understanding and tools they need to evaluate, compare and modify search engines. Coverage of the underlying IR and mathematical models reinforce key concepts. The book's numerous programming exercises make extensive use of Galago, a Java-based open source search engine. SUPPLEMENTS / Extensive lecture slides (in PDF and PPT format) / Solutions to selected end of chapter problems (Instructors only) / Test collections for exercises / Galago search engine
    LCSH
    Search engines / Programming
    Subject
    Search engines / Programming
  2. Kim, Y.; Seo, J.; Croft, W.B.; Smith, D.A.: Automatic suggestion of phrasal-concept queries for literature search (2014) 0.07
    0.06863054 = product of:
      0.102945805 = sum of:
        0.06709928 = weight(_text_:search in 2692) [ClassicSimilarity], result of:
          0.06709928 = score(doc=2692,freq=8.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.3840117 = fieldWeight in 2692, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2692)
        0.03584652 = product of:
          0.07169304 = sum of:
            0.07169304 = weight(_text_:engines in 2692) [ClassicSimilarity], result of:
              0.07169304 = score(doc=2692,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.2806784 = fieldWeight in 2692, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2692)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Both general and domain-specific search engines have adopted query suggestion techniques to help users formulate effective queries. In the specific domain of literature search (e.g., finding academic papers), the initial queries are usually based on a draft paper or abstract, rather than short lists of keywords. In this paper, we investigate phrasal-concept query suggestions for literature search. These suggestions explicitly specify important phrasal concepts related to an initial detailed query. The merits of phrasal-concept query suggestions for this domain are their readability and retrieval effectiveness: (1) phrasal concepts are natural for academic authors because of their frequent use of terminology and subject-specific phrases and (2) academic papers describe their key ideas via these subject-specific phrases, and thus phrasal concepts can be used effectively to find those papers. We propose a novel phrasal-concept query suggestion technique that generates queries by identifying key phrasal-concepts from pseudo-labeled documents and combines them with related phrases. Our proposed technique is evaluated in terms of both user preference and retrieval effectiveness. We conduct user experiments to verify a preference for our approach, in comparison to baseline query suggestion methods, and demonstrate the effectiveness of the technique with retrieval experiments.
  3. Croft, W.B.: Combining approaches to information retrieval (2000) 0.07
    0.066634305 = product of:
      0.09995145 = sum of:
        0.056935627 = weight(_text_:search in 6862) [ClassicSimilarity], result of:
          0.056935627 = score(doc=6862,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.3258447 = fieldWeight in 6862, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=6862)
        0.043015826 = product of:
          0.08603165 = sum of:
            0.08603165 = weight(_text_:engines in 6862) [ClassicSimilarity], result of:
              0.08603165 = score(doc=6862,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.33681408 = fieldWeight in 6862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.046875 = fieldNorm(doc=6862)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the "meta-search" engines used on the Web. This paper examines the development of this technique, including both experimental results and the retrieval models that have been proposed as formal frameworks for combination. We show that combining approaches for information retrieval can be modeled as combining the outputs of multiple classifiers based on one or more representations, and that this simple model can provide explanations for many of the experimental results. We also show that this view of combination is very similar to the inference net model, and that a new approach to retrieval based on language models supports combination and can be integrated with the inference net model
  4. Croft, W.B.; Harper, D.J.: Using probabilistic models of document retrieval without relevance information (1979) 0.05
    0.047340807 = product of:
      0.14202242 = sum of:
        0.14202242 = weight(_text_:search in 4520) [ClassicSimilarity], result of:
          0.14202242 = score(doc=4520,freq=14.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.8127996 = fieldWeight in 4520, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=4520)
      0.33333334 = coord(1/3)
    
    Abstract
    Based on a probablistic model, proposes strategies for the initial search and an intermediate search. Retrieval experiences with the Cranfield collection of 1,400 documents show that this initial search strategy is better than conventional search strategies both in terms of retrieval effectiveness and in terms of the number of queries that retrieve relevant documents. The intermediate search is a useful substitute for a relevance feedback search. A cluster search would be an effective alternative strategy.
  5. Shneiderman, B.; Byrd, D.; Croft, W.B.: Clarifying search : a user-interface framework for text searches (1997) 0.04
    0.035786286 = product of:
      0.10735885 = sum of:
        0.10735885 = weight(_text_:search in 1471) [ClassicSimilarity], result of:
          0.10735885 = score(doc=1471,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.6144187 = fieldWeight in 1471, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.125 = fieldNorm(doc=1471)
      0.33333334 = coord(1/3)
    
  6. Luk, R.W.P.; Leong, H.V.; Dillon, T.S.; Chan, A.T.S.; Croft, W.B.; Allen, J.: ¬A survey in indexing and searching XML documents (2002) 0.04
    0.035505608 = product of:
      0.10651682 = sum of:
        0.10651682 = weight(_text_:search in 460) [ClassicSimilarity], result of:
          0.10651682 = score(doc=460,freq=14.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.6095997 = fieldWeight in 460, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=460)
      0.33333334 = coord(1/3)
    
    Abstract
    XML holds the promise to yield (1) a more precise search by providing additional information in the elements, (2) a better integrated search of documents from heterogeneous sources, (3) a powerful search paradigm using structural as well as content specifications, and (4) data and information exchange to share resources and to support cooperative search. We survey several indexing techniques for XML documents, grouping them into flatfile, semistructured, and structured indexing paradigms. Searching techniques and supporting techniques for searching are reviewed, including full text search and multistage search. Because searching XML documents can be very flexible, various search result presentations are discussed, as well as database and information retrieval system integration and XML query languages. We also survey various retrieval models, examining how they would be used or extended for retrieving XML documents. To conclude the article, we discuss various open issues that XML poses with respect to information retrieval and database research.
  7. Shneiderman, B.; Byrd, D.; Croft, W.B.: Clarifying search : a user-interface framework for text searches (1997) 0.03
    0.025304725 = product of:
      0.075914174 = sum of:
        0.075914174 = weight(_text_:search in 1258) [ClassicSimilarity], result of:
          0.075914174 = score(doc=1258,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.43445963 = fieldWeight in 1258, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=1258)
      0.33333334 = coord(1/3)
    
    Abstract
    Current user interfaces for textual database searching leave much to be desired: individually, they are often confusing, and as a group, they are seriously inconsistent. We propose a four- phase framework for user-interface design: the framework provides common structure and terminology for searching while preserving the distinct features of individual collections and search mechanisms. Users will benefit from faster learning, increased comprehension, and better control, leading to more effective searches and higher satisfaction.
  8. Belkin, N.J.; Croft, W.B.: Retrieval techniques (1987) 0.02
    0.01816343 = product of:
      0.054490287 = sum of:
        0.054490287 = product of:
          0.10898057 = sum of:
            0.10898057 = weight(_text_:22 in 334) [ClassicSimilarity], result of:
              0.10898057 = score(doc=334,freq=2.0), product of:
                0.17604718 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05027291 = queryNorm
                0.61904186 = fieldWeight in 334, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=334)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Annual review of information science and technology. 22(1987), S.109-145
  9. Rajashekar, T.B.; Croft, W.B.: Combining automatic and manual index representations in probabilistic retrieval (1995) 0.02
    0.015656501 = product of:
      0.0469695 = sum of:
        0.0469695 = weight(_text_:search in 2418) [ClassicSimilarity], result of:
          0.0469695 = score(doc=2418,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.2688082 = fieldWeight in 2418, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2418)
      0.33333334 = coord(1/3)
    
    Abstract
    Results from research in information retrieval have suggested that significant improvements in retrieval effectiveness can be obtained by combining results from multiple index representioms, query formulations, and search strategies. The inference net model of retrieval, which was designed from this point of view, treats information retrieval as an evidental reasoning process where multiple sources of evidence about document and query content are combined to estimate relevance probabilities. Uses a system based on this model to study the retrieval effectiveness benefits of combining these types of document and query information that are found in typical commercial databases and information services. The results indicate that substantial real benefits are possible
  10. Xu, J.; Croft, W.B.: Topic-based language models for distributed retrieval (2000) 0.01
    0.013419857 = product of:
      0.04025957 = sum of:
        0.04025957 = weight(_text_:search in 38) [ClassicSimilarity], result of:
          0.04025957 = score(doc=38,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.230407 = fieldWeight in 38, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=38)
      0.33333334 = coord(1/3)
    
    Abstract
    Effective retrieval in a distributed environment is an important but difficult problem. Lack of effectiveness appears to have two major causes. First, existing collection selection algorithms do not work well on heterogeneous collections. Second, relevant documents are scattered over many collections and searching a few collections misses many relevant documents. We propose a topic-oriented approach to distributed retrieval. With this approach, we structure the document set of a distributed retrieval environment around a set of topics. Retrieval for a query involves first selecting the right topics for the query and then dispatching the search process to collections that contain such topics. The content of a topic is characterized by a language model. In environments where the labeling of documents by topics is unavailable, document clustering is employed for topic identification. Based on these ideas, three methods are proposed to suit different environments. We show that all three methods improve effectiveness of distributed retrieval
  11. Murdock, V.; Kelly, D.; Croft, W.B.; Belkin, N.J.; Yuan, X.: Identifying and improving retrieval for procedural questions (2007) 0.01
    0.013419857 = product of:
      0.04025957 = sum of:
        0.04025957 = weight(_text_:search in 902) [ClassicSimilarity], result of:
          0.04025957 = score(doc=902,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.230407 = fieldWeight in 902, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=902)
      0.33333334 = coord(1/3)
    
    Abstract
    People use questions to elicit information from other people in their everyday lives and yet the most common method of obtaining information from a search engine is by posing keywords. There has been research that suggests users are better at expressing their information needs in natural language, however the vast majority of work to improve document retrieval has focused on queries posed as sets of keywords or Boolean queries. This paper focuses on improving document retrieval for the subset of natural language questions asking about how something is done. We classify questions as asking either for a description of a process or asking for a statement of fact, with better than 90% accuracy. Further we identify non-content features of documents relevant to questions asking about a process. Finally we demonstrate that we can use these features to significantly improve the precision of document retrieval results for questions asking about a process. Our approach, based on exploiting the structure of documents, shows a significant improvement in precision at rank one for questions asking about how something is done.
  12. Tavakoli, L.; Zamani, H.; Scholer, F.; Croft, W.B.; Sanderson, M.: Analyzing clarification in asynchronous information-seeking conversations (2022) 0.01
    0.013419857 = product of:
      0.04025957 = sum of:
        0.04025957 = weight(_text_:search in 496) [ClassicSimilarity], result of:
          0.04025957 = score(doc=496,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.230407 = fieldWeight in 496, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=496)
      0.33333334 = coord(1/3)
    
    Abstract
    This research analyzes human-generated clarification questions to provide insights into how they are used to disambiguate and provide a better understanding of information needs. A set of clarification questions is extracted from posts on the Stack Exchange platform. Novel taxonomy is defined for the annotation of the questions and their responses. We investigate the clarification questions in terms of whether they add any information to the post (the initial question posted by the asker) and the accepted answer, which is the answer chosen by the asker. After identifying, which clarification questions are more useful, we investigated the characteristics of these questions in terms of their types and patterns. Non-useful clarification questions are identified, and their patterns are compared with useful clarifications. Our analysis indicates that the most useful clarification questions have similar patterns, regardless of topic. This research contributes to an understanding of clarification in conversations and can provide insight for clarification dialogues in conversational search scenarios and for the possible system generation of clarification requests in information-seeking conversations.
  13. Allan, J.; Callan, J.P.; Croft, W.B.; Ballesteros, L.; Broglio, J.; Xu, J.; Shu, H.: INQUERY at TREC-5 (1997) 0.01
    0.011352143 = product of:
      0.03405643 = sum of:
        0.03405643 = product of:
          0.06811286 = sum of:
            0.06811286 = weight(_text_:22 in 3103) [ClassicSimilarity], result of:
              0.06811286 = score(doc=3103,freq=2.0), product of:
                0.17604718 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05027291 = queryNorm
                0.38690117 = fieldWeight in 3103, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3103)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    27. 2.1999 20:55:22