Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 03. März 2020)
1Kim, Y. ; Seo, J. ; Croft, W.B. ; Smith, D.A.: Automatic suggestion of phrasal-concept queries for literature search.
In: Information processing and management. 50(2014) no.4, S.568-583.
Abstract: Both general and domain-specific search engines have adopted query suggestion techniques to help users formulate effective queries. In the specific domain of literature search (e.g., finding academic papers), the initial queries are usually based on a draft paper or abstract, rather than short lists of keywords. In this paper, we investigate phrasal-concept query suggestions for literature search. These suggestions explicitly specify important phrasal concepts related to an initial detailed query. The merits of phrasal-concept query suggestions for this domain are their readability and retrieval effectiveness: (1) phrasal concepts are natural for academic authors because of their frequent use of terminology and subject-specific phrases and (2) academic papers describe their key ideas via these subject-specific phrases, and thus phrasal concepts can be used effectively to find those papers. We propose a novel phrasal-concept query suggestion technique that generates queries by identifying key phrasal-concepts from pseudo-labeled documents and combines them with related phrases. Our proposed technique is evaluated in terms of both user preference and retrieval effectiveness. We conduct user experiments to verify a preference for our approach, in comparison to baseline query suggestion methods, and demonstrate the effectiveness of the technique with retrieval experiments.
Inhalt: Vgl.: doi: 10.1016/j.ipm.2014.03.003.
2Croft, W.B. ; Metzler, D. ; Strohman, T.: Search engines : information retrieval in practice.
Boston : Addison-Wesley, 2010. xxv, 524 S.
Abstract: For introductory information retrieval courses at the undergraduate and graduate level in computer science, information science and computer engineering departments. Written by a leader in the field of information retrieval, Search Engines: Information Retrieval in Practice, is designed to give undergraduate students the understanding and tools they need to evaluate, compare and modify search engines. Coverage of the underlying IR and mathematical models reinforce key concepts. The book's numerous programming exercises make extensive use of Galago, a Java-based open source search engine. SUPPLEMENTS / Extensive lecture slides (in PDF and PPT format) / Solutions to selected end of chapter problems (Instructors only) / Test collections for exercises / Galago search engine
LCSH: Search engines / Programming ; Information retrieval ; Information Storage and Retrieval ; Knowledge Bases
RSWK: Suchmaschine / Information Retrieval
BK: 54.64 / Datenbanken ; 54.75 / Sprachverarbeitung
; 06.74 / Informationssysteme
DDC: 025.04 ; 005.75/8
GHBS: TWX (FH K)
LCC: TK 5105.884 .C765 2010
RVK: ST 205 ; ST 270
3Xiaoyan Li, X. ; Croft, W.B.: ¬An information-pattern-based approach to novelty detection.
In: Information processing and management. 44(2008) no.3, S.1159-1188.
Abstract: In this paper, a new novelty detection approach based on the identification of sentence level information patterns is proposed. First, "novelty" is redefined based on the proposed information patterns, and several different types of information patterns are given corresponding to different types of users' information needs. Second, a thorough analysis of sentence level information patterns is elaborated using data from the TREC novelty tracks, including sentence lengths, named entities (NEs), and sentence level opinion patterns. Finally, a unified information-pattern-based approach to novelty detection (ip-BAND) is presented for both specific NE topics and more general topics. Experiments on novelty detection on data from the TREC 2002, 2003 and 2004 novelty tracks show that the proposed approach significantly improves the performance of novelty detection in terms of precision at top ranks. Future research directions are suggested.
4Murdock, V. ; Kelly, D. ; Croft, W.B. ; Belkin, N.J. ; Yuan, X.: Identifying and improving retrieval for procedural questions.
In: Information processing and management. 43(2007) no.1, S.181-203.
Abstract: People use questions to elicit information from other people in their everyday lives and yet the most common method of obtaining information from a search engine is by posing keywords. There has been research that suggests users are better at expressing their information needs in natural language, however the vast majority of work to improve document retrieval has focused on queries posed as sets of keywords or Boolean queries. This paper focuses on improving document retrieval for the subset of natural language questions asking about how something is done. We classify questions as asking either for a description of a process or asking for a statement of fact, with better than 90% accuracy. Further we identify non-content features of documents relevant to questions asking about a process. Finally we demonstrate that we can use these features to significantly improve the precision of document retrieval results for questions asking about a process. Our approach, based on exploiting the structure of documents, shows a significant improvement in precision at rank one for questions asking about how something is done.
7Liu, X. ; Croft, W.B.: Statistical language modeling for information retrieval.
In: Annual review of information science and technology. 39(2005), S.3-32.
Abstract: This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.
Themenfeld: Literaturübersicht ; Computerlinguistik
8Luk, R.W.P. ; Leong, H.V. ; Dillon, T.S. ; Chan, A.T.S. ; Croft, W.B. ; Allen, J.: ¬A survey in indexing and searching XML documents.
In: Journal of the American Society for Information Science and technology. 53(2002) no.6, S.415-437.
Abstract: XML holds the promise to yield (1) a more precise search by providing additional information in the elements, (2) a better integrated search of documents from heterogeneous sources, (3) a powerful search paradigm using structural as well as content specifications, and (4) data and information exchange to share resources and to support cooperative search. We survey several indexing techniques for XML documents, grouping them into flatfile, semistructured, and structured indexing paradigms. Searching techniques and supporting techniques for searching are reviewed, including full text search and multistage search. Because searching XML documents can be very flexible, various search result presentations are discussed, as well as database and information retrieval system integration and XML query languages. We also survey various retrieval models, examining how they would be used or extended for retrieving XML documents. To conclude the article, we discuss various open issues that XML poses with respect to information retrieval and database research.
9Croft, W.B.: Advances in information retrieval : Recent research from the Center for Intelligent Information Retrieval.
Boston, MA : Kluwer Academic Publ., 2000. XI, 306 S.
(The Kluwer international series on information retrieval; 7)
Inhalt: Enthält die Beiträge: CROFT, W.B.: Combining approaches to information retrieval; GREIFF, W.R.: The use of exploratory data analysis in information retrieval research; PONTE, J.M.: Language models for relevance feedback; PAPKA, R. u. J. ALLAN: Topic detection and tracking: event clustering as a basis for first story detection; CALLAN, J.: Distributed information retrieval; XU, J. u. W.B. CROFT: Topic-based language models for ditributed retrieval; LU, Z. u. K.S. McKINLEY: The effect of collection organization and query locality on information retrieval system performance; BALLESTEROS, L.A.: Cross-language retrieval via transitive translation; SANDERSON, M. u. D. LAWRIE: Building, testing, and applying concept hierarchies; RAVELA, S. u. C. LUO: Appearance-based global similarity retrieval of images
Anmerkung: Information retrieval - Relevanz - Information Retrieval Systeme - Verteilte Systeme - Multimedia - Bildverarbeitung
LCSH: Database management ; Multimedia systems ; Information retrieval
DDC: 005.74 / dc21
LCC: QA76.9.D3 A34835 2000
10Croft, W.B.: Combining approaches to information retrieval.
In: Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Ed.: W.B. Croft. Boston, MA : Kluwer Academic Publ., 2000. S.1-36.
(The Kluwer international series on information retrieval; 7)
Abstract: The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the "meta-search" engines used on the Web. This paper examines the development of this technique, including both experimental results and the retrieval models that have been proposed as formal frameworks for combination. We show that combining approaches for information retrieval can be modeled as combining the outputs of multiple classifiers based on one or more representations, and that this simple model can provide explanations for many of the experimental results. We also show that this view of combination is very similar to the inference net model, and that a new approach to retrieval based on language models supports combination and can be integrated with the inference net model
Themenfeld: Verteilte bibliographische Datenbanken ; Suchmaschinen
11Xu, J. ; Croft, W.B.: Topic-based language models for distributed retrieval.
In: Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Ed.: W.B. Croft. Boston, MA : Kluwer Academic Publ., 2000. S.151-172.
(The Kluwer international series on information retrieval; 7)
Abstract: Effective retrieval in a distributed environment is an important but difficult problem. Lack of effectiveness appears to have two major causes. First, existing collection selection algorithms do not work well on heterogeneous collections. Second, relevant documents are scattered over many collections and searching a few collections misses many relevant documents. We propose a topic-oriented approach to distributed retrieval. With this approach, we structure the document set of a distributed retrieval environment around a set of topics. Retrieval for a query involves first selecting the right topics for the query and then dispatching the search process to collections that contain such topics. The content of a topic is characterized by a language model. In environments where the labeling of documents by topics is unavailable, document clustering is employed for topic identification. Based on these ideas, three methods are proposed to suit different environments. We show that all three methods improve effectiveness of distributed retrieval
Themenfeld: Verteilte bibliographische Datenbanken
12Ballesteros, L. ; Croft, W.B.: Statistical methods for cross-language information retrieval.
In: Cross-language information retrieval. Ed.: G. Grefenstette. Boston, MA : Kluwer Academic Publ., 1998. S.41-50.
(The Kluwer International series on information retrieval)
Themenfeld: Multilinguale Probleme
13Croft, W.B.: What do people want from information retrieval?.
In: From classification to 'knowledge organization': Dorking revisited or 'past is prelude'. A collection of reprints to commemorate the firty year span between the Dorking Conference (First International Study Conference on Classification Research 1957) and the Sixth International Study Conference on Classification Research (London 1997). Ed.: A. Gilchrist. The Hague : International Federation for Information and Documentation (FID), 1997. S.181-185.
(FID publication; no.714)(FIC occasional paper; no.14)
Anmerkung: Wiederabdruck aus: D-LIB Magazine, Nov. 1995
15Allan, J. ; Callan, J.P. ; Croft, W.B. ; Ballesteros, L. ; Broglio, J. ; Xu, J. ; Shu, H.: INQUERY at TREC-5.
In: The Fifth Text Retrieval Conference (TREC-5). Ed.: E.M. Voorhees u. D.K. Harman. Gaithersburgh, MD : National Institute of Standards and Technology, 1997. S.191-197.
(NIST special publication;)
Objekt: TREC ; INQUERY
16Shneiderman, B. ; Byrd, D. ; Croft, W.B.: Clarifying search : a user-interface framework for text searches.
In: D-Lib magazine. 3(1997) no.1, xx S.
Abstract: Current user interfaces for textual database searching leave much to be desired: individually, they are often confusing, and as a group, they are seriously inconsistent. We propose a four- phase framework for user-interface design: the framework provides common structure and terminology for searching while preserving the distinct features of individual collections and search mechanisms. Users will benefit from faster learning, increased comprehension, and better control, leading to more effective searches and higher satisfaction.
Anmerkung: Vgl.: http://dlib.ukoln.ac.uk/dlib/january97/retrieval/01shneiderman.html.
Themenfeld: Suchoberflächen ; OPAC ; Suchmaschinen
17Allan, J. ; Ballesteros, L. ; Callan, J.P. ; Croft, W.B. ; Lu, Z.: Recent experiment with INQUERY.
In: The Fourth Text Retrieval Conference (TREC-4). Ed.: K. Harman. Gaithersburgh, MD : National Institute of Standards and Technology, 1996. S.49-63.
(NIST special publication; 500-236)
Objekt: INQUERY ; TREC
18Croft, W.B.: What do people want from information retrieval? : the top 10 research issues for companies that use and sell IR systems.
Anmerkung: Ebenso in: D-LIB Magazine, November 1995
19Croft, W.B.: Effective retrieval based on combining evidence from the corpus and users.
In: IEEE expert. 10(1995) no.6, S.59-63.
Abstract: Inquery is a text retrieval system that is the basis of a number of WWW applications, including the Thomas system supported by the Library of Congress. Surveys the representation, query processing, and retrieval techniques used in the system. By combining evidence about relevance from the corpus, individual documents, and users, Inquery achieves effective overall recall and precision evaluation while avoiding occasional major failures
Anmerkung: Vgl. auch: http://ciir.cs.umass.edu/inqueryhomepage.html
20Callan, J. ; Croft, W.B. ; Broglio, J.: TREC and TIPSTER experiments with INQUERY.
In: Information processing and management. 31(1995) no.3, S.327-343.
Anmerkung: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.436-439.
Objekt: TREC ; TIPSTER ; INQUERY