Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 28. April 2022)
1Sparck Jones, K.: Automatic summarising : the state of the art.
In: Information processing and management. 43(2007) no.6, S.1449-1481.
Abstract: This paper reviews research on automatic summarising in the last decade. This work has grown, stimulated by technology and by evaluation programmes. The paper uses several frameworks to organise the review, for summarising itself, for the factors affecting summarising, for systems, and for evaluation. The review examines the evaluation strategies applied to summarising, the issues they raise, and the major programmes. It considers the input, purpose and output factors investigated in recent summarising research, and discusses the classes of strategy, extractive and non-extractive, that have been explored, illustrating the range of systems built. The conclusions drawn are that automatic summarisation has made valuable progress, with useful applications, better evaluation, and more task understanding. But summarising systems are still poorly motivated in relation to the factors affecting them, and evaluation needs taking much further to engage with the purposes summaries are intended to serve and the contexts in which they are used.
Themenfeld: Automatisches Abstracting
2Sparck Jones, K.: Revisiting classification for retrieval.
In: Journal of documentation. 61(2005) no.5, S.598-601.
Abstract: Purpose - This short note seeks to respond to Hjørland and Pederson's paper "A substantive theory of classification for information retrieval" which starts from Sparck Jones's, "Some thoughts on classification for retrieval", originally published in 1970. Design/methodology/approach - The note comments on the context in which the 1970 paper was written, and on Hjørland and Pedersen's views, emphasising the need for well-grounded classification theory and application. Findings - The note maintains that text-based, a posteriori, classification, as increasingly found in applications, is likely to be more useful, in general, than a priori classification. Originality/value - The note elaborates on points made in a well-received earlier paper.
Anmerkung: Vgl. auch unter: http://www.emeraldinsight.com/10.1108/00220410510625813
Themenfeld: Klassifikationssysteme im Online-Retrieval
3Sparck Jones, K.: Some thoughts on classification for retrieval.
In: Journal of documentation. 61(2005) no.5, S.571-581.
Abstract: Purpose - This paper was originally published in 1970 (Journal of documentation. 26(1970), S.89-101), considered the suggestion that classifications for retrieval should be constructed automatically and raised some serious problems concerning the sorts of classification which were required, and the way in which formal classification theories should be exploited, given that a retrieval classification is required for a purpose. These difficulties had not been sufficiently considered, and the paper, therefore, aims to attempt an analysis of them, though no solutions of immediate application could be suggested. Design/methodology/approach - Starting with the illustrative proposition that a polythetic, multiple, unordered classification is required in automatic thesaurus construction, this is considered in the context of classification in general, where eight sorts of classification can be distinguished, each covering a range of class definitions and class-finding algorithms. Findings - Since there is generally no natural or best classification of a set of objects as such, the evaluation of alternative classifications requires either formal criteria of goodness of fit, or, if a classification is required for a purpose, a precise statement of that purpose. In any case a substantive theory of classification is needed, which does not exist; and, since sufficiently precise specifications of retrieval requirements are also lacking, the only currently available approach to automatic classification experiments for information retrieval is to do enough of them. Originality/value - Gives insights into the classification of material for information retrieval.
Anmerkung: Vgl. auch unter: http://www.emeraldinsight.com/10.1108/00220410510625796
Themenfeld: Klassifikationssysteme im Online-Retrieval
5Sparck Jones, K.: ¬A statistical interpretation of term specificity and its application in retrieval.
In: Journal of documentation. 60(2004) no.5, S.493-502.
Abstract: The exhaustivity of document descriptions and the specificity of index terms are usually regarded as independent. It is suggested that specificity should be interpreted statistically, as a function of term use rather than of term meaning. The effects on retrieval of variations in term specificity are examined, experiments with three test collections showing, in particular, that frequently-occurring terms are required for good overall performance. It is argued that terms should be weighted according to collection frequency, so that matches on less frequent, more specific, terms are of greater value than matches on frequent terms. Results for the test collections show that considerable improvements in performance are obtained with this very simple procedure.
Anmerkung: Vgl. auch unter: http://www.emeraldinsight.com/10.1108/00220410410560573.
6Sparck Jones, K.: IDF term weighting and IR research lessons.
In: Journal of documentation. 60(2004) no.5, S.521-523.
Abstract: Robertson comments on the theoretical status of IDF term weighting. Its history illustrates how ideas develop in a specific research context, in theory/experiment interaction, and in operational practice.
Anmerkung: Vgl. auch unter:http://www.emeraldinsight.com/10.1108/00220410410560591.
10Lewis, D.D. ; Sparck Jones, K.: Natural language processing for information retrieval.
In: From classification to 'knowledge organization': Dorking revisited or 'past is prelude'. A collection of reprints to commemorate the firty year span between the Dorking Conference (First International Study Conference on Classification Research 1957) and the Sixth International Study Conference on Classification Research (London 1997). Ed.: A. Gilchrist. The Hague : International Federation for Information and Documentation (FID), 1997. S.49-61.
(FID publication; no.714)(FIC occasional paper; no.14)
Anmerkung: Wiederabdruck aus: Communications of the ACM 39(1996) no.1, S.92-101
11Sparck Jones, K.: Reflections on TREC.
In: From classification to 'knowledge organization': Dorking revisited or 'past is prelude'. A collection of reprints to commemorate the firty year span between the Dorking Conference (First International Study Conference on Classification Research 1957) and the Sixth International Study Conference on Classification Research (London 1997). Ed.: A. Gilchrist. The Hague : International Federation for Information and Documentation (FID), 1997. S.101-125.
(FID publication; no.714)(FIC occasional paper; no.14)
Abstract: This paper discusses the Text REtrieval Conferences (TREC) programme as a major enterprise in information retrieval research. It reviews its structure as an evaluation exercise, characterises the methods of indexing and retrieval being tested within its terms of the approaches to system performance factors these represent; analyses the test results for solid, overall conclusions that can be drawn from them; and, in the light of the particular features of the test data, assesses TREC both for generally applicable findings that emerge from it and for directions it offers for future research
Anmerkung: Wiederabdruck aus: Information processing and management 31(1995) no.3, S.192-314
13Robertson, S.E. ; Sparck Jones, K.: Simple, proven approaches to text retrieval.May, 1997, Update of 1994 and 1996 versions.
(Technical Report TR356, University of Cambridge, Computer Laboratory)
Abstract: This technical note describes straightforward techniques for document indexing and retrieval that have been solidly established through extensive testing and are easy to apply. They are useful for many different types of text material, are viable for very large files, and have the advantage that they do not require special skills or training for searching, but are easy for end users. The document and text retrieval methods described here have a sound theoretical basis, are well established by extensive testing, and the ideas involved are now implemented in some commercial retrieval systems. Testing in the last few years has, in particular, shown that the methods presented here work very well with full texts, not only title and abstracts, and with large files of texts containing three quarters of a million documents. These tests, the TREC Tests (see Harman 1993 - 1997; IP&M 1995), have been rigorous comparative evaluations involving many different approaches to information retrieval. These techniques depend an the use of simple terms for indexing both request and document texts; an term weighting exploiting statistical information about term occurrences; an scoring for request-document matching, using these weights, to obtain a ranked search output; and an relevance feedback to modify request weights or term sets in iterative searching. The normal implementation is via an inverted file organisation using a term list with linked document identifiers, plus counting data, and pointers to the actual texts. The user's request can be a word list, phrases, sentences or extended text.
Anmerkung: ; Auch unter: http://www.ftp.cl.cam.ac.uk/ftp/papers/reports/.
Themenfeld: Retrievalalgorithmen ; Retrievalstudien
15Sparck Jones, K. ; Jones, G.J.F. ; Foote, J.T. ; Young, S.J.: Experiments in spoken document retrieval.
In: Information processing and management. 32(1996) no.4, S.399-417.
Abstract: Describes experiments in the retrieval of spoken documents in multimedia systems. Speech documents pose a particular problem for retrieval since their words as well as contents are unknown. Addresses this problem, for a video mail application, by combining state of the art speech recognition with established document retrieval technologies so as to provide an effective and efficient retrieval tool. Tests with a small spoken message collection show that retrieval precision for the spoken file can reach 90% of that obtained when the same file is used, as a benchmark, in text transcription form
Anmerkung: Wiederabdruck in: Readings in informatio retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.493-502.
Behandelte Form: Worttonträger
16Sparck Jones, K. ; Galliers, J.R.: Evaluating natural language processing systems : an analysis and review.
Berlin : Springer, 1996. XV,228 S.
(Lecture notes in artificial intelligence; vol.1083)
Abstract: This comprehensive state-of-the-art book is the first devoted to the important and timely issue of evaluating NLP systems. It addresses the whole area of NLP system evaluation, including aims and scope, problems and methodology. The authors provide a wide-ranging and careful analysis of evaluation concepts, reinforced with extensive illustrations; they relate systems to their environments and develop a framework for proper evaluation. The discussion of principles is completed by a detailed review of practice and strategies in the field, covering both systems for specific tasks, like translation, and core language processors. The methodology lessons drawn from the analysis and review are applied in a series of example cases. A comprehensive bibliography, a subject index, and term glossary are included
Anmerkung: Rez. in: Machine translation 12(1997) no.4, S.375-379 (D. Estival)
17Sparck Jones, K.: Reflections on TREC : TREC-2.
In: Information processing and management. 31(1995) no.3, S.291-314.
Abstract: Discusses the TREC programme as a major enterprise in information retrieval research. It reviews its structure as an evaluation exercise, characterises the methods of indexing and retrieval being tested within it in terms of the approaches to system performance factors these represent; analyses the test results for solid, overall conclusions that can be drawn from them; and, in the light of the particular features of the test data, assesses TREC both for generally applicable findings that emerge from it and for directions it offers for future research
18Sparck Jones, K. ; Endres-Niggemeyer, B.: Introduction: automatic summarizing.
In: Information processing and management. 31(1995) no.5, S.625-630.
Abstract: Automatic summarizing is a research topic whose time has come. The papers illustrate some of the relevant work already under way. Places these papers in their wider context: why research and development on automatic summarizing is timely, what areas of work and ideas it should draw on, how future investigations and experiments can be effectively framed
Themenfeld: Automatisches Abstracting
19Sparck Jones, K.: ¬The role of artificial intelligence in information retrieval.
In: Journal of the American Society for Information Science. 42(1991) no.8, S.558-565.
Abstract: Presents a view of the scope of artificial intelligence (AI) in information retrieval (IR). Considers potential roles of AI and IR, evaluating AI from a realistic point od view and within a wide information management potential, not just because AI is itself insufficiently developed, but because many information management tasks are properly shallow information processing ones. There is nevertheless an important place for specific applications of AI or AI-derived technology when particular constraints can be placed on the information management tasks involved
20Sparck Jones, K.: Fashionable trends and feasible strategies in information management.
In: Information processing and management. 24(1988), S.703-711.
Abstract: This article analyzes current trends in information management, considers the problems they involve, and suggests some strategies for tackling these problems. The current goal is integrated, personalized information systems, to be reached via artificial intelligence. The argument is that the extent to which this goal can be achieved is limited because these systems are intrinsically heterogeneous, are for access to information, and deal in linguistically expressed information; so the best strategy for building the systems that can be attained is via linguisticallay oriented knowledge and inference. Evaluating these systems also presents problems because each use is unique, but evaluation is much needed and large-sample strategies for performance study can be devised.