Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 16. Dezember 2019)
1Ferret, O. ; Grau, B. ; Hurault-Plantet, M. ; Illouz, G. ; Jacquemin, C. ; Monceaux, L. ; Robba, I. ; Vilnat, A.: How NLP can improve question answering.
In: Knowledge organization. 29(2002) nos.3/4, S.135-155.
Abstract: Answering open-domain factual questions requires Natural Language processing for refining document selection and answer identification. With our system QALC, we have participated in the Question Answering track of the TREC8, TREC9 and TREC10 evaluations. QALC performs an analysis of documents relying an multiword term searches and their linguistic variation both to minimize the number of documents selected and to provide additional clues when comparing question and sentence representations. This comparison process also makes use of the results of a syntactic parsing of the questions and Named Entity recognition functionalities. Answer extraction relies an the application of syntactic patterns chosen according to the kind of information that is sought, and categorized depending an the syntactic form of the question. These patterns allow QALC to handle nicely linguistic variations at the answer level.
Themenfeld: Computerlinguistik ; Retrievalstudien ; Sprachretrieval
2Jacquemin, C.: Spotting and discovering terms through natural language processing.
Cambridge, MA : MIT Press, 2001. VIII, 378 S.
Abstract: In this book Christian Jacquemin shows how the power of natural language processing (NLP) can be used to advance text indexing and information retrieval (IR). Jacquemin's novel tool is FASTR, a parser that normalizes terms and recognizes term variants. Since there are more meanings in a language than there are words, FASTR uses a metagrammar composed of shallow linguistic transformations that describe the morphological, syntactic, semantic, and pragmatic variations of words and terms. The acquired parsed terms can then be applied for precise retrieval and assembly of information. The use of a corpus-based unification grammar to define, recognize, and combine term variants from their base forms allows for intelligent information access to, or "linguistic data tuning" of, heterogeneous texts. FASTR can be used to do automatic controlled indexing, to carry out content-based Web searches through conceptually related alternative query formulations, to abstract scientific and technical extracts, and even to translate and collect terms from multilingual material. Jacquemin provides a comprehensive account of the method and implementation of this innovative retrieval technique for text processing.
Anmerkung: Rez. in: KO 28(2001) no.3, S.152-154 (L. Da Sylva)
LCSH: Language and languages / Variation / Data processing ; Terms and phrases / Data processing
RSWK: Automatische Indexierung / Computerlinguistik / Information Retrieval ; Syntaktische Analyse (GBV) ; Textverstehendes System (HBZ) ; Computerlinguistik / Sprachvariante (HBZ)
BK: 54.75 ; 18.04 ; 17.52 ; 17.46
GHBS: BFP (FH K) ; BFP (DU) ; TZF (DU) ; TVV (DU)
LCC: P305.18.D38J33 2001
RVK: ES 965
3Jacquemin, C.: What is the tree that we see through the window : a linguistic approach to windowing and term variation.
In: Information processing and management. 32(1996) no.4, S.445-458.
Abstract: Provides a linguistic approach to text windowing through an extraction of term variants with the help of a partial parser. The syntactic grounding of the method ensures ehat words observed within restricted spans are lexically related and that spurious word cooccurrences are rules out with a good level of confidence. The system is computationally tractable on large corpora and large lists of terms. Gives illustrative examples of term variation from a large medical corpus. An experimental evaluation of the method shows that only a small proportion of co-occuring words are lexically related and motivates the call for natural language parsing techniques in text windowing