Literatur zur Informationserschließung
Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft
/
Powered by litecat, BIS Oldenburg
(Stand: 28. April 2022)
Suche
Suchergebnisse
Treffer 1–13 von 13
sortiert nach:
-
1Yang, T.-H. ; Hsieh, Y.-L. ; Liu, S.-H. ; Chang, Y.-C. ; Hsu, W.-L.: ¬A flexible template generation and matching method with applications for publication reference metadata extraction.
In: Journal of the Association for Information Science and Technology. 72(2021) no.1, S.32-45.
Abstract: Conventional rule-based approaches use exact template matching to capture linguistic information and necessarily need to enumerate all variations. We propose a novel flexible template generation and matching scheme called the principle-based approach (PBA) based on sequence alignment, and employ it for reference metadata extraction (RME) to demonstrate its effectiveness. The main contributions of this research are threefold. First, we propose an automatic template generation that can capture prominent patterns using the dominating set algorithm. Second, we devise an alignment-based template-matching technique that uses a logistic regression model, which makes it more general and flexible than pure rule-based approaches. Last, we apply PBA to RME on extensive cross-domain corpora and demonstrate its robustness and generality. Experiments reveal that the same set of templates produced by the PBA framework not only deliver consistent performance on various unseen domains, but also surpass hand-crafted knowledge (templates). We use four independent journal style test sets and one conference style test set in the experiments. When compared to renowned machine learning methods, such as conditional random fields (CRF), as well as recent deep learning methods (i.e., bi-directional long short-term memory with a CRF layer, Bi-LSTM-CRF), PBA has the best performance for all datasets.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24391.
Themenfeld: Automatisches Indexieren ; Metadaten
-
2Wu, S. ; Liu, S. ; Wang, Y. ; Timmons, T. ; Uppili, H. ; Bedrick, S. ; Hersh, W. ; Liu, H,: Intrainstitutional EHR collections for patient-level information retrieval.
In: Journal of the Association for Information Science and Technology. 68(2017) no.11, S.2636-2648.
Abstract: Research in clinical information retrieval has long been stymied by the lack of open resources. However, both clinical information retrieval research innovation and legitimate privacy concerns can be served by the creation of intrainstitutional, fully protected resources. In this article, we provide some principles and tools for information retrieval resource-building in the unique problem setting of patient-level information retrieval, following the tradition of the Cranfield paradigm. We further include an analysis of parallel information retrieval resources at Oregon Health & Science University and Mayo Clinic that were built on these principles.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23884/full.
Anmerkung: Beitrag in einem Special issue on biomedical information retrieval.
Wissenschaftsfach: Medizin
-
3Liu, S. ; Chen, C.: ¬The differences between latent topics in abstracts and citation contexts of citing papers.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.3, S.627-639.
Abstract: Although it is commonly expected that the citation context of a reference is likely to provide more detailed and direct information about the nature of a citation, few studies in the literature have specifically addressed the extent to which the information in different parts of a scientific publication differs. Do abstracts tend to use conceptually broader terms than sentences in a citation context in the body of a publication? In this article, we propose a method to analyze and compare latent topics in scientific publications, in particular, from abstracts of papers that cited a target reference and from sentences that cited the target reference. We conducted an experiment and applied topical modeling techniques to full-text papers in eight biomedicine journals. Topics derived from the two sources are compared in terms of their similarities and broad-narrow relationships defined based on information entropy. The results show that abstracts and citation contexts are characterized by distinct sets of topics with moderate overlaps. Furthermore, the results confirm that topics from abstracts of citing papers have broader terms than topics from citation contexts formed by citing sentences. The method and the findings could be used to enhance and extend the current methodologies for research evaluation and citation evaluation.
Themenfeld: Informetrie
-
4Wei, F. ; Li, W. ; Liu, S.: iRANK: a rank-learn-combine framework for unsupervised ensemble ranking.
In: Journal of the American Society for Information Science and Technology. 61(2010) no.6, S.1232-1243.
Abstract: The authors address the problem of unsupervised ensemble ranking. Traditional approaches either combine multiple ranking criteria into a unified representation to obtain an overall ranking score or to utilize certain rank fusion or aggregation techniques to combine the ranking results. Beyond the aforementioned combine-then-rank and rank-then-combine approaches, the authors propose a novel rank-learn-combine ranking framework, called Interactive Ranking (iRANK), which allows two base rankers to teach each other before combination during the ranking process by providing their own ranking results as feedback to the others to boost the ranking performance. This mutual ranking refinement process continues until the two base rankers cannot learn from each other any more. The overall performance is improved by the enhancement of the base rankers through the mutual learning mechanism. The authors further design two ranking refinement strategies to efficiently and effectively use the feedback based on reasonable assumptions and rational analysis. Although iRANK is applicable to many applications, as a case study, they apply this framework to the sentence ranking problem in query-focused summarization and evaluate its effectiveness on the DUC 2005 and 2006 data sets. The results are encouraging with consistent and promising improvements.
Themenfeld: Retrievalalgorithmen
Objekt: iRANK
-
5Cao, N. ; Sun, J. ; Lin, Y.-R. ; Gotz, D. ; Liu, S. ; Qu, H.: FacetAtlas : Multifaceted visualization for rich text corpora.
In: IEEE Transactions on Visualization and Computer Graphics. InfoVis 2010. [http://systemg.research.ibm.com/apps/facetatlas/cao_infovis10_paper.pdf].
Abstract: Documents in rich text corpora usually contain multiple facets of information. For example, an article about a specific disease often consists of different facets such as symptom, treatment, cause, diagnosis, prognosis, and prevention. Thus, documents may have different relations based on different facets. Powerful search tools have been developed to help users locate lists of individual documents that are most related to specific keywords. However, there is a lack of effective analysis tools that reveal the multifaceted relations of documents within or cross the document clusters. In this paper, we present FacetAtlas, a multifaceted visualization technique for visually analyzing rich text corpora. FacetAtlas combines search technology with advanced visual analytical tools to convey both global and local patterns simultaneously. We describe several unique aspects of FacetAtlas, including (1) node cliques and multifaceted edges, (2) an optimized density map, and (3) automated opacity pattern enhancement for highlighting visual patterns, (4) interactive context switch between facets. In addition, we demonstrate the power of FacetAtlas through a case study that targets patient education in the health care domain. Our evaluation shows the benefits of this work, especially in support of complex multifaceted data analysis.
Inhalt: Vgl. auch: FacetAtlas: Visualizing multifaceted text documents as graphs. Unter: http://systemg.research.ibm.com/apps/facetatlas/index.html.
Themenfeld: Visualisierung ; Wissensrepräsentation ; Semantisches Umfeld in Indexierung u. Retrieval
Wissenschaftsfach: Medizin
Objekt: FacetAtlas ; InfoVis
-
6Liu, S. ; Liu, F. ; Yu, C. ; Meng, W.: ¬An effective approach to document retrieval via utilizing WordNet and recognizing phrases.
In: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a. New York, NY : ACM Press, 2004. S.266-272.
Themenfeld: Computerlinguistik
Objekt: WordNet
-
7Liu, S.Q. ; Shen, Z.G.: ¬The development of cataloging in China.
In: Historical aspects of cataloging and classification. Ed.: M.D. Joachim. New York : Haworth Information Press, 2003. S.137-154.
Anmerkung: Also published as Cataloging and Classification Quarterly, 35(2002/03)1/2 and 35(2002/03)3/4
Themenfeld: Geschichte der Kataloge
Land/Ort: China
-
8Liu, S. ; Shen, Z.: ¬The development of cataloging in China.
In: Cataloging and classification quarterly. 35(2002) nos.1/2, S.137-154.
Abstract: With a long history, cataloging has evolved with changes in society, economy, and technology in China. This paper presents Chinese cataloging history in four parts, with emphasis on the last two parts: the founding of the People's Republic of China in 1949 and the development of cataloging after 1979 when China opened its doors to the world. Particularly important has been the rapid growth of online cataloging in recent years. The China Academic Library and Information System (CALIS), as a successful online cataloging model, is emphasized. Through investigation of the entire history of Chinese cataloging, three distinct features can be stated: (1) Standardization- switching from the Chinese traditional way to aligning with international standards, (2) Cooperation-from decentralized and self-supporting systems to sharing systems, (3) Computerization and networking-from manual operation to computer-based online operation. At the end of this paper, a set of means by which to enhance online cataloging and resource sharing is suggested.
Anmerkung: Beitrag eines Themenheftes: Historical aspects of cataloging and classification; Part I
Themenfeld: Geschichte der Kataloge
Land/Ort: China
-
9Liu, S.: Decomposing DDC synthesized numbers.
In: International cataloguing and bibliographic control. 26(1997) no.3, S.58-62.
Abstract: Some empirical studies have explored the direct use of traditional classification schemes in the online environment; none has manipulated these manual classifications in such a way as to take full advantage of the power of both the classification and computer. It has been suggested that this power could be realized if the individual components of synthesized DDC numbers could be identified and indexed. Looks at the feasibility of automatically decomposing DDC synthesized numbers and the implications of such decompositions for informational retrieval. 1.701 sythesized numbers were decomposed by a computer system called DND (Dewey Number Decomposer). 600 were randomly selected for examination by 3 judges, each evaluating 200 numbers. The decomposition success rate was 100% and it was concluded that synthesized DDC numbers can be accurately decomposed automatically. The study has implications for information retrieval, expert systems for assigning DDC numbers, automatic indexing, switching language development and other important areas of cataloguing and classification
Anmerkung: Bezug zu: Liu, Songqiao. "The Automatic Decomposition of DDC Synthesized Numbers." Ph.D. diss., University of California, Los Angeles, 1993.
Themenfeld: Klassifikationssysteme im Online-Retrieval
Objekt: DDC
-
10Liu, S.: Decomposing DDC synthesized numbers.
In: http://www.ifla.org/IV/ifla62/62-sonl.htm.
Abstract: Much literature has been written speculating upon how classification can be used in online catalogs to improve information retrieval. While some empirical studies have been done exploring whether the direct use of traditional classification schemes designed for a manual environment is effective and efficient in the online environment, none has manipulated these manual classifications in such a w ay as to take full advantage of the power of both the classification and computer. It has been suggested by some authors, such as Wajenberg and Drabenstott, that this power could be realized if the individual components of synthesized DDC numbers could be identified and indexed. This paper looks at the feasibility of automatically decomposing DDC synthesized numbers and the implications of such decomposition for information retrieval. Based on an analysis of the instructions for synthesizing numbers in the main class Arts (700) and all DDC Tables, 17 decomposition rules were defined, 13 covering the Add Notes and four the Standard Subdivisions. 1,701 DDC synthesized numbers were decomposed by a computer system called DND (Dewey Number Decomposer), developed by the author. From the 1,701 numbers, 600 were randomly selected fo r examination by three judges, each evaluating 200 numbers. The decomposition success rate was 100% and it was concluded that synthesized DDC numbers can be accurately decomposed automatically. The study has implications for information retrieval, expert systems for assigning DDC numbers, automatic indexing, switching language development, enhancing classifiers' work, teaching library school students, and providing quality control for DDC number assignments. These implications were explored using a prototype retrieval system.
Inhalt: Bezug zu: Liu, Songqiao. "The Automatic Decomposition of DDC Synthesized Numbers." Ph.D. diss., University of California, Los Angeles, 1993.
Themenfeld: Klassifikationssysteme im Online-Retrieval
Objekt: DDC
-
11Svenonius, E. ; Liu, S. ; Subrahmanyam, B.: Automation of chain indexing.
In: Classification research for knowledge representation and organization. Proc. 5th Int. Study Conf. on Classification Research, Toronto, Canada, 24.-28.6.1991. Ed. by N.J. Williamson u. M. Hudon. Amsterdam : Elsevier, 1992. S.351-364.
(FID; 698)
Abstract: The last several years have seen the evolution of prototype systems exploiting the use of the Dewey Decimal Classification (DDC) as an interface to online catalogs. One such system, calles DORS (Dewy Online Retrieval System) was developed at the University of California, Los Angeles by the authors. The feature distinguishing this system is an automatically generated chain index, in particular the algorithms that were created for its automatic generation and the problems that were encountered. The problems were of three kinds: those that were overcome, but were not for lack of time and resources and those that we believe cannot be overcome. The paper concludes with suggestions for future resaerch and possible formatting changes to the DDC feature headings that would facilitate chain-index generation
Themenfeld: Klassifikationssysteme im Online-Retrieval
Objekt: DDC ; DORS ; Chain indexing
-
12Liu, S. ; Svenonius, E.: DORS: DDC online retrieval system.
In: Library resources and technical services. 35(1991), S.359-375.
Abstract: A model system, the Dewey Online Retrieval System (DORS), was implemented as an interface to an online catalog for the purpose of experimenting with classification-based search strategies and generally seeking further understanding of the role of traditional classifications in automated information retrieval. Specifications for a classification retrieval interface were enumerated and rationalized and the system was developed in accordance with them. The feature that particularly distinguishes the system and enables it to meet its stated specifications is an automatically generated chain index
Themenfeld: Klassifikationssysteme im Online-Retrieval
Objekt: DDC ; DORS
-
13Liu, S.: Online classification notation : proposal for a flexible faceted notation system (FFNS).
In: International classification. 17(1990), S.14-20.
Themenfeld: Notationen / Signaturen