Literatur zur Informationserschließung
Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft
/
Powered by litecat, BIS Oldenburg
(Stand: 28. April 2022)
Suche
Suchergebnisse
Treffer 1–13 von 13
sortiert nach:
-
1Ma, R. ; Li, K.: Digital humanities as a cross-disciplinary battleground : an examination of inscriptions in journal publications.
In: Journal of the Association for Information Science and Technology. 73(2022) no.2, S.172-187.
(JASIST special issue on digital humanities (DH): A. Landscapes of DH)
Abstract: Inscriptions are defined as traces of scientific research production that are embodied in material artifacts and media, which encompass a wide variety of nonverbal forms such as graphs, diagrams, and tables. Inscription serves as a fundamental rhetorical device in research outputs and practices. As many inscriptions are deeply rooted in a scientific research paradigm, they can be used to evaluate the level of scientificity of a scientific field. This is specifically helpful to understand the relationships between research traditions in digital humanities (DH), a highly cross-disciplinary between various humanities and scientific traditions. This paper presents a quantitative, community-focused examination of how inscriptions are used in English-language research articles in DH journals. We randomly selected 252 articles published between 2011 and 2020 from a representative DH journal list, and manually classified the inscriptions and author domains in these publications. We found that inscriptions have been increasingly used during the past decade, and their uses are more intensive in publications led by STEM authors comparing to other domains. This study offers a timely survey of the disciplinary landscape of DH from the perspective of inscriptions and sheds light on how different research approaches collaborate and combat in the field of DH.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24534.
Themenfeld: Elektronisches Publizieren
Wissenschaftsfach: Geisteswissenschaften
-
2Wu, C. ; Yan, E. ; Zhu, Y. ; Li, K.: Gender imbalance in the productivity of funded projects : a study of the outputs of National Institutes of Health R01 grants.
In: Journal of the Association for Information Science and Technology. 72(2021) no.11, S.1386-1399.
Abstract: This study examines the relationship between team's gender composition and outputs of funded projects using a large data set of National Institutes of Health (NIH) R01 grants and their associated publications between 1990 and 2017. This study finds that while the women investigators' presence in NIH grants is generally low, higher women investigator presence is on average related to slightly lower number of publications. This study finds empirically that women investigators elect to work in fields in which fewer publications per million-dollar funding is the norm. For fields where women investigators are relatively well represented, they are as productive as men. The overall lower productivity of women investigators may be attributed to the low representation of women in high productivity fields dominated by men investigators. The findings shed light on possible reasons for gender disparity in grant productivity.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24487.
Themenfeld: Informetrie
Wissenschaftsfach: Medizin
Land/Ort: USA
-
3Li, K. ; Greenberg, J. ; Dunic, J.: Data objects and documenting scientific processes : an analysis of data events in biodiversity data papers.
In: Journal of the Association for Information Science and Technology. 71(2020) no.2, S.172-182.
Abstract: The data paper, an emerging scholarly genre, describes research data sets and is intended to bridge the gap between the publication of research data and scientific articles. Research examining how data papers report data events, such as data transactions and manipulations, is limited. The research reported on in this article addresses this limitation and investigated how data events are inscribed in data papers. A content analysis was conducted examining the full texts of 82 data papers, drawn from the curated list of data papers connected to the Global Biodiversity Information Facility. Data events recorded for each paper were organized into a set of 17 categories. Many of these categories are described together in the same sentence, which indicates the messiness of data events in the laboratory space. The findings challenge the degrees to which data papers are a distinct genre compared to research articles and they describe data-centric research processes in a through way. This article also discusses how our results could inform a better data publication ecosystem in the future.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24226.
-
4Yan, E. ; Chen, Z. ; Li, K.: Authors' status and the perceived quality of their work : measuring citation sentiment change in nobel articles.
In: Journal of the Association for Information Science and Technology. 71(2020) no.3, S.314-324.
Abstract: Prior research in status ordering has used numeric indicators to examine the impact of a status change on the perception of a scientist's work. This study measures the perception change directly as reflected in citation sentiment, with the attainment of a Nobel Prize in Chemistry or a Nobel Prize in Physiology or Medicine considered the status change. The article identifies 12,393 citances to 25 Nobel articles in PubMed Central and includes a control article set of 75 articles with 30,851 citances. The results show a moderate increase in citation sentiment toward Nobel articles postaward. Dynamically, for Nobel articles there is a steady sentiment increase, and a Nobel Prize seems to co-occur with this trend. This trend, however, is not evident in the control article set.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24237.
Themenfeld: Informetrie
-
5Zhao, M. ; Yan, E. ; Li, K.: Data set mentions and citations : a content analysis of full-text publications.
In: Journal of the Association for Information Science and Technology. 69(2018) no.1, S.32-46.
Abstract: This study provides evidence of data set mentions and citations in multiple disciplines based on a content analysis of 600 publications in PLoS One. We find that data set mentions and citations varied greatly among disciplines in terms of how data sets were collected, referenced, and curated. While a majority of articles provided free access to data, formal ways of data attribution such as DOIs and data citations were used in a limited number of articles. In addition, data reuse took place in less than 30% of the publications that used data, suggesting that researchers are still inclined to create and use their own data sets, rather than reusing previously curated data. This paper provides a comprehensive understanding of how data sets are used in science and helps institutions and publishers make useful data policies.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23919/full.
Themenfeld: Informetrie
Objekt: PLoS One
-
6Yan, E. ; Li, K.: Which domains do open-access journals do best in? : a 5-year longitudinal study.
In: Journal of the Association for Information Science and Technology. 69(2018) no.6, S.844-856.
Abstract: Although researchers have begun to investigate the difference in scientific impact between closed-access and open-access journals, studies that focus specifically on dynamic and disciplinary differences remain scarce. This study serves to fill this gap by using a large longitudinal dataset to examine these differences. Using CiteScore as a proxy for journal scientific impact, we employ a series of statistical tests to identify the quartile categories and disciplinary areas in which impact trends differ notably between closed- and open-access journals. We find that closed-access journals have a noticeable advantage in social sciences (for example, business and economics), whereas open-access journals perform well in medical and healthcare domains (for example, health profession and nursing). Moreover, we find that after controlling for a journal's rank and disciplinary differences, there are statistically more closed-access journals in the top 10%, Quartile 1, and Quartile 2 categories as measured by CiteScore; in contrast, more open-access journals in Quartile 4 gained scientific impact from 2011 to 2015. Considering dynamic and disciplinary trends in tandem, we find that more closed-access journals in Social Sciences gained in impact, whereas in biochemistry and medicine, more open-access journals experienced such gains.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/abs/10.1002/asi.24002.
Themenfeld: Informetrie
-
7Li, K.W. ; Yang, C.C.: Conceptual analysis of parallel corpus collected from the Web.
In: Journal of the American Society for Information Science and Technology. 57(2006) no.5, S.632-644.
Abstract: As illustrated by the World Wide Web, the volume of information in languages other than English has grown significantly in recent years. This highlights the importance of multilingual corpora. Much effort has been devoted to the compilation of multilingual corpora for the purpose of cross-lingual information retrieval and machine translation. Existing parallel corpora mostly involve European languages, such as English-French and English-Spanish. There is still a lack of parallel corpora between European languages and Asian. languages. In the authors' previous work, an alignment method to identify one-to-one Chinese and English title pairs was developed to construct an English-Chinese parallel corpus that works automatically from the World Wide Web, and a 100% precision and 87% recall were obtained. Careful analysis of these results has helped the authors to understand how the alignment method can be improved. A conceptual analysis was conducted, which includes the analysis of conceptual equivalent and conceptual information alternation in the aligned and nonaligned English-Chinese title pairs that are obtained by the alignment method. The result of the analysis not only reflects the characteristics of parallel corpora, but also gives insight into the strengths and weaknesses of the alignment method. In particular, conceptual alternation, such as omission and addition, is found to have a significant impact on the performance of the alignment method.
Anmerkung: Beitrag einer special topic section on multilingual information systems
Themenfeld: Multilinguale Probleme
-
8Li, K.W. ; Yang, C.C.: Automatic crosslingual thesaurus generated from the Hong Kong SAR Police Department Web Corpus for Crime Analysis.
In: Journal of the American Society for Information Science and Technology. 56(2005) no.3, S.272-281.
Abstract: For the sake of national security, very large volumes of data and information are generated and gathered daily. Much of this data and information is written in different languages, stored in different locations, and may be seemingly unconnected. Crosslingual semantic interoperability is a major challenge to generate an overview of this disparate data and information so that it can be analyzed, shared, searched, and summarized. The recent terrorist attacks and the tragic events of September 11, 2001 have prompted increased attention an national security and criminal analysis. Many Asian countries and cities, such as Japan, Taiwan, and Singapore, have been advised that they may become the next targets of terrorist attacks. Semantic interoperability has been a focus in digital library research. Traditional information retrieval (IR) approaches normally require a document to share some common keywords with the query. Generating the associations for the related terms between the two term spaces of users and documents is an important issue. The problem can be viewed as the creation of a thesaurus. Apart from this, terrorists and criminals may communicate through letters, e-mails, and faxes in languages other than English. The translation ambiguity significantly exacerbates the retrieval problem. The problem is expanded to crosslingual semantic interoperability. In this paper, we focus an the English/Chinese crosslingual semantic interoperability problem. However, the developed techniques are not limited to English and Chinese languages but can be applied to many other languages. English and Chinese are popular languages in the Asian region. Much information about national security or crime is communicated in these languages. An efficient automatically generated thesaurus between these languages is important to crosslingual information retrieval between English and Chinese languages. To facilitate crosslingual information retrieval, a corpus-based approach uses the term co-occurrence statistics in parallel or comparable corpora to construct a statistical translation model to cross the language boundary. In this paper, the text based approach to align English/Chinese Hong Kong Police press release documents from the Web is first presented. We also introduce an algorithmic approach to generate a robust knowledge base based an statistical correlation analysis of the semantics (knowledge) embedded in the bilingual press release corpus. The research output consisted of a thesaurus-like, semantic network knowledge base, which can aid in semanticsbased crosslingual information management and retrieval.
Anmerkung: Beitrag in einem Themenheft zu: 'Intelligence and security informatics'
Themenfeld: Multilinguale Probleme ; Konzeption und Anwendung des Prinzips Thesaurus ; Semantische Interoperabilität
-
9Yang, C.C. ; Li, K.W.: ¬A heuristic method based on a statistical approach for chinese text segmentation.
In: Journal of the American Society for Information Science and Technology. 56(2005) no.13, S.1438-1447.
Abstract: The authors propose a heuristic method for Chinese automatic text segmentation based an a statistical approach. This method is developed based an statistical information about the association among adjacent characters in Chinese text. Mutual information of bi-grams and significant estimation of tri-grams are utilized. A heuristic method with six rules is then proposed to determine the segmentation points in a Chinese sentence. No dictionary is required in this method. Chinese text segmentation is important in Chinese text indexing and thus greatly affects the performance of Chinese information retrieval. Due to the lack of delimiters of words in Chinese text, Chinese text segmentation is more difficult than English text segmentation. Besides, segmentation ambiguities and occurrences of out-of-vocabulary words (i.e., unknown words) are the major challenges in Chinese segmentation. Many research studies dealing with the problem of word segmentation have focused an the resolution of segmentation ambiguities. The problem of unknown word identification has not drawn much attention. The experimental result Shows that the proposed heuristic method is promising to segment the unknown words as weIl as the known words. The authors further investigated the distribution of the errors of commission and the errors of omission caused by the proposed heuristic method and benchmarked the proposed heuristic method with a previous proposed technique, boundary detection. It is found that the heuristic method outperformed the boundary detection method.
-
10Yang, C.C. ; Li, K.W.: Automatic construction of English/Chinese parallel corpora.
In: Journal of the American Society for Information Science and technology. 54(2003) no.8, S.730-742.
Abstract: As the demand for global information increases significantly, multilingual corpora has become a valuable linguistic resource for applications to cross-lingual information retrieval and natural language processing. In order to cross the boundaries that exist between different languages, dictionaries are the most typical tools. However, the general-purpose dictionary is less sensitive in both genre and domain. It is also impractical to manually construct tailored bilingual dictionaries or sophisticated multilingual thesauri for large applications. Corpusbased approaches, which do not have the limitation of dictionaries, provide a statistical translation model with which to cross the language boundary. There are many domain-specific parallel or comparable corpora that are employed in machine translation and cross-lingual information retrieval. Most of these are corpora between Indo-European languages, such as English/French and English/Spanish. The Asian/Indo-European corpus, especially English/Chinese corpus, is relatively sparse. The objective of the present research is to construct English/ Chinese parallel corpus automatically from the World Wide Web. In this paper, an alignment method is presented which is based an dynamic programming to identify the one-to-one Chinese and English title pairs. The method includes alignment at title level, word level and character level. The longest common subsequence (LCS) is applied to find the most reliabie Chinese translation of an English word. As one word for a language may translate into two or more words repetitively in another language, the edit operation, deletion, is used to resolve redundancy. A score function is then proposed to determine the optimal title pairs. Experiments have been conducted to investigate the performance of the proposed method using the daily press release articles by the Hong Kong SAR government as the test bed. The precision of the result is 0.998 while the recall is 0.806. The release articles and speech articles, published by Hongkong & Shanghai Banking Corporation Limited, are also used to test our method, the precision is 1.00, and the recall is 0.948.
Themenfeld: Computerlinguistik
-
11Dilevko, J. ; Dali, K.: ¬The challenge of building multilingual collections in Canadian public libraries.
In: Library resources and technical services. 46(2002) no.4, S.116-137.
Abstract: A Web-based survey was conducted to determine the extent to which Canadian public libraries are collecting multilingual materials (foreign languages other than English and French), the methods that they use to select these materials, and whether public librarians are sufficiently prepared to provide their multilingual clientele with an adequate range of materials and services. There is room for improvement with regard to collection development of multilingual materials in Canadian public libraries, as well as in educating staff about keeping multilingual collections current, diverse, and of sufficient interest to potential users to keep such materials circulating. The main constraints preventing public libraries from developing better multilingual collections are addressed, and recommendations for improving the state of multilingual holdings are provided.
Themenfeld: Multilinguale Probleme
Land/Ort: CAN
Anwendungsfeld: Öffentliche Bibliotheken
-
12Broccoli, K. ; Ravenswaay, G.V.: Web indexing : anchors away!.Beyond book indexing: how to get started in Web indexing, embedded indexing and other computer-based media. Ed. by D. Brenner u. M. Rowland.
Phoenix, AZ : American Society of Indexers / Information Today, 2000. S.37-42.
Abstract: In this chapter we turn to embedded indexing for the Internet, frequently called Web indexing. We will define Web indexes; describe the structure of entries for Web indexes; present some of the challenges that Web indexers face; and compare Web indexes to search engines. One of the difficulties in defining Web indexes is their relative newness. The first pages were placed on the World Wide Web in 1991 when Tim Berners Lee, its founder, uploaded four files. We are in a period of transition, moving from using well-established forms of writing and communications to others that are still in their infancy. Paramount among these is the Web. For indexers, this is an uncharted voyage where we must jettison firmly established ideas while developing new ones. Where the voyage will end is anyone's guess.
Themenfeld: Register ; Internet
Objekt: WWW
-
13Yee, K.-P. ; Swearingen, K. ; Li, K. ; Hearst, M.: Faceted metadata for image search and browsing.
In: http://bailando.sims.berkeley.edu/papers/flamenco-chi03.pdf.
Abstract: There are currently two dominant interface types for searching and browsing large image collections: keywordbased search, and searching by overall similarity to sample images. We present an alternative based on enabling users to navigate along conceptual dimensions that describe the images. The interface makes use of hierarchical faceted metadata and dynamically generated query previews. A usability study, in which 32 art history students explored a collection of 35,000 fine arts images, compares this approach to a standard image search interface. Despite the unfamiliarity and power of the interface (attributes that often lead to rejection of new search interfaces), the study results show that 90% of the participants preferred the metadata approach overall, 97% said that it helped them learn more about the collection, 75% found it more flexible, and 72% found it easier to use than a standard baseline system. These results indicate that a category-based approach is a successful way to provide access to image collections.
Inhalt: Vgl. auch: http://flamenco.berkeley.edu/.
Themenfeld: Bilder ; Benutzerstudien
Objekt: Flamenco