Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 28. April 2022)
1Lackner, K. ; Schilhan, L.: ¬Der Einzug der EDV im österreichischen Bibliothekswesen am Beispiel der Universitätsbibliothek Graz.
In: Mitteilungen der Vereinigung Österreichischer Bibliothekarinnen und Bibliothekare. 74(2021) H.2, S.171-204.
Abstract: Durch den Einsatz von EDV-Systemen kam es ab den 1970er Jahren zu einem radikalen Wandel in der Benutzung und Verwaltung von Universitätsbibliotheken. Die Universitätsbibliothek Graz war die erste Bibliothek in Österreich, die ein elektronisches Bibliothekssystem entwickelte und einsetzte, womit sie zu den Vorreitern in Europa zählte. Dieser Artikel liefert einen historischen Überblick über die Anfänge, die Entwicklung und Verbreitung der elektronischen Bibliothekssysteme im Allgemeinen sowie an der Universitätsbibliothek Graz im Speziellen. Vorgestellt werden die im Lauf der Jahrzehnte an der UB Graz eingesetzten Bibliothekssysteme GRIBS, EMILE, FBInfo, BIBOS, ALEPH und ALMA sowie die Entwicklung von den ersten Online- über die CD-ROM-Datenbanken bis hin zum modernen Datenbank-Retrieval.
Inhalt: Vgl.: https://doi.org/10.31263/voebm.v74i2.6395.
Land/Ort: A ; Graz
2Sjögårde, P. ; Ahlgren, P. ; Waltman, L.: Algorithmic labeling in hierarchical classifications of publications : evaluation of bibliographic fields and term weighting approaches.
In: Journal of the Association for Information Science and Technology. 72(2021) no.7, S.853-869.
Abstract: Algorithmic classifications of research publications can be used to study many different aspects of the science system, such as the organization of science into fields, the growth of fields, interdisciplinarity, and emerging topics. How to label the classes in these classifications is a problem that has not been thoroughly addressed in the literature. In this study, we evaluate different approaches to label the classes in algorithmically constructed classifications of research publications. We focus on two important choices: the choice of (a) different bibliographic fields and (b) different approaches to weight the relevance of terms. To evaluate the different choices, we created two baselines: one based on the Medical Subject Headings in MEDLINE and another based on the Science-Metrix journal classification. We tested to what extent different approaches yield the desired labels for the classes in the two baselines. Based on our results, we recommend extracting terms from titles and keywords to label classes at high levels of granularity (e.g., topics). At low levels of granularity (e.g., disciplines) we recommend extracting terms from journal names and author addresses. We recommend the use of a new approach, term frequency to specificity ratio, to calculate the relevance of terms.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24452.
3Bogaard, T. ; Hollink, L. ; Wielemaker, J. ; Ossenbruggen, J. van ; Hardman, L.: Metadata categorization for identifying search patterns in a digital library.
In: Journal of documentation. 75(2019) no.2, S.270-286.
Abstract: Purpose For digital libraries, it is useful to understand how users search in a collection. Investigating search patterns can help them to improve the user interface, collection management and search algorithms. However, search patterns may vary widely in different parts of a collection. The purpose of this paper is to demonstrate how to identify these search patterns within a well-curated historical newspaper collection using the existing metadata. Design/methodology/approach The authors analyzed search logs combined with metadata records describing the content of the collection, using this metadata to create subsets in the logs corresponding to different parts of the collection. Findings The study shows that faceted search is more prevalent than non-faceted search in terms of number of unique queries, time spent, clicks and downloads. Distinct search patterns are observed in different parts of the collection, corresponding to historical periods, geographical regions or subject matter. Originality/value First, this study provides deeper insights into search behavior at a fine granularity in a historical newspaper collection, by the inclusion of the metadata in the analysis. Second, it demonstrates how to use metadata categorization as a way to analyze distinct search patterns in a collection.
Inhalt: Vgl.: https://www.emeraldinsight.com/doi/full/10.1108/JD-06-2018-0087.
Behandelte Form: Zeitungen
4Yuan, Q. ; Xu, S. ; Jian, L.: ¬A new method for retrieving batik shape patterns.
In: Journal of the Association for Information Science and Technology. 69(2018) no.4, S.578-599.
Abstract: Batik as a traditional art is well regarded due to its high aesthetic quality and cultural heritage values. It is not uncommon to reuse versatile decorative shape patterns across batiks. General-purpose image retrieval methods often fail to pay sufficient attention to such a frequent reuse of shape patterns in the graphical compositions of batiks, leading to suboptimal retrieval results, in particular for identifying batiks that use copyrighted shape patterns without proper authorization for law-enforcement purposes. To address the lack of an optimized image retrieval method suited for batiks, this study proposes a new method for retrieving salient shape patterns in batiks using a rich combination of global and local features. The global features deployed were extracted according to the Zernike moments (ZMs); the local features adopted were extracted through curvelet transformations that characterize shape contours embedded in batiks. The method subsequently incorporated both types of features via matching a weighted bipartite graph to measure the visual similarity between any pair of batik shape patterns through supervised distance metric learning. The derived similarity metric can then be used to detect and retrieve similar shape patterns appearing across batiks, which in turn can be employed as a reliable similarity metric for retrieving batiks. To explore the usefulness of the proposed method, the performance of the new retrieval method is compared against that of three peer methods as well as two variants of the proposed method. The experimental results consistently and convincingly demonstrate that the new method indeed outperforms the state-of-the-art methods in retrieving salient shape patterns in batiks.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/abs/10.1002/asi.23977.
5Colavizza, G. ; Boyack, K.W. ; Eck, N.J. van ; Waltman, L.: ¬The closer the better : similarity of publication pairs at different cocitation levels.
In: Journal of the Association for Information Science and Technology. 69(2018) no.4, S.600-609.
Abstract: We investigated the similarities of pairs of articles that are cocited at the different cocitation levels of the journal, article, section, paragraph, sentence, and bracket. Our results indicate that textual similarity, intellectual overlap (shared references), author overlap (shared authors), proximity in publication time all rise monotonically as the cocitation level gets lower (from journal to bracket). While the main gain in similarity happens when moving from journal to article cocitation, all level changes entail an increase in similarity, especially section to paragraph and paragraph to sentence/bracket levels. We compared the results from four journals over the years 2010-2015: Cell, the European Journal of Operational Research, Physics Letters B, and Research Policy, with consistent general outcomes and some interesting differences. Our findings motivate the use of granular cocitation information as defined by meaningful units of text, with implications for, among others, the elaboration of maps of science and the retrieval of scholarly literature.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/abs/10.1002/asi.23981.
6Vaughan, L. ; Ninkov, A.: ¬A new approach to web co-link analysis.
In: Journal of the Association for Information Science and Technology. 69(2018) no.6, S.820-831.
Abstract: Numerous web co-link studies have analyzed a wide variety of websites ranging from those in the academic and business arena to those dealing with politics and governments. Such studies uncover rich information about these organizations. In recent years, however, there has been a dearth of co-link analysis, mainly due to the lack of sources from which co-link data can be collected directly. Although several commercial services such as Alexa provide inlink data, none provide co-link data. We propose a new approach to web co-link analysis that can alleviate this problem so that researchers can continue to mine the valuable information contained in co-link data. The proposed approach has two components: (a) generating co-link data from inlink data using a computer program; (b) analyzing co-link data at the site level in addition to the page level that previous co-link analyses have used. The site-level analysis has the potential of expanding co-link data sources. We tested this proposed approach by analyzing a group of websites focused on vaccination using Moz inlink data. We found that the approach is feasible, as we were able to generate co-link data from inlink data and analyze the co-link data with multidimensional scaling.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/abs/10.1002/asi.24000.
7Zhu, J. ; Han, L. ; Gou, Z. ; Yuan, X.: ¬A fuzzy clustering-based denoising model for evaluating uncertainty in collaborative filtering recommender systems.
In: Journal of the Association for Information Science and Technology. 69(2018) no.9, S.1109-1121.
Abstract: Recommender systems are effective in predicting the most suitable products for users, such as movies and books. To facilitate personalized recommendations, the quality of item ratings should be guaranteed. However, a few ratings might not be accurate enough due to the uncertainty of user behavior and are referred to as natural noise. In this article, we present a novel fuzzy clustering-based method for detecting noisy ratings. The entropy of a subset of the original ratings dataset is used to indicate the data-driven uncertainty, and evaluation metrics are adopted to represent the prediction-driven uncertainty. After the repetition of resampling and the execution of a recommendation algorithm, the entropy and evaluation metrics vectors are obtained and are empirically categorized to identify the proportion of the potential noise. Then, the fuzzy C-means-based denoising (FCMD) algorithm is performed to verify the natural noise under the assumption that natural noise is primarily the result of the exceptional behavior of users. Finally, a case study is performed using two real-world datasets. The experimental results show that our proposal outperforms previous proposals and has an advantage in dealing with natural noise.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24036.
8Rajan, L.: Historical ambiguity : a lens for approaching outdated terms.
In: Challenges and opportunities for knowledge organization in the digital age: proceedings of the Fifteenth International ISKO Conference, 9-11 July 2018, Porto, Portugal / organized by: International Society for Knowledge Organization (ISKO), ISKO Spain and Portugal Chapter, University of Porto - Faculty of Arts and Humanities, Research Centre in Communication, Information and Digital Culture (CIC.digital) - Porto. Eds.: F. Ribeiro u. M.E. Cerveira. Baden-Baden : Ergon Verlag, 2018. S.256-264.
(Advances in knowledge organization; vol.16)
9Fang, L. ; Tuan, L.A. ; Hui, S.C. ; Wu, L.: Syntactic based approach for grammar question retrieval.
In: Information processing and management. 54(2018) no.2, S.184-202.
Abstract: With the popularity of online educational platforms, English learners can learn and practice no matter where they are and what they do. English grammar is one of the important components in learning English. To learn English grammar effectively, it requires students to practice questions containing focused grammar knowledge. In this paper, we study a novel problem of retrieving English grammar questions with similar grammatical focus. Since the grammatical focus similarity is different from textual similarity or sentence syntactic similarity, existing approaches cannot be applied directly to our problem. To address this problem, we propose a syntactic based approach for English grammar question retrieval which can retrieve related grammar questions with similar grammatical focus effectively. In the proposed syntactic based approach, we first propose a new syntactic tree, namely parse-key tree, to capture English grammar questions' grammatical focus. Next, we propose two kernel functions, namely relaxed tree kernel and part-of-speech order kernel, to compute the similarity between two parse-key trees of the query and grammar questions in the collection. Then, the retrieved grammar questions are ranked according to the similarity between the parse-key trees. In addition, if a query is submitted together with answer choices, conceptual similarity and textual similarity are also incorporated to further improve the retrieval accuracy. The performance results have shown that our proposed approach outperforms the state-of-the-art methods based on statistical analysis and syntactic analysis.
Inhalt: Vgl.: https://doi.org/10.1016/j.ipm.2017.11.004.
10Xu, A. ; Hess, K. ; Akerman, L.: From MARC to BIBFRAME 2.0 : Crosswalks.
In: Cataloging and classification quarterly. 56(2018) no.2/3, S.224-250.
Abstract: One of the big challenges facing academic libraries today is to increase the relevance of the libraries to their user communities. If the libraries can increase the visibility of their resources on the open web, it will increase the chances of the libraries to reach to their user communities via the user's first search experience. BIBFRAME and library Linked Data will enable libraries to publish their resources in a way that the Web understands, consume Linked Data to enrich their resources relevant to the libraries' user communities, and visualize networks across collections. However, one of the important steps for transitioning to BIBFRAME and library Linked Data involves crosswalks, mapping MARC fields and subfields across data models and performing necessary data reformatting to be in compliance with the specifications of the new model, which is currently BIBFRAME 2.0. This article looks into how the Library of Congress has mapped library bibliographic data from the MARC format to the BIBFRAME 2.0 model and vocabulary published and updated since April 2016, available from http://www.loc.gov/bibframe/docs/index.html based on the recently released conversion specifications and converter, developed by the Library of Congress with input from many community members. The BIBFRAME 2.0 standard and conversion tools will enable libraries to transform bibliographic data from MARC into BIBFRAME 2.0, which introduces a Linked Data model as the improved method of bibliographic control for the future, and make bibliographic information more useful within and beyond library communities.
Inhalt: Vgl.: https://doi.org/10.1080/01639374.2017.1388326.
Anmerkung: Beitrag in einem Heft: 'Setting standards to work and live by: A memorial Festschrift for Valerie Bross'.
Themenfeld: Formalerschließung ; Datenformate
Objekt: MARC ; BIBFRAME 2.0
11Ninkov, A. ; Vaughan, L.: ¬A webometric analysis of the online vaccination debate.
In: Journal of the Association for Information Science and Technology. 68(2017) no.5, S.1285-1294.
Abstract: Webometrics research methods can be effectively used to measure and analyze information on the web. One topic discussed vehemently online that could benefit from this type of analysis is vaccines. We carried out a study analyzing the web presence of both sides of this debate. We collected a variety of webometric data and analyzed the data both quantitatively and qualitatively. The study found far more anti- than pro-vaccine web domains. The anti and pro sides had similar web visibility as measured by the number of links coming from general websites and Tweets. However, the links to the pro domains were of higher quality measured by PageRank scores. The result from the qualitative content analysis confirmed this finding. The analysis of site ages revealed that the battle between the two sides had a long history and is still ongoing. The web scene was polarized with either pro or anti views and little neutral ground. The study suggests ways that professional information can be promoted more effectively on the web. The study demonstrates that webometrics analysis is effective in studying online information dissemination. This kind of analysis can be used to study not only health information but other information as well.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23758/full.
12Rajan, L.: Ambiguity in knowledge organization : four proposed types.
In: http://www.iskocus.org/NASKO2017papers/NASKO2017_paper_26.pdf [NASKO 2017, June 15-16, 2017, Champaign, IL, USA].
Abstract: Classification and categorization order by creating or seeking certainty. Yet inevitably we encounter things that defy ready placement, which we may label other or miscellaneous, or force into another category. The literature of knowledge organization recognizes the consequences of classification and misrepresentation, but has not systematically outlined what circumstances or conditions render a thing ambiguous to those who would seek to describe it. This paper proposes four major sources or type of ambiguity in classification. While examples of these types may be found in many disciplines and settings, they have in common similar requirements for accurate or improved representation. Multiplicity is a source of ambiguity when a resource or object requires more terms to describe than the system allows. Emergence is ambiguity that arises when phenomena, from medical observation to liter ary genre, is at an early stage of description and thus unstable. Privacy - related ambiguity is that which stems from a gap of understanding or trust between those classifying and what is being classified, particularly in human communities. Conditional ambiguity arises when something requires narrative due to conditional contexts such as temporality or geography. This term also describes things that have dichotomous or fragmentary identities that are not easily represented by most systems. These types of ambiguity may arise in formal and informal organization systems. While observing these types of ambiguity may not offer immediate or feasible solutions, it may allow us to discuss their unique challenges and to better understand their manifestations across disciplines.
Inhalt: Beitrag bei: NASKO 2017: Visualizing Knowledge Organization: Bringing Focus to Abstract Realities. The sixth North American Symposium on Knowledge Organization (NASKO 2017), June 15-16, 2017, in Champaign, IL, USA.
13Vaughan, L.: Uncovering information from social media hyperlinks.
In: Journal of the Association for Information Science and Technology. 67(2016) no.5, S.1105-1120.
Abstract: Analyzing hyperlink patterns has been a major research topic since the early days of the web. Numerous studies reported uncovering rich information and methodological advances. However, very few studies thus far examined hyperlinks in the rapidly developing sphere of social media. This paper reports a study that helps fill this gap. The study analyzed links originating from tweets to the websites of 3 types of organizations (government, education, and business). Data were collected over an 8-month period to observe the fluctuation and reliability of the individual data set. Hyperlink data from the general web (not social media sites) were also collected and compared with social media data. The study found that the 2 types of hyperlink data correlated significantly and that analyzing the 2 together can help organizations see their relative strength or weakness in the two platforms. The study also found that both types of inlink data correlated with offline measures of organizations' performance. Twitter data from a relatively short period were fairly reliable in estimating performance measures. The timelier nature of social media data as well as the date/time stamps on tweets make this type of data potentially more valuable than that from the general web.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23486/abstract.
14Tenopir, C. ; Levine, K. ; Allard, S. ; Christian, L. ; Volentine, R. ; Boehm, R. ; Nichols, F. ; Nicholas, D. ; Jamali, H.R. ; Herman, E. ; Watkinson, A.: Trustworthiness and authority of scholarly information in a digital age : results of an international questionnaire.
In: Journal of the Association for Information Science and Technology. 67(2016) no.10, S.2344-2361.
Abstract: An international survey of over 3,600 researchers examined how trustworthiness and quality are determined for making decisions on scholarly reading, citing, and publishing and how scholars perceive changes in trust with new forms of scholarly communication. Although differences in determining trustworthiness and authority of scholarly resources exist among age groups and fields of study, traditional methods and criteria remain important across the board. Peer review is considered the most important factor for determining the quality and trustworthiness of research. Researchers continue to read abstracts, check content for sound arguments and credible data, and rely on journal rankings when deciding whether to trust scholarly resources in reading, citing, or publishing. Social media outlets and open access publications are still often not trusted, although many researchers believe that open access has positive implications for research, especially if the open access journals are peer reviewed.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23598/full.
15Hicks, D. ; Wouters, P. ; Waltman, L. ; Rijcke, S. de ; Rafols, I.: ¬The Leiden Manifesto for research metrics : 10 principles to guide research evaluation.
In: Nature. 520(2015), 23.04.2015, S.429-431.
Abstract: Research evaluation has become routine and often relies on metrics. But it is increasingly driven by data and not by expert judgement. As a result, the procedures that were designed to increase the quality of research are now threatening to damage the scientific system. To support researchers and managers, five experts led by Diana Hicks, professor in the School of Public Policy at Georgia Institute of Technology, and Paul Wouters, director of CWTS at Leiden University, have proposed ten principles for the measurement of research performance: the Leiden Manifesto for Research Metrics published as a comment in Nature.
Inhalt: Vgl.: http://www.nature.com/polopoly_fs/1.17351!/menu/main/topColumns/topLeftColumn/pdf/520429a.pdf. http://www.leidenmanifesto.org/uploads/4/1/6/0/41603901/leiden_manifesto_german__leidener_manifest.pdf. Video unter: https://vimeo.com/133683418.
16Vaughan, L. ; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index.
In: Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22.
Abstract: Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23201/abstract.
Themenfeld: Data Mining ; Suchmaschinen
Objekt: Google ; Baidu
17Tan, L.K.-W. ; Na, J.-C. ; Ding, Y.: Influence diffusion detection using the influence style (INFUSE) model.
In: Journal of the Association for Information Science and Technology. 66(2015) no.8, S.1717-1733.
Abstract: Blogs are readily available sources of opinions and sentiments that in turn could influence the opinions of the blog readers. Previous studies have attempted to infer influence from blog features, but they have ignored the possible influence styles that describe the different ways in which influence is exerted. We propose a novel approach to analyzing bloggers' influence styles and using the influence styles as features to improve the performance of influence diffusion detection among linked bloggers. The proposed influence style (INFUSE) model describes bloggers' influence through their engagement style, persuasion style, and persona. Methods used include similarity analysis to detect the creating-sharing aspect of engagement style, subjectivity analysis to measure persuasion style, and sentiment analysis to identify persona style. We further extend the INFUSE model to detect influence diffusion among linked bloggers based on the bloggers' influence styles. The INFUSE model performed well with an average F1 score of 76% compared with the in-degree and sentiment-value baseline approaches. Previous studies have focused on the existence of influence among linked bloggers in detecting influence diffusion, but our INFUSE model is shown to provide a fine-grained description of the manner in which influence is diffused based on the bloggers' influence styles.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23287/abstract.
18Waltman, L. ; Costas, R.: F1000 Recommendations as a potential new data source for research evaluation : a comparison with citations.
In: Journal of the Association for Information Science and Technology. 65(2014) no.3, S.433-445.
Abstract: F1000 is a postpublication peer review service for biological and medical research. F1000 recommends important publications in the biomedical literature, and from this perspective F1000 could be an interesting tool for research evaluation. By linking the complete database of F1000 recommendations to the Web of Science bibliographic database, we are able to make a comprehensive comparison between F1000 recommendations and citations. We find that about 2% of the publications in the biomedical literature receive at least one F1000 recommendation. Recommended publications on average receive 1.30 recommendations, and more than 90% of the recommendations are given within half a year after a publication has appeared. There turns out to be a clear correlation between F1000 recommendations and citations. However, the correlation is relatively weak, at least weaker than the correlation between journal impact and citations. More research is needed to identify the main reasons for differences between recommendations and citations in assessing the impact of publications.
19Vaughan, L. ; Romero-Frías, E.: Web search volume as a predictor of academic fame : an exploration of Google trends.
In: Journal of the Association for Information Science and Technology. 65(2014) no.4, S.707-720.
Abstract: Searches conducted on web search engines reflect the interests of users and society. Google Trends, which provides information about the queries searched by users of the Google web search engine, is a rich data source from which a wealth of information can be mined. We investigated the possibility of using web search volume data from Google Trends to predict academic fame. As queries are language-dependent, we studied universities from two countries with different languages, the United States and Spain. We found a significant correlation between the search volume of a university name and the university's academic reputation or fame. We also examined the effect of some Google Trends features, namely, limiting the search to a specific country or topic category on the search volume data. Finally, we examined the effect of university sizes on the correlations found to gain a deeper understanding of the nature of the relationships.
Objekt: Google trends
20Waltman, L. ; Schreiber, M.: On the calculation of percentile-based bibliometric indicators.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.2, S.372-379.
Abstract: A percentile-based bibliometric indicator is an indicator that values publications based on their position within the citation distribution of their field. The most straightforward percentile-based indicator is the proportion of frequently cited publications, for instance, the proportion of publications that belong to the top 10% most frequently cited of their field. Recently, more complex percentile-based indicators have been proposed. A difficulty in the calculation of percentile-based indicators is caused by the discrete nature of citation distributions combined with the presence of many publications with the same number of citations. We introduce an approach to calculating percentile-based indicators that deals with this difficulty in a more satisfactory way than earlier approaches suggested in the literature. We show in a formal mathematical framework that our approach leads to indicators that do not suffer from biases in favor of or against particular fields of science.