Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Thellefsen, M.M. ; Thellefsen, T. ; Sørensen, B.: Information as signs : a semiotic analysis of the information concept, determining its ontological and epistemological foundations.
In: Journal of documentation. 74(2018) no.2, S.372-382.
Abstract: The purpose of this paper is to formulate an analytical framework for the information concept based on the semiotic theory. Design/methodology/approach The paper is motivated by the apparent controversy that still surrounds the information concept. Information, being a key concept within LIS, suffers from being anchored in various incompatible theories. The paper suggests that information is signs, and it demonstrates how the concept of information can be understood within C.S. Peirce's phenomenologically rooted semiotic. Hence, from there, certain ontological conditions as well epistemological consequences of the information concept can be deduced. Findings The paper argues that an understanding of information, as either objective or subjective/discursive, leads to either objective reductionism and signal processing, that fails to explain how information becomes meaningful at all, or conversely, information is understood only relative to subjective/discursive intentions, agendas, etc. To overcome the limitations of defining information as either objective or subjective/discursive, a semiotic analysis shows that information understood as signs is consistently sensitive to both objective and subjective/discursive features of information. It is consequently argued that information as concept should be defined in relation to ontological conditions having certain epistemological consequences. Originality/value The paper presents an analytical framework, derived from semiotics, that adds to the developments of the philosophical dimensions of information within LIS.
Inhalt: Vgl.: https://www.emeraldinsight.com/doi/full/10.1108/JD-05-2017-0078.
Themenfeld: Begriffstheorie ; Information
2Brunswicker, S. ; Jensen, B. ; Song, Z. ; Majchrzak, A.: Transparency as design choice of open data contests.
In: Journal of the Association for Information Science and Technology. 69(2018) no.10, S.1205-1222.
Abstract: Open data contests have become popular virtual events that motivate civic hackers to design high performing software applications that are useful and useable for citizens. However, such contests stir up controversy among scholars and practitioners about the role of transparency, or more specifically, the unrestricted access and observability of the applications submitted throughout the contest. In one view, transparency may reduce performance because it causes excessive replication, whereas another view argues that transparency can encourage novel forms of reuse, namely recombination. This article proposes a new perspective towards transparency as a design choice in open data contest architectures. We introduce a 2-dimensional view towards transparency, defined as observability of information about each submitted (a) solution (how it works) and its (b) performance (how high it scores). We design a sociotechnical contest architecture that jointly affords both transparency dimensions, and evaluate it in the field during a 21-day contest involving 28 participants. The results suggest that the joint instantiation of both transparency dimensions increases performance by triggering different kinds of recombination. Findings advance literature on sociotechnical architectures for civic design. Furthermore, they guide practitioners in implementing open data contests and balancing the tension between individual versus collective benefits.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24033.
3Liu, Z. ; Jansen, B.J.: ASK: A taxonomy of accuracy, social, and knowledge information seeking posts in social question and answering.
In: Journal of the Association for Information Science and Technology. 68(2017) no.2, S.333-347.
Abstract: Many people turn to their social networks to find information through the practice of question and answering. We believe it is necessary to use different answering strategies based on the type of questions to accommodate the different information needs. In this research, we propose the ASK taxonomy that categorizes questions posted on social networking sites into three types according to the nature of the questioner's inquiry of accuracy, social, or knowledge. To automatically decide which answering strategy to use, we develop a predictive model based on ASK question types using question features from the perspectives of lexical, topical, contextual, and syntactic as well as answer features. By applying the classifier on an annotated data set, we present a comprehensive analysis to compare questions in terms of their word usage, topical interests, temporal and spatial restrictions, syntactic structure, and response characteristics. Our research results show that the three types of questions exhibited different characteristics in the way they are asked. Our automatic classification algorithm achieves an 83% correct labeling result, showing the value of the ASK taxonomy for the design of social question and answering systems.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23655/full.
4Wakeling, S. ; Clough, P. ; Connaway, L.S. ; Sen, B. ; Tomás, D.: Users and uses of a global union catalog : a mixed-methods study of WorldCat.org.
In: Journal of the Association for Information Science and Technology. 68(2017) no.9, S.2166-2181.
Abstract: This paper presents the first large-scale investigation of the users and uses of WorldCat.org, the world's largest bibliographic database and global union catalog. Using a mixed-methods approach involving focus group interviews with 120 participants, an online survey with 2,918 responses, and an analysis of transaction logs of approximately 15 million sessions from WorldCat.org, the study provides a new understanding of the context for global union catalog use. We find that WorldCat.org is accessed by a diverse population, with the three primary user groups being librarians, students, and academics. Use of the system is found to fall within three broad types of work-task (professional, academic, and leisure), and we also present an emergent taxonomy of search tasks that encompass known-item, unknown-item, and institutional information searches. Our results support the notion that union catalogs are primarily used for known-item searches, although the volume of traffic to WorldCat.org means that unknown-item searches nonetheless represent an estimated 250,000 sessions per month. Search engine referrals account for almost half of all traffic, but although WorldCat.org effectively connects users referred from institutional library catalogs to other libraries holding a sought item, users arriving from a search engine are less likely to connect to a library.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23708/full. Der Beitrag ist frei verfügbar.
Themenfeld: Formalerschließung ; Katalogfragen allgemein
5Coughlin, D.M. ; Campbell, M.C. ; Jansen, B.J.: ¬A web analytics approach for appraising electronic resources in academic libraries.
In: Journal of the Association for Information Science and Technology. 67(2016) no.3, S.518-534.
Abstract: University libraries provide access to thousands of journals and spend millions of dollars annually on electronic resources. With several commercial entities providing these electronic resources, the result can be silo systems and processes to evaluate cost and usage of these resources, making it difficult to provide meaningful analytics. In this research, we examine a subset of journals from a large research library using a web analytics approach with the goal of developing a framework for the analysis of library subscriptions. This foundational approach is implemented by comparing the impact to the cost, titles, and usage for the subset of journals and by assessing the funding area. Overall, the results highlight the benefit of a web analytics evaluation framework for university libraries and the impact of classifying titles based on the funding area. Furthermore, they show the statistical difference in both use and cost among the various funding areas when ranked by cost, eliminating the outliers of heavily used and highly expensive journals. Future work includes refining this model for a larger scale analysis tying metrics to library organizational objectives and for the creation of an online application to automate this analysis.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23407/abstract.
Anwendungsfeld: Wissenschaftliche Bibliotheken
6Coughlin, D.M. ; Jansen, B.J.: Modeling journal bibliometrics to predict downloads and inform purchase decisions at university research libraries.
In: Journal of the Association for Information Science and Technology. 67(2016) no.9, S.2263-2273.
Abstract: University libraries provide access to thousands of online journals and other content, spending millions of dollars annually on these electronic resources. Providing access to these online resources is costly, and it is difficult both to analyze the value of this content to the institution and to discern those journals that comparatively provide more value. In this research, we examine 1,510 journals from a large research university library, representing more than 40% of the university's annual subscription cost for electronic resources at the time of the study. We utilize a web analytics approach for the creation of a linear regression model to predict usage among these journals. We categorize metrics into two classes: global (journal focused) and local (institution dependent). Using 275 journals for our training set, our analysis shows that a combination of global and local metrics creates the strongest model for predicting full-text downloads. Our linear regression model has an accuracy of more than 80% in predicting downloads for the 1,235 journals in our test set. The implications of the findings are that university libraries that use local metrics have better insight into the value of a journal and therefore more efficient cost content management.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23549/full.
8Ortiz-Cordova, A. ; Yang, Y. ; Jansen, B.J.: External to internal search : associating searching on search engines with searching on sites.
In: Information processing and management. 51(2015) no.5, S.718-736.
Abstract: We analyze the transitions from external search, searching on web search engines, to internal search, searching on websites. We categorize 295,571 search episodes composed of a query submitted to web search engines and the subsequent queries submitted to a single website search by the same users. There are a total of 1,136,390 queries from all searches, of which 295,571 are external search queries and 840,819 are internal search queries. We algorithmically classify queries into states and then use n-grams to categorize search patterns. We cluster the searching episodes into major patterns and identify the most commonly occurring, which are: (1) Explorers (43% of all patterns) with a broad external search query and then broad internal search queries, (2) Navigators (15%) with an external search query containing a URL component and then specific internal search queries, and (3) Shifters (15%) with a different, seemingly unrelated, query types when transitioning from external to internal search. The implications of this research are that external search and internal search sessions are part of a single search episode and that online businesses can leverage these search episodes to more effectively target potential customers.
Inhalt: Vgl.: doi: 10.1016/j.ipm.2015.06.009.
9Sen, B.K.: Ranganathan's contribution to bibliometrics.
In: Annals of library and information studies. 62(2015) no.4, S.222-225.
Abstract: Traces the origin of the term librametry. Shows how librametry has helped Ranganathan to develop the staff formula for different libraries, and it can help in decision making relating to the establishment of rural and branch libraries; dormitory and service libraries. His maintenance of statistics of various library activities showed the growth pattern of library collection, use of the collection by users, busy and very busy hours in the circulations and reference sections, and so on. He also developed a method for optimal procurement of books for every department in the university. Ranganathan also showed statistically that on average Colon class numbers are shorter than DC class numbers. With the passage of time bibliometrics overshadowed librametrics. Ranganathan did not define librametrics, neither he isolated its components. The lacunae have been filled in this article. It has also been shown that a substantial part of librametrics is occupied by bibliometrics.
Inhalt: Vgl. auch: http://op.niscair.res.in/index.php/ALIS/article/view/11399.
Anmerkung: Beitrag in einem Themenheft zu Leben und Werk von S.R. Ranganathan.
10Thellefsen, M. ; Thellefsen, T. ; Sørensen, B.: ¬The fallacy of the cognitive free fall in communication metaphor : a semiotic analysis.
In: Library trends. 63(2015) no.3, S.512-527.
Abstract: This paper is a theoretical analysis of the cognitive free-fall metaphor, used within the cognitive view, as a model for explaining the communication process between a generator and a receiver of a message. Its aim is to demonstrate that the idea of a cognitive free fall taking place within this communication process leads to apparent theoretical paradoxes, partly fostered by unclear definitions of key information-science concepts-namely, tokens, signs, information, and knowledge and their interrelatedness-and a naïve theoretical framework. The paper promotes a semiotically inspired model of communication that demonstrates that what takes place in communication is not a cognitive free fall, but rather a fall from a pragmatic level of knowing or knowledge to a level of representation or information. The paper further argues that the communication process more ideally can be expressed as a complex interrelation of emotion, information, and cognition.
Inhalt: Beitrag in einem Themenheft: 'Exploring Philosophies of Information'.
Anmerkung: Vgl.: 10.1353/lib.2015.0011.
11Thellefsen, T. ; Sørensen, B. ; Thellefsen, M.: ¬The information concept of Nicholas Belkin revisited : some semeiotic comments.
In: Journal of documentation. 70(2014) no.1, S.74-92.
Abstract: Purpose - The purpose of the paper is to examine and compare Nicholas Belkin's information concept and his concept of communication with the authors' semeiotic inspired communication model - the Dynacom. Design/methodology/approach - The authors compare the two communication models by comparing the requirements given by Belkin and the conditions of the Dynacom. Findings - The authors conclude that Belkin's idea of information and his idea of communication lack the social aspect. Based on his theory, he is unable to point out how information becomes knowledge. These are two major issues the authors believe they can elaborate on by introducing the Dynacom and their semeiotic inspired concept of information. Originality/value - No one has previously specifically analyzed Nicholas Belkin's concept of information and compared it to a semeiotic ditto.
12Thellefsen, T. ; Thellefsen, M. ; Soerensen, B.: Emotion, information, and cognition, and some possible consequences for library and information science.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.8, S.1735-1750.
Abstract: We present our semeiotic-inspired concept of information as 1 of 3 important elements in meaning creation, the 2 other concepts being emotion and cognition. We have the inner world (emotion); we have the outer world (information); and cognition mediates between the two. We analyze the 3 elements in relation to communication and discuss the semeiotics-inspired communication model, the Dynacom; then, we discuss our semeiotic perspective on the meaning-creation process and communication with regard to a few, but central, elements in library and information science, namely, the systems-oriented perspective, the user-oriented perspective, and a domain-oriented perspective.
13Jansen, B.J. ; Liu, Z. ; Simon, Z.: ¬The effect of ad rank on the performance of keyword advertising campaigns.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.10, S.2115-2132.
Abstract: The goal of this research is to evaluate the effect of ad rank on the performance of keyword advertising campaigns. We examined a large-scale data file comprised of nearly 7,000,000 records spanning 33 consecutive months of a major US retailer's search engine marketing campaign. The theoretical foundation is serial position effect to explain searcher behavior when interacting with ranked ad listings. We control for temporal effects and use one-way analysis of variance (ANOVA) with Tamhane's T2 tests to examine the effect of ad rank on critical keyword advertising metrics, including clicks, cost-per-click, sales revenue, orders, items sold, and advertising return on investment. Our findings show significant ad rank effect on most of those metrics, although less effect on conversion rates. A primacy effect was found on both clicks and sales, indicating a general compelling performance of top-ranked ads listed on the first results page. Conversion rates, on the other hand, follow a relatively stable distribution except for the top 2 ads, which had significantly higher conversion rates. However, examining conversion potential (the effect of both clicks and conversion rate), we show that ad rank has a significant effect on the performance of keyword advertising campaigns. Conversion potential is a more accurate measure of the impact of an ad's position. In fact, the first ad position generates about 80% of the total profits, after controlling for advertising costs. In addition to providing theoretical grounding, the research results reported in this paper are beneficial to companies using search engine marketing as they strive to design more effective advertising campaigns.
14Ortiz-Cordova, A. ; Jansen, B.J.: Classifying web search queries to identify high revenue generating customers.
In: Journal of the American Society for Information Science and Technology. 63(2012) no.7, S.1426-1441.
Abstract: Traffic from search engines is important for most online businesses, with the majority of visitors to many websites being referred by search engines. Therefore, an understanding of this search engine traffic is critical to the success of these websites. Understanding search engine traffic means understanding the underlying intent of the query terms and the corresponding user behaviors of searchers submitting keywords. In this research, using 712,643 query keywords from a popular Spanish music website relying on contextual advertising as its business model, we use a k-means clustering algorithm to categorize the referral keywords with similar characteristics of onsite customer behavior, including attributes such as clickthrough rate and revenue. We identified 6 clusters of consumer keywords. Clusters range from a large number of users who are low impact to a small number of high impact users. We demonstrate how online businesses can leverage this segmentation clustering approach to provide a more tailored consumer experience. Implications are that businesses can effectively segment customers to develop better business models to increase advertising conversion rates.
15Jansen, B.J. ; Rieh, S.Y.: ¬The seventeen theoretical constructs of information searching and information retrieval.
In: Journal of the American Society for Information Science and Technology. 61(2010) no.8, S.1517-1534.
Abstract: In this article, we identify, compare, and contrast theoretical constructs for the fields of information searching and information retrieval to emphasize the uniqueness of and synergy between the fields. Theoretical constructs are the foundational elements that underpin a field's core theories, models, assumptions, methodologies, and evaluation metrics. We provide a framework to compare and contrast the theoretical constructs in the fields of information searching and information retrieval using intellectual perspective and theoretical orientation. The intellectual perspectives are information searching, information retrieval, and cross-cutting; and the theoretical orientations are information, people, and technology. Using this framework, we identify 17 significant constructs in these fields contrasting the differences and comparing the similarities. We discuss the impact of the interplay among these constructs for moving research forward within both fields. Although there is tension between the fields due to contradictory constructs, an examination shows a trend toward convergence. We discuss the implications for future research within the information searching and information retrieval fields.
16Zhang, Y. ; Jansen, B.J. ; Spink, A.: Identification of factors predicting clickthrough in Web searching using neural network analysis.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.3, S.557-570.
Abstract: In this research, we aim to identify factors that significantly affect the clickthrough of Web searchers. Our underlying goal is determine more efficient methods to optimize the clickthrough rate. We devise a clickthrough metric for measuring customer satisfaction of search engine results using the number of links visited, number of queries a user submits, and rank of clicked links. We use a neural network to detect the significant influence of searching characteristics on future user clickthrough. Our results show that high occurrences of query reformulation, lengthy searching duration, longer query length, and the higher ranking of prior clicked links correlate positively with future clickthrough. We provide recommendations for leveraging these findings for improving the performance of search engine retrieval and result ranking, along with implications for search engine marketing.
Themenfeld: Internet ; Informetrie
17Larsen, B. ; Ingwersen, P. ; Lund, B.: Data fusion according to the principle of polyrepresentation.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.4, S.646-654.
Abstract: We report data fusion experiments carried out on the four best-performing retrieval models from TREC 5. Three were conceptually/algorithmically very different from one another; one was algorithmically similar to one of the former. The objective of the test was to observe the performance of the 11 logical data fusion combinations compared to the performance of the four individual models and their intermediate fusions when following the principle of polyrepresentation. This principle is based on cognitive IR perspective (Ingwersen & Järvelin, 2005) and implies that each retrieval model is regarded as a representation of a unique interpretation of information retrieval (IR). It predicts that only fusions of very different, but equally good, IR models may outperform each constituent as well as their intermediate fusions. Two kinds of experiments were carried out. One tested restricted fusions, which entails that only the inner disjoint overlap documents between fused models are ranked. The second set of experiments was based on traditional data fusion methods. The experiments involved the 30 TREC 5 topics that contain more than 44 relevant documents. In all tests, the Borda and CombSUM scoring methods were used. Performance was measured by precision and recall, with document cutoff values (DCVs) at 100 and 15 documents, respectively. Results show that restricted fusions made of two, three, or four cognitively/algorithmically very different retrieval models perform significantly better than do the individual models at DCV100. At DCV15, however, the results of polyrepresentative fusion were less predictable. The traditional fusion method based on polyrepresentation principles demonstrates a clear picture of performance at both DCV levels and verifies the polyrepresentation predictions for data fusion in IR. Data fusion improves retrieval performance over their constituent IR models only if the models all are quite conceptually/algorithmically dissimilar and equally and well performing, in that order of importance.
18Harmsen, B. ; Leiter, A.: Fraunhofer-Publica : Kompetenzdatenbank der angewandten Forschung.
In: Information - Wissenschaft und Praxis. 60(2009) H.3, S.151-154.
Abstract: Fraunhofer-Publica ist die multidisziplinäre, bibliographische Referenzdatenbank für die Veröffentlichungen der Fraunhofer-Gesellschaft, der größten Europäischen Trägerorganisation für angewandte Forschung, und ihre Mitarbeiter. Die Datenbank wurde 1988 ins Leben gerufen und 1991 um Patente ergänzt. Erstmals 1995 übers World Wide Web frei verfügbar, enthält sie seit 1999 auch Volltext-Dokumente. Seit 2005 ist sie "Open Archive Data Provider". Vor kurzem wurde die Web-Oberfläche überarbeitet, wodurch die Zugriffsmöglichkeiten für Suchmachinen-Robots erheblich verbessert werden konnten und Publica-Zitate nun in Google, MSN und in anderen Web-Datenbanken leichter auffindbar sind. 80 Prozent der heutigen Publica-Nutzung wird über Suchmaschinen vermittelt. Das Fraunhofer-Publica Team orientiert sich an vier Qualitätsmerkmalen für die Datenbankproduktion: Beschaffbarkeit der Originaldokumente, Vollständigkeit / Aktualität, Konsistenz der Metadaten und die Verbreitung der Publica-Inhalte. Dies verlangt akkurate Quellangaben, möglichst den Nachweis von IDs wie DOI oder URN sowie möglichst direkte Volltextlinks. Das Optimum im Bezug auf Beschaffbarkeit stellen freie Downloads dar. Was das Kriterium "Verbreitung" anbelangt, werden sowohl das "Harvesten" der Datenbank als auch die direkte Indexierung durch Robots unterstützt. Das Google-Ranking der Listen und Einzeldokumente ist jedoch schlecht, weil sie keine individuellen Titel haben. Deshalb wird als wichtigste Verbesserung angestrebt, individuelle Titel im HTML-Header für Listen und Einzelanzeigen zu generieren. Obwohl das "Harvesting" der Fraunhofer-Publica seit 2005 möglich ist, gibt es nach wie vor keine "Daten-Sets", d.h. fachspezifische Selektionsmöglichkeiten, die wichtig wären, um wissenschaftlichen Fachportalen zuzuarbeiten. Um dies zu ermöglichen, müssen nachträglich ca. 112.000 Publica-Dokumenten grobe DDC-Klassen bis zur dritten Ebene zugeordnet werden.
Objekt: Fraunhofer-Publica ; DDC
19Jansen, B.J. ; Booth, D.L. ; Spink, A.: Patterns of query reformulation during Web searching.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.7, S.1358-1371.
Abstract: Query reformulation is a key user behavior during Web search. Our research goal is to develop predictive models of query reformulation during Web searching. This article reports results from a study in which we automatically classified the query-reformulation patterns for 964,780 Web searching sessions, composed of 1,523,072 queries, to predict the next query reformulation. We employed an n-gram modeling approach to describe the probability of users transitioning from one query-reformulation state to another to predict their next state. We developed first-, second-, third-, and fourth-order models and evaluated each model for accuracy of prediction, coverage of the dataset, and complexity of the possible pattern set. The results show that Reformulation and Assistance account for approximately 45% of all query reformulations; furthermore, the results demonstrate that the first- and second-order models provide the best predictability, between 28 and 40% overall and higher than 70% for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance.
Themenfeld: Suchtaktik ; Benutzerstudien
20Jansen, B.J. ; Zhang, M. ; Schultz, C.D.: Brand and its effect on user perception of search engine performance.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.8, S.1572-1595.
Abstract: In this research we investigate the effect of search engine brand on the evaluation of searching performance. Our research is motivated by the large amount of search traffic directed to a handful of Web search engines, even though many have similar interfaces and performance. We conducted a laboratory experiment with 32 participants using a 42 factorial design confounded in four blocks to measure the effect of four search engine brands (Google, MSN, Yahoo!, and a locally developed search engine) while controlling for the quality and presentation of search engine results. We found brand indeed played a role in the searching process. Brand effect varied in different domains. Users seemed to place a high degree of trust in major search engine brands; however, they were more engaged in the searching process when using lesser-known search engines. It appears that branding affects overall Web search at four stages: (a) search engine selection, (b) search engine results page evaluation, (c) individual link evaluation, and (d) evaluation of the landing page. We discuss the implications for search engine marketing and the design of empirical studies measuring search engine performance.