Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 28. April 2022)
1Vaughan, L. ; Ninkov, A.: ¬A new approach to web co-link analysis.
In: Journal of the Association for Information Science and Technology. 69(2018) no.6, S.820-831.
Abstract: Numerous web co-link studies have analyzed a wide variety of websites ranging from those in the academic and business arena to those dealing with politics and governments. Such studies uncover rich information about these organizations. In recent years, however, there has been a dearth of co-link analysis, mainly due to the lack of sources from which co-link data can be collected directly. Although several commercial services such as Alexa provide inlink data, none provide co-link data. We propose a new approach to web co-link analysis that can alleviate this problem so that researchers can continue to mine the valuable information contained in co-link data. The proposed approach has two components: (a) generating co-link data from inlink data using a computer program; (b) analyzing co-link data at the site level in addition to the page level that previous co-link analyses have used. The site-level analysis has the potential of expanding co-link data sources. We tested this proposed approach by analyzing a group of websites focused on vaccination using Moz inlink data. We found that the approach is feasible, as we were able to generate co-link data from inlink data and analyze the co-link data with multidimensional scaling.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/abs/10.1002/asi.24000.
2Ninkov, A. ; Vaughan, L.: ¬A webometric analysis of the online vaccination debate.
In: Journal of the Association for Information Science and Technology. 68(2017) no.5, S.1285-1294.
Abstract: Webometrics research methods can be effectively used to measure and analyze information on the web. One topic discussed vehemently online that could benefit from this type of analysis is vaccines. We carried out a study analyzing the web presence of both sides of this debate. We collected a variety of webometric data and analyzed the data both quantitatively and qualitatively. The study found far more anti- than pro-vaccine web domains. The anti and pro sides had similar web visibility as measured by the number of links coming from general websites and Tweets. However, the links to the pro domains were of higher quality measured by PageRank scores. The result from the qualitative content analysis confirmed this finding. The analysis of site ages revealed that the battle between the two sides had a long history and is still ongoing. The web scene was polarized with either pro or anti views and little neutral ground. The study suggests ways that professional information can be promoted more effectively on the web. The study demonstrates that webometrics analysis is effective in studying online information dissemination. This kind of analysis can be used to study not only health information but other information as well.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23758/full.
3Vaughan, L.: Uncovering information from social media hyperlinks.
In: Journal of the Association for Information Science and Technology. 67(2016) no.5, S.1105-1120.
Abstract: Analyzing hyperlink patterns has been a major research topic since the early days of the web. Numerous studies reported uncovering rich information and methodological advances. However, very few studies thus far examined hyperlinks in the rapidly developing sphere of social media. This paper reports a study that helps fill this gap. The study analyzed links originating from tweets to the websites of 3 types of organizations (government, education, and business). Data were collected over an 8-month period to observe the fluctuation and reliability of the individual data set. Hyperlink data from the general web (not social media sites) were also collected and compared with social media data. The study found that the 2 types of hyperlink data correlated significantly and that analyzing the 2 together can help organizations see their relative strength or weakness in the two platforms. The study also found that both types of inlink data correlated with offline measures of organizations' performance. Twitter data from a relatively short period were fairly reliable in estimating performance measures. The timelier nature of social media data as well as the date/time stamps on tweets make this type of data potentially more valuable than that from the general web.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23486/abstract.
4Vaughan, L. ; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index.
In: Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22.
Abstract: Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23201/abstract.
Themenfeld: Data Mining ; Suchmaschinen
Objekt: Google ; Baidu
5Vaughan, L. ; Romero-Frías, E.: Web search volume as a predictor of academic fame : an exploration of Google trends.
In: Journal of the Association for Information Science and Technology. 65(2014) no.4, S.707-720.
Abstract: Searches conducted on web search engines reflect the interests of users and society. Google Trends, which provides information about the queries searched by users of the Google web search engine, is a rich data source from which a wealth of information can be mined. We investigated the possibility of using web search volume data from Google Trends to predict academic fame. As queries are language-dependent, we studied universities from two countries with different languages, the United States and Spain. We found a significant correlation between the search volume of a university name and the university's academic reputation or fame. We also examined the effect of some Google Trends features, namely, limiting the search to a specific country or topic category on the search volume data. Finally, we examined the effect of university sizes on the correlations found to gain a deeper understanding of the nature of the relationships.
Objekt: Google trends
6Romero-Frías, E. ; Vaughan, L.: Exploring the relationships between media and political parties through web hyperlink analysis : the case of Spain.
In: Journal of the American Society for Information Science and Technology. 63(2012) no.5, S.967-976.
Abstract: The study focuses on the web presence of the main Spanish media and seeks to determine whether hyperlink analysis of media and political parties can provide insight into their political orientation. The research included all major national media and political parties in Spain. Inlink and co-link data about these organizations were collected and analyzed using multidimensional scaling (MDS) and other statistical methods. In the MDS map, media are clustered based on their political orientation. There are significantly more co-links between media and parties with the same political orientation than there are between those with different political orientations. Findings from the study suggest the potential of using link analysis to gain new insights into the interactions among media and political parties.
7Vaughan, L. ; Yang, R.: Web data as academic and business quality estimates : a comparison of three data sources.
In: Journal of the American Society for Information Science and Technology. 63(2012) no.10, S.1960-1972.
Abstract: Earlier studies found that web hyperlink data contain various types of information, ranging from academic to political, that can be used to analyze a variety of social phenomena. Specifically, the numbers of inlinks to academic websites are associated with academic performance, while the counts of inlinks to company websites correlate with business variables. However, the scarcity of sources from which to collect inlink data in recent years has required us to seek new data sources. The recent demise of the inlink search function of Yahoo! made this need more pressing. Different alternative variables or data sources have been proposed. This study compared three types of web data to determine which are better as academic and business quality estimates, and what are the relationships among the three data sources. The study found that Alexa inlink and Google URL citation data can replace Yahoo! inlink data and that the former is better than the latter. Alexa is even better than Yahoo!, which has been the main data source in recent years. The unique nature of Alexa data could explain its relative advantages over other data sources.
8Romero-Frías, E. ; Vaughan, L.: European political trends viewed through patterns of Web linking.
In: Journal of the American Society for Information Science and Technology. 61(2010) no.10, S.2109-2121.
Abstract: This study explored the feasibility of using Web hyperlink data to study European political Web sites. Ninety-six European Union (EU) political parties belonging to a wide range of ideological, historical, and linguistic backgrounds were included in the study. Various types of data on Web links to party Web sites were collected. The Web colink data were visualized using multidimensional scaling (MDS), while the inlink data were analyzed with a 2-way analysis of variance test. The results showed that Web hyperlink data did reflect some political patterns in the EU. The MDS maps showed clusters of political parties along ideological, historical, linguistic, and social lines. Statistical analysis based on inlink counts further confirmed that there was a significant difference along the line of the political history of a country, such that left-wing parties in the former communist countries received considerably fewer inlinks to their Web sites than left-wing parties in countries without a history of communism did. The study demonstrated the possibility of using Web hyperlink data to gain insights into political situations in the EU. This suggests the richness of Web hyperlink data and its potential in studying social-political phenomena.
9Leydesdorff, L. ; Vaughan, L.: Co-occurrence matrices and their applications in information science : extending ACA to the Web environment.
In: Journal of the American Society for Information Science and Technology. 57(2006) no.12, S.1616-1628.
Abstract: Co-occurrence matrices, such as cocitation, coword, and colink matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of these data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This article discusses the difference between a symmetrical cocitation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (such as the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical cocitation matrix but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co-occurrence matrices to the Web environment, in which the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed by using both the traditional methods of multivariate analysis and the new visualization software Pajek, which is based on social network analysis and graph theory.
10Vaughan, L.: Visualizing linguistic and cultural differences using Web co-link data.
In: Journal of the American Society for Information Science and Technology. 57(2006) no.9, S.1178-1193.
Abstract: The study examined Web co-links to Canadian university Web sites. Multidimensional scaling (MDS) was used to analyze and visualize co-link data as was done in co-citation analysis. Co-link data were collected in ways that would reflect three different views, the global view, the French Canada view, and the English Canada view. Mapping results of the three data sets accurately reflected the ways Canadians see the universities and clearly showed the linguistic and cultural differences within Canadian society. This shows that Web co-linking is not a random phenomenon and that co-link data contain useful information for Web data mining. It is proposed that the method developed in the study can be applied to other contexts such as analyzing relationships of different organizations or countries. This kind of research is promising because of the dynamics and the diversity of the Web.
Themenfeld: Informetrie ; Internet
11Vaughan, L. ; Hahn, T.B.: Profile, needs, and expectations of information professionals : what we learned from the 2003 ASIST membership survey.
In: Journal of the American Society for Information Science and Technology. 56(2005) no.1, S.95-105.
Abstract: A survey of American Society for Information Science and Technology (ASIST) members was administered via the Web in May 2003. The survey gathered demographic data about members and their preferences and expectations in regard to conferences and other ASIST products and services. With about a 32% return rate, findings were compared with an earlier survey conducted in 1979, which provides a glimpse of how the Society has changed and what needs to be done to ensure a healthy future development. The gender split has remained the same but members are about 5 years older an average than they were in 1979. A significant shift has occurred in members' institutional affiliations, from the largest group being in the industrial sector to the largest group being in educational institutions. Members an average reported slightly higher incomes (after adjusting for inflation) in 2003 than in 1979. Since 1979, a larger percentage of members have earned a doctoral degree. The most common field of study is library and information science. About half of the respondents reported that ASIST is their primary professional society. Their primary reason for maintaining ASIST membership is "learning about new developments/issues in the field." The most common responses to the question about what factors would make ASIST conferences more appealing related to lowering costs. Other responses related to attitudes about the ASIST Bulletin and the value of other proposed products and services are summarized and reported. Detailed analyses of relationships among different variables made possible a deeper understanding of members' needs and expectations, which provides directions for design of programs and services.
12Vaughan, L. ; Shaw, D.: Web citation data for impact assessment : a comparison of four science disciplines.
In: Journal of the American Society for Information Science and Technology. 56(2005) no.10, S.1075-1087.
Abstract: The number and type of Web citations to journal articles in four areas of science are examined: biology, genetics, medicine, and multidisciplinary sciences. For a sample of 5,972 articles published in 114 journals, the median Web citation counts per journal article range from 6.2 in medicine to 10.4 in genetics. About 30% of Web citations in each area indicate intellectual impact (citations from articles or class readings, in contrast to citations from bibliographic services or the author's or journal's home page). Journals receiving more Web citations also have higher percentages of citations indicating intellectual impact. There is significant correlation between the number of citations reported in the databases from the Institute for Scientific Information (ISI, now Thomson Scientific) and the number of citations retrieved using the Google search engine (Web citations). The correlation is much weaker for journals published outside the United Kingdom or United States and for multidisciplinary journals. Web citation numbers are higher than ISI citation counts, suggesting that Web searches might be conducted for an earlier or a more fine-grained assessment of an article's impact. The Web-evident impact of non-UK/USA publications might provide a balance to the geographic or cultural biases observed in ISI's data, although the stability of Web citation counts is debatable.
Themenfeld: Informetrie ; Citation indexing ; Internet
Wissenschaftsfach: Biologie ; Medizin ; Molekularbiologie
13Vaughan, L. ; Thelwall, M.: ¬A modelling approach to uncover hyperlink patterns : the case of Canadian universities.
In: Information processing and management. 41(2005) no.2, S.347-360.
Abstract: Hyperlink patterns between Canadian university Web sites were analyzed by a mathematical modeling approach. A multiple regression model was developed which shows that faculty quality and the language of the university are important predictors for links to a university Web site. Higher faculty quality means more links. French universities received lower numbers of links to their Web sites than comparable English universities. Analysis of interlinking between pairs of universities also showed that English universities are advantaged. Universities are more likely to link to each other when the geographical distance between them is less than 3000 km, possibly reflecting the east vs. west divide that exists in Canadian society.
14Thelwall, M. ; Vaughan, L.: Webometrics : an introduction to the special issue.
In: Journal of the American Society for Information Science and Technology. 55(2004) no.14, S.1213-1215.
Abstract: Webometrics, the quantitative study of Web phenomena, is a field encompassing contributions from information science, computer science, and statistical physics. Its methodology draws especially from bibliometrics. This special issue presents contributions that both push for ward the field and illustrate a wide range of webometric approaches.
Anmerkung: Einleitung zu einem Themenheft über Webometrics
Themenfeld: Internet ; Informetrie
15Vaughan, L.: New measurements for search engine evaluation proposed and tested.
In: Information processing and management. 40(2004) no.4, S.677-691.
Abstract: A set of measurements is proposed for evaluating Web search engine performance. Some measurements are adapted from the concepts of recall and precision, which are commonly used in evaluating traditional information retrieval systems. Others are newly developed to evaluate search engine stability, an issue unique to Web information retrieval systems. An experiment was conducted to test these new measurements by applying them to a performance comparison of three commercial search engines: Google, AltaVista, and Teoma. Twenty-four subjects ranked four sets of Web pages and their rankings were used as benchmarks against which to compare search engine performance. Results show that the proposed measurements are able to distinguish search engine performance very well.
16Vaughan, L. ; Thelwall, M.: Search engine coverage bias : evidence and possible causes.
In: Information processing and management. 40(2004) no.4, S.693-708.
Abstract: Commercial search engines are now playing an increasingly important role in Web information dissemination and access. Of particular interest to business and national governments is whether the big engines have coverage biased towards the US or other countries. In our study we tested for national biases in three major search engines and found significant differences in their coverage of commercial Web sites. The US sites were much better covered than the others in the study: sites from China, Taiwan and Singapore. We then examined the possible technical causes of the differences and found that the language of a site does not affect its coverage by search engines. However, the visibility of a site, measured by the number of links to it, affects its chance to be covered by search engines. We conclude that the coverage bias does exist but this is due not to deliberate choices of the search engines but occurs as a natural result of cumulative advantage effects of US sites on the Web. Nevertheless, the bias remains a cause for international concern.
17Thelwall, M. ; Vaughan, L. ; Björneborn, L.: Webometrics.
In: Annual review of information science and technology. 39(2005), S.81-138.
Abstract: Webometrics, the quantitative study of Web-related phenomena, emerged from the realization that methods originally designed for bibliometric analysis of scientific journal article citation patterns could be applied to the Web, with commercial search engines providing the raw data. Almind and Ingwersen (1997) defined the field and gave it its name. Other pioneers included Rodriguez Gairin (1997) and Aguillo (1998). Larson (1996) undertook exploratory link structure analysis, as did Rousseau (1997). Webometrics encompasses research from fields beyond information science such as communication studies, statistical physics, and computer science. In this review we concentrate on link analysis, but also cover other aspects of webometrics, including Web log fle analysis. One theme that runs through this chapter is the messiness of Web data and the need for data cleansing heuristics. The uncontrolled Web creates numerous problems in the interpretation of results, for instance, from the automatic creation or replication of links. The loose connection between top-level domain specifications (e.g., com, edu, and org) and their actual content is also a frustrating problem. For example, many .com sites contain noncommercial content, although com is ostensibly the main commercial top-level domain. Indeed, a skeptical researcher could claim that obstacles of this kind are so great that all Web analyses lack value. As will be seen, one response to this view, a view shared by critics of evaluative bibliometrics, is to demonstrate that Web data correlate significantly with some non-Web data in order to prove that the Web data are not wholly random. A practical response has been to develop increasingly sophisticated data cleansing techniques and multiple data analysis methods.
Themenfeld: Literaturübersicht ; Internet ; Informetrie ; Citation indexing
18Thelwall, M. ; Vaughan, L.: New versions of PageRank employing alternative Web document models.
In: Aslib proceedings. 56(2004) no.1, S.24-33.
Abstract: Introduces several new versions of PageRank (the link based Web page ranking algorithm), based on an information science perspective on the concept of the Web document. Although the Web page is the typical indivisible unit of information in search engine results and most Web information retrieval algorithms, other research has suggested that aggregating pages based on directories and domains gives promising alternatives, particularly when Web links are the object of study. The new algorithms introduced based on these alternatives were used to rank four sets of Web pages. The ranking results were compared with human subjects' rankings. The results of the tests were somewhat inconclusive: the new approach worked well for the set that includes pages from different Web sites; however, it does not work well in ranking pages that are from the same site. It seems that the new algorithms may be effective for some tasks but not for others, especially when only low numbers of links are involved or the pages to be ranked are from the same site or directory.
Themenfeld: Suchmaschinen ; Retrievalalgorithmen
19Vaughan, L. ; Thelwall, M.: Scholarly use of the Web : what are the key inducers of links to journal Web sites?.
In: Journal of the American Society for Information Science and technology. 54(2003) no.1, S.29-38.
Abstract: Web links have been studied by information scientists for at least six years but it is only in the past two that clear evidence has emerged to show that counts of links to scholarly Web spaces (universities and departments) can correlate significantly with research measures, giving some credence to their use for the investigation of scholarly communication. This paper reports an a study to investigate the factors that influence the creation of links to journal Web sites. An empirical approach is used: collecting data and testing for significant patterns. The specific questions addressed are whether site age and site content are inducers of links to a journal's Web site as measured by the ratio of link counts to Journal Impact Factors, two variables previously discovered to be related. A new methodology for data collection is also introduced that uses the Internet Archive to obtain an earliest known creation date for Web sites. The results show that both site age and site content are significant factors for the disciplines studied: library and information science, and law. Comparisons between the two fields also show disciplinary differences in Web site characteristics. Scholars and publishers should be particularly aware that richer content an a journal's Web site tends to generate links and thus the traffic to the site.
Themenfeld: Internet ; Benutzerstudien ; Informetrie
20Vaughan, L. ; Shaw , D.: Bibliographic and Web citations : what Is the difference?.
In: Journal of the American Society for Information Science and technology. 54(2003) no.14, S.1313-1324.
Abstract: Vaughn, and Shaw look at the relationship between traditional citation and Web citation (not hyperlinks but rather textual mentions of published papers). Using English language research journals in ISI's 2000 Journal Citation Report - Information and Library Science category - 1209 full length papers published in 1997 in 46 journals were identified. Each was searched in Social Science Citation Index and on the Web using Google phrase search by entering the title in quotation marks, and followed for distinction where necessary with sub-titles, author's names, and journal title words. After removing obvious false drops, the number of web sites was recorded for comparison with the SSCI counts. A second sample from 1992 was also collected for examination. There were a total of 16,371 web citations to the selected papers. The top and bottom ranked four journals were then examined and every third citation to every third paper was selected and classified as to source type, domain, and country of origin. Web counts are much higher than ISI citation counts. Of the 46 journals from 1997, 26 demonstrated a significant correlation between Web and traditional citation counts, and 11 of the 15 in the 1992 sample also showed significant correlation. Journal impact factor in 1998 and 1999 correlated significantly with average Web citations per journal in the 1997 data, but at a low level. Thirty percent of web citations come from other papers posted on the web, and 30percent from listings of web based bibliographic services, while twelve percent come from class reading lists. High web citation journals often have web accessible tables of content.
Themenfeld: Informetrie ; Citation indexing ; Internet
Wissenschaftsfach: Bibliothekswesen ; Informationswissenschaft
Objekt: Journal Citation Report