Search (112 results, page 1 of 6)

Fong, A.C.M.: Mining a Web citation database for document clustering (2002) 0.07

0.07429434 = product of:
  0.19811824 = sum of:
    0.04888765 = weight(_text_:web in 3940) [ClassicSimilarity], result of:
      0.04888765 = score(doc=3940,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.50479853 = fieldWeight in 3940, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.109375 = fieldNorm(doc=3940)
    0.045895144 = weight(_text_:data in 3940) [ClassicSimilarity], result of:
      0.045895144 = score(doc=3940,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.48910472 = fieldWeight in 3940, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.109375 = fieldNorm(doc=3940)
    0.10333543 = product of:
      0.20667087 = sum of:
        0.20667087 = weight(_text_:mining in 3940) [ClassicSimilarity], result of:
          0.20667087 = score(doc=3940,freq=4.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            1.2342855 = fieldWeight in 3940, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.109375 = fieldNorm(doc=3940)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Theme: Data Mining

He, Y.; Hui, S.C.: Mining a web database for author cocitation analysis (2002) 0.03

0.03048921 = product of:
  0.12195684 = sum of:
    0.04888765 = weight(_text_:web in 2584) [ClassicSimilarity], result of:
      0.04888765 = score(doc=2584,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.50479853 = fieldWeight in 2584, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.109375 = fieldNorm(doc=2584)
    0.073069185 = product of:
      0.14613837 = sum of:
        0.14613837 = weight(_text_:mining in 2584) [ClassicSimilarity], result of:
          0.14613837 = score(doc=2584,freq=2.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            0.8727716 = fieldWeight in 2584, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.109375 = fieldNorm(doc=2584)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Thelwall, M.; Vaughan, L.; Björneborn, L.: Webometrics (2004) 0.02
```
0.02468518 = product of:
  0.09874072 = sum of:
    0.052379623 = weight(_text_:web in 4279) [ClassicSimilarity], result of:
      0.052379623 = score(doc=4279,freq=18.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.5408555 = fieldWeight in 4279, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4279)
    0.046361096 = weight(_text_:data in 4279) [ClassicSimilarity], result of:
      0.046361096 = score(doc=4279,freq=16.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.49407038 = fieldWeight in 4279, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4279)
  0.25 = coord(2/8)
```
Abstract

Webometrics, the quantitative study of Web-related phenomena, emerged from the realization that methods originally designed for bibliometric analysis of scientific journal article citation patterns could be applied to the Web, with commercial search engines providing the raw data. Almind and Ingwersen (1997) defined the field and gave it its name. Other pioneers included Rodriguez Gairin (1997) and Aguillo (1998). Larson (1996) undertook exploratory link structure analysis, as did Rousseau (1997). Webometrics encompasses research from fields beyond information science such as communication studies, statistical physics, and computer science. In this review we concentrate on link analysis, but also cover other aspects of webometrics, including Web log fle analysis. One theme that runs through this chapter is the messiness of Web data and the need for data cleansing heuristics. The uncontrolled Web creates numerous problems in the interpretation of results, for instance, from the automatic creation or replication of links. The loose connection between top-level domain specifications (e.g., com, edu, and org) and their actual content is also a frustrating problem. For example, many .com sites contain noncommercial content, although com is ostensibly the main commercial top-level domain. Indeed, a skeptical researcher could claim that obstacles of this kind are so great that all Web analyses lack value. As will be seen, one response to this view, a view shared by critics of evaluative bibliometrics, is to demonstrate that Web data correlate significantly with some non-Web data in order to prove that the Web data are not wholly random. A practical response has been to develop increasingly sophisticated data cleansing techniques and multiple data analysis methods.
He, Y.; Hui, S.C.: PubSearch : a Web citation-based retrieval system (2001) 0.02
```
0.02447011 = product of:
  0.09788044 = sum of:
    0.038619664 = weight(_text_:wide in 4806) [ClassicSimilarity], result of:
      0.038619664 = score(doc=4806,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.29372054 = fieldWeight in 4806, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4806)
    0.059260778 = weight(_text_:web in 4806) [ClassicSimilarity], result of:
      0.059260778 = score(doc=4806,freq=16.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.6119082 = fieldWeight in 4806, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4806)
  0.25 = coord(2/8)
```
Abstract

Many scientific publications are now available on the World Wide Web for researchers to share research findings. However, they tend to be poorly organised, making the search of relevant publications difficult and time-consuming. Most existing search engines are ineffective in searching these publications, as they do not index Web publications that normally appear in PDF (portable document format) or PostScript formats. Proposes a Web citation-based retrieval system, known as PubSearch, for the retrieval of Web publications. PubSearch indexes Web publications based on citation indices and stores them into a Web Citation Database. The Web Citation Database is then mined to support publication retrieval. Apart from supporting the traditional cited reference search, PubSearch also provides document clustering search and author clustering search. Document clustering groups related publications into clusters, while author clustering categorizes authors into different research areas based on author co-citation analysis.
Daquino, M.; Peroni, S.; Shotton, D.; Colavizza, G.; Ghavimi, B.; Lauscher, A.; Mayr, P.; Romanello, M.; Zumstein, P.: ¬The OpenCitations Data Model (2020) 0.02
```
0.022082468 = product of:
  0.088329874 = sum of:
    0.03628967 = weight(_text_:web in 38) [ClassicSimilarity], result of:
      0.03628967 = score(doc=38,freq=6.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.37471575 = fieldWeight in 38, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=38)
    0.052040204 = weight(_text_:data in 38) [ClassicSimilarity], result of:
      0.052040204 = score(doc=38,freq=14.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.55459267 = fieldWeight in 38, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=38)
  0.25 = coord(2/8)
```
Abstract

A variety of schemas and ontologies are currently used for the machine-readable description of bibliographic entities and citations. This diversity, and the reuse of the same ontology terms with different nuances, generates inconsistencies in data. Adoption of a single data model would facilitate data integration tasks regardless of the data supplier or context application. In this paper we present the OpenCitations Data Model (OCDM), a generic data model for describing bibliographic entities and citations, developed using Semantic Web technologies. We also evaluate the effective reusability of OCDM according to ontology evaluation practices, mention existing users of OCDM, and discuss the use and impact of OCDM in the wider open science community.

Content

Erschienen in: The Semantic Web - ISWC 2020, 19th International Semantic Web Conference, Athens, Greece, November 2-6, 2020, Proceedings, Part II. Vgl.: DOI: 10.1007/978-3-030-62466-8_28.
Robinson-García, N.; Jiménez-Contreras, E.; Torres-Salinas, D.: Analyzing data citation practices using the data citation index : a study of backup strategies of end users (2016) 0.02
```
0.020235607 = product of:
  0.08094243 = sum of:
    0.017459875 = weight(_text_:web in 3225) [ClassicSimilarity], result of:
      0.017459875 = score(doc=3225,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.18028519 = fieldWeight in 3225, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3225)
    0.06348255 = weight(_text_:data in 3225) [ClassicSimilarity], result of:
      0.06348255 = score(doc=3225,freq=30.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.6765338 = fieldWeight in 3225, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3225)
  0.25 = coord(2/8)
```
Abstract

We present an analysis of data citation practices based on the Data Citation Index (DCI) (Thomson Reuters). This database launched in 2012 links data sets and data studies with citations received from the other citation indexes. The DCI harvests citations to research data from papers indexed in the Web of Science. It relies on the information provided by the data repository. The findings of this study show that data citation practices are far from common in most research fields. Some differences have been reported on the way researchers cite data: Although in the areas of science and engineering & technology data sets were the most cited, in the social sciences and arts & humanities data studies play a greater role. A total of 88.1% of the records have received no citation, but some repositories show very low uncitedness rates. Although data citation practices are rare in most fields, they have expanded in disciplines such as crystallography and genomics. We conclude by emphasizing the role that the DCI could play in encouraging the consistent, standardized citation of research data-a role that would enhance their value as a means of following the research process from data collection to publication.
Vaughan, L.; Shaw , D.: Bibliographic and Web citations : what Is the difference? (2003) 0.02
```
0.019835899 = product of:
  0.079343595 = sum of:
    0.062952474 = weight(_text_:web in 5176) [ClassicSimilarity], result of:
      0.062952474 = score(doc=5176,freq=26.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.65002745 = fieldWeight in 5176, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5176)
    0.016391123 = weight(_text_:data in 5176) [ClassicSimilarity], result of:
      0.016391123 = score(doc=5176,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.17468026 = fieldWeight in 5176, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5176)
  0.25 = coord(2/8)
```
Abstract

Vaughn, and Shaw look at the relationship between traditional citation and Web citation (not hyperlinks but rather textual mentions of published papers). Using English language research journals in ISI's 2000 Journal Citation Report - Information and Library Science category - 1209 full length papers published in 1997 in 46 journals were identified. Each was searched in Social Science Citation Index and on the Web using Google phrase search by entering the title in quotation marks, and followed for distinction where necessary with sub-titles, author's names, and journal title words. After removing obvious false drops, the number of web sites was recorded for comparison with the SSCI counts. A second sample from 1992 was also collected for examination. There were a total of 16,371 web citations to the selected papers. The top and bottom ranked four journals were then examined and every third citation to every third paper was selected and classified as to source type, domain, and country of origin. Web counts are much higher than ISI citation counts. Of the 46 journals from 1997, 26 demonstrated a significant correlation between Web and traditional citation counts, and 11 of the 15 in the 1992 sample also showed significant correlation. Journal impact factor in 1998 and 1999 correlated significantly with average Web citations per journal in the 1997 data, but at a low level. Thirty percent of web citations come from other papers posted on the web, and 30percent from listings of web based bibliographic services, while twelve percent come from class reading lists. High web citation journals often have web accessible tables of content.
Vaughan, L.; Shaw, D.: Web citation data for impact assessment : a comparison of four science disciplines (2005) 0.02
```
0.01959838 = product of:
  0.07839352 = sum of:
    0.05521297 = weight(_text_:web in 3880) [ClassicSimilarity], result of:
      0.05521297 = score(doc=3880,freq=20.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.5701118 = fieldWeight in 3880, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3880)
    0.023180548 = weight(_text_:data in 3880) [ClassicSimilarity], result of:
      0.023180548 = score(doc=3880,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.24703519 = fieldWeight in 3880, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3880)
  0.25 = coord(2/8)
```
Abstract

The number and type of Web citations to journal articles in four areas of science are examined: biology, genetics, medicine, and multidisciplinary sciences. For a sample of 5,972 articles published in 114 journals, the median Web citation counts per journal article range from 6.2 in medicine to 10.4 in genetics. About 30% of Web citations in each area indicate intellectual impact (citations from articles or class readings, in contrast to citations from bibliographic services or the author's or journal's home page). Journals receiving more Web citations also have higher percentages of citations indicating intellectual impact. There is significant correlation between the number of citations reported in the databases from the Institute for Scientific Information (ISI, now Thomson Scientific) and the number of citations retrieved using the Google search engine (Web citations). The correlation is much weaker for journals published outside the United Kingdom or United States and for multidisciplinary journals. Web citation numbers are higher than ISI citation counts, suggesting that Web searches might be conducted for an earlier or a more fine-grained assessment of an article's impact. The Web-evident impact of non-UK/USA publications might provide a balance to the geographic or cultural biases observed in ISI's data, although the stability of Web citation counts is debatable.
Ahlgren, P.; Jarneving, B.; Rousseau, R.: Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient (2003) 0.02
```
0.019248914 = product of:
  0.05133044 = sum of:
    0.0139679 = weight(_text_:web in 5171) [ClassicSimilarity], result of:
      0.0139679 = score(doc=5171,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.14422815 = fieldWeight in 5171, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=5171)
    0.029321333 = weight(_text_:data in 5171) [ClassicSimilarity], result of:
      0.029321333 = score(doc=5171,freq=10.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.31247756 = fieldWeight in 5171, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=5171)
    0.008041205 = product of:
      0.01608241 = sum of:
        0.01608241 = weight(_text_:22 in 5171) [ClassicSimilarity], result of:
          0.01608241 = score(doc=5171,freq=2.0), product of:
            0.103918076 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029675366 = queryNorm
            0.15476047 = fieldWeight in 5171, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=5171)
      0.5 = coord(1/2)
  0.375 = coord(3/8)
```
Abstract

Ahlgren, Jarneving, and. Rousseau review accepted procedures for author co-citation analysis first pointing out that since in the raw data matrix the row and column values are identical i,e, the co-citation count of two authors, there is no clear choice for diagonal values. They suggest the number of times an author has been co-cited with himself excluding self citation rather than the common treatment as zeros or as missing values. When the matrix is converted to a similarity matrix the normal procedure is to create a matrix of Pearson's r coefficients between data vectors. Ranking by r and by co-citation frequency and by intuition can easily yield three different orders. It would seem necessary that the adding of zeros to the matrix will not affect the value or the relative order of similarity measures but it is shown that this is not the case with Pearson's r. Using 913 bibliographic descriptions form the Web of Science of articles form JASIS and Scientometrics, authors names were extracted, edited and 12 information retrieval authors and 12 bibliometric authors each from the top 100 most cited were selected. Co-citation and r value (diagonal elements treated as missing) matrices were constructed, and then reconstructed in expanded form. Adding zeros can both change the r value and the ordering of the authors based upon that value. A chi-squared distance measure would not violate these requirements, nor would the cosine coefficient. It is also argued that co-citation data is ordinal data since there is no assurance of an absolute zero number of co-citations, and thus Pearson is not appropriate. The number of ties in co-citation data make the use of the Spearman rank order coefficient problematic.

Date

9. 7.2006 10:22:35
Zhao, D.; Strotmann, A.: Can citation analysis of Web publications better detect research fronts? (2007) 0.02
```
0.017955929 = product of:
  0.071823716 = sum of:
    0.039041467 = weight(_text_:web in 471) [ClassicSimilarity], result of:
      0.039041467 = score(doc=471,freq=10.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.40312994 = fieldWeight in 471, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=471)
    0.032782245 = weight(_text_:data in 471) [ClassicSimilarity], result of:
      0.032782245 = score(doc=471,freq=8.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.34936053 = fieldWeight in 471, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=471)
  0.25 = coord(2/8)
```
Abstract

We present evidence that in some research fields, research published in journals and reported on the Web may collectively represent different evolutionary stages of the field, with journals lagging a few years behind the Web on average, and that a "two-tier" scholarly communication system may therefore be evolving. We conclude that in such fields, (a) for detecting current research fronts, author co-citation analyses (ACA) using articles published on the Web as a data source can outperform traditional ACAs using articles published in journals as data, and that (b) as a result, it is important to use multiple data sources in citation analysis studies of scholarly communication for a complete picture of communication patterns. Our evidence stems from comparing the respective intellectual structures of the XML research field, a subfield of computer science, as revealed from three sets of ACA covering two time periods: (a) from the field's beginnings in 1996 to 2001, and (b) from 2001 to 2006. For the first time period, we analyze research articles both from journals as indexed by the Science Citation Index (SCI) and from the Web as indexed by CiteSeer. We follow up by an ACA of SCI data for the second time period. We find that most trends in the evolution of this field from the first to the second time period that we find when comparing ACA results from the SCI between the two time periods already were apparent in the ACA results from CiteSeer during the first time period.
Kousha, K.; Thelwall, M.: Google Scholar citations and Google Web/URL citations : a multi-discipline exploratory analysis (2007) 0.02
```
0.017901024 = product of:
  0.071604095 = sum of:
    0.05521297 = weight(_text_:web in 337) [ClassicSimilarity], result of:
      0.05521297 = score(doc=337,freq=20.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.5701118 = fieldWeight in 337, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=337)
    0.016391123 = weight(_text_:data in 337) [ClassicSimilarity], result of:
      0.016391123 = score(doc=337,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.17468026 = fieldWeight in 337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=337)
  0.25 = coord(2/8)
```
Abstract

We use a new data gathering method, "Web/URL citation," Web/URL and Google Scholar to compare traditional and Web-based citation patterns across multiple disciplines (biology, chemistry, physics, computing, sociology, economics, psychology, and education) based upon a sample of 1,650 articles from 108 open access (OA) journals published in 2001. A Web/URL citation of an online journal article is a Web mention of its title, URL, or both. For each discipline, except psychology, we found significant correlations between Thomson Scientific (formerly Thomson ISI, here: ISI) citations and both Google Scholar and Google Web/URL citations. Google Scholar citations correlated more highly with ISI citations than did Google Web/URL citations, indicating that the Web/URL method measures a broader type of citation phenomenon. Google Scholar citations were more numerous than ISI citations in computer science and the four social science disciplines, suggesting that Google Scholar is more comprehensive for social sciences and perhaps also when conference articles are valued and published online. We also found large disciplinary differences in the percentage overlap between ISI and Google Scholar citation sources. Finally, although we found many significant trends, there were also numerous exceptions, suggesting that replacing traditional citation sources with the Web or Google Scholar for research impact calculations would be problematic.

McVeigh, M.E.: Citation indexes and the Web of Science (2009) 0.02

0.01743009 = product of:
  0.06972036 = sum of:
    0.041903697 = weight(_text_:web in 3848) [ClassicSimilarity], result of:
      0.041903697 = score(doc=3848,freq=8.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.43268442 = fieldWeight in 3848, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=3848)
    0.027816659 = weight(_text_:data in 3848) [ClassicSimilarity], result of:
      0.027816659 = score(doc=3848,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.29644224 = fieldWeight in 3848, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3848)
  0.25 = coord(2/8)

Abstract: The Web of Science, an online database of bibliographic information produced by Thomson Reuters- draws its real value from the scholarly citation index at its core. By indexing the cited references from each paper as a separate part of the bibliographic data, a citation index creates a pathway by which a paper can be linked backward in time to the body of work that preceded it, as well as linked forward in time to its scholarly descendants. This entry provides a brief history of the development of the citation index, its core functionalities, and the way these unique data are provided to users through the Web of Science.
Object: Web of Science

Cawkell, T.: Checking research progress on 'image retrieval by shape matching' using the Web of Science (1998) 0.02

0.01632138 = product of:
  0.06528552 = sum of:
    0.042337947 = weight(_text_:web in 3571) [ClassicSimilarity], result of:
      0.042337947 = score(doc=3571,freq=6.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.43716836 = fieldWeight in 3571, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3571)
    0.022947572 = weight(_text_:data in 3571) [ClassicSimilarity], result of:
      0.022947572 = score(doc=3571,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.24455236 = fieldWeight in 3571, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3571)
  0.25 = coord(2/8)

Abstract: Discusses the Web of Science database recently introduced by ISI, and which is compiled from 8.000 journals covered in the SCI, SSCI and AHCI. Briefly compares the database with the Citation Indexes as provided by the BIDS service at the University of Bath. Explores the characteristics and usefulness of the WoS through a search of it for articles on the topic of image retrieval by shape matching. Suggests that the selection of articles of interest is much easier and far quicker using the WoS than other methods of conducting a search using ISI's data
Object: Web of Science

Zhao, D.: Challenges of scholarly publications on the Web to the evaluation of science : a comparison of author visibility on the Web and in print journals (2005) 0.02

0.01632138 = product of:
  0.06528552 = sum of:
    0.042337947 = weight(_text_:web in 1065) [ClassicSimilarity], result of:
      0.042337947 = score(doc=1065,freq=6.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.43716836 = fieldWeight in 1065, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1065)
    0.022947572 = weight(_text_:data in 1065) [ClassicSimilarity], result of:
      0.022947572 = score(doc=1065,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.24455236 = fieldWeight in 1065, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1065)
  0.25 = coord(2/8)

Abstract: This article reveals different patterns of scholarly communication in the XML research field on the Web and in print journals in terms of author visibility, and challenges the common practice of exclusively using the ISI's databases to obtain citation counts as scientific performance indicators. Results from this study demonstrate both the importance and the feasibility of the use of multiple citation data sources in citation analysis studies of scholarly communication, and provide evidence for a developing "two tier" scholarly communication system.

Thelwall, M.: Extracting macroscopic information from Web links (2001) 0.01
```
0.014789727 = product of:
  0.059158906 = sum of:
    0.04276778 = weight(_text_:web in 6851) [ClassicSimilarity], result of:
      0.04276778 = score(doc=6851,freq=12.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.4416067 = fieldWeight in 6851, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6851)
    0.016391123 = weight(_text_:data in 6851) [ClassicSimilarity], result of:
      0.016391123 = score(doc=6851,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.17468026 = fieldWeight in 6851, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6851)
  0.25 = coord(2/8)
```
Abstract

Much has been written about the potential and pitfalls of macroscopic Web-based link analysis, yet there have been no studies that have provided clear statistical evidence that any of the proposed calculations can produce results over large areas of the Web that correlate with phenomena external to the Internet. This article attempts to provide such evidence through an evaluation of Ingwersen's (1998) proposed external Web Impact Factor (WIF) for the original use of the Web: the interlinking of academic research. In particular, it studies the case of the relationship between academic hyperlinks and research activity for universities in Britain, a country chosen for its variety of institutions and the existence of an official government rating exercise for research. After reviewing the numerous reasons why link counts may be unreliable, it demonstrates that four different WIFs do, in fact, correlate with the conventional academic research measures. The WIF delivering the greatest correlation with research rankings was the ratio of Web pages with links pointing at research-based pages to faculty numbers. The scarcity of links to electronic academic papers in the data set suggests that, in contrast to citation analysis, this WIF is measuring the reputations of universities and their scholars, rather than the quality of their publications
Schwartz, F.; Fang, Y.C.: Citation data analysis on hydrogeology (2007) 0.01
```
0.014711436 = product of:
  0.058845744 = sum of:
    0.029321333 = weight(_text_:data in 433) [ClassicSimilarity], result of:
      0.029321333 = score(doc=433,freq=10.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.31247756 = fieldWeight in 433, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=433)
    0.02952441 = product of:
      0.05904882 = sum of:
        0.05904882 = weight(_text_:mining in 433) [ClassicSimilarity], result of:
          0.05904882 = score(doc=433,freq=4.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            0.352653 = fieldWeight in 433, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.03125 = fieldNorm(doc=433)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

This article explores the status of research in hydrogeology using data mining techniques. First we try to explain what citation analysis is and review some of the previous work on citation analysis. The main idea in this article is to address some common issues about citation numbers and the use of these data. To validate the use of citation numbers, we compare the citation patterns for Water Resources Research papers in the 1980s with those in the 1990s. The citation growths for highly cited authors from the 1980s are used to examine whether it is possible to predict the citation patterns for highly-cited authors in the 1990s. If the citation data prove to be steady and stable, these numbers then can be used to explore the evolution of science in hydrogeology. The famous quotation, "If you are not the lead dog, the scenery never changes," attributed to Lee Iacocca, points to the importance of an entrepreneurial spirit in all forms of endeavor. In the case of hydrogeological research, impact analysis makes it clear how important it is to be a pioneer. Statistical correlation coefficients are used to retrieve papers among a collection of 2,847 papers before and after 1991 sharing the same topics with 273 papers in 1991 in Water Resources Research. The numbers of papers before and after 1991 are then plotted against various levels of citations for papers in 1991 to compare the distributions of paper population before and after that year. The similarity metrics based on word counts can ensure that the "before" papers are like ancestors and "after" papers are descendants in the same type of research. This exercise gives us an idea of how many papers are populated before and after 1991 (1991 is chosen based on balanced numbers of papers before and after that year). In addition, the impact of papers is measured in terms of citation presented as "percentile," a relative measure based on rankings in one year, in order to minimize the effect of time.

Theme

Data Mining

Aguillo, I.F.; Granadino, B.; Ortega, J.L.; Prieto, J.A.: Scientific research activity and communication measured with cybermetrics indicators (2006) 0.01

0.0143617615 = product of:
  0.057447046 = sum of:
    0.029630389 = weight(_text_:web in 5898) [ClassicSimilarity], result of:
      0.029630389 = score(doc=5898,freq=4.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.3059541 = fieldWeight in 5898, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5898)
    0.027816659 = weight(_text_:data in 5898) [ClassicSimilarity], result of:
      0.027816659 = score(doc=5898,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.29644224 = fieldWeight in 5898, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=5898)
  0.25 = coord(2/8)

Abstract: To test feasibility of cybermetric indicators for describing and ranking university activities as shown in their Web sites, a large set of 9,330 institutions worldwide was compiled and analyzed. Using search engines' advanced features, size (number of pages), visibility (number of external inlinks), and number of rich files (pdf, ps, doc, ppt, and As formats) were obtained for each of the institutional domains of the universities. We found a statistically significant correlation between a Web ranking built on a combination of Webometric data and other university rankings based on bibliometric and other indicators. Results show that cybermetric measures could be useful for reflecting the contribution of technologically oriented institutions, increasing the visibility of developing countries, and improving the rankings based on Science Citation Index (SCI) data with known biases.

Small, H.: Visualizing science by citation mapping (1999) 0.01

0.014224149 = product of:
  0.056896597 = sum of:
    0.024443826 = weight(_text_:web in 3920) [ClassicSimilarity], result of:
      0.024443826 = score(doc=3920,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.25239927 = fieldWeight in 3920, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3920)
    0.03245277 = weight(_text_:data in 3920) [ClassicSimilarity], result of:
      0.03245277 = score(doc=3920,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.34584928 = fieldWeight in 3920, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3920)
  0.25 = coord(2/8)

Abstract: Science mapping is discussed in the general context of information visualization. Attempts to construct maps of science using citation data are reviewed, focusing on the use of co-citation clusters. New work is reported on a dataset of about 36.000 documents using simplified methods for ordination, and nesting maps hierarchically. an overall map of the dataset shows the multidisciplinary breadth of the document sample, and submaps allow drilling down the document level. An effort to visualize these data using advanced virtual reality software is described, and the creation of document pathways through the map is seen as a realization of Bush's associative trails
Object: Web of Science

Leydesdorff, L.: On the normalization and visualization of author co-citation data : Salton's Cosine versus the Jaccard index (2008) 0.01

0.012192126 = product of:
  0.048768505 = sum of:
    0.020951848 = weight(_text_:web in 1341) [ClassicSimilarity], result of:
      0.020951848 = score(doc=1341,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.21634221 = fieldWeight in 1341, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=1341)
    0.027816659 = weight(_text_:data in 1341) [ClassicSimilarity], result of:
      0.027816659 = score(doc=1341,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.29644224 = fieldWeight in 1341, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1341)
  0.25 = coord(2/8)

Abstract: The debate about which similarity measure one should use for the normalization in the case of Author Co-citation Analysis (ACA) is further complicated when one distinguishes between the symmetrical co-citation - or, more generally, co-occurrence - matrix and the underlying asymmetrical citation - occurrence - matrix. In the Web environment, the approach of retrieving original citation data is often not feasible. In that case, one should use the Jaccard index, but preferentially after adding the number of total citations (i.e., occurrences) on the main diagonal. Unlike Salton's cosine and the Pearson correlation, the Jaccard index abstracts from the shape of the distributions and focuses only on the intersection and the sum of the two sets. Since the correlations in the co-occurrence matrix may be spurious, this property of the Jaccard index can be considered as an advantage in this case.

Tay, A.: ¬The next generation discovery citation indexes : a review of the landscape in 2020 (2020) 0.01

0.012160225 = product of:
  0.0486409 = sum of:
    0.03456879 = weight(_text_:web in 40) [ClassicSimilarity], result of:
      0.03456879 = score(doc=40,freq=4.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.35694647 = fieldWeight in 40, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=40)
    0.014072108 = product of:
      0.028144216 = sum of:
        0.028144216 = weight(_text_:22 in 40) [ClassicSimilarity], result of:
          0.028144216 = score(doc=40,freq=2.0), product of:
            0.103918076 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029675366 = queryNorm
            0.2708308 = fieldWeight in 40, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=40)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: Conclusion There is a reason why Google Scholar and Web of Science/Scopus are kings of the hills in their various arenas. They have strong brand recogniton, a head start in development and a mass of eyeballs and users that leads to an almost virtious cycle of improvement. Competing against such well established competitors is not easy even when one has deep pockets (Microsoft) or a killer idea (scite). It will be interesting to see how the landscape will look like in 2030. Stay tuned for part II where I review each particular index.
Date: 17.11.2020 12:22:59
Object: Web of Science

Search (112 results, page 1 of 6)

Authors

Years

Languages

Types

Themes