Search (2 results, page 1 of 1)

Delgado-Quirós, L.; Aguillo, I.F.; Martín-Martín, A.; López-Cózar, E.D.; Orduña-Malea, E.; Ortega, J.L.: Why are these publications missing? : uncovering the reasons behind the exclusion of documents in free-access scholarly databases (2024) 0.05
```
0.04626411 = product of:
  0.06939616 = sum of:
    0.03354964 = weight(_text_:search in 1201) [ClassicSimilarity], result of:
      0.03354964 = score(doc=1201,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 1201, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1201)
    0.03584652 = product of:
      0.07169304 = sum of:
        0.07169304 = weight(_text_:engines in 1201) [ClassicSimilarity], result of:
          0.07169304 = score(doc=1201,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.2806784 = fieldWeight in 1201, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1201)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

This study analyses the coverage of seven free-access bibliographic databases (Crossref, Dimensions-non-subscription version, Google Scholar, Lens, Microsoft Academic, Scilit, and Semantic Scholar) to identify the potential reasons that might cause the exclusion of scholarly documents and how they could influence coverage. To do this, 116 k randomly selected bibliographic records from Crossref were used as a baseline. API endpoints and web scraping were used to query each database. The results show that coverage differences are mainly caused by the way each service builds their databases. While classic bibliographic databases ingest almost the exact same content from Crossref (Lens and Scilit miss 0.1% and 0.2% of the records, respectively), academic search engines present lower coverage (Google Scholar does not find: 9.8%, Semantic Scholar: 10%, and Microsoft Academic: 12%). Coverage differences are mainly attributed to external factors, such as web accessibility and robot exclusion policies (39.2%-46%), and internal requirements that exclude secondary content (6.5%-11.6%). In the case of Dimensions, the only classic bibliographic database with the lowest coverage (7.6%), internal selection criteria such as the indexation of full books instead of book chapters (65%) and the exclusion of secondary content (15%) are the main motives of missing publications.
Orduña-Malea, E.; Torres-Salinas, D.; López-Cózar, E.D.: Hyperlinks embedded in twitter as a proxy for total external in-links to international university websites (2015) 0.01
```
0.011183213 = product of:
  0.03354964 = sum of:
    0.03354964 = weight(_text_:search in 2043) [ClassicSimilarity], result of:
      0.03354964 = score(doc=2043,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 2043, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2043)
  0.33333334 = coord(1/3)
```
Abstract

Twitter as a potential alternative source of external links for use in webometric analysis is analyzed because of its capacity to embed hyperlinks in different tweets. Given the limitations on searching Twitter's public application programming interface (API), we used the Topsy search engine as a source for compiling tweets. To this end, we took a global sample of 200 universities and compiled all the tweets with hyperlinks to any of these institutions. Further link data was obtained from alternative sources (MajesticSEO and OpenSiteExplorer) in order to compare the results. Thereafter, various statistical tests were performed to determine the correlation between the indicators and the possibility of predicting external links from the collected tweets. The results indicate a high volume of tweets, although they are skewed by the performance of specific universities and countries. The data provided by Topsy correlated significantly with all link indicators, particularly with OpenSiteExplorer (r?=?0.769). Finally, prediction models do not provide optimum results because of high error rates. We conclude that the use of Twitter (via Topsy) as a source of hyperlinks to universities produces promising results due to its high correlation with link indicators, though limited by policies and culture regarding use and presence in social networks.

Search (2 results, page 1 of 1)

Authors

Years