Search (58 results, page 1 of 3)

Zhang, Y.; Jansen, B.J.; Spink, A.: Identification of factors predicting clickthrough in Web searching using neural network analysis (2009) 0.06

0.060201935 = product of:
  0.096323095 = sum of:
    0.025048172 = weight(_text_:retrieval in 2742) [ClassicSimilarity], result of:
      0.025048172 = score(doc=2742,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.20052543 = fieldWeight in 2742, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2742)
    0.025667597 = weight(_text_:use in 2742) [ClassicSimilarity], result of:
      0.025667597 = score(doc=2742,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 2742, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=2742)
    0.022201622 = weight(_text_:of in 2742) [ClassicSimilarity], result of:
      0.022201622 = score(doc=2742,freq=22.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.34381276 = fieldWeight in 2742, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2742)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 2742) [ClassicSimilarity], result of:
          0.013242318 = score(doc=2742,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 2742, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2742)
      0.5 = coord(1/2)
    0.016784549 = product of:
      0.033569098 = sum of:
        0.033569098 = weight(_text_:22 in 2742) [ClassicSimilarity], result of:
          0.033569098 = score(doc=2742,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.23214069 = fieldWeight in 2742, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2742)
      0.5 = coord(1/2)
  0.625 = coord(5/8)

Abstract: In this research, we aim to identify factors that significantly affect the clickthrough of Web searchers. Our underlying goal is determine more efficient methods to optimize the clickthrough rate. We devise a clickthrough metric for measuring customer satisfaction of search engine results using the number of links visited, number of queries a user submits, and rank of clicked links. We use a neural network to detect the significant influence of searching characteristics on future user clickthrough. Our results show that high occurrences of query reformulation, lengthy searching duration, longer query length, and the higher ranking of prior clicked links correlate positively with future clickthrough. We provide recommendations for leveraging these findings for improving the performance of search engine retrieval and result ranking, along with implications for search engine marketing.
Date: 22. 3.2009 17:49:11
Source: Journal of the American Society for Information Science and Technology. 60(2009) no.3, S.557-570

Tonta, Y.: Scholarly communication and the use of networked information sources (1996) 0.04

0.03733675 = product of:
  0.0746735 = sum of:
    0.036299463 = weight(_text_:use in 6389) [ClassicSimilarity], result of:
      0.036299463 = score(doc=6389,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.2870708 = fieldWeight in 6389, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=6389)
    0.014968331 = weight(_text_:of in 6389) [ClassicSimilarity], result of:
      0.014968331 = score(doc=6389,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23179851 = fieldWeight in 6389, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=6389)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 6389) [ClassicSimilarity], result of:
          0.013242318 = score(doc=6389,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 6389, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=6389)
      0.5 = coord(1/2)
    0.016784549 = product of:
      0.033569098 = sum of:
        0.033569098 = weight(_text_:22 in 6389) [ClassicSimilarity], result of:
          0.033569098 = score(doc=6389,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.23214069 = fieldWeight in 6389, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=6389)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Examines the use of networked information sources in scholarly communication. Networked information sources are defined broadly to cover: documents and images stored on electronic network hosts; data files; newsgroups; listservs; online information services and electronic periodicals. Reports results of a survey to determine how heavily, if at all, networked information sources are cited in scholarly printed periodicals published in 1993 and 1994. 27 printed periodicals, representing a wide range of subjects and the most influential periodicals in their fields, were identified through the Science Citation Index and Social Science Citation Index Journal Citation Reports. 97 articles were selected for further review and references, footnotes and bibliographies were checked for references to networked information sources. Only 2 articles were found to contain such references. Concludes that, although networked information sources facilitate scholars' work to a great extent during the research process, scholars have yet to incorporate such sources in the bibliographies of their published articles
Source: IFLA journal. 22(1996) no.3, S.240-245

Sugimoto, C.R.; Work, S.; Larivière, V.; Haustein, S.: Scholarly use of social media and altmetrics : A review of the literature (2017) 0.04

0.035075396 = product of:
  0.093534395 = sum of:
    0.057394497 = weight(_text_:use in 3781) [ClassicSimilarity], result of:
      0.057394497 = score(doc=3781,freq=10.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.45389885 = fieldWeight in 3781, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=3781)
    0.026776163 = weight(_text_:of in 3781) [ClassicSimilarity], result of:
      0.026776163 = score(doc=3781,freq=32.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.41465378 = fieldWeight in 3781, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3781)
    0.009363732 = product of:
      0.018727465 = sum of:
        0.018727465 = weight(_text_:on in 3781) [ClassicSimilarity], result of:
          0.018727465 = score(doc=3781,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.20619515 = fieldWeight in 3781, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=3781)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Social media has become integrated into the fabric of the scholarly communication system in fundamental ways, principally through scholarly use of social media platforms and the promotion of new indicators on the basis of interactions with these platforms. Research and scholarship in this area has accelerated since the coining and subsequent advocacy for altmetrics-that is, research indicators based on social media activity. This review provides an extensive account of the state-of-the art in both scholarly use of social media and altmetrics. The review consists of 2 main parts: the first examines the use of social media in academia, reviewing the various functions these platforms have in the scholarly communication process and the factors that affect this use. The second part reviews empirical studies of altmetrics, discussing the various interpretations of altmetrics, data collection and methodological limitations, and differences according to platform. The review ends with a critical discussion of the implications of this transformation in the scholarly communication system.
Source: Journal of the Association for Information Science and Technology. 68(2017) no.9, S.2037-2062

Kaminer, N.; Braunstein, Y.M.: Bibliometric analysis of the impact of Internet use on scholarly productivity (1998) 0.03

0.026982468 = product of:
  0.071953245 = sum of:
    0.03422346 = weight(_text_:use in 1151) [ClassicSimilarity], result of:
      0.03422346 = score(doc=1151,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.27065295 = fieldWeight in 1151, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0625 = fieldNorm(doc=1151)
    0.025244808 = weight(_text_:of in 1151) [ClassicSimilarity], result of:
      0.025244808 = score(doc=1151,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.39093933 = fieldWeight in 1151, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=1151)
    0.012484977 = product of:
      0.024969954 = sum of:
        0.024969954 = weight(_text_:on in 1151) [ClassicSimilarity], result of:
          0.024969954 = score(doc=1151,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.27492687 = fieldWeight in 1151, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=1151)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Variables measuring the nature and level of Internet usage by natural scientists improve the explanatory power of a traditional bibliographic model of scholarly productivity. The data used to construct these variables come from log files generated by the internal accounting modules of the UNIX operating system. The effects of Internet usage on productivity are quntifiable, and it is possible to calculate tradeoffs between Internet usage and the more traditional inputs
Source: Journal of the American Society for Information Science. 49(1998) no.8, S.720-730

Larson, R.R.: Bibliometrics of the World Wide Web : an exploratory analysis of the intellectual structure of cyberspace (1996) 0.03

0.025885578 = product of:
  0.069028206 = sum of:
    0.029945528 = weight(_text_:use in 7334) [ClassicSimilarity], result of:
      0.029945528 = score(doc=7334,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 7334, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7334)
    0.028158326 = weight(_text_:of in 7334) [ClassicSimilarity], result of:
      0.028158326 = score(doc=7334,freq=26.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.43605784 = fieldWeight in 7334, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7334)
    0.010924355 = product of:
      0.02184871 = sum of:
        0.02184871 = weight(_text_:on in 7334) [ClassicSimilarity], result of:
          0.02184871 = score(doc=7334,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24056101 = fieldWeight in 7334, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7334)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Examines the explosive growth and the bibliometrics of the WWW based on both analysis of over 30 GBytes of WWW pages collected by the Inktomi Web Crawler and on the use of the DEC AltaVista search engine for cocitation analysis of a set of Earth Science related WWW sites. Examines the statistical characteristics of web documents and their links, and the characteristics of highly cited web documents
Source: Global complexity: information, chaos and control. Proceedings of the 59th Annual Meeting of the American Society for Information Science, ASIS'96, Baltimore, Maryland, 21-24 Oct 1996. Ed.: S. Hardin

Maharana, B.; Nayak, K.; Sahu, N.K.: Scholarly use of web resources in LIS research : a citation analysis (2006) 0.02

0.022767901 = product of:
  0.0607144 = sum of:
    0.030249555 = weight(_text_:use in 53) [ClassicSimilarity], result of:
      0.030249555 = score(doc=53,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23922569 = fieldWeight in 53, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=53)
    0.024947217 = weight(_text_:of in 53) [ClassicSimilarity], result of:
      0.024947217 = score(doc=53,freq=40.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.38633084 = fieldWeight in 53, product of:
          6.3245554 = tf(freq=40.0), with freq of:
            40.0 = termFreq=40.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=53)
    0.0055176322 = product of:
      0.0110352645 = sum of:
        0.0110352645 = weight(_text_:on in 53) [ClassicSimilarity], result of:
          0.0110352645 = score(doc=53,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.121501654 = fieldWeight in 53, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=53)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Purpose - The essential purpose of this paper is to measure the amount of web resources used for scholarly contributions in the area of library and information science (LIS) in India. It further aims to make an analysis of the nature and type of web resources and studies the various standards for web citations. Design/methodology/approach - In this study, the result of analysis of 292 web citations spread over 95 scholarly papers published in the proceedings of the National Conference of the Society for Information Science, India (SIS-2005) has been reported. All the 292 web citations were scanned and data relating to types of web domains, file formats, styles of citations, etc., were collected through a structured check list. The data thus obtained were systematically analyzed, figurative representations were made and appropriate interpretations were drawn. Findings - The study revealed that 292 (34.88 per cent) out of 837 were web citations, proving a significant correlation between the use of Internet resources and research productivity of LIS professionals in India. The highest number of web citations (35.6 per cent) was from .edu/.ac type domains. Most of the web resources (46.9 per cent) cited in the study were hypertext markup language (HTML) files. Originality/value - The paper is the result of an original analysis of web citations undertaken in order to study the dependence of LIS professionals in India on web sources for their scholarly contributions. This carries research value for web content providers, authors and researchers in LIS.

Barnett, G.A.; Fink, E.L.: Impact of the internet and scholar age distribution on academic citation age (2008) 0.02

0.022621732 = product of:
  0.060324617 = sum of:
    0.025667597 = weight(_text_:use in 1376) [ClassicSimilarity], result of:
      0.025667597 = score(doc=1376,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 1376, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=1376)
    0.023188837 = weight(_text_:of in 1376) [ClassicSimilarity], result of:
      0.023188837 = score(doc=1376,freq=24.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.3591007 = fieldWeight in 1376, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1376)
    0.011468184 = product of:
      0.022936368 = sum of:
        0.022936368 = weight(_text_:on in 1376) [ClassicSimilarity], result of:
          0.022936368 = score(doc=1376,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.25253648 = fieldWeight in 1376, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=1376)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: This article examines the impact of the Internet and the age distribution of research scholars on academic citation age with a mathematical model proposed by Barnett, Fink, and Debus (1989) and a revised model that incorporates information about the online environment and scholar age distribution. The modified model fits the data well, accounting for 99.6% of the variance for science citations and 99.8% for social science citations. The Internet's impact on the aging process of academic citations has been very small, accounting for only 0.1% for the social sciences and 0.8% for the sciences. Rather than resulting in the use of more recent citations, the Internet appears to have lengthened the average life of academic citations by 6 to 8 months. The aging of scholars seems to have a greater impact, accounting for 2.8% of the variance for the sciences and 0.9% for the social sciences. However, because the diffusion of the Internet and the aging of the professoriate are correlated over this time period, differentiating their effects is somewhat problematic.
Source: Journal of the American Society for Information Science and Technology. 59(2008) no.4, S.526-534

Amitay, E.; Carmel, D.; Herscovici, M.; Lempel, R.; Soffer, A.: Trend detection through temporal link analysis (2004) 0.02

0.02176543 = product of:
  0.058041148 = sum of:
    0.020873476 = weight(_text_:retrieval in 3092) [ClassicSimilarity], result of:
      0.020873476 = score(doc=3092,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 3092, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3092)
    0.021389665 = weight(_text_:use in 3092) [ClassicSimilarity], result of:
      0.021389665 = score(doc=3092,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 3092, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3092)
    0.015778005 = weight(_text_:of in 3092) [ClassicSimilarity], result of:
      0.015778005 = score(doc=3092,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.24433708 = fieldWeight in 3092, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3092)
  0.375 = coord(3/8)

Abstract: Although time has been recognized as an important dimension in the co-citation literature, to date it has not been incorporated into the analogous process of link analysis an the Web. In this paper, we discuss several aspects and uses of the time dimension in the context of Web information retrieval. We describe the ideal casewhere search engines track and store temporal data for each of the pages in their repository, assigning timestamps to the hyperlinks embedded within the pages. We introduce several applications which benefit from the availability of such timestamps. To demonstrate our claims, we use a somewhat simplistic approach, which dates links by approximating the age of the page's content. We show that by using this crude measure alone it is possible to detect and expose significant events and trends. We predict that by using more robust methods for tracking modifications in the content of pages, search engines will be able to provide results that are more timely and better reflect current real-life trends than those they provide today.
Source: Journal of the American Society for Information Science and Technology. 55(2004) no.14, S.1270-1281

Neth, M.: Citation analysis and the Web (1998) 0.02

0.019723326 = product of:
  0.052595537 = sum of:
    0.022089208 = weight(_text_:of in 108) [ClassicSimilarity], result of:
      0.022089208 = score(doc=108,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.34207192 = fieldWeight in 108, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=108)
    0.010924355 = product of:
      0.02184871 = sum of:
        0.02184871 = weight(_text_:on in 108) [ClassicSimilarity], result of:
          0.02184871 = score(doc=108,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24056101 = fieldWeight in 108, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=108)
      0.5 = coord(1/2)
    0.019581974 = product of:
      0.039163947 = sum of:
        0.039163947 = weight(_text_:22 in 108) [ClassicSimilarity], result of:
          0.039163947 = score(doc=108,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.2708308 = fieldWeight in 108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=108)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Citation analysis has long been used by librarians as an important tool of collection development and the advent of Internet technology and especially the WWW adds a new facet to the role played by citation analysis. One of the reasons why librarians create WWW homepages is to provide users with further sources of interest or reference and to do this libraries include links from their own homepages to other information sources. Reports current research on the analysis of WWW pages as an introduction to an examination of the homepages of 25 art libraries to determine what sites are most often included. The types of linked sites are analyzed based on 3 criteria: location, focus and evidence that the link was evaluated before the connection was establisheds
Date: 10. 1.1999 16:22:37

Simkin, M.V.; Roychowdhury, V.P.: Why does attention to web articles fall with Time? (2015) 0.02

0.019639079 = product of:
  0.052370876 = sum of:
    0.025667597 = weight(_text_:use in 2163) [ClassicSimilarity], result of:
      0.025667597 = score(doc=2163,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 2163, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=2163)
    0.02008212 = weight(_text_:of in 2163) [ClassicSimilarity], result of:
      0.02008212 = score(doc=2163,freq=18.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.3109903 = fieldWeight in 2163, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2163)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 2163) [ClassicSimilarity], result of:
          0.013242318 = score(doc=2163,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 2163, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2163)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: We analyze access statistics of 150 blog entries and news articles for periods of up to 3 years. Access rate falls as an inverse power of time passed since publication. The power law holds for periods of up to 1,000 days. The exponents are different for different blogs and are distributed between 0.6 and 3.2. We argue that the decay of attention to a web article is caused by the link to it first dropping down the list of links on the website's front page and then disappearing from the front page and its subsequent movement further into background. The other proposed explanations that use a decaying with time novelty factor, or some intricate theory of human dynamics, cannot explain all of the experimental observations.
Source: Journal of the Association for Information Science and Technology. 66(2015) no.9, S.1847-1856

Zhang, Y.: ¬The impact of Internet-based electronic resources on formal scholarly communication in the area of library and information science : a citation analysis (1998) 0.02

0.01880253 = product of:
  0.05014008 = sum of:
    0.019324033 = weight(_text_:of in 2808) [ClassicSimilarity], result of:
      0.019324033 = score(doc=2808,freq=24.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2992506 = fieldWeight in 2808, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2808)
    0.0110352645 = product of:
      0.022070529 = sum of:
        0.022070529 = weight(_text_:on in 2808) [ClassicSimilarity], result of:
          0.022070529 = score(doc=2808,freq=8.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24300331 = fieldWeight in 2808, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2808)
      0.5 = coord(1/2)
    0.019780781 = product of:
      0.039561562 = sum of:
        0.039561562 = weight(_text_:22 in 2808) [ClassicSimilarity], result of:
          0.039561562 = score(doc=2808,freq=4.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.27358043 = fieldWeight in 2808, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2808)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Internet based electronic resources are growing dramatically but there have been no empirical studies evaluating the impact of e-sources, as a whole, on formal scholarly communication. reports results of an investigation into how much e-sources have been used in formal scholarly communication, using a case study in the area of Library and Information Science (LIS) during the period 1994 to 1996. 4 citation based indicators were used in the study of the impact measurement. Concludes that, compared with the impact of print sources, the impact of e-sources on formal scholarly communication in LIS is small, as measured by e-sources cited, and does not increase significantly by year even though there is observable growth of these impact across the years. It is found that periodical format is related to the rate of citing e-sources, articles are more likely to cite e-sources than are print priodical articles. However, once authors cite electronic resource, there is no significant difference in the number of references per article by periodical format or by year. Suggests that, at this stage, citing e-sources may depend on authors rather than the periodical format in which authors choose to publish
Date: 30. 1.1999 17:22:22
Source: Journal of information science. 24(1998) no.4, S.241-254

Stuart, D.: Web metrics for library and information professionals (2014) 0.02
```
0.017859904 = product of:
  0.04762641 = sum of:
    0.021174688 = weight(_text_:use in 2274) [ClassicSimilarity], result of:
      0.021174688 = score(doc=2274,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.16745798 = fieldWeight in 2274, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2274)
    0.018727036 = weight(_text_:of in 2274) [ClassicSimilarity], result of:
      0.018727036 = score(doc=2274,freq=46.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.29000556 = fieldWeight in 2274, product of:
          6.78233 = tf(freq=46.0), with freq of:
            46.0 = termFreq=46.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2274)
    0.007724685 = product of:
      0.01544937 = sum of:
        0.01544937 = weight(_text_:on in 2274) [ClassicSimilarity], result of:
          0.01544937 = score(doc=2274,freq=8.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.17010231 = fieldWeight in 2274, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2274)
      0.5 = coord(1/2)
  0.375 = coord(3/8)
```
Abstract

This is a practical guide to using web metrics to measure impact and demonstrate value. The web provides an opportunity to collect a host of different metrics, from those associated with social media accounts and websites to more traditional research outputs. This book is a clear guide for library and information professionals as to what web metrics are available and how to assess and use them to make informed decisions and demonstrate value. As individuals and organizations increasingly use the web in addition to traditional publishing avenues and formats, this book provides the tools to unlock web metrics and evaluate the impact of this content. The key topics covered include: bibliometrics, webometrics and web metrics; data collection tools; evaluating impact on the web; evaluating social media impact; investigating relationships between actors; exploring traditional publications in a new environment; web metrics and the web of data; the future of web metrics and the library and information professional. The book will provide a practical introduction to web metrics for a wide range of library and information professionals, from the bibliometrician wanting to demonstrate the wider impact of a researcher's work than can be demonstrated through traditional citations databases, to the reference librarian wanting to measure how successfully they are engaging with their users on Twitter. It will be a valuable tool for anyone who wants to not only understand the impact of content, but demonstrate this impact to others within the organization and beyond.

Content

1. Introduction. MetricsIndicators -- Web metrics and Ranganathan's laws of library science -- Web metrics for the library and information professional -- The aim of this book -- The structure of the rest of this book -- 2. Bibliometrics, webometrics and web metrics. Web metrics -- Information science metrics -- Web analytics -- Relational and evaluative metrics -- Evaluative web metrics -- Relational web metrics -- Validating the results -- 3. Data collection tools. The anatomy of a URL, web links and the structure of the web -- Search engines 1.0 -- Web crawlers -- Search engines 2.0 -- Post search engine 2.0: fragmentation -- 4. Evaluating impact on the web. Websites -- Blogs -- Wikis -- Internal metrics -- External metrics -- A systematic approach to content analysis -- 5. Evaluating social media impact. Aspects of social network sites -- Typology of social network sites -- Research and tools for specific sites and services -- Other social network sites -- URL shorteners: web analytic links on any site -- General social media impact -- Sentiment analysis -- 6. Investigating relationships between actors. Social network analysis methods -- Sources for relational network analysis -- 7. Exploring traditional publications in a new environment. More bibliographic items -- Full text analysis -- Greater context -- 8. Web metrics and the web of data. The web of data -- Building the semantic web -- Implications of the web of data for web metrics -- Investigating the web of data today -- SPARQL -- Sindice -- LDSpider: an RDF web crawler -- 9. The future of web metrics and the library and information professional. How far we have come -- The future of web metrics -- The future of the library and information professional and web metrics.
Thelwall, M.; Wilkinson, D.: Finding similar academic Web sites with links, bibliometric couplings and colinks (2004) 0.01
```
0.0148897935 = product of:
  0.059559174 = sum of:
    0.035423465 = weight(_text_:retrieval in 2571) [ClassicSimilarity], result of:
      0.035423465 = score(doc=2571,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 2571, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2571)
    0.024135707 = weight(_text_:of in 2571) [ClassicSimilarity], result of:
      0.024135707 = score(doc=2571,freq=26.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.37376386 = fieldWeight in 2571, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2571)
  0.25 = coord(2/8)
```
Abstract

A common task in both Webmetrics and Web information retrieval is to identify a set of Web pages or sites that are similar in content. In this paper we assess the extent to which links, colinks and couplings can be used to identify similar Web sites. As an experiment, a random sample of 500 pairs of domains from the UK academic Web were taken and human assessments of site similarity, based upon content type, were compared against ratings for the three concepts. The results show that using a combination of all three gives the highest probability of identifying similar sites, but surprisingly this was only a marginal improvement over using links alone. Another unexpected result was that high values for either colink counts or couplings were associated with only a small increased likelihood of similarity. The principal advantage of using couplings and colinks was found to be greater coverage in terms of a much larger number of pairs of sites being connected by these measures, instead of increased probability of similarity. In information retrieval terminology, this is improved recall rather than improved precision.

Brody, T.; Harnad, S.; Carr, L.: Earlier Web usage statistics as predictors of later citation impact (2006) 0.01

0.013008684 = product of:
  0.052034736 = sum of:
    0.029945528 = weight(_text_:use in 165) [ClassicSimilarity], result of:
      0.029945528 = score(doc=165,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 165, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=165)
    0.022089208 = weight(_text_:of in 165) [ClassicSimilarity], result of:
      0.022089208 = score(doc=165,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.34207192 = fieldWeight in 165, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=165)
  0.25 = coord(2/8)

Abstract: The use of citation counts to assess the impact of research articles is well established. However, the citation impact of an article can only be measured several years after it has been published. As research articles are increasingly accessed through the Web, the number of times an article is downloaded can be instantly recorded and counted. One would expect the number of times an article is read to be related both to the number of times it is cited and to how old the article is. The authors analyze how short-term Web usage impact predicts medium-term citation impact. The physics e-print archive-arXiv.org-is used to test this.
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.8, S.1060-1072

Huang, X.; Peng, F,; An, A.; Schuurmans, D.: Dynamic Web log session identification with statistical language models (2004) 0.01
```
0.012816949 = product of:
  0.051267795 = sum of:
    0.036299463 = weight(_text_:use in 3096) [ClassicSimilarity], result of:
      0.036299463 = score(doc=3096,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.2870708 = fieldWeight in 3096, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=3096)
    0.014968331 = weight(_text_:of in 3096) [ClassicSimilarity], result of:
      0.014968331 = score(doc=3096,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23179851 = fieldWeight in 3096, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3096)
  0.25 = coord(2/8)
```
Abstract

We present a novel session identification method based an statistical language modeling. Unlike standard timeout methods, which use fixed time thresholds for session identification, we use an information theoretic approach that yields more robust results for identifying session boundaries. We evaluate our new approach by learning interesting association rules from the segmented session files. We then compare the performance of our approach to three standard session identification methods-the standard timeout method, the reference length method, and the maximal forward reference method-and find that our statistical language modeling approach generally yields superior results. However, as with every method, the performance of our technique varies with changing parameter settings. Therefore, we also analyze the influence of the two key factors in our language-modeling-based approach: the choice of smoothing technique and the language model order. We find that all standard smoothing techniques, save one, perform weIl, and that performance is robust to language model order.

Source

Journal of the American Society for Information Science and Technology. 55(2004) no.14, S.1290-1303
Thelwall, M.; Sud, P.: ¬A comparison of methods for collecting web citation data for academic organizations (2011) 0.01
```
0.011563662 = product of:
  0.04625465 = sum of:
    0.029519552 = weight(_text_:retrieval in 4626) [ClassicSimilarity], result of:
      0.029519552 = score(doc=4626,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.23632148 = fieldWeight in 4626, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4626)
    0.0167351 = weight(_text_:of in 4626) [ClassicSimilarity], result of:
      0.0167351 = score(doc=4626,freq=18.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.25915858 = fieldWeight in 4626, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4626)
  0.25 = coord(2/8)
```
Abstract

The primary webometric method for estimating the online impact of an organization is to count links to its website. Link counts have been available from commercial search engines for over a decade but this was set to end by early 2012 and so a replacement is needed. This article compares link counts to two alternative methods: URL citations and organization title mentions. New variations of these methods are also introduced. The three methods are compared against each other using Yahoo!. Two of the three methods (URL citations and organization title mentions) are also compared against each other using Bing. Evidence from a case study of 131 UK universities and 49 US Library and Information Science (LIS) departments suggests that Bing's Hit Count Estimates (HCEs) for popular title searches are not useful for webometric research but that Yahoo!'s HCEs for all three types of search and Bing's URL citation HCEs seem to be consistent. For exact URL counts the results of all three methods in Yahoo! and both methods in Bing are also consistent. Four types of accuracy factors are also introduced and defined: search engine coverage, search engine retrieval variation, search engine retrieval anomalies, and query polysemy.

Source

Journal of the American Society for Information Science and Technology. 62(2011) no.8, S.1488-1497
Vaughan, L.; Thelwall, M.: Scholarly use of the Web : what are the key inducers of links to journal Web sites? (2003) 0.01
```
0.01150689 = product of:
  0.04602756 = sum of:
    0.030249555 = weight(_text_:use in 1236) [ClassicSimilarity], result of:
      0.030249555 = score(doc=1236,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23922569 = fieldWeight in 1236, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1236)
    0.015778005 = weight(_text_:of in 1236) [ClassicSimilarity], result of:
      0.015778005 = score(doc=1236,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.24433708 = fieldWeight in 1236, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1236)
  0.25 = coord(2/8)
```
Abstract

Web links have been studied by information scientists for at least six years but it is only in the past two that clear evidence has emerged to show that counts of links to scholarly Web spaces (universities and departments) can correlate significantly with research measures, giving some credence to their use for the investigation of scholarly communication. This paper reports an a study to investigate the factors that influence the creation of links to journal Web sites. An empirical approach is used: collecting data and testing for significant patterns. The specific questions addressed are whether site age and site content are inducers of links to a journal's Web site as measured by the ratio of link counts to Journal Impact Factors, two variables previously discovered to be related. A new methodology for data collection is also introduced that uses the Internet Archive to obtain an earliest known creation date for Web sites. The results show that both site age and site content are significant factors for the disciplines studied: library and information science, and law. Comparisons between the two fields also show disciplinary differences in Web site characteristics. Scholars and publishers should be particularly aware that richer content an a journal's Web site tends to generate links and thus the traffic to the site.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.1, S.29-38
Jepsen, E.T.; Seiden, P.; Ingwersen, P.; Björneborn, L.; Borlund, P.: Characteristics of scientific Web publications : preliminary data gathering and analysis (2004) 0.01
```
0.011135121 = product of:
  0.044540484 = sum of:
    0.020873476 = weight(_text_:retrieval in 3091) [ClassicSimilarity], result of:
      0.020873476 = score(doc=3091,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 3091, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3091)
    0.023667008 = weight(_text_:of in 3091) [ClassicSimilarity], result of:
      0.023667008 = score(doc=3091,freq=36.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.36650562 = fieldWeight in 3091, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3091)
  0.25 = coord(2/8)
```
Abstract

Because of the increasing presence of scientific publications an the Web, combined with the existing difficulties in easily verifying and retrieving these publications, research an techniques and methods for retrieval of scientific Web publications is called for. In this article, we report an the initial steps taken toward the construction of a test collection of scientific Web publications within the subject domain of plant biology. The steps reported are those of data gathering and data analysis aiming at identifying characteristics of scientific Web publications. The data used in this article were generated based an specifically selected domain topics that are searched for in three publicly accessible search engines (Google, AlITheWeb, and AItaVista). A sample of the retrieved hits was analyzed with regard to how various publication attributes correlated with the scientific quality of the content and whether this information could be employed to harvest, filter, and rank Web publications. The attributes analyzed were inlinks, outlinks, bibliographic references, file format, language, search engine overlap, structural position (according to site structure), and the occurrence of various types of metadata. As could be expected, the ranked output differs between the three search engines. Apparently, this is caused by differences in ranking algorithms rather than the databases themselves. In fact, because scientific Web content in this subject domain receives few inlinks, both AItaVista and AlITheWeb retrieved a higher degree of accessible scientific content than Google. Because of the search engine cutoffs of accessible URLs, the feasibility of using search engine output for Web content analysis is also discussed.

Source

Journal of the American Society for Information Science and Technology. 55(2004) no.14, S.1239-1249
Thelwall, M.: Extracting macroscopic information from Web links (2001) 0.01
```
0.010565501 = product of:
  0.042262003 = sum of:
    0.021389665 = weight(_text_:use in 6851) [ClassicSimilarity], result of:
      0.021389665 = score(doc=6851,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 6851, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6851)
    0.02087234 = weight(_text_:of in 6851) [ClassicSimilarity], result of:
      0.02087234 = score(doc=6851,freq=28.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.32322758 = fieldWeight in 6851, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6851)
  0.25 = coord(2/8)
```
Abstract

Much has been written about the potential and pitfalls of macroscopic Web-based link analysis, yet there have been no studies that have provided clear statistical evidence that any of the proposed calculations can produce results over large areas of the Web that correlate with phenomena external to the Internet. This article attempts to provide such evidence through an evaluation of Ingwersen's (1998) proposed external Web Impact Factor (WIF) for the original use of the Web: the interlinking of academic research. In particular, it studies the case of the relationship between academic hyperlinks and research activity for universities in Britain, a country chosen for its variety of institutions and the existence of an official government rating exercise for research. After reviewing the numerous reasons why link counts may be unreliable, it demonstrates that four different WIFs do, in fact, correlate with the conventional academic research measures. The WIF delivering the greatest correlation with research rankings was the ratio of Web pages with links pointing at research-based pages to faculty numbers. The scarcity of links to electronic academic papers in the data set suggests that, in contrast to citation analysis, this WIF is measuring the reputations of universities and their scholars, rather than the quality of their publications

Source

Journal of the American Society for Information Science and technology. 52(2001) no.13, S.1157-1168
Menczer, F.: Lexical and semantic clustering by Web links (2004) 0.01
```
0.010361289 = product of:
  0.041445155 = sum of:
    0.025048172 = weight(_text_:retrieval in 3090) [ClassicSimilarity], result of:
      0.025048172 = score(doc=3090,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.20052543 = fieldWeight in 3090, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3090)
    0.016396983 = weight(_text_:of in 3090) [ClassicSimilarity], result of:
      0.016396983 = score(doc=3090,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.25392252 = fieldWeight in 3090, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3090)
  0.25 = coord(2/8)
```
Abstract

Recent Web-searching and -mining tools are combining text and link analysis to improve ranking and crawling algorithms. The central assumption behind such approaches is that there is a correiation between the graph structure of the Web and the text and meaning of pages. Here I formalize and empirically evaluate two general conjectures drawing connections from link information to lexical and semantic Web content. The link-content conjecture states that a page is similar to the pages that link to it, and the link-cluster conjecture that pages about the same topic are clustered together. These conjectures are offen simply assumed to hold, and Web search tools are built an such assumptions. The present quantitative confirmation sheds light an the connection between the success of the latest Web-mining techniques and the small world topology of the Web, with encouraging implications for the design of better crawling algorithms.

Source

Journal of the American Society for Information Science and Technology. 55(2004) no.14, S.1261-1269

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Search (58 results, page 1 of 3)

Authors

Years

Languages

Types

Themes

Subjects

Classifications