Search (415 results, page 1 of 21)

Herb, U.; Beucke, D.: ¬Die Zukunft der Impact-Messung : Social Media, Nutzung und Zitate im World Wide Web (2013) 0.13

0.12634152 = product of:
  0.3158538 = sum of:
    0.2940668 = weight(_text_:2f in 2188) [ClassicSimilarity], result of:
      0.2940668 = score(doc=2188,freq=2.0), product of:
        0.39242527 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.04628742 = queryNorm
        0.7493574 = fieldWeight in 2188, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0625 = fieldNorm(doc=2188)
    0.021787029 = product of:
      0.043574058 = sum of:
        0.043574058 = weight(_text_:web in 2188) [ClassicSimilarity], result of:
          0.043574058 = score(doc=2188,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.2884563 = fieldWeight in 2188, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=2188)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Content: Vgl. unter: https://www.leibniz-science20.de%2Fforschung%2Fprojekte%2Faltmetrics-in-verschiedenen-wissenschaftsdisziplinen%2F&ei=2jTgVaaXGcK4Udj1qdgB&usg=AFQjCNFOPdONj4RKBDf9YDJOLuz3lkGYlg&sig2=5YI3KWIGxBmk5_kv0P_8iQ.

Menczer, F.: Lexical and semantic clustering by Web links (2004) 0.09

0.087797076 = product of:
  0.14632845 = sum of:
    0.028076671 = weight(_text_:retrieval in 3090) [ClassicSimilarity], result of:
      0.028076671 = score(doc=3090,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.20052543 = fieldWeight in 3090, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3090)
    0.075019486 = weight(_text_:semantic in 3090) [ClassicSimilarity], result of:
      0.075019486 = score(doc=3090,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.38979942 = fieldWeight in 3090, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=3090)
    0.0432323 = product of:
      0.0864646 = sum of:
        0.0864646 = weight(_text_:web in 3090) [ClassicSimilarity], result of:
          0.0864646 = score(doc=3090,freq=14.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.57238775 = fieldWeight in 3090, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3090)
      0.5 = coord(1/2)
  0.6 = coord(3/5)

Abstract: Recent Web-searching and -mining tools are combining text and link analysis to improve ranking and crawling algorithms. The central assumption behind such approaches is that there is a correiation between the graph structure of the Web and the text and meaning of pages. Here I formalize and empirically evaluate two general conjectures drawing connections from link information to lexical and semantic Web content. The link-content conjecture states that a page is similar to the pages that link to it, and the link-cluster conjecture that pages about the same topic are clustered together. These conjectures are offen simply assumed to hold, and Web search tools are built an such assumptions. The present quantitative confirmation sheds light an the connection between the success of the latest Web-mining techniques and the small world topology of the Web, with encouraging implications for the design of better crawling algorithms.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Zhu, Q.; Kong, X.; Hong, S.; Li, J.; He, Z.: Global ontology research progress : a bibliometric analysis (2015) 0.06
```
0.061612546 = product of:
  0.15403137 = sum of:
    0.062516235 = weight(_text_:semantic in 2590) [ClassicSimilarity], result of:
      0.062516235 = score(doc=2590,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32483283 = fieldWeight in 2590, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2590)
    0.09151513 = sum of:
      0.0471703 = weight(_text_:web in 2590) [ClassicSimilarity], result of:
        0.0471703 = score(doc=2590,freq=6.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.3122631 = fieldWeight in 2590, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2590)
      0.04434483 = weight(_text_:22 in 2590) [ClassicSimilarity], result of:
        0.04434483 = score(doc=2590,freq=4.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.27358043 = fieldWeight in 2590, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2590)
  0.4 = coord(2/5)
```
Abstract

Purpose - The purpose of this paper is to analyse the global scientific outputs of ontology research, an important emerging discipline that has huge potential to improve information understanding, organization, and management. Design/methodology/approach - This study collected literature published during 1900-2012 from the Web of Science database. The bibliometric analysis was performed from authorial, institutional, national, spatiotemporal, and topical aspects. Basic statistical analysis, visualization of geographic distribution, co-word analysis, and a new index were applied to the selected data. Findings - Characteristics of publication outputs suggested that ontology research has entered into the soaring stage, along with increased participation and collaboration. The authors identified the leading authors, institutions, nations, and articles in ontology research. Authors were more from North America, Europe, and East Asia. The USA took the lead, while China grew fastest. Four major categories of frequently used keywords were identified: applications in Semantic Web, applications in bioinformatics, philosophy theories, and common supporting technology. Semantic Web research played a core role, and gene ontology study was well-developed. The study focus of ontology has shifted from philosophy to information science. Originality/value - This is the first study to quantify global research patterns and trends in ontology, which might provide a potential guide for the future research. The new index provides an alternative way to evaluate the multidisciplinary influence of researchers.

Date

20. 1.2015 18:30:22
17. 9.2018 18:22:23

Ding, Y.: Applying weighted PageRank to author citation networks (2011) 0.05

0.04591303 = product of:
  0.11478257 = sum of:
    0.032756116 = weight(_text_:retrieval in 4188) [ClassicSimilarity], result of:
      0.032756116 = score(doc=4188,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.23394634 = fieldWeight in 4188, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4188)
    0.08202645 = sum of:
      0.038127303 = weight(_text_:web in 4188) [ClassicSimilarity], result of:
        0.038127303 = score(doc=4188,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.25239927 = fieldWeight in 4188, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0546875 = fieldNorm(doc=4188)
      0.043899145 = weight(_text_:22 in 4188) [ClassicSimilarity], result of:
        0.043899145 = score(doc=4188,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.2708308 = fieldWeight in 4188, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=4188)
  0.4 = coord(2/5)

Abstract: This article aims to identify whether different weighted PageRank algorithms can be applied to author citation networks to measure the popularity and prestige of a scholar from a citation perspective. Information retrieval (IR) was selected as a test field and data from 1956-2008 were collected from Web of Science. Weighted PageRank with citation and publication as weighted vectors were calculated on author citation networks. The results indicate that both popularity rank and prestige rank were highly correlated with the weighted PageRank. Principal component analysis was conducted to detect relationships among these different measures. For capturing prize winners within the IR field, prestige rank outperformed all the other measures
Date: 22. 1.2011 13:02:21

Stuart, D.: Web metrics for library and information professionals (2014) 0.05
```
0.045851987 = product of:
  0.11462997 = sum of:
    0.053596504 = weight(_text_:semantic in 2274) [ClassicSimilarity], result of:
      0.053596504 = score(doc=2274,freq=6.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.27848613 = fieldWeight in 2274, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2274)
    0.061033465 = product of:
      0.12206693 = sum of:
        0.12206693 = weight(_text_:web in 2274) [ClassicSimilarity], result of:
          0.12206693 = score(doc=2274,freq=82.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.808072 = fieldWeight in 2274, product of:
              9.055386 = tf(freq=82.0), with freq of:
                82.0 = termFreq=82.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2274)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

This is a practical guide to using web metrics to measure impact and demonstrate value. The web provides an opportunity to collect a host of different metrics, from those associated with social media accounts and websites to more traditional research outputs. This book is a clear guide for library and information professionals as to what web metrics are available and how to assess and use them to make informed decisions and demonstrate value. As individuals and organizations increasingly use the web in addition to traditional publishing avenues and formats, this book provides the tools to unlock web metrics and evaluate the impact of this content. The key topics covered include: bibliometrics, webometrics and web metrics; data collection tools; evaluating impact on the web; evaluating social media impact; investigating relationships between actors; exploring traditional publications in a new environment; web metrics and the web of data; the future of web metrics and the library and information professional. The book will provide a practical introduction to web metrics for a wide range of library and information professionals, from the bibliometrician wanting to demonstrate the wider impact of a researcher's work than can be demonstrated through traditional citations databases, to the reference librarian wanting to measure how successfully they are engaging with their users on Twitter. It will be a valuable tool for anyone who wants to not only understand the impact of content, but demonstrate this impact to others within the organization and beyond.

Content

1. Introduction. MetricsIndicators -- Web metrics and Ranganathan's laws of library science -- Web metrics for the library and information professional -- The aim of this book -- The structure of the rest of this book -- 2. Bibliometrics, webometrics and web metrics. Web metrics -- Information science metrics -- Web analytics -- Relational and evaluative metrics -- Evaluative web metrics -- Relational web metrics -- Validating the results -- 3. Data collection tools. The anatomy of a URL, web links and the structure of the web -- Search engines 1.0 -- Web crawlers -- Search engines 2.0 -- Post search engine 2.0: fragmentation -- 4. Evaluating impact on the web. Websites -- Blogs -- Wikis -- Internal metrics -- External metrics -- A systematic approach to content analysis -- 5. Evaluating social media impact. Aspects of social network sites -- Typology of social network sites -- Research and tools for specific sites and services -- Other social network sites -- URL shorteners: web analytic links on any site -- General social media impact -- Sentiment analysis -- 6. Investigating relationships between actors. Social network analysis methods -- Sources for relational network analysis -- 7. Exploring traditional publications in a new environment. More bibliographic items -- Full text analysis -- Greater context -- 8. Web metrics and the web of data. The web of data -- Building the semantic web -- Implications of the web of data for web metrics -- Investigating the web of data today -- SPARQL -- Sindice -- LDSpider: an RDF web crawler -- 9. The future of web metrics and the library and information professional. How far we have come -- The future of web metrics -- The future of the library and information professional and web metrics.

RSWK

Bibliothek / World Wide Web / World Wide Web 2.0 / Analyse / Statistik
Bibliometrie / Semantic Web / Soziale Software

Subject

Bibliothek / World Wide Web / World Wide Web 2.0 / Analyse / Statistik
Bibliometrie / Semantic Web / Soziale Software

Tscherteu, G.; Langreiter, C.: Explorative Netzwerkanalyse im Living Web (2009) 0.05

0.04572124 = product of:
  0.1143031 = sum of:
    0.07072904 = weight(_text_:semantic in 4870) [ClassicSimilarity], result of:
      0.07072904 = score(doc=4870,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.36750638 = fieldWeight in 4870, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0625 = fieldNorm(doc=4870)
    0.043574058 = product of:
      0.087148115 = sum of:
        0.087148115 = weight(_text_:web in 4870) [ClassicSimilarity], result of:
          0.087148115 = score(doc=4870,freq=8.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.5769126 = fieldWeight in 4870, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=4870)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Object: Web 2.0
Source: Social Semantic Web: Web 2.0, was nun? Hrsg.: A. Blumauer u. T. Pellegrini

Zhang, Y.; Jansen, B.J.; Spink, A.: Identification of factors predicting clickthrough in Web searching using neural network analysis (2009) 0.04

0.044768713 = product of:
  0.11192178 = sum of:
    0.028076671 = weight(_text_:retrieval in 2742) [ClassicSimilarity], result of:
      0.028076671 = score(doc=2742,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.20052543 = fieldWeight in 2742, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2742)
    0.08384511 = sum of:
      0.046217266 = weight(_text_:web in 2742) [ClassicSimilarity], result of:
        0.046217266 = score(doc=2742,freq=4.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.3059541 = fieldWeight in 2742, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.046875 = fieldNorm(doc=2742)
      0.03762784 = weight(_text_:22 in 2742) [ClassicSimilarity], result of:
        0.03762784 = score(doc=2742,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.23214069 = fieldWeight in 2742, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2742)
  0.4 = coord(2/5)

Abstract: In this research, we aim to identify factors that significantly affect the clickthrough of Web searchers. Our underlying goal is determine more efficient methods to optimize the clickthrough rate. We devise a clickthrough metric for measuring customer satisfaction of search engine results using the number of links visited, number of queries a user submits, and rank of clicked links. We use a neural network to detect the significant influence of searching characteristics on future user clickthrough. Our results show that high occurrences of query reformulation, lengthy searching duration, longer query length, and the higher ranking of prior clicked links correlate positively with future clickthrough. We provide recommendations for leveraging these findings for improving the performance of search engine retrieval and result ranking, along with implications for search engine marketing.
Date: 22. 3.2009 17:49:11

Meireles, N.R.G.; Cendón, B.V.; Almeida, P.E.M. de: Bibliometric knowledge organization : a domain analytic method using artificial neural networks (2014) 0.03
```
0.034365386 = product of:
  0.085913464 = sum of:
    0.023397226 = weight(_text_:retrieval in 1377) [ClassicSimilarity], result of:
      0.023397226 = score(doc=1377,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.16710453 = fieldWeight in 1377, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1377)
    0.062516235 = weight(_text_:semantic in 1377) [ClassicSimilarity], result of:
      0.062516235 = score(doc=1377,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32483283 = fieldWeight in 1377, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1377)
  0.4 = coord(2/5)
```
Abstract

The organization of large collections of documents has become more important with the increase in the amount of digital information available. In certain constricted domains of knowledge, keywords and subject descriptors tend to be similar and therefore insufficient to differentiate documents. In this context, instead of relying only on the presence of common terms, the identification of common cited references can be useful to define semantic relationship among documents. The purpose of this work is to add another instance on the research linking information retrieval and bibliometric techniques aided by information technology. A domain analytic method was developed to generate clusters of documents, which uses self-organizing maps, in the scope of artificial neural networks, to categorize documents. The results obtained show that this approach successfully identified clusters of authors and documents through their cited references. In addition, further qualitative analysis of these clusters demonstrates the existence of semantic relationships between the documents. This study can contribute to the development of the field of knowledge organization by evaluating the use of artificial neural networks in the automatic categorization of documents in a constricted knowledge domain based on the analysis of the references cited by these documents.
Delgado-Quirós, L.; Aguillo, I.F.; Martín-Martín, A.; López-Cózar, E.D.; Orduña-Malea, E.; Ortega, J.L.: Why are these publications missing? : uncovering the reasons behind the exclusion of documents in free-access scholarly databases (2024) 0.03
```
0.03270937 = product of:
  0.08177343 = sum of:
    0.062516235 = weight(_text_:semantic in 1201) [ClassicSimilarity], result of:
      0.062516235 = score(doc=1201,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32483283 = fieldWeight in 1201, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1201)
    0.019257195 = product of:
      0.03851439 = sum of:
        0.03851439 = weight(_text_:web in 1201) [ClassicSimilarity], result of:
          0.03851439 = score(doc=1201,freq=4.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.25496176 = fieldWeight in 1201, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1201)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

This study analyses the coverage of seven free-access bibliographic databases (Crossref, Dimensions-non-subscription version, Google Scholar, Lens, Microsoft Academic, Scilit, and Semantic Scholar) to identify the potential reasons that might cause the exclusion of scholarly documents and how they could influence coverage. To do this, 116 k randomly selected bibliographic records from Crossref were used as a baseline. API endpoints and web scraping were used to query each database. The results show that coverage differences are mainly caused by the way each service builds their databases. While classic bibliographic databases ingest almost the exact same content from Crossref (Lens and Scilit miss 0.1% and 0.2% of the records, respectively), academic search engines present lower coverage (Google Scholar does not find: 9.8%, Semantic Scholar: 10%, and Microsoft Academic: 12%). Coverage differences are mainly attributed to external factors, such as web accessibility and robot exclusion policies (39.2%-46%), and internal requirements that exclude secondary content (6.5%-11.6%). In the case of Dimensions, the only classic bibliographic database with the lowest coverage (7.6%), internal selection criteria such as the indexation of full books instead of book chapters (65%) and the exclusion of secondary content (15%) are the main motives of missing publications.
Riechert, M.; Schmitz, J.: Qualitätssicherung von Forschungsinformationen durch visuelle Repräsentation : das Fallbeispiel des "Informationssystems Promotionsnoten" (2017) 0.03
```
0.032380622 = product of:
  0.08095156 = sum of:
    0.06188791 = weight(_text_:semantic in 3724) [ClassicSimilarity], result of:
      0.06188791 = score(doc=3724,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32156807 = fieldWeight in 3724, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3724)
    0.019063652 = product of:
      0.038127303 = sum of:
        0.038127303 = weight(_text_:web in 3724) [ClassicSimilarity], result of:
          0.038127303 = score(doc=3724,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.25239927 = fieldWeight in 3724, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3724)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Source

Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber

Tunger, D.: Bibliometrie : quo vadis? (2017) 0.03

0.032380622 = product of:
  0.08095156 = sum of:
    0.06188791 = weight(_text_:semantic in 3519) [ClassicSimilarity], result of:
      0.06188791 = score(doc=3519,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32156807 = fieldWeight in 3519, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3519)
    0.019063652 = product of:
      0.038127303 = sum of:
        0.038127303 = weight(_text_:web in 3519) [ClassicSimilarity], result of:
          0.038127303 = score(doc=3519,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.25239927 = fieldWeight in 3519, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3519)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Source: Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber

Möller, T.: Woher stammt das Wissen über die Halbwertzeiten des Wissens? (2017) 0.03

0.032380622 = product of:
  0.08095156 = sum of:
    0.06188791 = weight(_text_:semantic in 3520) [ClassicSimilarity], result of:
      0.06188791 = score(doc=3520,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32156807 = fieldWeight in 3520, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3520)
    0.019063652 = product of:
      0.038127303 = sum of:
        0.038127303 = weight(_text_:web in 3520) [ClassicSimilarity], result of:
          0.038127303 = score(doc=3520,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.25239927 = fieldWeight in 3520, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3520)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Source: Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber

Thelwall, M.; Wilkinson, D.: Finding similar academic Web sites with links, bibliometric couplings and colinks (2004) 0.03
```
0.030497748 = product of:
  0.07624437 = sum of:
    0.03970641 = weight(_text_:retrieval in 2571) [ClassicSimilarity], result of:
      0.03970641 = score(doc=2571,freq=4.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.2835858 = fieldWeight in 2571, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2571)
    0.03653796 = product of:
      0.07307592 = sum of:
        0.07307592 = weight(_text_:web in 2571) [ClassicSimilarity], result of:
          0.07307592 = score(doc=2571,freq=10.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.48375595 = fieldWeight in 2571, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2571)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

A common task in both Webmetrics and Web information retrieval is to identify a set of Web pages or sites that are similar in content. In this paper we assess the extent to which links, colinks and couplings can be used to identify similar Web sites. As an experiment, a random sample of 500 pairs of domains from the UK academic Web were taken and human assessments of site similarity, based upon content type, were compared against ratings for the three concepts. The results show that using a combination of all three gives the highest probability of identifying similar sites, but surprisingly this was only a marginal improvement over using links alone. Another unexpected result was that high values for either colink counts or couplings were associated with only a small increased likelihood of similarity. The principal advantage of using couplings and colinks was found to be greater coverage in terms of a much larger number of pairs of sites being connected by these measures, instead of increased probability of similarity. In information retrieval terminology, this is improved recall rather than improved precision.

Bar-Ilan, J.; Levene, M.: ¬The hw-rank : an h-index variant for ranking web pages (2015) 0.03

0.029611295 = product of:
  0.07402824 = sum of:
    0.04679445 = weight(_text_:retrieval in 1694) [ClassicSimilarity], result of:
      0.04679445 = score(doc=1694,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.33420905 = fieldWeight in 1694, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=1694)
    0.027233787 = product of:
      0.054467574 = sum of:
        0.054467574 = weight(_text_:web in 1694) [ClassicSimilarity], result of:
          0.054467574 = score(doc=1694,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.36057037 = fieldWeight in 1694, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=1694)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Footnote: Beitrag in einem Special Issue "Combining bibliometrics and information retrieval"

Haustein, S.; Sugimoto, C.; Larivière, V.: Social media in scholarly communication : Guest editorial (2015) 0.03
```
0.027756086 = product of:
  0.069390215 = sum of:
    0.0140383355 = weight(_text_:retrieval in 3809) [ClassicSimilarity], result of:
      0.0140383355 = score(doc=3809,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.10026272 = fieldWeight in 3809, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3809)
    0.05535188 = sum of:
      0.03653796 = weight(_text_:web in 3809) [ClassicSimilarity], result of:
        0.03653796 = score(doc=3809,freq=10.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.24187797 = fieldWeight in 3809, product of:
            3.1622777 = tf(freq=10.0), with freq of:
              10.0 = termFreq=10.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0234375 = fieldNorm(doc=3809)
      0.01881392 = weight(_text_:22 in 3809) [ClassicSimilarity], result of:
        0.01881392 = score(doc=3809,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.116070345 = fieldWeight in 3809, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0234375 = fieldNorm(doc=3809)
  0.4 = coord(2/5)
```
Abstract

One of the solutions to help scientists filter the most relevant publications and, thus, to stay current on developments in their fields during the transition from "little science" to "big science", was the introduction of citation indexing as a Wellsian "World Brain" (Garfield, 1964) of scientific information: It is too much to expect a research worker to spend an inordinate amount of time searching for the bibliographic descendants of antecedent papers. It would not be excessive to demand that the thorough scholar check all papers that have cited or criticized such papers, if they could be located quickly. The citation index makes this check practicable (Garfield, 1955, p. 108). In retrospective, citation indexing can be perceived as a pre-social web version of crowdsourcing, as it is based on the concept that the community of citing authors outperforms indexers in highlighting cognitive links between papers, particularly on the level of specific ideas and concepts (Garfield, 1983). Over the last 50 years, citation analysis and more generally, bibliometric methods, have developed from information retrieval tools to research evaluation metrics, where they are presumed to make scientific funding more efficient and effective (Moed, 2006). However, the dominance of bibliometric indicators in research evaluation has also led to significant goal displacement (Merton, 1957) and the oversimplification of notions of "research productivity" and "scientific quality", creating adverse effects such as salami publishing, honorary authorships, citation cartels, and misuse of indicators (Binswanger, 2015; Cronin and Sugimoto, 2014; Frey and Osterloh, 2006; Haustein and Larivière, 2015; Weingart, 2005).
Furthermore, the rise of the web, and subsequently, the social web, has challenged the quasi-monopolistic status of the journal as the main form of scholarly communication and citation indices as the primary assessment mechanisms. Scientific communication is becoming more open, transparent, and diverse: publications are increasingly open access; manuscripts, presentations, code, and data are shared online; research ideas and results are discussed and criticized openly on blogs; and new peer review experiments, with open post publication assessment by anonymous or non-anonymous referees, are underway. The diversification of scholarly production and assessment, paired with the increasing speed of the communication process, leads to an increased information overload (Bawden and Robinson, 2008), demanding new filters. The concept of altmetrics, short for alternative (to citation) metrics, was created out of an attempt to provide a filter (Priem et al., 2010) and to steer against the oversimplification of the measurement of scientific success solely on the basis of number of journal articles published and citations received, by considering a wider range of research outputs and metrics (Piwowar, 2013). Although the term altmetrics was introduced in a tweet in 2010 (Priem, 2010), the idea of capturing traces - "polymorphous mentioning" (Cronin et al., 1998, p. 1320) - of scholars and their documents on the web to measure "impact" of science in a broader manner than citations was introduced years before, largely in the context of webometrics (Almind and Ingwersen, 1997; Thelwall et al., 2005):
There will soon be a critical mass of web-based digital objects and usage statistics on which to model scholars' communication behaviors - publishing, posting, blogging, scanning, reading, downloading, glossing, linking, citing, recommending, acknowledging - and with which to track their scholarly influence and impact, broadly conceived and broadly felt (Cronin, 2005, p. 196). A decade after Cronin's prediction and five years after the coining of altmetrics, the time seems ripe to reflect upon the role of social media in scholarly communication. This Special Issue does so by providing an overview of current research on the indicators and metrics grouped under the umbrella term of altmetrics, on their relationships with traditional indicators of scientific activity, and on the uses that are made of the various social media platforms - on which these indicators are based - by scientists of various disciplines.

Date

20. 1.2015 18:30:22

Koulouri, X.; Ifrim, C.; Wallace, M.; Pop, F.: Making sense of citations (2017) 0.03

0.027754819 = product of:
  0.06938705 = sum of:
    0.05304678 = weight(_text_:semantic in 3486) [ClassicSimilarity], result of:
      0.05304678 = score(doc=3486,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.2756298 = fieldWeight in 3486, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=3486)
    0.01634027 = product of:
      0.03268054 = sum of:
        0.03268054 = weight(_text_:web in 3486) [ClassicSimilarity], result of:
          0.03268054 = score(doc=3486,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.21634221 = fieldWeight in 3486, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3486)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Series: Information Systems and Applications, incl. Internet/Web, and HCI; 10151
Source: Semantic keyword-based search on structured data sources: COST Action IC1302. Second International KEYSTONE Conference, IKC 2016, Cluj-Napoca, Romania, September 8-9, 2016, Revised Selected Papers. Eds.: A. Calì, A. et al

Wang, P.: ¬An empirical study of knowledge structures of research topics (1999) 0.03
```
0.02704115 = product of:
  0.06760287 = sum of:
    0.023397226 = weight(_text_:retrieval in 6667) [ClassicSimilarity], result of:
      0.023397226 = score(doc=6667,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.16710453 = fieldWeight in 6667, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6667)
    0.04420565 = weight(_text_:semantic in 6667) [ClassicSimilarity], result of:
      0.04420565 = score(doc=6667,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.22969149 = fieldWeight in 6667, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6667)
  0.4 = coord(2/5)
```
Abstract

How knowledge is organized in human memory is of interest to both information science and cognitive science. The current information retrieval (IR) systems can be improved if we understand which conceptual structures could facilitate users in information processing and seeking. This project examined twenty-two cognitive maps on ten research topics generated by ten experts and eleven non-experts. Experts were those who had completed a research project on the topic prior to participating in this study, while non-experts were from the same academic department who were familiar with the topic but had not conducted any in-depth research on it. A research topic can be represented by a vocabulary and the relationships among the terms in the vocabulary. A cognitive map visualizes the vocabulary and its configuration in a plane. We observed that experts did not generate the maps much faster than non-experts. Both experts and non-experts modified the given vocabulary by either adding or dropping terms. The dominant configuration for the maps was top-down, while five maps were orientated in left-right or radical structure (from a center). Experts tended to use problem-oriented approach to organize the vocabulary while non-experts often applied discipline-oriented hierarchical structure. Despite of many differences in vocabulary and structure by individuals, there are terms clustered in a similar ways across maps indicating an agreed-upon semantic closeness among these terms
Thelwall, M.: ¬A layered approach for investigating the topological structure of communities in the Web (2003) 0.03
```
0.026577247 = product of:
  0.066443115 = sum of:
    0.033088673 = weight(_text_:retrieval in 4450) [ClassicSimilarity], result of:
      0.033088673 = score(doc=4450,freq=4.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.23632148 = fieldWeight in 4450, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4450)
    0.03335444 = product of:
      0.06670888 = sum of:
        0.06670888 = weight(_text_:web in 4450) [ClassicSimilarity], result of:
          0.06670888 = score(doc=4450,freq=12.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.4416067 = fieldWeight in 4450, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4450)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

A layered approach for identifying communities in the Web is presented and explored by applying the flake exact community identification algorithm to the UK academic Web. Although community or topic identification is a common task in information retrieval, a new perspective is developed by: the application of alternative document models, shifting the focus from individual pages to aggregated collections based upon Web directories, domains and entire sites; the removal of internal site links; and the adaptation of a new fast algorithm to allow fully-automated community identification using all possible single starting points. The overall topology of the graphs in the three least-aggregated layers was first investigated and found to include a large number of isolated points but, surprisingly, with most of the remainder being in one huge connected component, exact proportions varying by layer. The community identification process then found that the number of communities far exceeded the number of topological components, indicating that community identification is a potentially useful technique, even with random starting points. Both the number and size of communities identified was dependent on the parameter of the algorithm, with very different results being obtained in each case. In conclusion, the UK academic Web is embedded with layers of non-trivial communities and, if it is not unique in this, then there is the promise of improved results for information retrieval algorithms that can exploit this additional structure, and the application of the technique directly to partially automate Web metrics tasks such as that of finding all pages related to a given subject hosted by a single country's universities.
Ahlgren, P.; Jarneving, B.; Rousseau, R.: Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient (2003) 0.03
```
0.026236016 = product of:
  0.06559004 = sum of:
    0.01871778 = weight(_text_:retrieval in 5171) [ClassicSimilarity], result of:
      0.01871778 = score(doc=5171,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.13368362 = fieldWeight in 5171, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=5171)
    0.046872254 = sum of:
      0.021787029 = weight(_text_:web in 5171) [ClassicSimilarity], result of:
        0.021787029 = score(doc=5171,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.14422815 = fieldWeight in 5171, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.03125 = fieldNorm(doc=5171)
      0.025085226 = weight(_text_:22 in 5171) [ClassicSimilarity], result of:
        0.025085226 = score(doc=5171,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.15476047 = fieldWeight in 5171, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=5171)
  0.4 = coord(2/5)
```
Abstract

Ahlgren, Jarneving, and. Rousseau review accepted procedures for author co-citation analysis first pointing out that since in the raw data matrix the row and column values are identical i,e, the co-citation count of two authors, there is no clear choice for diagonal values. They suggest the number of times an author has been co-cited with himself excluding self citation rather than the common treatment as zeros or as missing values. When the matrix is converted to a similarity matrix the normal procedure is to create a matrix of Pearson's r coefficients between data vectors. Ranking by r and by co-citation frequency and by intuition can easily yield three different orders. It would seem necessary that the adding of zeros to the matrix will not affect the value or the relative order of similarity measures but it is shown that this is not the case with Pearson's r. Using 913 bibliographic descriptions form the Web of Science of articles form JASIS and Scientometrics, authors names were extracted, edited and 12 information retrieval authors and 12 bibliometric authors each from the top 100 most cited were selected. Co-citation and r value (diagonal elements treated as missing) matrices were constructed, and then reconstructed in expanded form. Adding zeros can both change the r value and the ordering of the authors based upon that value. A chi-squared distance measure would not violate these requirements, nor would the cosine coefficient. It is also argued that co-citation data is ordinal data since there is no assurance of an absolute zero number of co-citations, and thus Pearson is not appropriate. The number of ties in co-citation data make the use of the Spearman rank order coefficient problematic.

Date

9. 7.2006 10:22:35
Jepsen, E.T.; Seiden, P.; Ingwersen, P.; Björneborn, L.; Borlund, P.: Characteristics of scientific Web publications : preliminary data gathering and analysis (2004) 0.02
```
0.024764648 = product of:
  0.061911616 = sum of:
    0.023397226 = weight(_text_:retrieval in 3091) [ClassicSimilarity], result of:
      0.023397226 = score(doc=3091,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.16710453 = fieldWeight in 3091, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3091)
    0.03851439 = product of:
      0.07702878 = sum of:
        0.07702878 = weight(_text_:web in 3091) [ClassicSimilarity], result of:
          0.07702878 = score(doc=3091,freq=16.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.5099235 = fieldWeight in 3091, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3091)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Because of the increasing presence of scientific publications an the Web, combined with the existing difficulties in easily verifying and retrieving these publications, research an techniques and methods for retrieval of scientific Web publications is called for. In this article, we report an the initial steps taken toward the construction of a test collection of scientific Web publications within the subject domain of plant biology. The steps reported are those of data gathering and data analysis aiming at identifying characteristics of scientific Web publications. The data used in this article were generated based an specifically selected domain topics that are searched for in three publicly accessible search engines (Google, AlITheWeb, and AItaVista). A sample of the retrieved hits was analyzed with regard to how various publication attributes correlated with the scientific quality of the content and whether this information could be employed to harvest, filter, and rank Web publications. The attributes analyzed were inlinks, outlinks, bibliographic references, file format, language, search engine overlap, structural position (according to site structure), and the occurrence of various types of metadata. As could be expected, the ranked output differs between the three search engines. Apparently, this is caused by differences in ranking algorithms rather than the databases themselves. In fact, because scientific Web content in this subject domain receives few inlinks, both AItaVista and AlITheWeb retrieved a higher degree of accessible scientific content than Google. Because of the search engine cutoffs of accessible URLs, the feasibility of using search engine output for Web content analysis is also discussed.

Search (415 results, page 1 of 21)

Authors

Years

Languages

Types

Themes

Subjects

Classifications