Search (68 results, page 1 of 4)

Thelwall, M.: Web impact factors and search engine coverage (2000) 0.13

0.12564686 = product of:
  0.18847027 = sum of:
    0.10735885 = weight(_text_:search in 4539) [ClassicSimilarity], result of:
      0.10735885 = score(doc=4539,freq=8.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.6144187 = fieldWeight in 4539, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0625 = fieldNorm(doc=4539)
    0.08111142 = product of:
      0.16222285 = sum of:
        0.16222285 = weight(_text_:engines in 4539) [ClassicSimilarity], result of:
          0.16222285 = score(doc=4539,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.63510275 = fieldWeight in 4539, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0625 = fieldNorm(doc=4539)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Search engines index only a proportion of the web and this proportion is not determined randomly but by following algorithms that take into account the properties that impact factors measure. A survey was conducted in order to test the coverage of search engines and to decide thether their partial coverage is indeed an obstacle to using them to calculate web impact factors. The results indicate that search engine coverage, even of large national domains is extremely uneven and would be likely to lead to misleading calculations

Thelwall, M.: Quantitative comparisons of search engine results (2008) 0.10
```
0.1025817 = product of:
  0.15387255 = sum of:
    0.08217951 = weight(_text_:search in 2350) [ClassicSimilarity], result of:
      0.08217951 = score(doc=2350,freq=12.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.47031635 = fieldWeight in 2350, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2350)
    0.07169304 = product of:
      0.14338608 = sum of:
        0.14338608 = weight(_text_:engines in 2350) [ClassicSimilarity], result of:
          0.14338608 = score(doc=2350,freq=8.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.5613568 = fieldWeight in 2350, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2350)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Search engines are normally used to find information or Web sites, but Webometric investigations use them for quantitative data such as the number of pages matching a query and the international spread of those pages. For this type of application, the accuracy of the hit count estimates and range of URLs in the full results are important. Here, we compare the applications programming interfaces of Google, Yahoo!, and Live Search for 1,587 single word searches. The hit count estimates were broadly consistent but with Yahoo! and Google, reporting 5-6 times more hits than Live Search. Yahoo! tended to return slightly more matching URLs than Google, with Live Search returning significantly fewer. Yahoo!'s result URLs included a significantly wider range of domains and sites than the other two, and there was little consistency between the three engines in the number of different domains. In contrast, the three engines were reasonably consistent in the number of different top-level domains represented in the result URLs, although Yahoo! tended to return the most. In conclusion, quantitative results from the three search engines are mostly consistent but with unexpected types of inconsistency that users should be aware of. Google is recommended for hit count estimates but Yahoo! is recommended for all other Webometric purposes.
Bhavnani, S.K.: Why is it difficult to find comprehensive information? : implications of information scatter for search and design (2005) 0.09
```
0.08858277 = product of:
  0.13287415 = sum of:
    0.08217951 = weight(_text_:search in 3684) [ClassicSimilarity], result of:
      0.08217951 = score(doc=3684,freq=12.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.47031635 = fieldWeight in 3684, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3684)
    0.05069464 = product of:
      0.10138928 = sum of:
        0.10138928 = weight(_text_:engines in 3684) [ClassicSimilarity], result of:
          0.10138928 = score(doc=3684,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.39693922 = fieldWeight in 3684, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3684)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The rapid development of Web sites providing extensive coverage of a topic, coupled with the development of powerful search engines (designed to help users find such Web sites), suggests that users can easily find comprehensive information about a topic. In domains such as consumer healthcare, finding comprehensive information about a topic is critical as it can improve a patient's judgment in making healthcare decisions, and can encourage higher compliance with treatment. However, recent studies show that despite using powerful search engines, many healthcare information seekers have difficulty finding comprehensive information even for narrow healthcare topics because the relevant information is scattered across many Web sites. To date, no studies have analyzed how facts related to a search topic are distributed across relevant Web pages and Web sites. In this study, the distribution of facts related to five common healthcare topics across high-quality sites is analyzed, and the reasons underlying those distributions are explored. The analysis revealed the existence of few pages that had many facts, many pages that had few facts, and no single page or site that provided all the facts. While such a distribution conforms to other information-related phenomena, a deeper analysis revealed that the distributions were caused by a trade-off between depth and breadth, leading to the existence of general, specialized, and sparse pages. Furthermore, the results helped to make explicit the knowledge needed by searchers to find comprehensive healthcare information, and suggested the motivation to explore distribution-conscious approaches for the development of future search systems, search interfaces, Web page designs, and training.
Jepsen, E.T.; Seiden, P.; Ingwersen, P.; Björneborn, L.; Borlund, P.: Characteristics of scientific Web publications : preliminary data gathering and analysis (2004) 0.08
```
0.08380928 = product of:
  0.12571391 = sum of:
    0.07501928 = weight(_text_:search in 3091) [ClassicSimilarity], result of:
      0.07501928 = score(doc=3091,freq=10.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.4293381 = fieldWeight in 3091, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3091)
    0.05069464 = product of:
      0.10138928 = sum of:
        0.10138928 = weight(_text_:engines in 3091) [ClassicSimilarity], result of:
          0.10138928 = score(doc=3091,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.39693922 = fieldWeight in 3091, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3091)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Because of the increasing presence of scientific publications an the Web, combined with the existing difficulties in easily verifying and retrieving these publications, research an techniques and methods for retrieval of scientific Web publications is called for. In this article, we report an the initial steps taken toward the construction of a test collection of scientific Web publications within the subject domain of plant biology. The steps reported are those of data gathering and data analysis aiming at identifying characteristics of scientific Web publications. The data used in this article were generated based an specifically selected domain topics that are searched for in three publicly accessible search engines (Google, AlITheWeb, and AItaVista). A sample of the retrieved hits was analyzed with regard to how various publication attributes correlated with the scientific quality of the content and whether this information could be employed to harvest, filter, and rank Web publications. The attributes analyzed were inlinks, outlinks, bibliographic references, file format, language, search engine overlap, structural position (according to site structure), and the occurrence of various types of metadata. As could be expected, the ranked output differs between the three search engines. Apparently, this is caused by differences in ranking algorithms rather than the databases themselves. In fact, because scientific Web content in this subject domain receives few inlinks, both AItaVista and AlITheWeb retrieved a higher degree of accessible scientific content than Google. Because of the search engine cutoffs of accessible URLs, the feasibility of using search engine output for Web content analysis is also discussed.
Thelwall, M.: Results from a web impact factor crawler (2001) 0.08
```
0.0801318 = product of:
  0.12019769 = sum of:
    0.058109686 = weight(_text_:search in 4490) [ClassicSimilarity], result of:
      0.058109686 = score(doc=4490,freq=6.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.33256388 = fieldWeight in 4490, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4490)
    0.062088005 = product of:
      0.12417601 = sum of:
        0.12417601 = weight(_text_:engines in 4490) [ClassicSimilarity], result of:
          0.12417601 = score(doc=4490,freq=6.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.4861493 = fieldWeight in 4490, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4490)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Web impact factors, the proposed web equivalent of impact factors for journals, can be calculated by using search engines. It has been found that the results are problematic because of the variable coverage of search engines as well as their ability to give significantly different results over short periods of time. The fundamental problem is that although some search engines provide a functionality that is capable of being used for impact calculations, this is not their primary task and therefore they do not give guarantees as to performance in this respect. In this paper, a bespoke web crawler designed specifically for the calculation of reliable WIFs is presented. This crawler was used to calculate WIFs for a number of UK universities, and the results of these calculations are discussed. The principal findings were that with certain restrictions, WIFs can be calculated reliably, but do not correlate with accepted research rankings owing to the variety of material hosted on university servers. Changes to the calculations to improve the fit of the results to research rankings are proposed, but there are still inherent problems undermining the reliability of the calculation. These problems still apply if the WIF scores are taken on their own as indicators of the general impact of any area of the Internet, but with care would not apply to online journals.
Lawrence, S.: Online or Invisible? (2001) 0.07
```
0.07086621 = product of:
  0.10629931 = sum of:
    0.0657436 = weight(_text_:search in 1063) [ClassicSimilarity], result of:
      0.0657436 = score(doc=1063,freq=12.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.37625307 = fieldWeight in 1063, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=1063)
    0.04055571 = product of:
      0.08111142 = sum of:
        0.08111142 = weight(_text_:engines in 1063) [ClassicSimilarity], result of:
          0.08111142 = score(doc=1063,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.31755137 = fieldWeight in 1063, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=1063)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Content

The volume of scientific literature typically far exceeds the ability of scientists to identify and utilize all relevant information in their research. Improvements to the accessibility of scientific literature, allowing scientists to locate more relevant research within a given time, have the potential to dramatically improve communication and progress in science. With the web, scientists now have very convenient access to an increasing amount of literature that previously required trips to the library, inter-library loan delays, or substantial effort in locating the source. Evidence shows that usage increases when access is more convenient, and maximizing the usage of the scientific record benefits all of society. Although availability varies greatly by discipline, over a million research articles are freely available on the web. Some journals and conferences provide free access online, others allow authors to post articles on the web, and others allow authors to purchase the right to post their articles on the web. In this article we investigate the impact of free online availability by analyzing citation rates. We do not discuss methods of creating free online availability, such as time-delayed release or publication/membership/conference charges. Online availability of an article may not be expected to greatly improve access and impact by itself. For example, efficient means of locating articles via web search engines or specialized search services is required, and a substantial percentage of the literature needs to be indexed by these search services before it is worthwhile for many scientists to use them. Computer science is a forerunner in web availability -- a substantial percentage of the literature is online and available through search engines such as Google (google.com), or specialized services such as ResearchIndex (researchindex.org). Even so, the greatest impact of the online availability of computer science literature is likely yet to come, because comprehensive search services and more powerful search methods have only become available recently. We analyzed 119,924 conference articles in computer science and related disciplines, obtained from DBLP (dblp.uni-trier.de). In computer science, conference articles are typically formal publications and are often more prestigious than journal articles, with acceptance rates at some conferences below 10%. Citation counts and online availability were estimated using ResearchIndex. The analysis excludes self-citations, where a citation is considered to be a self-citation if one or more of the citing and cited authors match.
Bar-Ilan, J.: ¬The Web as an information source on informetrics? : A content analysis (2000) 0.07
```
0.066634305 = product of:
  0.09995145 = sum of:
    0.056935627 = weight(_text_:search in 4587) [ClassicSimilarity], result of:
      0.056935627 = score(doc=4587,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.3258447 = fieldWeight in 4587, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=4587)
    0.043015826 = product of:
      0.08603165 = sum of:
        0.08603165 = weight(_text_:engines in 4587) [ClassicSimilarity], result of:
          0.08603165 = score(doc=4587,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.33681408 = fieldWeight in 4587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=4587)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

This article addresses the question of whether the Web can serve as an information source for research. Specifically, it analyzes by way of content analysis the Web pages retrieved by the major search engines on a particular date (June 7, 1998), as a result of the query 'informetrics OR informetric'. In 807 out of the 942 retrieved pages, the search terms were mentioned in the context of information science. Over 70% of the pages contained only indirect information on the topic, in the form of hypertext links and bibliographical references without annotation. The bibliographical references extracted from the Web pages were analyzed, and lists of most productive authors, most cited authors, works, and sources were compiled. The list of reference obtained from the Web was also compared to data retrieved from commercial databases. For most cases, the list of references extracted from the Web outperformed the commercial, bibliographic databases. The results of these comparisons indicate that valuable, freely available data is hidden in the Web waiting to be extracted from the millions of Web pages
Amitay, E.; Carmel, D.; Herscovici, M.; Lempel, R.; Soffer, A.: Trend detection through temporal link analysis (2004) 0.07
```
0.06542733 = product of:
  0.098141 = sum of:
    0.04744636 = weight(_text_:search in 3092) [ClassicSimilarity], result of:
      0.04744636 = score(doc=3092,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.27153727 = fieldWeight in 3092, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3092)
    0.05069464 = product of:
      0.10138928 = sum of:
        0.10138928 = weight(_text_:engines in 3092) [ClassicSimilarity], result of:
          0.10138928 = score(doc=3092,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.39693922 = fieldWeight in 3092, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3092)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Although time has been recognized as an important dimension in the co-citation literature, to date it has not been incorporated into the analogous process of link analysis an the Web. In this paper, we discuss several aspects and uses of the time dimension in the context of Web information retrieval. We describe the ideal casewhere search engines track and store temporal data for each of the pages in their repository, assigning timestamps to the hyperlinks embedded within the pages. We introduce several applications which benefit from the availability of such timestamps. To demonstrate our claims, we use a somewhat simplistic approach, which dates links by approximating the age of the page's content. We show that by using this crude measure alone it is possible to detect and expose significant events and trends. We predict that by using more robust methods for tracking modifications in the content of pages, search engines will be able to provide results that are more timely and better reflect current real-life trends than those they provide today.

Cothey, V.: Web-crawling reliability (2004) 0.06

0.06476976 = product of:
  0.09715463 = sum of:
    0.0469695 = weight(_text_:search in 3089) [ClassicSimilarity], result of:
      0.0469695 = score(doc=3089,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.2688082 = fieldWeight in 3089, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3089)
    0.05018513 = product of:
      0.10037026 = sum of:
        0.10037026 = weight(_text_:engines in 3089) [ClassicSimilarity], result of:
          0.10037026 = score(doc=3089,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.39294976 = fieldWeight in 3089, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3089)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: In this article, I investigate the reliability, in the social science sense, of collecting informetric data about the World Wide Web by Web crawling. The investigation includes a critical examination of the practice of Web crawling and contrasts the results of content crawling with the results of link crawling. It is shown that Web crawling by search engines is intentionally biased and selective. I also report the results of a [arge-scale experimental simulation of Web crawling that illustrates the effects of different crawling policies an data collection. It is concluded that the reliability of Web crawling as a data collection technique is improved by fuller reporting of relevant crawling policies.

H-Index auch im Web of Science (2008) 0.06

0.060110323 = product of:
  0.09016548 = sum of:
    0.06973162 = weight(_text_:search in 590) [ClassicSimilarity], result of:
      0.06973162 = score(doc=590,freq=6.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.39907667 = fieldWeight in 590, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=590)
    0.020433856 = product of:
      0.040867712 = sum of:
        0.040867712 = weight(_text_:22 in 590) [ClassicSimilarity], result of:
          0.040867712 = score(doc=590,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.23214069 = fieldWeight in 590, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=590)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Content: "Zur Kurzmitteilung "Latest enhancements in Scopus: ... h-Index incorporated in Scopus" in den letzten Online-Mitteilungen (Online-Mitteilungen 92, S.31) ist zu korrigieren, dass der h-Index sehr wohl bereits im Web of Science enthalten ist. Allerdings findet man/frau diese Information nicht in der "cited ref search", sondern neben der Trefferliste einer Quick Search, General Search oder einer Suche über den Author Finder in der rechten Navigationsleiste unter dem Titel "Citation Report". Der "Citation Report" bietet für die in der jeweiligen Trefferliste angezeigten Arbeiten: - Die Gesamtzahl der Zitierungen aller Arbeiten in der Trefferliste - Die mittlere Zitationshäufigkeit dieser Arbeiten - Die Anzahl der Zitierungen der einzelnen Arbeiten, aufgeschlüsselt nach Publikationsjahr der zitierenden Arbeiten - Die mittlere Zitationshäufigkeit dieser Arbeiten pro Jahr - Den h-Index (ein h-Index von x sagt aus, dass x Arbeiten der Trefferliste mehr als x-mal zitiert wurden; er ist gegenüber sehr hohen Zitierungen einzelner Arbeiten unempfindlicher als die mittlere Zitationshäufigkeit)."
Date: 6. 4.2008 19:04:22

Zhang, Y.; Jansen, B.J.; Spink, A.: Identification of factors predicting clickthrough in Web searching using neural network analysis (2009) 0.06

0.060110323 = product of:
  0.09016548 = sum of:
    0.06973162 = weight(_text_:search in 2742) [ClassicSimilarity], result of:
      0.06973162 = score(doc=2742,freq=6.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.39907667 = fieldWeight in 2742, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=2742)
    0.020433856 = product of:
      0.040867712 = sum of:
        0.040867712 = weight(_text_:22 in 2742) [ClassicSimilarity], result of:
          0.040867712 = score(doc=2742,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.23214069 = fieldWeight in 2742, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2742)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: In this research, we aim to identify factors that significantly affect the clickthrough of Web searchers. Our underlying goal is determine more efficient methods to optimize the clickthrough rate. We devise a clickthrough metric for measuring customer satisfaction of search engine results using the number of links visited, number of queries a user submits, and rank of clicked links. We use a neural network to detect the significant influence of searching characteristics on future user clickthrough. Our results show that high occurrences of query reformulation, lengthy searching duration, longer query length, and the higher ranking of prior clicked links correlate positively with future clickthrough. We provide recommendations for leveraging these findings for improving the performance of search engine retrieval and result ranking, along with implications for search engine marketing.
Date: 22. 3.2009 17:49:11

Prime-Claverie, C.; Beigbeder, M.; Lafouge, T.: Transposition of the cocitation method with a view to classifying Web pages (2004) 0.06

0.05551693 = product of:
  0.08327539 = sum of:
    0.04025957 = weight(_text_:search in 3095) [ClassicSimilarity], result of:
      0.04025957 = score(doc=3095,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.230407 = fieldWeight in 3095, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=3095)
    0.043015826 = product of:
      0.08603165 = sum of:
        0.08603165 = weight(_text_:engines in 3095) [ClassicSimilarity], result of:
          0.08603165 = score(doc=3095,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.33681408 = fieldWeight in 3095, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=3095)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: The Web is a huge source of information, and one of the main problems facing users is finding documents which correspond to their requirements. Apart from the problem of thematic relevance, the documents retrieved by search engines do not always meet the users' expectations. The document may be too general, or conversely too specialized, or of a different type from what the user is looking for, and so forth. We think that adding metadata to pages can considerably improve the process of searching for information an the Web. This article presents a possible typology for Web sites and pages, as weIl as a method for propagating metadata values, based an the study of the Web graph and more specifically the method of cocitation in this graph.

Aguillo, I.F.; Granadino, B.; Ortega, J.L.; Prieto, J.A.: Scientific research activity and communication measured with cybermetrics indicators (2006) 0.06

0.05551693 = product of:
  0.08327539 = sum of:
    0.04025957 = weight(_text_:search in 5898) [ClassicSimilarity], result of:
      0.04025957 = score(doc=5898,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.230407 = fieldWeight in 5898, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=5898)
    0.043015826 = product of:
      0.08603165 = sum of:
        0.08603165 = weight(_text_:engines in 5898) [ClassicSimilarity], result of:
          0.08603165 = score(doc=5898,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.33681408 = fieldWeight in 5898, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=5898)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: To test feasibility of cybermetric indicators for describing and ranking university activities as shown in their Web sites, a large set of 9,330 institutions worldwide was compiled and analyzed. Using search engines' advanced features, size (number of pages), visibility (number of external inlinks), and number of rich files (pdf, ps, doc, ppt, and As formats) were obtained for each of the institutional domains of the universities. We found a statistically significant correlation between a Web ranking built on a combination of Webometric data and other university rankings based on bibliometric and other indicators. Results show that cybermetric measures could be useful for reflecting the contribution of technologically oriented institutions, increasing the visibility of developing countries, and improving the rankings based on Science Citation Index (SCI) data with known biases.

Thelwall, M.: Webometrics (2009) 0.06

0.05551693 = product of:
  0.08327539 = sum of:
    0.04025957 = weight(_text_:search in 3906) [ClassicSimilarity], result of:
      0.04025957 = score(doc=3906,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.230407 = fieldWeight in 3906, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=3906)
    0.043015826 = product of:
      0.08603165 = sum of:
        0.08603165 = weight(_text_:engines in 3906) [ClassicSimilarity], result of:
          0.08603165 = score(doc=3906,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.33681408 = fieldWeight in 3906, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=3906)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Webometrics is an information science field concerned with measuring aspects of the World Wide Web (WWW) for a variety of information science research goals. It came into existence about five years after the Web was formed and has since grown to become a significant aspect of information science, at least in terms of published research. Although some webometrics research has focused on the structure or evolution of the Web itself or the performance of commercial search engines, most has used data from the Web to shed light on information provision or online communication in various contexts. Most prominently, techniques have been developed to track, map, and assess Web-based informal scholarly communication, for example, in terms of the hyperlinks between academic Web sites or the online impact of digital repositories. In addition, a range of nonacademic issues and groups of Web users have also been analyzed.

Thelwall, M.: Conceptualizing documentation on the Web : an evaluation of different heuristic-based models for counting links between university Web sites (2002) 0.05
```
0.04626411 = product of:
  0.06939616 = sum of:
    0.03354964 = weight(_text_:search in 978) [ClassicSimilarity], result of:
      0.03354964 = score(doc=978,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 978, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=978)
    0.03584652 = product of:
      0.07169304 = sum of:
        0.07169304 = weight(_text_:engines in 978) [ClassicSimilarity], result of:
          0.07169304 = score(doc=978,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.2806784 = fieldWeight in 978, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=978)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

All known previous Web link studies have used the Web page as the primary indivisible source document for counting purposes. Arguments are presented to explain why this is not necessarily optimal and why other alternatives have the potential to produce better results. This is despite the fact that individual Web files are often the only choice if search engines are used for raw data and are the easiest basic Web unit to identify. The central issue is of defining the Web "document": that which should comprise the single indissoluble unit of coherent material. Three alternative heuristics are defined for the educational arena based upon the directory, the domain and the whole university site. These are then compared by implementing them an a set of 108 UK university institutional Web sites under the assumption that a more effective heuristic will tend to produce results that correlate more highly with institutional research productivity. It was discovered that the domain and directory models were able to successfully reduce the impact of anomalous linking behavior between pairs of Web sites, with the latter being the method of choice. Reasons are then given as to why a document model an its own cannot eliminate all anomalies in Web linking behavior. Finally, the results from all models give a clear confirmation of the very strong association between the research productivity of a UK university and the number of incoming links from its peers' Web sites.
Thelwall, M.; Vaughan, L.; Björneborn, L.: Webometrics (2004) 0.05
```
0.04626411 = product of:
  0.06939616 = sum of:
    0.03354964 = weight(_text_:search in 4279) [ClassicSimilarity], result of:
      0.03354964 = score(doc=4279,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 4279, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4279)
    0.03584652 = product of:
      0.07169304 = sum of:
        0.07169304 = weight(_text_:engines in 4279) [ClassicSimilarity], result of:
          0.07169304 = score(doc=4279,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.2806784 = fieldWeight in 4279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4279)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Webometrics, the quantitative study of Web-related phenomena, emerged from the realization that methods originally designed for bibliometric analysis of scientific journal article citation patterns could be applied to the Web, with commercial search engines providing the raw data. Almind and Ingwersen (1997) defined the field and gave it its name. Other pioneers included Rodriguez Gairin (1997) and Aguillo (1998). Larson (1996) undertook exploratory link structure analysis, as did Rousseau (1997). Webometrics encompasses research from fields beyond information science such as communication studies, statistical physics, and computer science. In this review we concentrate on link analysis, but also cover other aspects of webometrics, including Web log fle analysis. One theme that runs through this chapter is the messiness of Web data and the need for data cleansing heuristics. The uncontrolled Web creates numerous problems in the interpretation of results, for instance, from the automatic creation or replication of links. The loose connection between top-level domain specifications (e.g., com, edu, and org) and their actual content is also a frustrating problem. For example, many .com sites contain noncommercial content, although com is ostensibly the main commercial top-level domain. Indeed, a skeptical researcher could claim that obstacles of this kind are so great that all Web analyses lack value. As will be seen, one response to this view, a view shared by critics of evaluative bibliometrics, is to demonstrate that Web data correlate significantly with some non-Web data in order to prove that the Web data are not wholly random. A practical response has been to develop increasingly sophisticated data cleansing techniques and multiple data analysis methods.
Hayer, L.: Lazarsfeld zitiert : eine bibliometrische Analyse (2008) 0.03
```
0.03371857 = product of:
  0.050577857 = sum of:
    0.03354964 = weight(_text_:search in 1934) [ClassicSimilarity], result of:
      0.03354964 = score(doc=1934,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 1934, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1934)
    0.017028214 = product of:
      0.03405643 = sum of:
        0.03405643 = weight(_text_:22 in 1934) [ClassicSimilarity], result of:
          0.03405643 = score(doc=1934,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.19345059 = fieldWeight in 1934, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1934)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Um sich einer Antwort auf die Frage anzunähern, welche Bedeutung der Nachlass eines Wissenschaftlers wie jener Paul F. Lazarsfelds (mit zahlreichen noch unveröffentlichten Schriften) für die aktuelle Forschung haben könne, kann untersucht werden, wie häufig dieser Wissenschaftler zitiert wird. Wenn ein Autor zitiert wird, wird er auch genutzt. Wird er über einen langen Zeitraum oft genutzt, ist vermutlich auch die Auseinandersetzung mit seinem Nachlass von Nutzen. Außerdem kann aufgrund der Zitierungen festgestellt werden, was aus dem Lebenswerk eines Wissenschaftlers für die aktuelle Forschung relevant erscheint. Daraus können die vordringlichen Fragestellungen in der Bearbeitung des Nachlasses abgeleitet werden. Die Aufgabe für die folgende Untersuchung lautete daher: Wie oft wird Paul F. Lazarsfeld zitiert? Dabei interessierte auch: Wer zitiert wo? Die Untersuchung wurde mit Hilfe der Meta-Datenbank "ISI Web of Knowledge" durchgeführt. In dieser wurde im "Web of Science" mit dem Werkzeug "Cited Reference Search" nach dem zitierten Autor (Cited Author) "Lazarsfeld P*" gesucht. Diese Suche ergab 1535 Referenzen (References). Werden alle Referenzen gewählt, führt dies zu 4839 Ergebnissen (Results). Dabei wurden die Datenbanken SCI-Expanded, SSCI und A&HCI verwendet. Bei dieser Suche wurden die Publikationsjahre 1941-2008 analysiert. Vor 1956 wurden allerdings nur sehr wenige Zitate gefunden: 1946 fünf, ansonsten maximal drei, 1942-1944 und 1949 überhaupt keines. Zudem ist das Jahr 2008 noch lange nicht zu Ende. (Es gab jedoch schon vor Ende März 24 Zitate!)

Date

22. 6.2008 12:54:12
Hood, W.W.; Wilson, C.S.: ¬The scatter of documents over databases in different subject domains : how many databases are needed? (2001) 0.03
```
0.025006426 = product of:
  0.07501928 = sum of:
    0.07501928 = weight(_text_:search in 6936) [ClassicSimilarity], result of:
      0.07501928 = score(doc=6936,freq=10.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.4293381 = fieldWeight in 6936, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6936)
  0.33333334 = coord(1/3)
```
Abstract

The distribution of bibliographic records in on-line bibliographic databases is examined using 14 different search topics. These topics were searched using the DIALOG database host, and using as many suitable databases as possible. The presence of duplicate records in the searches was taken into consideration in the analysis, and the problem with lexical ambiguity in at least one search topic is discussed. The study answers questions such as how many databases are needed in a multifile search for particular topics, and what coverage will be achieved using a certain number of databases. The distribution of the percentages of records retrieved over a number of databases for 13 of the 14 search topics roughly fell into three groups: (1) high concentration of records in one database with about 80% coverage in five to eight databases; (2) moderate concentration in one database with about 80% coverage in seven to 10 databases; and (3) low concentration in one database with about 80% coverage in 16 to 19 databases. The study does conform with earlier results, but shows that the number of databases needed for searches with varying complexities of search strategies, is much more topic dependent than previous studies would indicate.

Brooks, T.A.: How good are the best papers of JASIS? (2000) 0.02

0.018978544 = product of:
  0.056935627 = sum of:
    0.056935627 = weight(_text_:search in 4593) [ClassicSimilarity], result of:
      0.056935627 = score(doc=4593,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.3258447 = fieldWeight in 4593, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=4593)
  0.33333334 = coord(1/3)

Content: Top by numbers of citations: (1) Saracevic, T. et al.: A study of information seeking and retrieving I-III (1988); (2) Bates, M.: Information search tactics (1979); (3) Cooper, W.S.: On selecting a measure of retrieval effectiveness (1973); (4) Marcus, R.S.: A experimental comparison of the effectiveness of computers and humans as search intermediaries (1983); (4) Fidel, R.: Online searching styles (1984)

Walters, W.H.: Google Scholar coverage of a multidisciplinary field (2007) 0.02
```
0.018978544 = product of:
  0.056935627 = sum of:
    0.056935627 = weight(_text_:search in 928) [ClassicSimilarity], result of:
      0.056935627 = score(doc=928,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.3258447 = fieldWeight in 928, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=928)
  0.33333334 = coord(1/3)
```
Abstract

This paper evaluates the content of Google Scholar and seven other databases (Academic Search Elite, AgeLine, ArticleFirst, GEOBASE, POPLINE, Social Sciences Abstracts, and Social Sciences Citation Index) within the multidisciplinary subject area of later-life migration. Each database is evaluated with reference to a set of 155 core articles selected in advance-the most important studies of later-life migration published from 1990 to 2000. Of the eight databases, Google Scholar indexes the greatest number of core articles (93%) and provides the most uniform publisher and date coverage. It covers 27% more core articles than the second-ranked database (SSCI) and 2.4 times as many as the lowest-ranked database (GEOBASE). At the same time, a substantial proportion of the citations provided by Google Scholar are incomplete (32%) or presented without abstracts (33%).

Object

Academic Search Elite

Search (68 results, page 1 of 4)

Authors

Languages

Types

Themes