Search (140 results, page 1 of 7)

Bar-Ilan, J.: On the overlap, the precision and estimated recall of search engines : a case study of the query 'Erdös' (1998) 0.05
```
0.050969247 = product of:
  0.10193849 = sum of:
    0.10193849 = product of:
      0.20387699 = sum of:
        0.20387699 = weight(_text_:engines in 3753) [ClassicSimilarity], result of:
          0.20387699 = score(doc=3753,freq=8.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.7858995 = fieldWeight in 3753, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3753)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Investigates the retrieval capabilities of 6 Internet search engines on a simple query. Existing work on search engine evaluation considers only the first 10 or 20 results returned by the search engine. In this work, all documents that the search engine pointed at were retrieved and thoroughly examined. Thus the precision of the whole retrieval process could be calculated, the overlap between the results of the engines studied, and an estimate on the recall of the searches given. The precision of the engines is high, recall is very low and the overlap is minimal

Thelwall, M.: Web impact factors and search engine coverage (2000) 0.04

0.041189373 = product of:
  0.082378745 = sum of:
    0.082378745 = product of:
      0.16475749 = sum of:
        0.16475749 = weight(_text_:engines in 4539) [ClassicSimilarity], result of:
          0.16475749 = score(doc=4539,freq=4.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.63510275 = fieldWeight in 4539, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0625 = fieldNorm(doc=4539)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Search engines index only a proportion of the web and this proportion is not determined randomly but by following algorithms that take into account the properties that impact factors measure. A survey was conducted in order to test the coverage of search engines and to decide thether their partial coverage is indeed an obstacle to using them to calculate web impact factors. The results indicate that search engine coverage, even of large national domains is extremely uneven and would be likely to lead to misleading calculations

Thelwall, M.: Quantitative comparisons of search engine results (2008) 0.04
```
0.036406603 = product of:
  0.072813205 = sum of:
    0.072813205 = product of:
      0.14562641 = sum of:
        0.14562641 = weight(_text_:engines in 2350) [ClassicSimilarity], result of:
          0.14562641 = score(doc=2350,freq=8.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.5613568 = fieldWeight in 2350, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2350)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Search engines are normally used to find information or Web sites, but Webometric investigations use them for quantitative data such as the number of pages matching a query and the international spread of those pages. For this type of application, the accuracy of the hit count estimates and range of URLs in the full results are important. Here, we compare the applications programming interfaces of Google, Yahoo!, and Live Search for 1,587 single word searches. The hit count estimates were broadly consistent but with Yahoo! and Google, reporting 5-6 times more hits than Live Search. Yahoo! tended to return slightly more matching URLs than Google, with Live Search returning significantly fewer. Yahoo!'s result URLs included a significantly wider range of domains and sites than the other two, and there was little consistency between the three engines in the number of different domains. In contrast, the three engines were reasonably consistent in the number of different top-level domains represented in the result URLs, although Yahoo! tended to return the most. In conclusion, quantitative results from the three search engines are mostly consistent but with unexpected types of inconsistency that users should be aware of. Google is recommended for hit count estimates but Yahoo! is recommended for all other Webometric purposes.
Herring, S.D.: ¬The value of interdisciplinarity : a study based on the design of Internet search engines (1999) 0.03
```
0.031529047 = product of:
  0.06305809 = sum of:
    0.06305809 = product of:
      0.12611619 = sum of:
        0.12611619 = weight(_text_:engines in 3458) [ClassicSimilarity], result of:
          0.12611619 = score(doc=3458,freq=6.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.4861493 = fieldWeight in 3458, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3458)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Continued development of the Internet requires the development of efficient, easy-to-use search engines. Ideally, such development should call upon knowledge and skills from a variety of disciplines, including computer science, information science, psychology, and ergonomics. The current study is intended to determine whether search engines shows a pattern of interdisciplinarity. 2 disciplines were selected as the focus for the study: computer science, and library/information science. A citation analysis was conducted to measure levels of interdisciplinary research and publishing in Internet search engine design and development. The results show a higher level of interdisciplinarity among library and information scientists than among computer scientists or among any of those categorized as 'other'. This is reflected both in the types of journals in which the authors publish, and in the references they cite to support their work. However, almost no authors published articles or cited references in fields such as cognitive science, ergonomics, or psychology. The results of this study are analyzed in terms of the writings of Patrick Wilson, Bruno Latour, Pierre Bordieu, Fritz Ringer, and Thomas Pinelli, focusing on cognitive authority within a profession, interaction between disciplines, and information-gathering habits of professionals. Suggestions for further research are given
Thelwall, M.: Results from a web impact factor crawler (2001) 0.03
```
0.031529047 = product of:
  0.06305809 = sum of:
    0.06305809 = product of:
      0.12611619 = sum of:
        0.12611619 = weight(_text_:engines in 4490) [ClassicSimilarity], result of:
          0.12611619 = score(doc=4490,freq=6.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.4861493 = fieldWeight in 4490, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4490)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Web impact factors, the proposed web equivalent of impact factors for journals, can be calculated by using search engines. It has been found that the results are problematic because of the variable coverage of search engines as well as their ability to give significantly different results over short periods of time. The fundamental problem is that although some search engines provide a functionality that is capable of being used for impact calculations, this is not their primary task and therefore they do not give guarantees as to performance in this respect. In this paper, a bespoke web crawler designed specifically for the calculation of reliable WIFs is presented. This crawler was used to calculate WIFs for a number of UK universities, and the results of these calculations are discussed. The principal findings were that with certain restrictions, WIFs can be calculated reliably, but do not correlate with accepted research rankings owing to the variety of material hosted on university servers. Changes to the calculations to improve the fit of the results to research rankings are proposed, but there are still inherent problems undermining the reliability of the calculation. These problems still apply if the WIF scores are taken on their own as indicators of the general impact of any area of the Internet, but with care would not apply to online journals.

Nicholls, P.T.: Empirical validation of Lotka's law (1986) 0.03

0.027670832 = product of:
  0.055341665 = sum of:
    0.055341665 = product of:
      0.11068333 = sum of:
        0.11068333 = weight(_text_:22 in 5509) [ClassicSimilarity], result of:
          0.11068333 = score(doc=5509,freq=2.0), product of:
            0.17879781 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051058397 = queryNorm
            0.61904186 = fieldWeight in 5509, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=5509)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information processing and management. 22(1986), S.417-419

Nicolaisen, J.: Citation analysis (2007) 0.03

0.027670832 = product of:
  0.055341665 = sum of:
    0.055341665 = product of:
      0.11068333 = sum of:
        0.11068333 = weight(_text_:22 in 6091) [ClassicSimilarity], result of:
          0.11068333 = score(doc=6091,freq=2.0), product of:
            0.17879781 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051058397 = queryNorm
            0.61904186 = fieldWeight in 6091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=6091)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 13. 7.2008 19:53:22

Fiala, J.: Information flood : fiction and reality (1987) 0.03

0.027670832 = product of:
  0.055341665 = sum of:
    0.055341665 = product of:
      0.11068333 = sum of:
        0.11068333 = weight(_text_:22 in 1080) [ClassicSimilarity], result of:
          0.11068333 = score(doc=1080,freq=2.0), product of:
            0.17879781 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051058397 = queryNorm
            0.61904186 = fieldWeight in 1080, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=1080)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Thermochimica acta. 110(1987), S.11-22

Jepsen, E.T.; Seiden, P.; Ingwersen, P.; Björneborn, L.; Borlund, P.: Characteristics of scientific Web publications : preliminary data gathering and analysis (2004) 0.03
```
0.025743358 = product of:
  0.051486716 = sum of:
    0.051486716 = product of:
      0.10297343 = sum of:
        0.10297343 = weight(_text_:engines in 3091) [ClassicSimilarity], result of:
          0.10297343 = score(doc=3091,freq=4.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.39693922 = fieldWeight in 3091, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3091)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Because of the increasing presence of scientific publications an the Web, combined with the existing difficulties in easily verifying and retrieving these publications, research an techniques and methods for retrieval of scientific Web publications is called for. In this article, we report an the initial steps taken toward the construction of a test collection of scientific Web publications within the subject domain of plant biology. The steps reported are those of data gathering and data analysis aiming at identifying characteristics of scientific Web publications. The data used in this article were generated based an specifically selected domain topics that are searched for in three publicly accessible search engines (Google, AlITheWeb, and AItaVista). A sample of the retrieved hits was analyzed with regard to how various publication attributes correlated with the scientific quality of the content and whether this information could be employed to harvest, filter, and rank Web publications. The attributes analyzed were inlinks, outlinks, bibliographic references, file format, language, search engine overlap, structural position (according to site structure), and the occurrence of various types of metadata. As could be expected, the ranked output differs between the three search engines. Apparently, this is caused by differences in ranking algorithms rather than the databases themselves. In fact, because scientific Web content in this subject domain receives few inlinks, both AItaVista and AlITheWeb retrieved a higher degree of accessible scientific content than Google. Because of the search engine cutoffs of accessible URLs, the feasibility of using search engine output for Web content analysis is also discussed.
Amitay, E.; Carmel, D.; Herscovici, M.; Lempel, R.; Soffer, A.: Trend detection through temporal link analysis (2004) 0.03
```
0.025743358 = product of:
  0.051486716 = sum of:
    0.051486716 = product of:
      0.10297343 = sum of:
        0.10297343 = weight(_text_:engines in 3092) [ClassicSimilarity], result of:
          0.10297343 = score(doc=3092,freq=4.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.39693922 = fieldWeight in 3092, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3092)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Although time has been recognized as an important dimension in the co-citation literature, to date it has not been incorporated into the analogous process of link analysis an the Web. In this paper, we discuss several aspects and uses of the time dimension in the context of Web information retrieval. We describe the ideal casewhere search engines track and store temporal data for each of the pages in their repository, assigning timestamps to the hyperlinks embedded within the pages. We introduce several applications which benefit from the availability of such timestamps. To demonstrate our claims, we use a somewhat simplistic approach, which dates links by approximating the age of the page's content. We show that by using this crude measure alone it is possible to detect and expose significant events and trends. We predict that by using more robust methods for tracking modifications in the content of pages, search engines will be able to provide results that are more timely and better reflect current real-life trends than those they provide today.
Bhavnani, S.K.: Why is it difficult to find comprehensive information? : implications of information scatter for search and design (2005) 0.03
```
0.025743358 = product of:
  0.051486716 = sum of:
    0.051486716 = product of:
      0.10297343 = sum of:
        0.10297343 = weight(_text_:engines in 3684) [ClassicSimilarity], result of:
          0.10297343 = score(doc=3684,freq=4.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.39693922 = fieldWeight in 3684, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3684)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The rapid development of Web sites providing extensive coverage of a topic, coupled with the development of powerful search engines (designed to help users find such Web sites), suggests that users can easily find comprehensive information about a topic. In domains such as consumer healthcare, finding comprehensive information about a topic is critical as it can improve a patient's judgment in making healthcare decisions, and can encourage higher compliance with treatment. However, recent studies show that despite using powerful search engines, many healthcare information seekers have difficulty finding comprehensive information even for narrow healthcare topics because the relevant information is scattered across many Web sites. To date, no studies have analyzed how facts related to a search topic are distributed across relevant Web pages and Web sites. In this study, the distribution of facts related to five common healthcare topics across high-quality sites is analyzed, and the reasons underlying those distributions are explored. The analysis revealed the existence of few pages that had many facts, many pages that had few facts, and no single page or site that provided all the facts. While such a distribution conforms to other information-related phenomena, a deeper analysis revealed that the distributions were caused by a trade-off between depth and breadth, leading to the existence of general, specialized, and sparse pages. Furthermore, the results helped to make explicit the knowledge needed by searchers to find comprehensive healthcare information, and suggested the motivation to explore distribution-conscious approaches for the development of future search systems, search interfaces, Web page designs, and training.
Thelwall, M.: ¬A comparison of link and URL citation counting (2011) 0.03
```
0.025743358 = product of:
  0.051486716 = sum of:
    0.051486716 = product of:
      0.10297343 = sum of:
        0.10297343 = weight(_text_:engines in 4533) [ClassicSimilarity], result of:
          0.10297343 = score(doc=4533,freq=4.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.39693922 = fieldWeight in 4533, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4533)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Purpose - Link analysis is an established topic within webometrics. It normally uses counts of links between sets of web sites or to sets of web sites. These link counts are derived from web crawlers or commercial search engines with the latter being the only alternative for some investigations. This paper compares link counts with URL citation counts in order to assess whether the latter could be a replacement for the former if the major search engines withdraw their advanced hyperlink search facilities. Design/methodology/approach - URL citation counts are compared with link counts for a variety of data sets used in previous webometric studies. Findings - The results show a high degree of correlation between the two but with URL citations being much less numerous, at least outside academia and business. Research limitations/implications - The results cover a small selection of 15 case studies and so the findings are only indicative. Significant differences between results indicate that the difference between link counts and URL citation counts will vary between webometric studies. Practical implications - Should link searches be withdrawn, then link analyses of less well linked non-academic, non-commercial sites would be seriously weakened, although citations based on e-mail addresses could help to make citations more numerous than links for some business and academic contexts. Originality/value - This is the first systematic study of the difference between link counts and URL citation counts in a variety of contexts and it shows that there are significant differences between the two.
Cothey, V.: Web-crawling reliability (2004) 0.03
```
0.025484623 = product of:
  0.050969247 = sum of:
    0.050969247 = product of:
      0.10193849 = sum of:
        0.10193849 = weight(_text_:engines in 3089) [ClassicSimilarity], result of:
          0.10193849 = score(doc=3089,freq=2.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.39294976 = fieldWeight in 3089, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3089)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this article, I investigate the reliability, in the social science sense, of collecting informetric data about the World Wide Web by Web crawling. The investigation includes a critical examination of the practice of Web crawling and contrasts the results of content crawling with the results of link crawling. It is shown that Web crawling by search engines is intentionally biased and selective. I also report the results of a [arge-scale experimental simulation of Web crawling that illustrates the effects of different crawling policies an data collection. It is concluded that the reliability of Web crawling as a data collection technique is improved by fuller reporting of relevant crawling policies.

Su, Y.; Han, L.-F.: ¬A new literature growth model : variable exponential growth law of literature (1998) 0.02

0.024457792 = product of:
  0.048915584 = sum of:
    0.048915584 = product of:
      0.09783117 = sum of:
        0.09783117 = weight(_text_:22 in 3690) [ClassicSimilarity], result of:
          0.09783117 = score(doc=3690,freq=4.0), product of:
            0.17879781 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051058397 = queryNorm
            0.54716086 = fieldWeight in 3690, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3690)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 5.1999 19:22:35

Van der Veer Martens, B.: Do citation systems represent theories of truth? (2001) 0.02

0.024457792 = product of:
  0.048915584 = sum of:
    0.048915584 = product of:
      0.09783117 = sum of:
        0.09783117 = weight(_text_:22 in 3925) [ClassicSimilarity], result of:
          0.09783117 = score(doc=3925,freq=4.0), product of:
            0.17879781 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051058397 = queryNorm
            0.54716086 = fieldWeight in 3925, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3925)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 7.2006 15:22:28

Diodato, V.: Dictionary of bibliometrics (1994) 0.02

0.024211979 = product of:
  0.048423957 = sum of:
    0.048423957 = product of:
      0.096847914 = sum of:
        0.096847914 = weight(_text_:22 in 5666) [ClassicSimilarity], result of:
          0.096847914 = score(doc=5666,freq=2.0), product of:
            0.17879781 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051058397 = queryNorm
            0.5416616 = fieldWeight in 5666, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=5666)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Rez. in: Journal of library and information science 22(1996) no.2, S.116-117 (L.C. Smith)

Bookstein, A.: Informetric distributions : I. Unified overview (1990) 0.02

0.024211979 = product of:
  0.048423957 = sum of:
    0.048423957 = product of:
      0.096847914 = sum of:
        0.096847914 = weight(_text_:22 in 6902) [ClassicSimilarity], result of:
          0.096847914 = score(doc=6902,freq=2.0), product of:
            0.17879781 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051058397 = queryNorm
            0.5416616 = fieldWeight in 6902, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6902)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 7.2006 18:55:29

Bookstein, A.: Informetric distributions : II. Resilience to ambiguity (1990) 0.02

0.024211979 = product of:
  0.048423957 = sum of:
    0.048423957 = product of:
      0.096847914 = sum of:
        0.096847914 = weight(_text_:22 in 4689) [ClassicSimilarity], result of:
          0.096847914 = score(doc=4689,freq=2.0), product of:
            0.17879781 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051058397 = queryNorm
            0.5416616 = fieldWeight in 4689, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4689)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 7.2006 18:55:55

Bar-Ilan, J.: ¬The Web as an information source on informetrics? : A content analysis (2000) 0.02
```
0.021843962 = product of:
  0.043687925 = sum of:
    0.043687925 = product of:
      0.08737585 = sum of:
        0.08737585 = weight(_text_:engines in 4587) [ClassicSimilarity], result of:
          0.08737585 = score(doc=4587,freq=2.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.33681408 = fieldWeight in 4587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=4587)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article addresses the question of whether the Web can serve as an information source for research. Specifically, it analyzes by way of content analysis the Web pages retrieved by the major search engines on a particular date (June 7, 1998), as a result of the query 'informetrics OR informetric'. In 807 out of the 942 retrieved pages, the search terms were mentioned in the context of information science. Over 70% of the pages contained only indirect information on the topic, in the form of hypertext links and bibliographical references without annotation. The bibliographical references extracted from the Web pages were analyzed, and lists of most productive authors, most cited authors, works, and sources were compiled. The list of reference obtained from the Web was also compared to data retrieved from commercial databases. For most cases, the list of references extracted from the Web outperformed the commercial, bibliographic databases. The results of these comparisons indicate that valuable, freely available data is hidden in the Web waiting to be extracted from the millions of Web pages
Prime-Claverie, C.; Beigbeder, M.; Lafouge, T.: Transposition of the cocitation method with a view to classifying Web pages (2004) 0.02
```
0.021843962 = product of:
  0.043687925 = sum of:
    0.043687925 = product of:
      0.08737585 = sum of:
        0.08737585 = weight(_text_:engines in 3095) [ClassicSimilarity], result of:
          0.08737585 = score(doc=3095,freq=2.0), product of:
            0.25941864 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.051058397 = queryNorm
            0.33681408 = fieldWeight in 3095, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=3095)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The Web is a huge source of information, and one of the main problems facing users is finding documents which correspond to their requirements. Apart from the problem of thematic relevance, the documents retrieved by search engines do not always meet the users' expectations. The document may be too general, or conversely too specialized, or of a different type from what the user is looking for, and so forth. We think that adding metadata to pages can considerably improve the process of searching for information an the Web. This article presents a possible typology for Web sites and pages, as weIl as a method for propagating metadata values, based an the study of the Web graph and more specifically the method of cocitation in this graph.

Search (140 results, page 1 of 7)

Authors

Years

Languages

Types

Themes

Subjects

Classifications