Search (752 results, page 2 of 38)

  • × theme_ss:"Internet"
  1. Kennedy, S.D.: How to find subjects and subject experts (1996) 0.09
    0.088845745 = product of:
      0.13326861 = sum of:
        0.075914174 = weight(_text_:search in 4531) [ClassicSimilarity], result of:
          0.075914174 = score(doc=4531,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.43445963 = fieldWeight in 4531, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=4531)
        0.057354435 = product of:
          0.11470887 = sum of:
            0.11470887 = weight(_text_:engines in 4531) [ClassicSimilarity], result of:
              0.11470887 = score(doc=4531,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.44908544 = fieldWeight in 4531, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4531)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Discusses mailing list archives on the Internet and the use of Listserv and Listproc mailing list management programs to access them. Describes how to search the archives of Listserv mailing lists thorugh e-mail. Offers useful sites to start exploring electronic mailing lists. Covers sites for special file collections; insider lists and shareware search engines
  2. Notess, G.R.: Mega-searching from the desktop (1997) 0.09
    0.088845745 = product of:
      0.13326861 = sum of:
        0.075914174 = weight(_text_:search in 433) [ClassicSimilarity], result of:
          0.075914174 = score(doc=433,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.43445963 = fieldWeight in 433, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=433)
        0.057354435 = product of:
          0.11470887 = sum of:
            0.11470887 = weight(_text_:engines in 433) [ClassicSimilarity], result of:
              0.11470887 = score(doc=433,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.44908544 = fieldWeight in 433, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0625 = fieldNorm(doc=433)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Internet software vendors are now offering commercial products that not only queries multiple search engines simultaneously, but allows the user to sort results, remove duplicates and verify the available of the links. They run from the user's own computer. Evaluates the search capabilities, database coverage, post-processing, sorting and access limitations and other problems of: Internet Fast Find from Symantec, EchoSearch from Iconovex, WebCompass from Quarterdeck, and WebSeeker from ForeFront Group
  3. Ardito, S.C.: ¬The Internet : beginning or end of organized information? (1998) 0.09
    0.088845745 = product of:
      0.13326861 = sum of:
        0.075914174 = weight(_text_:search in 1664) [ClassicSimilarity], result of:
          0.075914174 = score(doc=1664,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.43445963 = fieldWeight in 1664, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=1664)
        0.057354435 = product of:
          0.11470887 = sum of:
            0.11470887 = weight(_text_:engines in 1664) [ClassicSimilarity], result of:
              0.11470887 = score(doc=1664,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.44908544 = fieldWeight in 1664, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1664)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Many information professionals still seem loathe to conduct searches on the Internet, preferring instead to continue to use commercial, proprietary systems. Compares the characteristics and advantages of search strategies for traditional databases with those for the Internet. Discusses future developments in Internet search engines and concludes that the merger of commercial database expertise with Internet technology and accessibility will enrich and simplify the end user's expectation
  4. Chowdhury, G.G.: ¬The Internet and information retrieval research : a brief review (1999) 0.09
    0.088845745 = product of:
      0.13326861 = sum of:
        0.075914174 = weight(_text_:search in 3424) [ClassicSimilarity], result of:
          0.075914174 = score(doc=3424,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.43445963 = fieldWeight in 3424, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=3424)
        0.057354435 = product of:
          0.11470887 = sum of:
            0.11470887 = weight(_text_:engines in 3424) [ClassicSimilarity], result of:
              0.11470887 = score(doc=3424,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.44908544 = fieldWeight in 3424, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3424)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The Internet and related information services attract increasing interest from information retrieval researchers. A survey of recent publications shows that frequent topics are the effectiveness of search engines, information validation and quality, user studies, design of user interfaces, data structures and metadata, classification and vocabulary based aids, and indexing and search agents. Current research in these areas is briefly discussed. The changing balance between CD-ROM sources and traditional online searching is quite important and is noted
  5. Internet searching and indexing : the subject approach (2000) 0.09
    0.088845745 = product of:
      0.13326861 = sum of:
        0.075914174 = weight(_text_:search in 1468) [ClassicSimilarity], result of:
          0.075914174 = score(doc=1468,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.43445963 = fieldWeight in 1468, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=1468)
        0.057354435 = product of:
          0.11470887 = sum of:
            0.11470887 = weight(_text_:engines in 1468) [ClassicSimilarity], result of:
              0.11470887 = score(doc=1468,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.44908544 = fieldWeight in 1468, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1468)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    This comprehensive volume offers usable information for people at all levels of Internet savvy. It can teach librarians, students, and patrons how to search the Internet more systematically. It also helps information professionals design more efficient, effective search engines and Web pages.
  6. Notess, G.R.: Searching the hidden Internet (1997) 0.09
    0.08769246 = product of:
      0.13153869 = sum of:
        0.08135357 = weight(_text_:search in 4802) [ClassicSimilarity], result of:
          0.08135357 = score(doc=4802,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.46558946 = fieldWeight in 4802, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4802)
        0.05018513 = product of:
          0.10037026 = sum of:
            0.10037026 = weight(_text_:engines in 4802) [ClassicSimilarity], result of:
              0.10037026 = score(doc=4802,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.39294976 = fieldWeight in 4802, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4802)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    WWW search engines are not comprehensive in their searches. They do not search: Adobe PDF file or other formatted files, registration files, and data sets. Basic search strategies can give access to some of the hidden content. 2 databases are also available to provide access to the hidden information. Excite's News Tracker searches a database of selected online publications. ATI databases from PLS, Inc. presents access to a variety of Internet accessible databases that may require membership or the payment of a registration fee
  7. Pu, H.-T.; Chuang, S.-L.; Yang, C.: Subject categorization of query terms for exploring Web users' search interests (2002) 0.09
    0.0871595 = product of:
      0.13073924 = sum of:
        0.09489272 = weight(_text_:search in 587) [ClassicSimilarity], result of:
          0.09489272 = score(doc=587,freq=16.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.54307455 = fieldWeight in 587, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=587)
        0.03584652 = product of:
          0.07169304 = sum of:
            0.07169304 = weight(_text_:engines in 587) [ClassicSimilarity], result of:
              0.07169304 = score(doc=587,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.2806784 = fieldWeight in 587, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=587)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Subject content analysis of Web query terms is essential to understand Web searching interests. Such analysis includes exploring search topics and observing changes in their frequency distributions with time. To provide a basis for in-depth analysis of users' search interests on a larger scale, this article presents a query categorization approach to automatically classifying Web query terms into broad subject categories. Because a query is short in length and simple in structure, its intended subject(s) of search is difficult to judge. Our approach, therefore, combines the search processes of real-world search engines to obtain highly ranked Web documents based on each unknown query term. These documents are used to extract cooccurring terms and to create a feature set. An effective ranking function has also been developed to find the most appropriate categories. Three search engine logs in Taiwan were collected and tested. They contained over 5 million queries from different periods of time. The achieved performance is quite encouraging compared with that of human categorization. The experimental results demonstrate that the approach is efficient in dealing with large numbers of queries and adaptable to the dynamic Web environment. Through good integration of human and machine efforts, the frequency distributions of subject categories in response to changes in users' search interests can be systematically observed in real time. The approach has also shown potential for use in various information retrieval applications, and provides a basis for further Web searching studies.
  8. Lawrence, S.; Giles, C.L.: Accessibility and distribution of information on the Web (1999) 0.09
    0.087043464 = product of:
      0.1305652 = sum of:
        0.06973162 = weight(_text_:search in 4952) [ClassicSimilarity], result of:
          0.06973162 = score(doc=4952,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.39907667 = fieldWeight in 4952, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=4952)
        0.060833566 = product of:
          0.12166713 = sum of:
            0.12166713 = weight(_text_:engines in 4952) [ClassicSimilarity], result of:
              0.12166713 = score(doc=4952,freq=4.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.47632706 = fieldWeight in 4952, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4952)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Search engine coverage relative to the estimated size of the publicly indexable web has decreased substantially since December 97, with no engine indexing more than about 16% of the estimated size of the publicly indexable web. (Note that many queries can be satisfied with a relatively small database). Search engines are typically more likely to index sites that have more links to them (more 'popular' sites). They are also typically more likely to index US sites than non-US sites (AltaVista is an exception), and more likely to index commercial sites than educational sites. Indexing of new or modified pages byjust one of the major search engines can take months. 83% of sites contain commercial content and 6% contain scientific or educational content. Only 1.5% of sites contain pornographic content. The publicly indexable web contains an estimated 800 million pages as of February 1999, encompassing about 15 terabytes of information or about 6 terabytes of text after removing HTML tags, comments, and extra whitespace. The simple HTML "keywords" and "description" metatags are only used on the homepages of 34% of sites. Only 0.3% of sites use the Dublin Core metadata standard.
  9. Jepsen, E.T.; Seiden, P.; Ingwersen, P.; Björneborn, L.; Borlund, P.: Characteristics of scientific Web publications : preliminary data gathering and analysis (2004) 0.08
    0.08380928 = product of:
      0.12571391 = sum of:
        0.07501928 = weight(_text_:search in 3091) [ClassicSimilarity], result of:
          0.07501928 = score(doc=3091,freq=10.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.4293381 = fieldWeight in 3091, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3091)
        0.05069464 = product of:
          0.10138928 = sum of:
            0.10138928 = weight(_text_:engines in 3091) [ClassicSimilarity], result of:
              0.10138928 = score(doc=3091,freq=4.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.39693922 = fieldWeight in 3091, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3091)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Because of the increasing presence of scientific publications an the Web, combined with the existing difficulties in easily verifying and retrieving these publications, research an techniques and methods for retrieval of scientific Web publications is called for. In this article, we report an the initial steps taken toward the construction of a test collection of scientific Web publications within the subject domain of plant biology. The steps reported are those of data gathering and data analysis aiming at identifying characteristics of scientific Web publications. The data used in this article were generated based an specifically selected domain topics that are searched for in three publicly accessible search engines (Google, AlITheWeb, and AItaVista). A sample of the retrieved hits was analyzed with regard to how various publication attributes correlated with the scientific quality of the content and whether this information could be employed to harvest, filter, and rank Web publications. The attributes analyzed were inlinks, outlinks, bibliographic references, file format, language, search engine overlap, structural position (according to site structure), and the occurrence of various types of metadata. As could be expected, the ranked output differs between the three search engines. Apparently, this is caused by differences in ranking algorithms rather than the databases themselves. In fact, because scientific Web content in this subject domain receives few inlinks, both AItaVista and AlITheWeb retrieved a higher degree of accessible scientific content than Google. Because of the search engine cutoffs of accessible URLs, the feasibility of using search engine output for Web content analysis is also discussed.
  10. Lee, L.-H.; Chen, H.-H.: Mining search intents for collaborative cyberporn filtering (2012) 0.08
    0.08307369 = product of:
      0.12461053 = sum of:
        0.08876401 = weight(_text_:search in 4988) [ClassicSimilarity], result of:
          0.08876401 = score(doc=4988,freq=14.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.5079997 = fieldWeight in 4988, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4988)
        0.03584652 = product of:
          0.07169304 = sum of:
            0.07169304 = weight(_text_:engines in 4988) [ClassicSimilarity], result of:
              0.07169304 = score(doc=4988,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.2806784 = fieldWeight in 4988, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4988)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    This article presents a search-intent-based method to generate pornographic blacklists for collaborative cyberporn filtering. A novel porn-detection framework that can find newly appearing pornographic web pages by mining search query logs is proposed. First, suspected queries are identified along with their clicked URLs by an automatically constructed lexicon. Then, a candidate URL is determined if the number of clicks satisfies majority voting rules. Finally, a candidate whose URL contains at least one categorical keyword will be included in a blacklist. Several experiments are conducted on an MSN search porn dataset to demonstrate the effectiveness of our method. The resulting blacklist generated by our search-intent-based method achieves high precision (0.701) while maintaining a favorably low false-positive rate (0.086). The experiments of a real-life filtering simulation reveal that our proposed method with its accumulative update strategy can achieve 44.15% of a macro-averaging blocking rate, when the update frequency is set to 1 day. In addition, the overblocking rates are less than 9% with time change due to the strong advantages of our search-intent-based method. This user-behavior-oriented method can be easily applied to search engines for incorporating only implicit collective intelligence from query logs without other efforts. In practice, it is complementary to intelligent content analysis for keeping up with the changing trails of objectionable websites from users' perspectives.
  11. Feldman, S.: ¬The Internet search-off (1998) 0.08
    0.08235665 = product of:
      0.12353496 = sum of:
        0.08051914 = weight(_text_:search in 1832) [ClassicSimilarity], result of:
          0.08051914 = score(doc=1832,freq=8.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.460814 = fieldWeight in 1832, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=1832)
        0.043015826 = product of:
          0.08603165 = sum of:
            0.08603165 = weight(_text_:engines in 1832) [ClassicSimilarity], result of:
              0.08603165 = score(doc=1832,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.33681408 = fieldWeight in 1832, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1832)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Reprots the results of a study which sought indications of the effectiveness of searching for information on the WWW versus the traditional online vendors, for comparisons of how it took to search, and for comments on the relative value of the results retrieved. Professional searchers were invited to search for answers to the same questions on both the WWW and on either DIALOG or Dow Jones Interactive. Overall DIALOG / Dow Jones Interactive retrieved more relevant documents and total searching time was more than double for finding information on the Web. Summarizes the comments from searchers on: searching assumptions of professional searchers; the subjects and types of searches for which DIALOG and Dow Jones Interactive showed superiority; areas where the Web has advantages over traditional online services; and uses for which both traditional online vendors and the Web can supplement each other. Lists nine searching principles for maximizing success when using Web search engines
  12. Lewandowski, D.; Mayr, P.: Exploring the academic invisible Web (2006) 0.08
    0.0801318 = product of:
      0.12019769 = sum of:
        0.058109686 = weight(_text_:search in 3752) [ClassicSimilarity], result of:
          0.058109686 = score(doc=3752,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.33256388 = fieldWeight in 3752, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3752)
        0.062088005 = product of:
          0.12417601 = sum of:
            0.12417601 = weight(_text_:engines in 3752) [ClassicSimilarity], result of:
              0.12417601 = score(doc=3752,freq=6.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.4861493 = fieldWeight in 3752, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3752)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Purpose: To provide a critical review of Bergman's 2001 study on the Deep Web. In addition, we bring a new concept into the discussion, the Academic Invisible Web (AIW). We define the Academic Invisible Web as consisting of all databases and collections relevant to academia but not searchable by the general-purpose internet search engines. Indexing this part of the Invisible Web is central to scien-tific search engines. We provide an overview of approaches followed thus far. Design/methodology/approach: Discussion of measures and calculations, estima-tion based on informetric laws. Literature review on approaches for uncovering information from the Invisible Web. Findings: Bergman's size estimate of the Invisible Web is highly questionable. We demonstrate some major errors in the conceptual design of the Bergman paper. A new (raw) size estimate is given. Research limitations/implications: The precision of our estimate is limited due to a small sample size and lack of reliable data. Practical implications: We can show that no single library alone will be able to index the Academic Invisible Web. We suggest collaboration to accomplish this task. Originality/value: Provides library managers and those interested in developing academic search engines with data on the size and attributes of the Academic In-visible Web.
  13. Thelwall, M.: Results from a web impact factor crawler (2001) 0.08
    0.0801318 = product of:
      0.12019769 = sum of:
        0.058109686 = weight(_text_:search in 4490) [ClassicSimilarity], result of:
          0.058109686 = score(doc=4490,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.33256388 = fieldWeight in 4490, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4490)
        0.062088005 = product of:
          0.12417601 = sum of:
            0.12417601 = weight(_text_:engines in 4490) [ClassicSimilarity], result of:
              0.12417601 = score(doc=4490,freq=6.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.4861493 = fieldWeight in 4490, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4490)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Web impact factors, the proposed web equivalent of impact factors for journals, can be calculated by using search engines. It has been found that the results are problematic because of the variable coverage of search engines as well as their ability to give significantly different results over short periods of time. The fundamental problem is that although some search engines provide a functionality that is capable of being used for impact calculations, this is not their primary task and therefore they do not give guarantees as to performance in this respect. In this paper, a bespoke web crawler designed specifically for the calculation of reliable WIFs is presented. This crawler was used to calculate WIFs for a number of UK universities, and the results of these calculations are discussed. The principal findings were that with certain restrictions, WIFs can be calculated reliably, but do not correlate with accepted research rankings owing to the variety of material hosted on university servers. Changes to the calculations to improve the fit of the results to research rankings are proposed, but there are still inherent problems undermining the reliability of the calculation. These problems still apply if the WIF scores are taken on their own as indicators of the general impact of any area of the Internet, but with care would not apply to online journals.
  14. Espadas, J.; Calero, C.; Piattini, M.: Web site visibility evaluation (2008) 0.08
    0.0801318 = product of:
      0.12019769 = sum of:
        0.058109686 = weight(_text_:search in 2353) [ClassicSimilarity], result of:
          0.058109686 = score(doc=2353,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.33256388 = fieldWeight in 2353, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2353)
        0.062088005 = product of:
          0.12417601 = sum of:
            0.12417601 = weight(_text_:engines in 2353) [ClassicSimilarity], result of:
              0.12417601 = score(doc=2353,freq=6.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.4861493 = fieldWeight in 2353, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2353)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    In recent years, the Internet has experienced a boom as an information source. The use of search engines is the most common way of finding this information. This means that less visible contents (for search engines) are increasingly difficult or even almost impossible to find. Thus, Web users are forced to accept alternative services or contents only because they are visible and offered to users by search engines. If a company's Web site is not visible, that company is losing clients. Therefore, it is fundamental to assure that one's Web site will be indexed and, consequently, visible to as many Web users as possible. To quantitatively evaluate the visibility of a Web site, this article introduces a method that Web administrators may use. The method consists of four activities and several tasks. Most of the tasks are accompanied by a set of defined measures that can help the Web administrator determine where the Web design is failing (from the positioning point of view). Some tools that can be used for the determination of the measure values also are referenced in the description of the method. The method is furthermore accompanied by examples to help in understanding how to apply it.
  15. Sherman, C.; Price, G.: ¬The invisible Web : uncovering information sources search engines can't see (2001) 0.08
    0.07852928 = product of:
      0.11779392 = sum of:
        0.06709928 = weight(_text_:search in 62) [ClassicSimilarity], result of:
          0.06709928 = score(doc=62,freq=8.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.3840117 = fieldWeight in 62, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=62)
        0.05069464 = product of:
          0.10138928 = sum of:
            0.10138928 = weight(_text_:engines in 62) [ClassicSimilarity], result of:
              0.10138928 = score(doc=62,freq=4.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.39693922 = fieldWeight in 62, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=62)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Enormous expanses of the Internet are unreachable with standard Web search engines. This book provides the key to finding these hidden resources by identifying how to uncover and use invisible Web resources. Mapping the invisible Web, when and how to use it, assessing the validity of the information, and the future of Web searching are topics covered in detail. Only 16 percent of Net-based information can be located using a general search engine. The other 84 percent is what is referred to as the invisible Web-made up of information stored in databases. Unlike pages on the visible Web, information in databases is generally inaccessible to the software spiders and crawlers that compile search engine indexes. As Web technology improves, more and more information is being stored in databases that feed into dynamically generated Web pages. The tips provided in this resource will ensure that those databases are exposed and Net-based research will be conducted in the most thorough and effective manner. Discusses the use of online information resources and problems caused by dynamically generated Web pages, paying special attention to information mapping, assessing the validity of information, and the future of Web searching.
  16. Chau, M.; Shiu, B.; Chan, M.; Chen, H.: Redips: backlink search and analysis on the Web for business intelligence analysis (2007) 0.08
    0.07852928 = product of:
      0.11779392 = sum of:
        0.06709928 = weight(_text_:search in 142) [ClassicSimilarity], result of:
          0.06709928 = score(doc=142,freq=8.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.3840117 = fieldWeight in 142, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=142)
        0.05069464 = product of:
          0.10138928 = sum of:
            0.10138928 = weight(_text_:engines in 142) [ClassicSimilarity], result of:
              0.10138928 = score(doc=142,freq=4.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.39693922 = fieldWeight in 142, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=142)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The World Wide Web presents significant opportunities for business intelligence analysis as it can provide information about a company's external environment and its stakeholders. Traditional business intelligence analysis on the Web has focused on simple keyword searching. Recently, it has been suggested that the incoming links, or backlinks, of a company's Web site (i.e., other Web pages that have a hyperlink pointing to the company of Interest) can provide important insights about the company's "online communities." Although analysis of these communities can provide useful signals for a company and information about its stakeholder groups, the manual analysis process can be very time-consuming for business analysts and consultants. In this article, we present a tool called Redips that automatically integrates backlink meta-searching and text-mining techniques to facilitate users in performing such business intelligence analysis on the Web. The architectural design and implementation of the tool are presented in the article. To evaluate the effectiveness, efficiency, and user satisfaction of Redips, an experiment was conducted to compare the tool with two popular business Intelligence analysis methods-using backlink search engines and manual browsing. The experiment results showed that Redips was statistically more effective than both benchmark methods (in terms of Recall and F-measure) but required more time in search tasks. In terms of user satisfaction, Redips scored statistically higher than backlink search engines in all five measures used, and also statistically higher than manual browsing in three measures.
  17. Sperber, W.; Dalitz, W.: Portale, Search Engines and Math-Net (2000) 0.08
    0.0785128 = product of:
      0.1177692 = sum of:
        0.056935627 = weight(_text_:search in 5237) [ClassicSimilarity], result of:
          0.056935627 = score(doc=5237,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.3258447 = fieldWeight in 5237, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=5237)
        0.060833566 = product of:
          0.12166713 = sum of:
            0.12166713 = weight(_text_:engines in 5237) [ClassicSimilarity], result of:
              0.12166713 = score(doc=5237,freq=4.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.47632706 = fieldWeight in 5237, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5237)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    In Math-Net stellen Personen und Institutionen ihre für die Mathematik relevanten Informationen auf eigenen Web-Servern bereit, doch sollen die Informationen in einheitlicher Weise erschlossen werden. Dazu gibt es sowohl für Server als auch für die Dokumente Empfehlungen für deren Strukturierung. Die lokalen Informationen werden durch automatische Verfahren gesammelt, ausgewertet und indexiert. Diese Indexe sind die Basis für die Math-Net Dienste. Das sind Search Engines und Portale, die einen qualifizierten und effizienten Zugang zu den Informationen im Math-Net bieten. Die Dienste decken im Gegensatz zu den universellen Suchmaschinen nur den für die Mathematik relevanten Teil des Web ab. Math-Net ist auch ein Informations- und Kornmunikationssystem sowie ein Publikationsmedium für die Mathematik. Die Entwicklung des Math-Net wird von dem breiten Konsens der Mathematiker getragen, den Zugang zu der für die Mathematik relevanten Information zu erleichtern und zu verbessern
  18. Gibson, P.: Navigating the Internet road to riches (1998) 0.08
    0.07774002 = product of:
      0.11661003 = sum of:
        0.0664249 = weight(_text_:search in 3521) [ClassicSimilarity], result of:
          0.0664249 = score(doc=3521,freq=4.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.38015217 = fieldWeight in 3521, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3521)
        0.05018513 = product of:
          0.10037026 = sum of:
            0.10037026 = weight(_text_:engines in 3521) [ClassicSimilarity], result of:
              0.10037026 = score(doc=3521,freq=2.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.39294976 = fieldWeight in 3521, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3521)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    In the light of InfoSeek's extremely lucrative contract with Disney, analyses how it has made such a success of developing an Internet business through advertising revenues, and the imporatnce of powerful search engines. Reports 2 other deals by Internet search companies: NBC's purchase of a minority stake in the portal Snap!, owned by the Computer Network, Inc. and Yahoo's acquisition of Internet Mall and Viaweb which enable it to launch Yahoo Store hosting commerce sites on behalf of other companies allowing them to sell over the Internet. Outlines the possible consequences for users of these developments and of the possibility of Internet startups selling up and quitting the scene
  19. Warnick, W.L.; Leberman, A.; Scott, R.L.; Spence, K.J.; Johnsom, L.A.; Allen, V.S.: Searching the deep Web : directed query engine applications at the Department of Energy (2001) 0.08
    0.07651012 = product of:
      0.114765175 = sum of:
        0.04025957 = weight(_text_:search in 1215) [ClassicSimilarity], result of:
          0.04025957 = score(doc=1215,freq=2.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.230407 = fieldWeight in 1215, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=1215)
        0.074505605 = product of:
          0.14901121 = sum of:
            0.14901121 = weight(_text_:engines in 1215) [ClassicSimilarity], result of:
              0.14901121 = score(doc=1215,freq=6.0), product of:
                0.25542772 = queryWeight, product of:
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.05027291 = queryNorm
                0.58337915 = fieldWeight in 1215, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.080822 = idf(docFreq=746, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1215)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Directed Query Engines, an emerging class of search engine specifically designed to access distributed resources on the deep web, offer the opportunity to create inexpensive digital libraries. Already, one such engine, Distributed Explorer, has been used to select and assemble high quality information resources and incorporate them into publicly available systems for the physical sciences. By nesting Directed Query Engines so that one query launches several other engines in a cascading fashion, enormous virtual collections may soon be assembled to form a comprehensive information infrastructure for the physical sciences. Once a Directed Query Engine has been configured for a set of information resources, distributed alerts tools can provide patrons with personalized, profile-based notices of recent additions to any of the selected resources. Due to the potentially enormous size and scope of Directed Query Engine applications, consideration must be given to issues surrounding the representation of large quantities of information from multiple, heterogeneous sources.
  20. Müller, T.: Wort-Schnüffler : Kochrezepte kostenlos: in den USA erlaubt Amazon online das Recherchieren in Büchern (2004) 0.08
    0.07646762 = product of:
      0.11470142 = sum of:
        0.040676784 = weight(_text_:search in 4826) [ClassicSimilarity], result of:
          0.040676784 = score(doc=4826,freq=6.0), product of:
            0.1747324 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.05027291 = queryNorm
            0.23279473 = fieldWeight in 4826, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4826)
        0.07402463 = sum of:
          0.05018513 = weight(_text_:engines in 4826) [ClassicSimilarity], result of:
            0.05018513 = score(doc=4826,freq=2.0), product of:
              0.25542772 = queryWeight, product of:
                5.080822 = idf(docFreq=746, maxDocs=44218)
                0.05027291 = queryNorm
              0.19647488 = fieldWeight in 4826, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.080822 = idf(docFreq=746, maxDocs=44218)
                0.02734375 = fieldNorm(doc=4826)
          0.0238395 = weight(_text_:22 in 4826) [ClassicSimilarity], result of:
            0.0238395 = score(doc=4826,freq=2.0), product of:
              0.17604718 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05027291 = queryNorm
              0.1354154 = fieldWeight in 4826, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.02734375 = fieldNorm(doc=4826)
      0.6666667 = coord(2/3)
    
    Content
    "Hobbyköche sind begeistert, Teenager, die sonst nur am Computerbildschirm kleben, interessieren sich plötzlich für Bücher, und Autoren werden nervös: Mit einer neuartigen Internet-Suchmaschine sorgt der Onlinebuchhändler Amazon.com in den USA für Furore und mischt den umkämpften Markt der "Search Engines" auf. Die im Oktober eingeführte Suchmaschine "Search Inside the Bock" ("Suche innerhalb des Buches!", englische Informationen unter http://www.amazon.com/exec/ obidos/tg/browse/-/10197041/002-3913532 0581613) stößt in eine neue Dimension vor. Während die meisten Suchmaschinen bisher bei einem gesuchten Titel Halt machten, blättert Amazons Suchmaschine das Buch förmlich auf und erlaubt das digitale Durchforsten ganzer Werke - zumindest von denen, die von Amazon eingescannt wurden - und das sind immerhin schon 120 000 Bücher mit 33 Millionen Seiten. Ist als Suchbegriff etwa" Oliver Twist", eingegeben, tauchen die Seiten auf, auf denen der Held des Romans von Charles Dickens erscheint. Von diesen Seiten aus können mit einem Passwort registrierte Kunden dann sogar weiter blättern und so bis zu 20 Prozent eines Buchs am Bildschirm durchschmökern. Ein neuer Kaufanreiz? Ob und wann die Suchmaschine auf dem deutschen Markt eingeführt wird, lässt Amazon offen. "Darüber spekulieren wir nicht", sagte eine Sprecherin des Unternehmens in Seattle. Amazon erhofft sich von dem neuen Service vor allem einen Kaufanreiz. Erste Zahlen scheinen dem Unternehmen von Jeff Bezos Recht zu geben. Bücher, die von der Suchmaschine erfasst wurden, verkauften sich zumindest in den ersten Tagen nach der Markteinführung deutlich besser als die anderen Werke. Bisher hat Amazon Verträge mit 190 Verlagen getroffen und deren Werke elektronisch abrufbar gemacht. Nur wenige Unternehmen sperrten sich aus Sorge vor Verkaufseinbußen oder einer möglichen Verletzung von Urheberrechten gegen das Einscannen ihrer Bücher. 15 Autoren forderten den Online-Riesen allerdings auf, ihre Bücher von der Suchmaschine auszunehmen. Besondere Sorge bereitet Amazons Erfindung einigen Sachbuchverlagen. So nutzen in den USA unter anderem Hobbyköche die neue Suchmaschine mit Begeisterung. Denn sie können den oft teuren Kochbüchern ihre Lieblingsrezepte entnehmen und dann auf den Kauf verzichten. "Kochbücher werden oft für ein bestimmtes Rezept gekauft", erklärte Nach Waxman, der Besitzer eines Kochbuchladens in New York der "Washington Post". Wenn sie das Rezept aber schon haben, dann könnten sie leicht sagen, "ich habe alles, was ich brauche", stellt Waxman besorgt fest. Auch für Lexika und andere teure Sachbücher, die etwa von Schülern oder College-Studenten für ihre Arbeiten durchsucht werden, könnte dies zutreffen.
    Inzwischen hat der Buchversender einige Vorkehrungen getroffen. Unter anderem wurde eine elektronische Sperre errichtet, so dass die Seiten inzwischen nur noch von geübten Computernutzern kopiert oder ausdruckt werden können. Mit "Search Inside the Bock" hat der OnlineRiese seinen ersten Schritt in den heiß umkämpften Markt der Suchmaschinen unternom- men. Schon plant Amazön nach amerikanischen Medienberichten eine weitere Suchmaschine, um Kunden das elektronische Einkaufen zu erleichtern.. Die unter den Codenamen A9 bekannte Suchmaschine soll unter anderem die Preise von Produkten vergleichen und das günstigste Angebot ermitteln. Damit stößt Amazön in einen Markt vor, der bisher in den USA von dem Onlineportal Yahoo oder der Super-Suchmaschine Google beherrscht wurde. Google hat bereits zum Gegenangriff angesetzt. Nach Informationen des Fachmagazins "Publishers Weekly". verhandelt das Unternehmen bereits mit Verlagen, um ebenfalls in die neue Dimension der Buchinhalte vorzudringen."
    Date
    3. 5.1997 8:44:22

Years

Languages

Types

  • a 653
  • m 63
  • s 27
  • el 22
  • r 5
  • b 2
  • x 2
  • More… Less…

Subjects

Classifications