Document (#24106)

Author
Munson, K.I.
Title
Internet search engines : understanding their design to improve information retrieval
Source
Journal of Internet cataloging. 2(2000) nos.3/4, S.47-60
Year
2000
Abstract
The relationship between the methods currently used for indexing the World Wide Web and the programs, languages, and protocols on which the World Wide Web is based is examined. Two methods for indexing the Web are described, directories being briefly discussed while search engines are considered in detail. The automated approach used to create these tools is examined with special emphasis on the parts of a document used in indexing. Shortcomings of the approach are described. Suggestions for effective use of Web search engines are given
Theme
Suchmaschinen

Similar documents (content)

  1. Hock, R.: Search engines (2009) 0.28
    0.27922204 = sum of:
      0.27922204 = product of:
        1.1634252 = sum of:
          0.054763034 = weight(abstract_txt:special in 3876) [ClassicSimilarity], result of:
            0.054763034 = score(doc=3876,freq=1.0), product of:
              0.11740399 = queryWeight, product of:
                4.9754615 = idf(docFreq=829, maxDocs=44218)
                0.023596603 = queryNorm
              0.4664495 = fieldWeight in 3876, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9754615 = idf(docFreq=829, maxDocs=44218)
                0.09375 = fieldNorm(doc=3876)
          0.08812248 = weight(abstract_txt:briefly in 3876) [ClassicSimilarity], result of:
            0.08812248 = score(doc=3876,freq=1.0), product of:
              0.16121879 = queryWeight, product of:
                1.1718348 = boost
                5.830419 = idf(docFreq=352, maxDocs=44218)
                0.023596603 = queryNorm
              0.5466018 = fieldWeight in 3876, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.830419 = idf(docFreq=352, maxDocs=44218)
                0.09375 = fieldNorm(doc=3876)
          0.15455656 = weight(abstract_txt:described in 3876) [ClassicSimilarity], result of:
            0.15455656 = score(doc=3876,freq=2.0), product of:
              0.2344676 = queryWeight, product of:
                1.9985498 = boost
                4.9718537 = idf(docFreq=832, maxDocs=44218)
                0.023596603 = queryNorm
              0.6591809 = fieldWeight in 3876, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9718537 = idf(docFreq=832, maxDocs=44218)
                0.09375 = fieldNorm(doc=3876)
          0.15993154 = weight(abstract_txt:search in 3876) [ClassicSimilarity], result of:
            0.15993154 = score(doc=3876,freq=6.0), product of:
              0.19038698 = queryWeight, product of:
                2.2056563 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.023596603 = queryNorm
              0.840034 = fieldWeight in 3876, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.09375 = fieldNorm(doc=3876)
          0.1097622 = weight(abstract_txt:indexing in 3876) [ClassicSimilarity], result of:
            0.1097622 = score(doc=3876,freq=1.0), product of:
              0.26917422 = queryWeight, product of:
                2.6226234 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.023596603 = queryNorm
              0.40777382 = fieldWeight in 3876, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.09375 = fieldNorm(doc=3876)
          0.59628934 = weight(abstract_txt:engines in 3876) [ClassicSimilarity], result of:
            0.59628934 = score(doc=3876,freq=9.0), product of:
              0.39990374 = queryWeight, product of:
                3.1966636 = boost
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.023596603 = queryNorm
              1.4910822 = fieldWeight in 3876, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.09375 = fieldNorm(doc=3876)
        0.24 = coord(6/25)
    
  2. Rudich, J.: Internet search engines on CD-ROM (1996) 0.27
    0.26515302 = sum of:
      0.26515302 = product of:
        1.3257651 = sum of:
          0.27884743 = weight(abstract_txt:directories in 5654) [ClassicSimilarity], result of:
            0.27884743 = score(doc=5654,freq=1.0), product of:
              0.24719316 = queryWeight, product of:
                1.4510313 = boost
                7.2195506 = idf(docFreq=87, maxDocs=44218)
                0.023596603 = queryNorm
              1.1280547 = fieldWeight in 5654, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2195506 = idf(docFreq=87, maxDocs=44218)
                0.15625 = fieldNorm(doc=5654)
          0.18239835 = weight(abstract_txt:world in 5654) [ClassicSimilarity], result of:
            0.18239835 = score(doc=5654,freq=2.0), product of:
              0.18626845 = queryWeight, product of:
                1.7813252 = boost
                4.4314575 = idf(docFreq=1429, maxDocs=44218)
                0.023596603 = queryNorm
              0.979223 = fieldWeight in 5654, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4314575 = idf(docFreq=1429, maxDocs=44218)
                0.15625 = fieldNorm(doc=5654)
          0.24213608 = weight(abstract_txt:wide in 5654) [ClassicSimilarity], result of:
            0.24213608 = score(doc=5654,freq=2.0), product of:
              0.22499093 = queryWeight, product of:
                1.9577447 = boost
                4.870342 = idf(docFreq=921, maxDocs=44218)
                0.023596603 = queryNorm
              1.0762037 = fieldWeight in 5654, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.870342 = idf(docFreq=921, maxDocs=44218)
                0.15625 = fieldNorm(doc=5654)
          0.15389419 = weight(abstract_txt:search in 5654) [ClassicSimilarity], result of:
            0.15389419 = score(doc=5654,freq=2.0), product of:
              0.19038698 = queryWeight, product of:
                2.2056563 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.023596603 = queryNorm
              0.808323 = fieldWeight in 5654, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.15625 = fieldNorm(doc=5654)
          0.46848917 = weight(abstract_txt:engines in 5654) [ClassicSimilarity], result of:
            0.46848917 = score(doc=5654,freq=2.0), product of:
              0.39990374 = queryWeight, product of:
                3.1966636 = boost
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.023596603 = queryNorm
              1.1715049 = fieldWeight in 5654, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.15625 = fieldNorm(doc=5654)
        0.2 = coord(5/25)
    
  3. McMurdo, G.: How the Internet was indexed (1995) 0.21
    0.21489955 = sum of:
      0.21489955 = product of:
        0.6715611 = sum of:
          0.046902217 = weight(abstract_txt:considered in 2411) [ClassicSimilarity], result of:
            0.046902217 = score(doc=2411,freq=1.0), product of:
              0.11956597 = queryWeight, product of:
                1.0091654 = boost
                5.021064 = idf(docFreq=792, maxDocs=44218)
                0.023596603 = queryNorm
              0.39227062 = fieldWeight in 2411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.021064 = idf(docFreq=792, maxDocs=44218)
                0.078125 = fieldNorm(doc=2411)
          0.059623286 = weight(abstract_txt:currently in 2411) [ClassicSimilarity], result of:
            0.059623286 = score(doc=2411,freq=1.0), product of:
              0.14031021 = queryWeight, product of:
                1.093209 = boost
                5.4392195 = idf(docFreq=521, maxDocs=44218)
                0.023596603 = queryNorm
              0.42493904 = fieldWeight in 2411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4392195 = idf(docFreq=521, maxDocs=44218)
                0.078125 = fieldNorm(doc=2411)
          0.06518412 = weight(abstract_txt:automated in 2411) [ClassicSimilarity], result of:
            0.06518412 = score(doc=2411,freq=1.0), product of:
              0.14890406 = queryWeight, product of:
                1.1261904 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.023596603 = queryNorm
              0.43775916 = fieldWeight in 2411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.078125 = fieldNorm(doc=2411)
          0.07472662 = weight(abstract_txt:methods in 2411) [ClassicSimilarity], result of:
            0.07472662 = score(doc=2411,freq=2.0), product of:
              0.1631031 = queryWeight, product of:
                1.6668813 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.023596603 = queryNorm
              0.4581557 = fieldWeight in 2411, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.078125 = fieldNorm(doc=2411)
          0.09107333 = weight(abstract_txt:described in 2411) [ClassicSimilarity], result of:
            0.09107333 = score(doc=2411,freq=1.0), product of:
              0.2344676 = queryWeight, product of:
                1.9985498 = boost
                4.9718537 = idf(docFreq=832, maxDocs=44218)
                0.023596603 = queryNorm
              0.38842607 = fieldWeight in 2411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9718537 = idf(docFreq=832, maxDocs=44218)
                0.078125 = fieldNorm(doc=2411)
          0.07694709 = weight(abstract_txt:search in 2411) [ClassicSimilarity], result of:
            0.07694709 = score(doc=2411,freq=2.0), product of:
              0.19038698 = queryWeight, product of:
                2.2056563 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.023596603 = queryNorm
              0.4041615 = fieldWeight in 2411, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.078125 = fieldNorm(doc=2411)
          0.0914685 = weight(abstract_txt:indexing in 2411) [ClassicSimilarity], result of:
            0.0914685 = score(doc=2411,freq=1.0), product of:
              0.26917422 = queryWeight, product of:
                2.6226234 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.023596603 = queryNorm
              0.3398115 = fieldWeight in 2411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.078125 = fieldNorm(doc=2411)
          0.16563594 = weight(abstract_txt:engines in 2411) [ClassicSimilarity], result of:
            0.16563594 = score(doc=2411,freq=1.0), product of:
              0.39990374 = queryWeight, product of:
                3.1966636 = boost
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.023596603 = queryNorm
              0.41418952 = fieldWeight in 2411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.078125 = fieldNorm(doc=2411)
        0.32 = coord(8/25)
    
  4. Köhler, J.; Philippi, S.; Specht, M.; Rüegg, A.: Ontology based text indexing and querying for the semantic web (2006) 0.19
    0.18795697 = sum of:
      0.18795697 = product of:
        0.58736557 = sum of:
          0.05214729 = weight(abstract_txt:automated in 3280) [ClassicSimilarity], result of:
            0.05214729 = score(doc=3280,freq=1.0), product of:
              0.14890406 = queryWeight, product of:
                1.1261904 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.023596603 = queryNorm
              0.35020733 = fieldWeight in 3280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.044046428 = weight(abstract_txt:approach in 3280) [ClassicSimilarity], result of:
            0.044046428 = score(doc=3280,freq=2.0), product of:
              0.13305336 = queryWeight, product of:
                1.5055199 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.023596603 = queryNorm
              0.33104333 = fieldWeight in 3280, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.07321683 = weight(abstract_txt:methods in 3280) [ClassicSimilarity], result of:
            0.07321683 = score(doc=3280,freq=3.0), product of:
              0.1631031 = queryWeight, product of:
                1.6668813 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.023596603 = queryNorm
              0.44889906 = fieldWeight in 3280, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.07285866 = weight(abstract_txt:described in 3280) [ClassicSimilarity], result of:
            0.07285866 = score(doc=3280,freq=1.0), product of:
              0.2344676 = queryWeight, product of:
                1.9985498 = boost
                4.9718537 = idf(docFreq=832, maxDocs=44218)
                0.023596603 = queryNorm
              0.31074086 = fieldWeight in 3280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9718537 = idf(docFreq=832, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.033710368 = weight(abstract_txt:used in 3280) [ClassicSimilarity], result of:
            0.033710368 = score(doc=3280,freq=1.0), product of:
              0.16055906 = queryWeight, product of:
                2.0255203 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.023596603 = queryNorm
              0.2099562 = fieldWeight in 3280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.07539245 = weight(abstract_txt:search in 3280) [ClassicSimilarity], result of:
            0.07539245 = score(doc=3280,freq=3.0), product of:
              0.19038698 = queryWeight, product of:
                2.2056563 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.023596603 = queryNorm
              0.3959958 = fieldWeight in 3280, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.103484794 = weight(abstract_txt:indexing in 3280) [ClassicSimilarity], result of:
            0.103484794 = score(doc=3280,freq=2.0), product of:
              0.26917422 = queryWeight, product of:
                2.6226234 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.023596603 = queryNorm
              0.38445285 = fieldWeight in 3280, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
          0.13250875 = weight(abstract_txt:engines in 3280) [ClassicSimilarity], result of:
            0.13250875 = score(doc=3280,freq=1.0), product of:
              0.39990374 = queryWeight, product of:
                3.1966636 = boost
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.023596603 = queryNorm
              0.3313516 = fieldWeight in 3280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.0625 = fieldNorm(doc=3280)
        0.32 = coord(8/25)
    
  5. Frakes, W.B.: Stemming algorithms (1992) 0.18
    0.18127161 = sum of:
      0.18127161 = product of:
        0.7552984 = sum of:
          0.12199367 = weight(abstract_txt:detail in 3503) [ClassicSimilarity], result of:
            0.12199367 = score(doc=3503,freq=1.0), product of:
              0.16530661 = queryWeight, product of:
                1.1865982 = boost
                5.9038734 = idf(docFreq=327, maxDocs=44218)
                0.023596603 = queryNorm
              0.7379842 = fieldWeight in 3503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9038734 = idf(docFreq=327, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
          0.12614141 = weight(abstract_txt:programs in 3503) [ClassicSimilarity], result of:
            0.12614141 = score(doc=3503,freq=1.0), product of:
              0.1690326 = queryWeight, product of:
                1.1998966 = boost
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.023596603 = queryNorm
              0.7462549 = fieldWeight in 3503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
          0.14571732 = weight(abstract_txt:described in 3503) [ClassicSimilarity], result of:
            0.14571732 = score(doc=3503,freq=1.0), product of:
              0.2344676 = queryWeight, product of:
                1.9985498 = boost
                4.9718537 = idf(docFreq=832, maxDocs=44218)
                0.023596603 = queryNorm
              0.6214817 = fieldWeight in 3503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9718537 = idf(docFreq=832, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
          0.067420736 = weight(abstract_txt:used in 3503) [ClassicSimilarity], result of:
            0.067420736 = score(doc=3503,freq=1.0), product of:
              0.16055906 = queryWeight, product of:
                2.0255203 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.023596603 = queryNorm
              0.4199124 = fieldWeight in 3503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
          0.0870557 = weight(abstract_txt:search in 3503) [ClassicSimilarity], result of:
            0.0870557 = score(doc=3503,freq=1.0), product of:
              0.19038698 = queryWeight, product of:
                2.2056563 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.023596603 = queryNorm
              0.45725656 = fieldWeight in 3503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
          0.20696959 = weight(abstract_txt:indexing in 3503) [ClassicSimilarity], result of:
            0.20696959 = score(doc=3503,freq=2.0), product of:
              0.26917422 = queryWeight, product of:
                2.6226234 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.023596603 = queryNorm
              0.7689057 = fieldWeight in 3503, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.125 = fieldNorm(doc=3503)
        0.24 = coord(6/25)