Document (#30272)

Author
Avrahami, T.T.
Yau, L.
Si, L.
Callan, J.P.
Title
¬The FedLemur project : Federated search in the real world
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.3, S.347-358
Year
2006
Abstract
Federated search and distributed information retrieval systems provide a single user interface for searching multiple full-text search engines. They have been an active area of research for more than a decade, but in spite of their success as a research topic, they are still rare in operational environments. This article discusses a prototype federated search system developed for the U.S. government's FedStats Web portal, and the issues addressed in adapting research solutions to this operational environment. A series of experiments explore how well prior research results, parameter settings, and heuristics apply in the FedStats environment. The article concludes with a set of lessons learned from this technology transfer effort, including observations about search engine quality in the real world.
Theme
Verteilte bibliographische Datenbanken

Similar documents (author)

  1. Callan, J.: Distributed information retrieval (2000) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:callan in 31) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 31, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=31)
    
  2. Robertson, S.; Callan, J.: Routing and filtering (2005) 4.75
    4.749831 = sum of:
      4.749831 = weight(author_txt:callan in 4688) [ClassicSimilarity], result of:
        4.749831 = fieldWeight in 4688, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.5 = fieldNorm(doc=4688)
    
  3. Collins-Thompson, K.; Callan, J.: Predicting reading difficulty with statistical language models (2005) 4.16
    4.156102 = sum of:
      4.156102 = weight(author_txt:callan in 4579) [ClassicSimilarity], result of:
        4.156102 = fieldWeight in 4579, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.4375 = fieldNorm(doc=4579)
    
  4. Callan, J.; Croft, W.B.; Broglio, J.: TREC and TIPSTER experiments with INQUERY (1995) 3.56
    3.5623734 = sum of:
      3.5623734 = weight(author_txt:callan in 1944) [ClassicSimilarity], result of:
        3.5623734 = fieldWeight in 1944, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.375 = fieldNorm(doc=1944)
    
  5. Allan, J.; Croft, W.B.; Callan, J.: ¬The University of Massachusetts and a dozen TRECs (2005) 3.56
    3.5623734 = sum of:
      3.5623734 = weight(author_txt:callan in 5086) [ClassicSimilarity], result of:
        3.5623734 = fieldWeight in 5086, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.375 = fieldNorm(doc=5086)
    

Similar documents (content)

  1. Peckham, J.; MacKellar, B.; Vorback, J.: ¬A unified approach to the design and generation of complex database schemata (1997) 0.17
    0.16709213 = sum of:
      0.16709213 = product of:
        0.5967576 = sum of:
          0.11932896 = weight(abstract_txt:active in 1259) [ClassicSimilarity], result of:
            0.11932896 = score(doc=1259,freq=2.0), product of:
              0.12185591 = queryWeight, product of:
                1.0317307 = boost
                6.330911 = idf(docFreq=213, maxDocs=44218)
                0.018655807 = queryNorm
              0.9792628 = fieldWeight in 1259, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.330911 = idf(docFreq=213, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
          0.09417398 = weight(abstract_txt:learned in 1259) [ClassicSimilarity], result of:
            0.09417398 = score(doc=1259,freq=1.0), product of:
              0.13111326 = queryWeight, product of:
                1.0702034 = boost
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.018655807 = queryNorm
              0.71826434 = fieldWeight in 1259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
          0.11257371 = weight(abstract_txt:lessons in 1259) [ClassicSimilarity], result of:
            0.11257371 = score(doc=1259,freq=1.0), product of:
              0.14767851 = queryWeight, product of:
                1.1357995 = boost
                6.9694996 = idf(docFreq=112, maxDocs=44218)
                0.018655807 = queryNorm
              0.76228905 = fieldWeight in 1259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9694996 = idf(docFreq=112, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
          0.035128918 = weight(abstract_txt:they in 1259) [ClassicSimilarity], result of:
            0.035128918 = score(doc=1259,freq=1.0), product of:
              0.08560107 = queryWeight, product of:
                1.2229187 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.018655807 = queryNorm
              0.41037944 = fieldWeight in 1259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
          0.093687534 = weight(abstract_txt:environment in 1259) [ClassicSimilarity], result of:
            0.093687534 = score(doc=1259,freq=2.0), product of:
              0.13066137 = queryWeight, product of:
                1.5108857 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.018655807 = queryNorm
              0.7170255 = fieldWeight in 1259, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
          0.099479795 = weight(abstract_txt:real in 1259) [ClassicSimilarity], result of:
            0.099479795 = score(doc=1259,freq=1.0), product of:
              0.1713402 = queryWeight, product of:
                1.7301655 = boost
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.018655807 = queryNorm
              0.5805981 = fieldWeight in 1259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
          0.042384677 = weight(abstract_txt:research in 1259) [ClassicSimilarity], result of:
            0.042384677 = score(doc=1259,freq=1.0), product of:
              0.122232094 = queryWeight, product of:
                2.066644 = boost
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.018655807 = queryNorm
              0.3467557 = fieldWeight in 1259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
        0.28 = coord(7/25)
    
  2. Mitchell, A.M.; Thompson, J.M.; Wu, A.: Agile cataloging : staffing and skills for a bibliographic future (2010) 0.15
    0.14502127 = sum of:
      0.14502127 = product of:
        0.6042553 = sum of:
          0.01415865 = weight(abstract_txt:this in 4159) [ClassicSimilarity], result of:
            0.01415865 = score(doc=4159,freq=2.0), product of:
              0.05310756 = queryWeight, product of:
                1.1797279 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018655807 = queryNorm
              0.2666033 = fieldWeight in 4159, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=4159)
          0.026150305 = weight(abstract_txt:article in 4159) [ClassicSimilarity], result of:
            0.026150305 = score(doc=4159,freq=1.0), product of:
              0.08799119 = queryWeight, product of:
                1.2398742 = boost
                3.8040617 = idf(docFreq=2677, maxDocs=44218)
                0.018655807 = queryNorm
              0.2971923 = fieldWeight in 4159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8040617 = idf(docFreq=2677, maxDocs=44218)
                0.078125 = fieldNorm(doc=4159)
          0.047319353 = weight(abstract_txt:environment in 4159) [ClassicSimilarity], result of:
            0.047319353 = score(doc=4159,freq=1.0), product of:
              0.13066137 = queryWeight, product of:
                1.5108857 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.018655807 = queryNorm
              0.36215258 = fieldWeight in 4159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.078125 = fieldNorm(doc=4159)
          0.03027477 = weight(abstract_txt:research in 4159) [ClassicSimilarity], result of:
            0.03027477 = score(doc=4159,freq=1.0), product of:
              0.122232094 = queryWeight, product of:
                2.066644 = boost
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.018655807 = queryNorm
              0.24768265 = fieldWeight in 4159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.078125 = fieldNorm(doc=4159)
          0.05813316 = weight(abstract_txt:search in 4159) [ClassicSimilarity], result of:
            0.05813316 = score(doc=4159,freq=1.0), product of:
              0.20341547 = queryWeight, product of:
                2.980712 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018655807 = queryNorm
              0.28578535 = fieldWeight in 4159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.078125 = fieldNorm(doc=4159)
          0.42821908 = weight(abstract_txt:federated in 4159) [ClassicSimilarity], result of:
            0.42821908 = score(doc=4159,freq=1.0), product of:
              0.64952487 = queryWeight, product of:
                4.125737 = boost
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.018655807 = queryNorm
              0.6592805 = fieldWeight in 4159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.078125 = fieldNorm(doc=4159)
        0.24 = coord(6/25)
    
  3. Taylor, M.: Using the Google search appliance for federated searching : a case study (2005) 0.12
    0.12243701 = sum of:
      0.12243701 = product of:
        0.7652313 = sum of:
          0.014016349 = weight(abstract_txt:this in 355) [ClassicSimilarity], result of:
            0.014016349 = score(doc=355,freq=1.0), product of:
              0.05310756 = queryWeight, product of:
                1.1797279 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018655807 = queryNorm
              0.2639238 = fieldWeight in 355, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.109375 = fieldNorm(doc=355)
          0.03661043 = weight(abstract_txt:article in 355) [ClassicSimilarity], result of:
            0.03661043 = score(doc=355,freq=1.0), product of:
              0.08799119 = queryWeight, product of:
                1.2398742 = boost
                3.8040617 = idf(docFreq=2677, maxDocs=44218)
                0.018655807 = queryNorm
              0.41606924 = fieldWeight in 355, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8040617 = idf(docFreq=2677, maxDocs=44218)
                0.109375 = fieldNorm(doc=355)
          0.11509778 = weight(abstract_txt:search in 355) [ClassicSimilarity], result of:
            0.11509778 = score(doc=355,freq=2.0), product of:
              0.20341547 = queryWeight, product of:
                2.980712 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018655807 = queryNorm
              0.5658261 = fieldWeight in 355, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.109375 = fieldNorm(doc=355)
          0.59950674 = weight(abstract_txt:federated in 355) [ClassicSimilarity], result of:
            0.59950674 = score(doc=355,freq=1.0), product of:
              0.64952487 = queryWeight, product of:
                4.125737 = boost
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.018655807 = queryNorm
              0.9229927 = fieldWeight in 355, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.109375 = fieldNorm(doc=355)
        0.16 = coord(4/25)
    
  4. Joint, N.: ¬The one-stop shop search engine : a transformational library technology? ANTAEUS (2010) 0.12
    0.12135138 = sum of:
      0.12135138 = product of:
        0.6067569 = sum of:
          0.012138513 = weight(abstract_txt:this in 4201) [ClassicSimilarity], result of:
            0.012138513 = score(doc=4201,freq=3.0), product of:
              0.05310756 = queryWeight, product of:
                1.1797279 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018655807 = queryNorm
              0.2285647 = fieldWeight in 4201, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4201)
          0.049739897 = weight(abstract_txt:real in 4201) [ClassicSimilarity], result of:
            0.049739897 = score(doc=4201,freq=1.0), product of:
              0.1713402 = queryWeight, product of:
                1.7301655 = boost
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.018655807 = queryNorm
              0.29029906 = fieldWeight in 4201, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4201)
          0.029970491 = weight(abstract_txt:research in 4201) [ClassicSimilarity], result of:
            0.029970491 = score(doc=4201,freq=2.0), product of:
              0.122232094 = queryWeight, product of:
                2.066644 = boost
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.018655807 = queryNorm
              0.2451933 = fieldWeight in 4201, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4201)
          0.09099279 = weight(abstract_txt:search in 4201) [ClassicSimilarity], result of:
            0.09099279 = score(doc=4201,freq=5.0), product of:
              0.20341547 = queryWeight, product of:
                2.980712 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018655807 = queryNorm
              0.44732484 = fieldWeight in 4201, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4201)
          0.42391527 = weight(abstract_txt:federated in 4201) [ClassicSimilarity], result of:
            0.42391527 = score(doc=4201,freq=2.0), product of:
              0.64952487 = queryWeight, product of:
                4.125737 = boost
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.018655807 = queryNorm
              0.6526544 = fieldWeight in 4201, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4201)
        0.2 = coord(5/25)
    
  5. Shechtman, N.; Chung, M.; Roschelle, J.: Supporting member collaboration in the Math Tools digital library : a formative user study (2004) 0.12
    0.11766742 = sum of:
      0.11766742 = product of:
        0.7354214 = sum of:
          0.017340733 = weight(abstract_txt:this in 1163) [ClassicSimilarity], result of:
            0.017340733 = score(doc=1163,freq=3.0), product of:
              0.05310756 = queryWeight, product of:
                1.1797279 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018655807 = queryNorm
              0.32652098 = fieldWeight in 1163, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=1163)
          0.03027477 = weight(abstract_txt:research in 1163) [ClassicSimilarity], result of:
            0.03027477 = score(doc=1163,freq=1.0), product of:
              0.122232094 = queryWeight, product of:
                2.066644 = boost
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.018655807 = queryNorm
              0.24768265 = fieldWeight in 1163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.170338 = idf(docFreq=5046, maxDocs=44218)
                0.078125 = fieldNorm(doc=1163)
          0.0822127 = weight(abstract_txt:search in 1163) [ClassicSimilarity], result of:
            0.0822127 = score(doc=1163,freq=2.0), product of:
              0.20341547 = queryWeight, product of:
                2.980712 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.018655807 = queryNorm
              0.4041615 = fieldWeight in 1163, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.078125 = fieldNorm(doc=1163)
          0.6055932 = weight(abstract_txt:federated in 1163) [ClassicSimilarity], result of:
            0.6055932 = score(doc=1163,freq=2.0), product of:
              0.64952487 = queryWeight, product of:
                4.125737 = boost
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.018655807 = queryNorm
              0.9323634 = fieldWeight in 1163, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.078125 = fieldNorm(doc=1163)
        0.16 = coord(4/25)