Search (4985 results, page 2 of 250)

  • × language_ss:"e"
  • × year_i:[2000 TO 2010}
  1. Koch, T.: Quality-controlled subject gateways : definitions, typologies, empirical overview (2000) 0.07
    0.07478131 = product of:
      0.11217197 = sum of:
        0.017470727 = weight(_text_:information in 631) [ClassicSimilarity], result of:
          0.017470727 = score(doc=631,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.1920054 = fieldWeight in 631, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=631)
        0.094701245 = sum of:
          0.045543127 = weight(_text_:management in 631) [ClassicSimilarity], result of:
            0.045543127 = score(doc=631,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.2606825 = fieldWeight in 631, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0546875 = fieldNorm(doc=631)
          0.04915812 = weight(_text_:22 in 631) [ClassicSimilarity], result of:
            0.04915812 = score(doc=631,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.2708308 = fieldWeight in 631, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=631)
      0.6666667 = coord(2/3)
    
    Abstract
    'Quality-controlled subject gateways' are Internet services which apply a rich set of quality measures to support systematic resource discovery. Considerable manual effort is used to secure a selection of resources which meet quality criteria and to display a rich description of these resources with standards-based metadata. Regular checking and updating ensure good collection management. A main goal is to provide a high quality of subject access through indexing resources using controlled vocabularies and by offering a deep classification structure for advanced searching and browsing. This article provides an initial empirical overview of existing services of this kind, their approaches and technologies, based on proposed working definitions and typologies of subject gateways
    Date
    22. 6.2002 19:37:55
    Source
    Online information review. 24(2000) no.1, S.24-34
    Theme
    Information Gateway
  2. Tsakonas, G.; Papatheodorou, C.: Exploring usefulness and usability in the evaluation of open access digital libraries (2008) 0.07
    0.07478131 = product of:
      0.11217197 = sum of:
        0.017470727 = weight(_text_:information in 2090) [ClassicSimilarity], result of:
          0.017470727 = score(doc=2090,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.1920054 = fieldWeight in 2090, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2090)
        0.094701245 = sum of:
          0.045543127 = weight(_text_:management in 2090) [ClassicSimilarity], result of:
            0.045543127 = score(doc=2090,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.2606825 = fieldWeight in 2090, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2090)
          0.04915812 = weight(_text_:22 in 2090) [ClassicSimilarity], result of:
            0.04915812 = score(doc=2090,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.2708308 = fieldWeight in 2090, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2090)
      0.6666667 = coord(2/3)
    
    Abstract
    Advances in the publishing world have emerged new models of digital library development. Open access publishing modes are expanding their presence and realize the digital library idea in various means. While user-centered evaluation of digital libraries has drawn considerable attention during the last years, these systems are currently viewed from the publishing, economic and scientometric perspectives. The present study explores the concepts of usefulness and usability in the evaluation of an e-print archive. The results demonstrate that several attributes of usefulness, such as the level and the relevance of information, and usability, such as easiness of use and learnability, as well as functionalities commonly met in these systems, affect user interaction and satisfaction.
    Date
    1. 8.2008 11:49:22
    Source
    Information processing and management. 44(2008) no.3, S.1234-1250
  3. LaBarre, K.: Discovery and access systems for Websites and cultural heritage sites reconsidering the practical application of facets (2008) 0.07
    0.07478131 = product of:
      0.11217197 = sum of:
        0.017470727 = weight(_text_:information in 2247) [ClassicSimilarity], result of:
          0.017470727 = score(doc=2247,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.1920054 = fieldWeight in 2247, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2247)
        0.094701245 = sum of:
          0.045543127 = weight(_text_:management in 2247) [ClassicSimilarity], result of:
            0.045543127 = score(doc=2247,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.2606825 = fieldWeight in 2247, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2247)
          0.04915812 = weight(_text_:22 in 2247) [ClassicSimilarity], result of:
            0.04915812 = score(doc=2247,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.2708308 = fieldWeight in 2247, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2247)
      0.6666667 = coord(2/3)
    
    Content
    Facets are an increasingly common feature of contemporary access and discovery systems. These intuitively adaptable structures seem well suited for application in information architecture and the practice of knowledge management (La Barre, 2006). As browsing and searching devices, facets function equally well on e-commerce sites, digital museum portals, and online library catalogs. This paper argues that clearly articulated principles for facets and facet analysis must draw examples from current practice while building upon heritage principles m order to scaffold the development of robust and fully faceted information infrastructures.
    Date
    27.12.2008 9:50:22
  4. Genereux, C.: Building connections : a review of the serials literature 2004 through 2005 (2007) 0.07
    0.071954 = product of:
      0.107930996 = sum of:
        0.01058886 = weight(_text_:information in 2548) [ClassicSimilarity], result of:
          0.01058886 = score(doc=2548,freq=2.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.116372846 = fieldWeight in 2548, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2548)
        0.09734213 = sum of:
          0.055206608 = weight(_text_:management in 2548) [ClassicSimilarity], result of:
            0.055206608 = score(doc=2548,freq=4.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.31599492 = fieldWeight in 2548, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=2548)
          0.04213553 = weight(_text_:22 in 2548) [ClassicSimilarity], result of:
            0.04213553 = score(doc=2548,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 2548, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2548)
      0.6666667 = coord(2/3)
    
    Abstract
    This review of 2004 and 2005 serials literature covers the themes of cost, management, and access. Interwoven through the serials literature of these two years are the importance of collaboration, communication, and linkages between scholars, publishers, subscription agents and other intermediaries, and librarians. The emphasis in the literature is on electronic serials and their impact on publishing, libraries, and vendors. In response to the crisis of escalating journal prices and libraries' dissatisfaction with the Big Deal licensing agreements, Open Access journals and publishing models were promoted. Libraries subscribed to or licensed increasing numbers of electronic serials. As a result, libraries sought ways to better manage licensing and subscription data (not handled by traditional integrated library systems) by implementing electronic resources management systems. In order to provide users with better, faster, and more current information on and access to electronic serials, libraries implemented tools and services to provide A-Z title lists, title by title coverage data, MARC records, and OpenURL link resolvers.
    Date
    10. 9.2000 17:38:22
  5. Hawking, D.; Robertson, S.: On collection size and retrieval effectiveness (2003) 0.07
    0.07179232 = product of:
      0.10768847 = sum of:
        0.02823696 = weight(_text_:information in 4109) [ClassicSimilarity], result of:
          0.02823696 = score(doc=4109,freq=2.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.3103276 = fieldWeight in 4109, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=4109)
        0.079451516 = product of:
          0.15890303 = sum of:
            0.15890303 = weight(_text_:22 in 4109) [ClassicSimilarity], result of:
              0.15890303 = score(doc=4109,freq=4.0), product of:
                0.18150859 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0518325 = queryNorm
                0.8754574 = fieldWeight in 4109, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=4109)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    14. 8.2005 14:22:22
    Source
    Information retrieval. 6(2003) no.1, S.99-150
  6. Dextre Clarke, S.G.: Thesaural relationships (2001) 0.07
    0.071369946 = product of:
      0.10705492 = sum of:
        0.01235367 = weight(_text_:information in 1149) [ClassicSimilarity], result of:
          0.01235367 = score(doc=1149,freq=2.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.13576832 = fieldWeight in 1149, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1149)
        0.094701245 = sum of:
          0.045543127 = weight(_text_:management in 1149) [ClassicSimilarity], result of:
            0.045543127 = score(doc=1149,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.2606825 = fieldWeight in 1149, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1149)
          0.04915812 = weight(_text_:22 in 1149) [ClassicSimilarity], result of:
            0.04915812 = score(doc=1149,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.2708308 = fieldWeight in 1149, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1149)
      0.6666667 = coord(2/3)
    
    Date
    22. 9.2007 15:45:57
    Series
    Information science and knowledge management; vol.2
  7. Copeland, A.; Hamburger, S.; Hamilton, J.; Robinson, K.J.: Cataloging and digitizing ephemera : one team's experience with Pennsylvania German broadsides and fraktur (2006) 0.07
    0.071369946 = product of:
      0.10705492 = sum of:
        0.01235367 = weight(_text_:information in 768) [ClassicSimilarity], result of:
          0.01235367 = score(doc=768,freq=2.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.13576832 = fieldWeight in 768, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=768)
        0.094701245 = sum of:
          0.045543127 = weight(_text_:management in 768) [ClassicSimilarity], result of:
            0.045543127 = score(doc=768,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.2606825 = fieldWeight in 768, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0546875 = fieldNorm(doc=768)
          0.04915812 = weight(_text_:22 in 768) [ClassicSimilarity], result of:
            0.04915812 = score(doc=768,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.2708308 = fieldWeight in 768, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=768)
      0.6666667 = coord(2/3)
    
    Abstract
    The growing interest in ephemera collections within libraries will necessitate the bibliographic control of materials that do not easily fall into traditional categories. This paper discusses the many challenges confronting catalogers when approaching a mixed collection of unique materials of an ephemeral nature. Based on their experience cataloging a collection of Pennsylvania German broadsides and Fraktur at the Pennsylvania State University, the authors describe the process of deciphering handwriting, preserving genealogical information, deciding on cataloging approaches at the format and field level, and furthering access to the materials through digitization and the Encoded Archival Description finding aid. Observations are made on expanding the skills of traditional book catalogers to include manuscript cataloging, and on project management.
    Date
    10. 9.2000 17:38:22
  8. Schäffler, H.: How to organise the digital library : reengineering and change management in the Bayerische Staatsbibliothek, Munich (2004) 0.07
    0.071369946 = product of:
      0.10705492 = sum of:
        0.01235367 = weight(_text_:information in 2577) [ClassicSimilarity], result of:
          0.01235367 = score(doc=2577,freq=2.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.13576832 = fieldWeight in 2577, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2577)
        0.094701245 = sum of:
          0.045543127 = weight(_text_:management in 2577) [ClassicSimilarity], result of:
            0.045543127 = score(doc=2577,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.2606825 = fieldWeight in 2577, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2577)
          0.04915812 = weight(_text_:22 in 2577) [ClassicSimilarity], result of:
            0.04915812 = score(doc=2577,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.2708308 = fieldWeight in 2577, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2577)
      0.6666667 = coord(2/3)
    
    Abstract
    The introduction of digital resources has not only had considerable impact on the role of libraries in the information society, but it has also had a remarkable effect on back office procedures, i.e. on the way the library is organised. This article presents a case study of a reengineering process at the Bayerische Staatsbibliothek (Bavarian State Library) in Munich, Germany, the central regional library of the State of Bavaria and one of the largest academic research libraries in Europe with local, regional and supraregional responsibilities. Due to the multiple roles of this library, it was particularly important not only to bridge the gap between traditional and new material, but also to create a flexible organisational platform for the various tasks at the different levels indicated.
    Source
    Library hi tech. 22(2004) no.4, S.340-346
  9. Fallis, D.: Social epistemology and information science (2006) 0.07
    0.070059046 = product of:
      0.10508856 = sum of:
        0.04890785 = weight(_text_:information in 4368) [ClassicSimilarity], result of:
          0.04890785 = score(doc=4368,freq=6.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.5375032 = fieldWeight in 4368, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=4368)
        0.056180708 = product of:
          0.112361416 = sum of:
            0.112361416 = weight(_text_:22 in 4368) [ClassicSimilarity], result of:
              0.112361416 = score(doc=4368,freq=2.0), product of:
                0.18150859 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0518325 = queryNorm
                0.61904186 = fieldWeight in 4368, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=4368)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    13. 7.2008 19:22:28
    Source
    Annual review of information science and technology. 40(2006), S.xxx-xxx
    Theme
    Information
  10. Winget, M.A.: Annotations on musical scores by performing musicians : collaborative models, interactive methods, and music digital library tool development (2008) 0.07
    0.06989994 = product of:
      0.104849905 = sum of:
        0.02367741 = weight(_text_:information in 2368) [ClassicSimilarity], result of:
          0.02367741 = score(doc=2368,freq=10.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.2602176 = fieldWeight in 2368, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2368)
        0.081172496 = sum of:
          0.039036963 = weight(_text_:management in 2368) [ClassicSimilarity], result of:
            0.039036963 = score(doc=2368,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.22344214 = fieldWeight in 2368, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=2368)
          0.04213553 = weight(_text_:22 in 2368) [ClassicSimilarity], result of:
            0.04213553 = score(doc=2368,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 2368, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2368)
      0.6666667 = coord(2/3)
    
    Abstract
    Although there have been a number of fairly recent studies in which researchers have explored the information-seeking and management behaviors of people interacting with musical retrieval systems, there have been very few published studies of the interaction and use behaviors of musicians interacting with their primary information object, the musical score. The ethnographic research reported here seeks to correct this deficiency in the literature. In addition to observing rehearsals and conducting 22 in-depth musician interviews, this research provides in-depth analysis of 25,000 annotations representing 250 parts from 13 complete musical works, made by musicians of all skill levels and performance modes. In addition to producing specific and practical recommendations for digital-library development, this research also provides an augmented annotation framework that will enable more specific study of human-information interaction, both with musical scores, and with more general notational/instructional information objects.
    Source
    Journal of the American Society for Information Science and Technology. 59(2008) no.12, S.1878-1897
  11. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.07
    0.068927675 = product of:
      0.10339151 = sum of:
        0.08232375 = product of:
          0.24697125 = sum of:
            0.24697125 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.24697125 = score(doc=562,freq=2.0), product of:
                0.43943653 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.0518325 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.021067765 = product of:
          0.04213553 = sum of:
            0.04213553 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.04213553 = score(doc=562,freq=2.0), product of:
                0.18150859 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0518325 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  12. Egghe, L.: ¬A universal method of information retrieval evaluation : the "missing" link M and the universal IR surface (2004) 0.07
    0.06634197 = product of:
      0.09951294 = sum of:
        0.018340444 = weight(_text_:information in 2558) [ClassicSimilarity], result of:
          0.018340444 = score(doc=2558,freq=6.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.20156369 = fieldWeight in 2558, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2558)
        0.081172496 = sum of:
          0.039036963 = weight(_text_:management in 2558) [ClassicSimilarity], result of:
            0.039036963 = score(doc=2558,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.22344214 = fieldWeight in 2558, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=2558)
          0.04213553 = weight(_text_:22 in 2558) [ClassicSimilarity], result of:
            0.04213553 = score(doc=2558,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 2558, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2558)
      0.6666667 = coord(2/3)
    
    Abstract
    The paper shows that the present evaluation methods in information retrieval (basically recall R and precision P and in some cases fallout F ) lack universal comparability in the sense that their values depend on the generality of the IR problem. A solution is given by using all "parts" of the database, including the non-relevant documents and also the not-retrieved documents. It turns out that the solution is given by introducing the measure M being the fraction of the not-retrieved documents that are relevant (hence the "miss" measure). We prove that - independent of the IR problem or of the IR action - the quadruple (P,R,F,M) belongs to a universal IR surface, being the same for all IR-activities. This universality is then exploited by defining a new measure for evaluation in IR allowing for unbiased comparisons of all IR results. We also show that only using one, two or even three measures from the set {P,R,F,M} necessary leads to evaluation measures that are non-universal and hence not capable of comparing different IR situations.
    Date
    14. 8.2004 19:17:22
    Source
    Information processing and management. 40(2004) no.1, S.21-30
  13. Aringhieri, R.; Damiani, E.; De Capitani di Vimercati, S.; Paraboschi, S.; Samarati, P.: Fuzzy techniques for trust and reputation management in anonymous peer-to-peer systems (2006) 0.07
    0.06634197 = product of:
      0.09951294 = sum of:
        0.018340444 = weight(_text_:information in 5279) [ClassicSimilarity], result of:
          0.018340444 = score(doc=5279,freq=6.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.20156369 = fieldWeight in 5279, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5279)
        0.081172496 = sum of:
          0.039036963 = weight(_text_:management in 5279) [ClassicSimilarity], result of:
            0.039036963 = score(doc=5279,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.22344214 = fieldWeight in 5279, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=5279)
          0.04213553 = weight(_text_:22 in 5279) [ClassicSimilarity], result of:
            0.04213553 = score(doc=5279,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 5279, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=5279)
      0.6666667 = coord(2/3)
    
    Date
    22. 7.2006 17:06:18
    Footnote
    Beitrag in einer Special Topic Section on Soft Approaches to Information Retrieval and Information Access on the Web
    Source
    Journal of the American Society for Information Science and Technology. 57(2006) no.4, S.528-537
  14. Seo, H.-C.; Kim, S.-B.; Rim, H.-C.; Myaeng, S.-H.: lmproving query translation in English-Korean Cross-language information retrieval (2005) 0.07
    0.06634197 = product of:
      0.09951294 = sum of:
        0.018340444 = weight(_text_:information in 1023) [ClassicSimilarity], result of:
          0.018340444 = score(doc=1023,freq=6.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.20156369 = fieldWeight in 1023, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1023)
        0.081172496 = sum of:
          0.039036963 = weight(_text_:management in 1023) [ClassicSimilarity], result of:
            0.039036963 = score(doc=1023,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.22344214 = fieldWeight in 1023, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=1023)
          0.04213553 = weight(_text_:22 in 1023) [ClassicSimilarity], result of:
            0.04213553 = score(doc=1023,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 1023, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=1023)
      0.6666667 = coord(2/3)
    
    Abstract
    Query translation is a viable method for cross-language information retrieval (CLIR), but it suffers from translation ambiguities caused by multiple translations of individual query terms. Previous research has employed various methods for disambiguation, including the method of selecting an individual target query term from multiple candidates by comparing their statistical associations with the candidate translations of other query terms. This paper proposes a new method where we examine all combinations of target query term translations corresponding to the source query terms, instead of looking at the candidates for each query term and selecting the best one at a time. The goodness value for a combination of target query terms is computed based on the association value between each pair of the terms in the combination. We tested our method using the NTCIR-3 English-Korean CLIR test collection. The results show some improvements regardless of the association measures we used.
    Date
    26.12.2007 20:22:38
    Source
    Information processing and management. 41(2005) no.3, S.507-522
  15. Morrison, P.J.: Tagging and searching : search retrieval effectiveness of folksonomies on the World Wide Web (2008) 0.07
    0.06634197 = product of:
      0.09951294 = sum of:
        0.018340444 = weight(_text_:information in 2109) [ClassicSimilarity], result of:
          0.018340444 = score(doc=2109,freq=6.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.20156369 = fieldWeight in 2109, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2109)
        0.081172496 = sum of:
          0.039036963 = weight(_text_:management in 2109) [ClassicSimilarity], result of:
            0.039036963 = score(doc=2109,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.22344214 = fieldWeight in 2109, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=2109)
          0.04213553 = weight(_text_:22 in 2109) [ClassicSimilarity], result of:
            0.04213553 = score(doc=2109,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 2109, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2109)
      0.6666667 = coord(2/3)
    
    Abstract
    Many Web sites have begun allowing users to submit items to a collection and tag them with keywords. The folksonomies built from these tags are an interesting topic that has seen little empirical research. This study compared the search information retrieval (IR) performance of folksonomies from social bookmarking Web sites against search engines and subject directories. Thirty-four participants created 103 queries for various information needs. Results from each IR system were collected and participants judged relevance. Folksonomy search results overlapped with those from the other systems, and documents found by both search engines and folksonomies were significantly more likely to be judged relevant than those returned by any single IR system type. The search engines in the study had the highest precision and recall, but the folksonomies fared surprisingly well. Del.icio.us was statistically indistinguishable from the directories in many cases. Overall the directories were more precise than the folksonomies but they had similar recall scores. Better query handling may enhance folksonomy IR performance further. The folksonomies studied were promising, and may be able to improve Web search performance.
    Date
    1. 8.2008 12:39:22
    Source
    Information processing and management. 44(2008) no.4, S.1562-1579
  16. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.07
    0.06634197 = product of:
      0.09951294 = sum of:
        0.018340444 = weight(_text_:information in 2760) [ClassicSimilarity], result of:
          0.018340444 = score(doc=2760,freq=6.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.20156369 = fieldWeight in 2760, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2760)
        0.081172496 = sum of:
          0.039036963 = weight(_text_:management in 2760) [ClassicSimilarity], result of:
            0.039036963 = score(doc=2760,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.22344214 = fieldWeight in 2760, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=2760)
          0.04213553 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
            0.04213553 = score(doc=2760,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 2760, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2760)
      0.6666667 = coord(2/3)
    
    Abstract
    Information is often organized as a text hierarchy. A hierarchical text-classification system is thus essential for the management, sharing, and dissemination of information. It aims to automatically classify each incoming document into zero, one, or several categories in the text hierarchy. In this paper, we present a technique called CRHTC (context recognition for hierarchical text classification) that performs hierarchical text classification by recognizing the context of discussion (COD) of each category. A category's COD is governed by its ancestor categories, whose contents indicate contextual backgrounds of the category. A document may be classified into a category only if its content matches the category's COD. CRHTC does not require any trials to manually set parameters, and hence is more portable and easier to implement than other methods. It is empirically evaluated under various conditions. The results show that CRHTC achieves both better and more stable performance than several hierarchical and nonhierarchical text-classification methodologies.
    Date
    22. 3.2009 19:11:54
    Source
    Journal of the American Society for Information Science and Technology. 60(2009) no.4, S.803-813
  17. Detlor, B.: Information management (2009) 0.07
    0.06620909 = product of:
      0.09931363 = sum of:
        0.039065734 = weight(_text_:information in 3793) [ClassicSimilarity], result of:
          0.039065734 = score(doc=3793,freq=20.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.42933714 = fieldWeight in 3793, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3793)
        0.060247894 = product of:
          0.12049579 = sum of:
            0.12049579 = weight(_text_:management in 3793) [ClassicSimilarity], result of:
              0.12049579 = score(doc=3793,freq=14.0), product of:
                0.17470726 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0518325 = queryNorm
                0.6897011 = fieldWeight in 3793, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3793)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Information management concerns the control over how information is created, acquired, organized, stored, distributed, and used as a means of promoting efficient and effective information access, processing, and use by people and organizations. Various perspectives of information management exist. For this entry, three are presented: the organizational, library, and personal perspectives. Each deals with the management of some or all of the processes involved in the information life cycle. Each concerns itself with the management of different types of information resources. The purpose of this entry is to clearly describe what "information management" is and to clarify how information management differs in regards to closely related terms.
    Source
    Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates
  18. Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.06
    0.06409827 = product of:
      0.0961474 = sum of:
        0.014974909 = weight(_text_:information in 948) [ClassicSimilarity], result of:
          0.014974909 = score(doc=948,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.16457605 = fieldWeight in 948, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=948)
        0.081172496 = sum of:
          0.039036963 = weight(_text_:management in 948) [ClassicSimilarity], result of:
            0.039036963 = score(doc=948,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.22344214 = fieldWeight in 948, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=948)
          0.04213553 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
            0.04213553 = score(doc=948,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 948, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=948)
      0.6666667 = coord(2/3)
    
    Abstract
    In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.
    Source
    Information processing and management. 43(2007) no.6, S.1606-1618
  19. Shankar, K.: Ambiguity and legitimate peripheral participation in the creation of scientific documents (2009) 0.06
    0.06409827 = product of:
      0.0961474 = sum of:
        0.014974909 = weight(_text_:information in 1727) [ClassicSimilarity], result of:
          0.014974909 = score(doc=1727,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.16457605 = fieldWeight in 1727, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1727)
        0.081172496 = sum of:
          0.039036963 = weight(_text_:management in 1727) [ClassicSimilarity], result of:
            0.039036963 = score(doc=1727,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.22344214 = fieldWeight in 1727, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=1727)
          0.04213553 = weight(_text_:22 in 1727) [ClassicSimilarity], result of:
            0.04213553 = score(doc=1727,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 1727, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=1727)
      0.6666667 = coord(2/3)
    
    Abstract
    Purpose - The purpose of this paper is to report on a qualitative study of data management and recordkeeping in the research sciences and their roles in information creation and professional identity formation. Design/methodology/approach - The study uses ethnographic fieldwork data in an academic laboratory to examine documentation practices as a part of the trajectory of scientific professionalization. The article examines ethnographic fieldnotes and medical records as cognate areas that provide insight into the topic. Findings - The paper argues that scientific recordkeeping is essential for learning to balance professional standards and personal knowledge, establishing comfort with ambiguity, and can be a process marked by ritual, anxiety, and affect. The article does this by discussing the creation of record from data, tacit knowledge as part of that process, and the process of legitimate peripheral participation (LPP). Research limitations/implications - The qualitative nature of the study suggests the need for similar studies in other environments. Originality/value - The article emphasizes recordkeeping as a part of documentation studies by taking an interdisciplinary, ethnographic approach that is still emergent in information studies. The article is written primarily for fellow researchers.
    Date
    23. 2.2009 17:22:14
  20. Witschel, H.F.: Global term weights in distributed environments (2008) 0.06
    0.06409827 = product of:
      0.0961474 = sum of:
        0.014974909 = weight(_text_:information in 2096) [ClassicSimilarity], result of:
          0.014974909 = score(doc=2096,freq=4.0), product of:
            0.09099081 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0518325 = queryNorm
            0.16457605 = fieldWeight in 2096, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2096)
        0.081172496 = sum of:
          0.039036963 = weight(_text_:management in 2096) [ClassicSimilarity], result of:
            0.039036963 = score(doc=2096,freq=2.0), product of:
              0.17470726 = queryWeight, product of:
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.0518325 = queryNorm
              0.22344214 = fieldWeight in 2096, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3706124 = idf(docFreq=4130, maxDocs=44218)
                0.046875 = fieldNorm(doc=2096)
          0.04213553 = weight(_text_:22 in 2096) [ClassicSimilarity], result of:
            0.04213553 = score(doc=2096,freq=2.0), product of:
              0.18150859 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0518325 = queryNorm
              0.23214069 = fieldWeight in 2096, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2096)
      0.6666667 = coord(2/3)
    
    Abstract
    This paper examines the estimation of global term weights (such as IDF) in information retrieval scenarios where a global view on the collection is not available. In particular, the two options of either sampling documents or of using a reference corpus independent of the target retrieval collection are compared using standard IR test collections. In addition, the possibility of pruning term lists based on frequency is evaluated. The results show that very good retrieval performance can be reached when just the most frequent terms of a collection - an "extended stop word list" - are known and all terms which are not in that list are treated equally. However, the list cannot always be fully estimated from a general-purpose reference corpus, but some "domain-specific stop words" need to be added. A good solution for achieving this is to mix estimates from small samples of the target retrieval collection with ones derived from a reference corpus.
    Date
    1. 8.2008 9:44:22
    Source
    Information processing and management. 44(2008) no.3, S.1049-1061

Authors

Languages

Types

  • a 4339
  • m 425
  • el 298
  • s 169
  • b 37
  • i 17
  • r 16
  • x 16
  • n 14
  • More… Less…

Themes

Subjects

Classifications