Search (364 results, page 19 of 19)

  • × theme_ss:"Retrievalstudien"
  • × type_ss:"a"
  1. Ding, C.H.Q.: ¬A probabilistic model for Latent Semantic Indexing (2005) 0.00
    1.8014197E-4 = product of:
      0.0030624135 = sum of:
        0.0030624135 = weight(_text_:in in 3459) [ClassicSimilarity], result of:
          0.0030624135 = score(doc=3459,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.09017298 = fieldWeight in 3459, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=3459)
      0.05882353 = coord(1/17)
    
    Abstract
    Latent Semantic Indexing (LSI), when applied to semantic space built an text collections, improves information retrieval, information filtering, and word sense disambiguation. A new dual probability model based an the similarity concepts is introduced to provide deeper understanding of LSI. Semantic associations can be quantitatively characterized by their statistical significance, the likelihood. Semantic dimensions containing redundant and noisy information can be separated out and should be ignored because their negative contribution to the overall statistical significance. LSI is the optimal solution of the model. The peak in the likelihood curve indicates the existence of an intrinsic semantic dimension. The importance of LSI dimensions follows the Zipf-distribution, indicating that LSI dimensions represent latent concepts. Document frequency of words follows the Zipf distribution, and the number of distinct words follows log-normal distribution. Experiments an five standard document collections confirm and illustrate the analysis.
  2. Hirsh, S.G.: Children's relevance criteria and information seeking on electronic resources (1999) 0.00
    1.5011833E-4 = product of:
      0.0025520115 = sum of:
        0.0025520115 = weight(_text_:in in 4297) [ClassicSimilarity], result of:
          0.0025520115 = score(doc=4297,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.07514416 = fieldWeight in 4297, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4297)
      0.05882353 = coord(1/17)
    
    Abstract
    This study explores the relevance criteria and search strategies elementary school children applied when searching for information related to a class assignment in a school library setting. Students were interviewed on 2 occasions at different stages of the research process; field observations involved students thinking aloud to explain their search proceses and shadowing as students moved around the school library. Students performed searches on an online catalog, an electronic encyclopedia, an electronic magazine index, and the WWW. Results are presented for children selecting the topic, conducting the search, examining the results, and extracting relevant results. A total of 254 mentions of relevance criteria were identified, including 197 references to textual relevance criteria that were coded into 9 categories and 57 references to graphical relevance criteria that were coded into 5 categories. Students exhibited little concern for the authority of the textual and graphical information they found, based the majority of their relevance decisions for textual material on topicality, and identified information they found interesting. Students devoted a large portion of their research time to find pictures. Understanding the ways that children use electronic resources and the relevance criteria they apply has implications for information literacy training and for systems design
  3. Debole, F.; Sebastiani, F.: ¬An analysis of the relative hardness of Reuters-21578 subsets (2005) 0.00
    1.5011833E-4 = product of:
      0.0025520115 = sum of:
        0.0025520115 = weight(_text_:in in 3456) [ClassicSimilarity], result of:
          0.0025520115 = score(doc=3456,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.07514416 = fieldWeight in 3456, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3456)
      0.05882353 = coord(1/17)
    
    Abstract
    The existence, public availability, and widespread acceptance of a standard benchmark for a given information retrieval (IR) task are beneficial to research an this task, because they allow different researchers to experimentally compare their own systems by comparing the results they have obtained an this benchmark. The Reuters-21578 test collection, together with its earlier variants, has been such a standard benchmark for the text categorization (TC) task throughout the last 10 years. However, the benefits that this has brought about have somehow been limited by the fact that different researchers have "carved" different subsets out of this collection and tested their systems an one of these subsets only; systems that have been tested an different Reuters-21578 subsets are thus not readily comparable. In this article, we present a systematic, comparative experimental study of the three subsets of Reuters-21578 that have been most popular among TC researchers. The results we obtain allow us to determine the relative hardness of these subsets, thus establishing an indirect means for comparing TC systems that have, or will be, tested an these different subsets.
  4. Hider, P.: ¬The search value added by professional indexing to a bibliographic database (2017) 0.00
    1.5011833E-4 = product of:
      0.0025520115 = sum of:
        0.0025520115 = weight(_text_:in in 3868) [ClassicSimilarity], result of:
          0.0025520115 = score(doc=3868,freq=2.0), product of:
            0.033961542 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.024967048 = queryNorm
            0.07514416 = fieldWeight in 3868, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3868)
      0.05882353 = coord(1/17)
    
    Content
    Beitrag bei: NASKO 2017: Visualizing Knowledge Organization: Bringing Focus to Abstract Realities. The sixth North American Symposium on Knowledge Organization (NASKO 2017), June 15-16, 2017, in Champaign, IL, USA.

Languages