Search (319 results, page 1 of 16)

  • × theme_ss:"Retrievalalgorithmen"
  1. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.02
    0.016243609 = product of:
      0.048730824 = sum of:
        0.020946784 = weight(_text_:information in 402) [ClassicSimilarity], result of:
          0.020946784 = score(doc=402,freq=2.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.3103276 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
        0.02778404 = product of:
          0.08335212 = sum of:
            0.08335212 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.08335212 = score(doc=402,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  2. Cole, C.: Intelligent information retrieval: diagnosing information need : Part II: uncertainty expansion in a prototype of a diagnostic IR tool (1998) 0.02
    0.016079284 = product of:
      0.048237853 = sum of:
        0.027210671 = weight(_text_:information in 6432) [ClassicSimilarity], result of:
          0.027210671 = score(doc=6432,freq=6.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.40312737 = fieldWeight in 6432, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6432)
        0.021027183 = product of:
          0.06308155 = sum of:
            0.06308155 = weight(_text_:29 in 6432) [ClassicSimilarity], result of:
              0.06308155 = score(doc=6432,freq=2.0), product of:
                0.13525672 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03845047 = queryNorm
                0.46638384 = fieldWeight in 6432, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6432)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Date
    11. 8.2001 14:48:29
    Source
    Information processing and management. 34(1998) no.6, S.721-731
  3. Crestani, F.: Combination of similarity measures for effective spoken document retrieval (2003) 0.01
    0.014286717 = product of:
      0.04286015 = sum of:
        0.018328438 = weight(_text_:information in 4690) [ClassicSimilarity], result of:
          0.018328438 = score(doc=4690,freq=2.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.27153665 = fieldWeight in 4690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.109375 = fieldNorm(doc=4690)
        0.024531713 = product of:
          0.07359514 = sum of:
            0.07359514 = weight(_text_:29 in 4690) [ClassicSimilarity], result of:
              0.07359514 = score(doc=4690,freq=2.0), product of:
                0.13525672 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03845047 = queryNorm
                0.5441145 = fieldWeight in 4690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4690)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Source
    Journal of information science. 29(2003) no.2, S.87-96
  4. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.01
    0.014213158 = product of:
      0.04263947 = sum of:
        0.018328438 = weight(_text_:information in 3445) [ClassicSimilarity], result of:
          0.018328438 = score(doc=3445,freq=2.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.27153665 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
        0.024311034 = product of:
          0.0729331 = sum of:
            0.0729331 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.0729331 = score(doc=3445,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Date
    25. 8.2005 17:42:22
    Source
    Library and information research news. 24(2000) no.77, S.30-34
  5. Thompson, P.: Looking back: on relevance, probabilistic indexing and information retrieval (2008) 0.01
    0.013224198 = product of:
      0.03967259 = sum of:
        0.025654467 = weight(_text_:information in 2074) [ClassicSimilarity], result of:
          0.025654467 = score(doc=2074,freq=12.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.38007212 = fieldWeight in 2074, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=2074)
        0.014018122 = product of:
          0.042054366 = sum of:
            0.042054366 = weight(_text_:29 in 2074) [ClassicSimilarity], result of:
              0.042054366 = score(doc=2074,freq=2.0), product of:
                0.13525672 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03845047 = queryNorm
                0.31092256 = fieldWeight in 2074, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2074)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Forty-eight years ago Maron and Kuhns published their paper, "On Relevance, Probabilistic Indexing and Information Retrieval" (1960). This was the first paper to present a probabilistic approach to information retrieval, and perhaps the first paper on ranked retrieval. Although it is one of the most widely cited papers in the field of information retrieval, many researchers today may not be familiar with its influence. This paper describes the Maron and Kuhns article and the influence that it has had on the field of information retrieval.
    Date
    31. 7.2008 19:58:29
    Source
    Information processing and management. 44(2008) no.2, S.963-970
  6. Okada, M.; Ando, K.; Lee, S.S.; Hayashi, Y.; Aoe, J.I.: ¬An efficient substring search method by using delayed keyword extraction (2001) 0.01
    0.0122457575 = product of:
      0.03673727 = sum of:
        0.01571009 = weight(_text_:information in 6415) [ClassicSimilarity], result of:
          0.01571009 = score(doc=6415,freq=2.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.23274569 = fieldWeight in 6415, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6415)
        0.021027183 = product of:
          0.06308155 = sum of:
            0.06308155 = weight(_text_:29 in 6415) [ClassicSimilarity], result of:
              0.06308155 = score(doc=6415,freq=2.0), product of:
                0.13525672 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03845047 = queryNorm
                0.46638384 = fieldWeight in 6415, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6415)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Date
    29. 3.2002 17:24:03
    Source
    Information processing and management. 37(2001) no.5, S.741-761
  7. Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.01
    0.011612935 = product of:
      0.034838803 = sum of:
        0.020946784 = weight(_text_:information in 1422) [ClassicSimilarity], result of:
          0.020946784 = score(doc=1422,freq=8.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.3103276 = fieldWeight in 1422, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
        0.01389202 = product of:
          0.04167606 = sum of:
            0.04167606 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
              0.04167606 = score(doc=1422,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.30952093 = fieldWeight in 1422, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1422)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
    Date
    22. 3.2003 19:27:23
    Footnote
    Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.4, S.285-301
  8. Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.01
    0.010161318 = product of:
      0.030483954 = sum of:
        0.018328438 = weight(_text_:information in 1319) [ClassicSimilarity], result of:
          0.018328438 = score(doc=1319,freq=8.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.27153665 = fieldWeight in 1319, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
        0.012155517 = product of:
          0.03646655 = sum of:
            0.03646655 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
              0.03646655 = score(doc=1319,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.2708308 = fieldWeight in 1319, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1319)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
    Date
    1. 8.1996 22:08:06
  9. Song, D.; Bruza, P.D.: Towards context sensitive information inference (2003) 0.01
    0.010130903 = product of:
      0.030392706 = sum of:
        0.021710195 = weight(_text_:information in 1428) [ClassicSimilarity], result of:
          0.021710195 = score(doc=1428,freq=22.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.32163754 = fieldWeight in 1428, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1428)
        0.008682513 = product of:
          0.026047537 = sum of:
            0.026047537 = weight(_text_:22 in 1428) [ClassicSimilarity], result of:
              0.026047537 = score(doc=1428,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.19345059 = fieldWeight in 1428, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1428)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Humans can make hasty, but generally robust judgements about what a text fragment is, or is not, about. Such judgements are termed information inference. This article furnishes an account of information inference from a psychologistic stance. By drawing an theories from nonclassical logic and applied cognition, an information inference mechanism is proposed that makes inferences via computations of information flow through an approximation of a conceptual space. Within a conceptual space information is represented geometrically. In this article, geometric representations of words are realized as vectors in a high dimensional semantic space, which is automatically constructed from a text corpus. Two approaches were presented for priming vector representations according to context. The first approach uses a concept combination heuristic to adjust the vector representation of a concept in the light of the representation of another concept. The second approach computes a prototypical concept an the basis of exemplar trace texts and moves it in the dimensional space according to the context. Information inference is evaluated by measuring the effectiveness of query models derived by information flow computations. Results show that information flow contributes significantly to query model effectiveness, particularly with respect to precision. Moreover, retrieval effectiveness compares favorably with two probabilistic query models, and another based an semantic association. More generally, this article can be seen as a contribution towards realizing operational systems that mimic text-based human reasoning.
    Date
    22. 3.2003 19:35:46
    Footnote
    Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.4, S.321-334
  10. Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.01
    0.009886622 = product of:
      0.029659865 = sum of:
        0.01924085 = weight(_text_:information in 1451) [ClassicSimilarity], result of:
          0.01924085 = score(doc=1451,freq=12.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.2850541 = fieldWeight in 1451, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
        0.010419015 = product of:
          0.031257045 = sum of:
            0.031257045 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
              0.031257045 = score(doc=1451,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.23214069 = fieldWeight in 1451, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1451)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
    Date
    22. 3.2003 19:27:36
    Footnote
    Einführung zu den Beiträgen eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.4, S.281-284
  11. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.01
    0.009886622 = product of:
      0.029659865 = sum of:
        0.01924085 = weight(_text_:information in 2419) [ClassicSimilarity], result of:
          0.01924085 = score(doc=2419,freq=12.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.2850541 = fieldWeight in 2419, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2419)
        0.010419015 = product of:
          0.031257045 = sum of:
            0.031257045 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
              0.031257045 = score(doc=2419,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.23214069 = fieldWeight in 2419, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
    Date
    16.11.2008 16:22:48
  12. Vechtomova, O.; Karamuftuoglu, M.: Lexical cohesion and term proximity in document ranking (2008) 0.01
    0.009609912 = product of:
      0.028829735 = sum of:
        0.014811614 = weight(_text_:information in 2101) [ClassicSimilarity], result of:
          0.014811614 = score(doc=2101,freq=4.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.21943474 = fieldWeight in 2101, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=2101)
        0.014018122 = product of:
          0.042054366 = sum of:
            0.042054366 = weight(_text_:29 in 2101) [ClassicSimilarity], result of:
              0.042054366 = score(doc=2101,freq=2.0), product of:
                0.13525672 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03845047 = queryNorm
                0.31092256 = fieldWeight in 2101, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2101)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    We demonstrate effective new methods of document ranking based on lexical cohesive relationships between query terms. The proposed methods rely solely on the lexical relationships between original query terms, and do not involve query expansion or relevance feedback. Two types of lexical cohesive relationship information between query terms are used in document ranking: short-distance collocation relationship between query terms, and long-distance relationship, determined by the collocation of query terms with other words. The methods are evaluated on TREC corpora, and show improvements over baseline systems.
    Date
    1. 8.2008 12:29:05
    Source
    Information processing and management. 44(2008) no.4, S.1485-1502
  13. Qi, Q.; Hessen, D.J.; Heijden, P.G.M. van der: Improving information retrieval through correspondenceanalysis instead of latent semantic analysis (2023) 0.01
    0.0093593355 = product of:
      0.028078005 = sum of:
        0.017564414 = weight(_text_:information in 1045) [ClassicSimilarity], result of:
          0.017564414 = score(doc=1045,freq=10.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.2602176 = fieldWeight in 1045, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1045)
        0.010513592 = product of:
          0.031540774 = sum of:
            0.031540774 = weight(_text_:29 in 1045) [ClassicSimilarity], result of:
              0.031540774 = score(doc=1045,freq=2.0), product of:
                0.13525672 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03845047 = queryNorm
                0.23319192 = fieldWeight in 1045, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1045)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    The initial dimensions extracted by latent semantic analysis (LSA) of a document-term matrixhave been shown to mainly display marginal effects, which are irrelevant for informationretrieval. To improve the performance of LSA, usually the elements of the raw document-term matrix are weighted and the weighting exponent of singular values can be adjusted.An alternative information retrieval technique that ignores the marginal effects is correspon-dence analysis (CA). In this paper, the information retrieval performance of LSA and CA isempirically compared. Moreover, it is explored whether the two weightings also improve theperformance of CA. The results for four empirical datasets show that CA always performsbetter than LSA. Weighting the elements of the raw data matrix can improve CA; however,it is data dependent and the improvement is small. Adjusting the singular value weightingexponent often improves the performance of CA; however, the extent of the improvementdepends on the dataset and the number of dimensions. (PDF) Improving information retrieval through correspondence analysis instead of latent semantic analysis.
    Date
    15. 9.2023 12:28:29
    Source
    Journal of intelligent information systems [https://doi.org/10.1007/s10844-023-00815-y]
  14. Kwok, K.L.: ¬A network approach to probabilistic information retrieval (1995) 0.01
    0.008741228 = product of:
      0.026223682 = sum of:
        0.01571009 = weight(_text_:information in 5696) [ClassicSimilarity], result of:
          0.01571009 = score(doc=5696,freq=8.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.23274569 = fieldWeight in 5696, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5696)
        0.010513592 = product of:
          0.031540774 = sum of:
            0.031540774 = weight(_text_:29 in 5696) [ClassicSimilarity], result of:
              0.031540774 = score(doc=5696,freq=2.0), product of:
                0.13525672 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03845047 = queryNorm
                0.23319192 = fieldWeight in 5696, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5696)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Shows how probabilistic information retrieval based on document components may be implemented as a feedforward (feedbackward) artificial neural network. The network supports adaptation of connection weights as well as the growing of new edges between queries and terms based on user relevance feedback data for training, and it reflects query modification and expansion in information retrieval. A learning rule is applied that can also be viewed as supporting sequential learning using a harmonic sequence learning rate. Experimental results with 4 standard small collections and a large Wall Street Journal collection show that small query expansion levels of about 30 terms can achieve most of the gains at the low-recall high-precision region, while larger expansion levels continue to provide gains at the high-recall low-precision region of a precision recall curve
    Date
    29. 1.1996 18:42:14
    Source
    ACM transactions on information systems. 13(1995) no.3, S.324-353
  15. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.01
    0.008371893 = product of:
      0.02511568 = sum of:
        0.012960162 = weight(_text_:information in 3276) [ClassicSimilarity], result of:
          0.012960162 = score(doc=3276,freq=4.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.1920054 = fieldWeight in 3276, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3276)
        0.012155517 = product of:
          0.03646655 = sum of:
            0.03646655 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
              0.03646655 = score(doc=3276,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.2708308 = fieldWeight in 3276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3276)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Im Rahmen des klassischen Information Retrieval wurden verschiedene Verfahren für das Ranking sowie die Suche in einer homogenen strukturlosen Dokumentenmenge entwickelt. Die Erfolge der Suchmaschine Google haben gezeigt dass die Suche in einer zwar inhomogenen aber zusammenhängenden Dokumentenmenge wie dem Internet unter Berücksichtigung der Dokumentenverbindungen (Links) sehr effektiv sein kann. Unter den von der Suchmaschine Google realisierten Konzepten ist ein Verfahren zum Ranking von Suchergebnissen (PageRank), das in diesem Artikel kurz erklärt wird. Darüber hinaus wird auf die Konzepte eines Systems namens CiteSeer eingegangen, welches automatisch bibliographische Angaben indexiert (engl. Autonomous Citation Indexing, ACI). Letzteres erzeugt aus einer Menge von nicht vernetzten wissenschaftlichen Dokumenten eine zusammenhängende Dokumentenmenge und ermöglicht den Einsatz von Banking-Verfahren, die auf den von Google genutzten Verfahren basieren.
    Date
    20. 3.2005 16:23:22
    Source
    Information - Wissenschaft und Praxis. 56(2005) H.2, S.87-92
  16. Dominich, S.: Mathematical foundations of information retrieval (2001) 0.01
    0.008238852 = product of:
      0.024716556 = sum of:
        0.016034042 = weight(_text_:information in 1753) [ClassicSimilarity], result of:
          0.016034042 = score(doc=1753,freq=12.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.23754507 = fieldWeight in 1753, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1753)
        0.008682513 = product of:
          0.026047537 = sum of:
            0.026047537 = weight(_text_:22 in 1753) [ClassicSimilarity], result of:
              0.026047537 = score(doc=1753,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.19345059 = fieldWeight in 1753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1753)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    This book offers a comprehensive and consistent mathematical approach to information retrieval (IR) without which no implementation is possible, and sheds an entirely new light upon the structure of IR models. It contains the descriptions of all IR models in a unified formal style and language, along with examples for each, thus offering a comprehensive overview of them. The book also creates mathematical foundations and a consistent mathematical theory (including all mathematical results achieved so far) of IR as a stand-alone mathematical discipline, which thus can be read and taught independently. Also, the book contains all necessary mathematical knowledge on which IR relies, to help the reader avoid searching different sources. The book will be of interest to computer or information scientists, librarians, mathematicians, undergraduate students and researchers whose work involves information retrieval.
    Date
    22. 3.2008 12:26:32
    LCSH
    Information storage and retrieval
    Subject
    Information storage and retrieval
  17. Uratani, N.; Takeda, M.: ¬A fast string-searching algorithm for multiple patterns (1993) 0.01
    0.008163839 = product of:
      0.024491515 = sum of:
        0.010473392 = weight(_text_:information in 6275) [ClassicSimilarity], result of:
          0.010473392 = score(doc=6275,freq=2.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.1551638 = fieldWeight in 6275, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=6275)
        0.014018122 = product of:
          0.042054366 = sum of:
            0.042054366 = weight(_text_:29 in 6275) [ClassicSimilarity], result of:
              0.042054366 = score(doc=6275,freq=2.0), product of:
                0.13525672 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03845047 = queryNorm
                0.31092256 = fieldWeight in 6275, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6275)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Source
    Information processing and management. 29(1993) no.6, S.775-791
  18. Faloutsos, C.: Signature files (1992) 0.01
    0.008121804 = product of:
      0.024365412 = sum of:
        0.010473392 = weight(_text_:information in 3499) [ClassicSimilarity], result of:
          0.010473392 = score(doc=3499,freq=2.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.1551638 = fieldWeight in 3499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.01389202 = product of:
          0.04167606 = sum of:
            0.04167606 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
              0.04167606 = score(doc=3499,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.30952093 = fieldWeight in 3499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3499)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Date
    7. 5.1999 15:22:48
    Source
    Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
  19. Bornmann, L.; Mutz, R.: From P100 to P100' : a new citation-rank approach (2014) 0.01
    0.008121804 = product of:
      0.024365412 = sum of:
        0.010473392 = weight(_text_:information in 1431) [ClassicSimilarity], result of:
          0.010473392 = score(doc=1431,freq=2.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.1551638 = fieldWeight in 1431, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=1431)
        0.01389202 = product of:
          0.04167606 = sum of:
            0.04167606 = weight(_text_:22 in 1431) [ClassicSimilarity], result of:
              0.04167606 = score(doc=1431,freq=2.0), product of:
                0.13464698 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03845047 = queryNorm
                0.30952093 = fieldWeight in 1431, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1431)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Date
    22. 8.2014 17:05:18
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.9, S.1939-1943
  20. Drucker, H.; Shahrary, B.; Gibbon, D.C.: Support vector machines : relevance feedback and information retrieval (2002) 0.01
    0.008039642 = product of:
      0.024118926 = sum of:
        0.013605336 = weight(_text_:information in 2581) [ClassicSimilarity], result of:
          0.013605336 = score(doc=2581,freq=6.0), product of:
            0.067498945 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03845047 = queryNorm
            0.20156369 = fieldWeight in 2581, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2581)
        0.010513592 = product of:
          0.031540774 = sum of:
            0.031540774 = weight(_text_:29 in 2581) [ClassicSimilarity], result of:
              0.031540774 = score(doc=2581,freq=2.0), product of:
                0.13525672 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03845047 = queryNorm
                0.23319192 = fieldWeight in 2581, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2581)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    We compare support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. It is assumed a preliminary search finds a set of documents that the user marks as relevant or not and then feedback iterations commence. Particular attention is paid to IR searches where the number of relevant documents in the database is low and the preliminary set of documents used to start the search has few relevant documents. Experiments show that if inverse document frequency (IDF) weighting is not used because one is unwilling to pay the time penalty needed to obtain these features, then SVMs are better whether using term-frequency (TF) or binary weighting. SVM performance is marginally better than Ide dec-hi if TF-IDF weighting is used and there is a reasonable number of relevant documents found in the preliminary search. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred.
    Date
    15. 8.2004 18:55:29
    Source
    Information processing and management. 38(2002) no.3, S.305-323

Languages

Types

  • a 296
  • m 12
  • el 6
  • s 5
  • r 3
  • p 2
  • x 2
  • More… Less…