Search (7 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Retrievalalgorithmen"
  • × year_i:[2020 TO 2030}
  1. Liu, J.; Liu, C.: Personalization in text information retrieval : a survey (2020) 0.00
    4.4124527E-4 = product of:
      0.0066186786 = sum of:
        0.0066186786 = product of:
          0.013237357 = sum of:
            0.013237357 = weight(_text_:information in 5761) [ClassicSimilarity], result of:
              0.013237357 = score(doc=5761,freq=10.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.2602176 = fieldWeight in 5761, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5761)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Abstract
    Personalization of information retrieval (PIR) is aimed at tailoring a search toward individual users and user groups by taking account of additional information about users besides their queries. In the past two decades or so, PIR has received extensive attention in both academia and industry. This article surveys the literature of personalization in text retrieval, following a framework for aspects or factors that can be used for personalization. The framework consists of additional information about users that can be explicitly obtained by asking users for their preferences, or implicitly inferred from users' search behaviors. Users' characteristics and contextual factors such as tasks, time, location, etc., can be helpful for personalization. This article also addresses various issues including when to personalize, the evaluation of PIR, privacy, usability, etc. Based on the extensive review, challenges are discussed and directions for future effort are suggested.
    Source
    Journal of the Association for Information Science and Technology. 71(2020) no.3, S.349-369
  2. Qi, Q.; Hessen, D.J.; Heijden, P.G.M. van der: Improving information retrieval through correspondenceanalysis instead of latent semantic analysis (2023) 0.00
    4.4124527E-4 = product of:
      0.0066186786 = sum of:
        0.0066186786 = product of:
          0.013237357 = sum of:
            0.013237357 = weight(_text_:information in 1045) [ClassicSimilarity], result of:
              0.013237357 = score(doc=1045,freq=10.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.2602176 = fieldWeight in 1045, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1045)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Abstract
    The initial dimensions extracted by latent semantic analysis (LSA) of a document-term matrixhave been shown to mainly display marginal effects, which are irrelevant for informationretrieval. To improve the performance of LSA, usually the elements of the raw document-term matrix are weighted and the weighting exponent of singular values can be adjusted.An alternative information retrieval technique that ignores the marginal effects is correspon-dence analysis (CA). In this paper, the information retrieval performance of LSA and CA isempirically compared. Moreover, it is explored whether the two weightings also improve theperformance of CA. The results for four empirical datasets show that CA always performsbetter than LSA. Weighting the elements of the raw data matrix can improve CA; however,it is data dependent and the improvement is small. Adjusting the singular value weightingexponent often improves the performance of CA; however, the extent of the improvementdepends on the dataset and the number of dimensions. (PDF) Improving information retrieval through correspondence analysis instead of latent semantic analysis.
    Source
    Journal of intelligent information systems [https://doi.org/10.1007/s10844-023-00815-y]
  3. Wiggers, G.; Verberne, S.; Loon, W. van; Zwenne, G.-J.: Bibliometric-enhanced legal information retrieval : combining usage and citations as flavors of impact relevance (2023) 0.00
    3.2888478E-4 = product of:
      0.0049332716 = sum of:
        0.0049332716 = product of:
          0.009866543 = sum of:
            0.009866543 = weight(_text_:information in 1022) [ClassicSimilarity], result of:
              0.009866543 = score(doc=1022,freq=8.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.19395474 = fieldWeight in 1022, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1022)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Abstract
    Bibliometric-enhanced information retrieval uses bibliometrics (e.g., citations) to improve ranking algorithms. Using a data-driven approach, this article describes the development of a bibliometric-enhanced ranking algorithm for legal information retrieval, and the evaluation thereof. We statistically analyze the correlation between usage of documents and citations over time, using data from a commercial legal search engine. We then propose a bibliometric boost function that combines usage of documents with citation counts. The core of this function is an impact variable based on usage and citations that increases in influence as citations and usage counts become more reliable over time. We evaluate our ranking function by comparing search sessions before and after the introduction of the new ranking in the search engine. Using a cost model applied to 129,571 sessions before and 143,864 sessions after the intervention, we show that our bibliometric-enhanced ranking algorithm reduces the time of a search session of legal professionals by 2 to 3% on average for use cases other than known-item retrieval or updating behavior. Given the high hourly tariff of legal professionals and the limited time they can spend on research, this is expected to lead to increased efficiency, especially for users with extremely long search sessions.
    Source
    Journal of the Association for Information Science and Technology. 74(2023) no.8, S.1010-1025
  4. Pan, M.; Huang, J.X.; He, T.; Mao, Z.; Ying, Z.; Tu, X.: ¬A simple kernel co-occurrence-based enhancement for pseudo-relevance feedback (2020) 0.00
    2.848226E-4 = product of:
      0.004272339 = sum of:
        0.004272339 = product of:
          0.008544678 = sum of:
            0.008544678 = weight(_text_:information in 5678) [ClassicSimilarity], result of:
              0.008544678 = score(doc=5678,freq=6.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.16796975 = fieldWeight in 5678, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5678)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Abstract
    Pseudo-relevance feedback is a well-studied query expansion technique in which it is assumed that the top-ranked documents in an initial set of retrieval results are relevant and expansion terms are then extracted from those documents. When selecting expansion terms, most traditional models do not simultaneously consider term frequency and the co-occurrence relationships between candidate terms and query terms. Intuitively, however, a term that has a higher co-occurrence with a query term is more likely to be related to the query topic. In this article, we propose a kernel co-occurrence-based framework to enhance retrieval performance by integrating term co-occurrence information into the Rocchio model and a relevance language model (RM3). Specifically, a kernel co-occurrence-based Rocchio method (KRoc) and a kernel co-occurrence-based RM3 method (KRM3) are proposed. In our framework, co-occurrence information is incorporated into both the factor of the term discrimination power and the factor of the within-document term weight to boost retrieval performance. The results of a series of experiments show that our proposed methods significantly outperform the corresponding strong baselines over all data sets in terms of the mean average precision and over most data sets in terms of P@10. A direct comparison of standard Text Retrieval Conference data sets indicates that our proposed methods are at least comparable to state-of-the-art approaches.
    Source
    Journal of the Association for Information Science and Technology. 71(2020) no.3, S.264-281
  5. Hammache, A.; Boughanem, M.: Term position-based language model for information retrieval (2021) 0.00
    2.3255666E-4 = product of:
      0.0034883497 = sum of:
        0.0034883497 = product of:
          0.0069766995 = sum of:
            0.0069766995 = weight(_text_:information in 216) [ClassicSimilarity], result of:
              0.0069766995 = score(doc=216,freq=4.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.13714671 = fieldWeight in 216, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=216)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Source
    Journal of the Association for Information Science and Technology. 72(2021) no.5, S.627-642
  6. Dang, E.K.F.; Luk, R.W.P.; Allan, J.: ¬A retrieval model family based on the probability ranking principle for ad hoc retrieval (2022) 0.00
    2.3021935E-4 = product of:
      0.00345329 = sum of:
        0.00345329 = product of:
          0.00690658 = sum of:
            0.00690658 = weight(_text_:information in 638) [ClassicSimilarity], result of:
              0.00690658 = score(doc=638,freq=2.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.13576832 = fieldWeight in 638, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=638)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Source
    Journal of the Association for Information Science and Technology. 73(2022) no.8, S.1140-1154
  7. Purpura, A.; Silvello, G.; Susto, G.A.: Learning to rank from relevance judgments distributions (2022) 0.00
    1.6444239E-4 = product of:
      0.0024666358 = sum of:
        0.0024666358 = product of:
          0.0049332716 = sum of:
            0.0049332716 = weight(_text_:information in 645) [ClassicSimilarity], result of:
              0.0049332716 = score(doc=645,freq=2.0), product of:
                0.050870337 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.028978055 = queryNorm
                0.09697737 = fieldWeight in 645, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=645)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Source
    Journal of the Association for Information Science and Technology. 73(2022) no.9, S.1236-1252