Search (4 results, page 1 of 1)

  • × author_ss:"Ahlgren, P."
  1. Ahlgren, P.; Kekäläinen, J.: Indexing strategies for Swedish full text retrieval under different user scenarios (2007) 0.00
    0.0018075579 = product of:
      0.014460463 = sum of:
        0.014460463 = product of:
          0.04338139 = sum of:
            0.04338139 = weight(_text_:problem in 896) [ClassicSimilarity], result of:
              0.04338139 = score(doc=896,freq=4.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.33160037 = fieldWeight in 896, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=896)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    This paper deals with Swedish full text retrieval and the problem of morphological variation of query terms in the document database. The effects of combination of indexing strategies with query terms on retrieval effectiveness were studied. Three of five tested combinations involved indexing strategies that used conflation, in the form of normalization. Further, two of these three combinations used indexing strategies that employed compound splitting. Normalization and compound splitting were performed by SWETWOL, a morphological analyzer for the Swedish language. A fourth combination attempted to group related terms by right hand truncation of query terms. The four combinations were compared to each other and to a baseline combination, where no attempt was made to counteract the problem of morphological variation of query terms in the document database. The five combinations were evaluated under six different user scenarios, where each scenario simulated a certain user type. The four alternative combinations outperformed the baseline, for each user scenario. The truncation combination had the best performance under each user scenario. The main conclusion of the paper is that normalization and right hand truncation (performed by a search expert) enhanced retrieval effectiveness in comparison to the baseline. The performance of the three combinations of indexing strategies with query terms based on normalization was not far below the performance of the truncation combination.
  2. Ahlgren, P.; Grönqvist, L.: Evaluation of retrieval effectiveness with incomplete relevance data : theoretical and experimental comparison of three measures (2008) 0.00
    0.001789391 = product of:
      0.014315128 = sum of:
        0.014315128 = product of:
          0.042945385 = sum of:
            0.042945385 = weight(_text_:problem in 2032) [ClassicSimilarity], result of:
              0.042945385 = score(doc=2032,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.3282676 = fieldWeight in 2032, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2032)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    This paper investigates two relatively new measures of retrieval effectiveness in relation to the problem of incomplete relevance data. The measures, Bpref and RankEff, which do not take into account documents that have not been relevance judged, are compared theoretically and experimentally. The experimental comparisons involve a third measure, the well-known mean uninterpolated average precision. The results indicate that RankEff is the most stable of the three measures when the amount of relevance data is reduced, with respect to system ranking and absolute values. In addition, RankEff has the lowest error-rate.
  3. Sjögårde, P.; Ahlgren, P.; Waltman, L.: Algorithmic labeling in hierarchical classifications of publications : evaluation of bibliographic fields and term weighting approaches (2021) 0.00
    0.0012781365 = product of:
      0.010225092 = sum of:
        0.010225092 = product of:
          0.030675275 = sum of:
            0.030675275 = weight(_text_:problem in 261) [ClassicSimilarity], result of:
              0.030675275 = score(doc=261,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23447686 = fieldWeight in 261, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=261)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    Algorithmic classifications of research publications can be used to study many different aspects of the science system, such as the organization of science into fields, the growth of fields, interdisciplinarity, and emerging topics. How to label the classes in these classifications is a problem that has not been thoroughly addressed in the literature. In this study, we evaluate different approaches to label the classes in algorithmically constructed classifications of research publications. We focus on two important choices: the choice of (a) different bibliographic fields and (b) different approaches to weight the relevance of terms. To evaluate the different choices, we created two baselines: one based on the Medical Subject Headings in MEDLINE and another based on the Science-Metrix journal classification. We tested to what extent different approaches yield the desired labels for the classes in the two baselines. Based on our results, we recommend extracting terms from titles and keywords to label classes at high levels of granularity (e.g., topics). At low levels of granularity (e.g., disciplines) we recommend extracting terms from journal names and author addresses. We recommend the use of a new approach, term frequency to specificity ratio, to calculate the relevance of terms.
  4. Ahlgren, P.; Jarneving, B.; Rousseau, R.: Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient (2003) 0.00
    6.9599686E-4 = product of:
      0.005567975 = sum of:
        0.005567975 = product of:
          0.016703924 = sum of:
            0.016703924 = weight(_text_:22 in 5171) [ClassicSimilarity], result of:
              0.016703924 = score(doc=5171,freq=2.0), product of:
                0.10793405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030822188 = queryNorm
                0.15476047 = fieldWeight in 5171, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5171)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    9. 7.2006 10:22:35