Search (3 results, page 1 of 1)

  • × author_ss:"Ahlgren, P."
  1. Ahlgren, P.; Jarneving, B.; Rousseau, R.: Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient (2003) 0.02
    0.017167753 = product of:
      0.025751628 = sum of:
        0.01779992 = weight(_text_:retrieval in 5171) [ClassicSimilarity], result of:
          0.01779992 = score(doc=5171,freq=2.0), product of:
            0.1331496 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.04401763 = queryNorm
            0.13368362 = fieldWeight in 5171, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=5171)
        0.0079517085 = product of:
          0.023855126 = sum of:
            0.023855126 = weight(_text_:22 in 5171) [ClassicSimilarity], result of:
              0.023855126 = score(doc=5171,freq=2.0), product of:
                0.15414225 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04401763 = queryNorm
                0.15476047 = fieldWeight in 5171, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5171)
          0.33333334 = coord(1/3)
      0.6666667 = coord(2/3)
    
    Abstract
    Ahlgren, Jarneving, and. Rousseau review accepted procedures for author co-citation analysis first pointing out that since in the raw data matrix the row and column values are identical i,e, the co-citation count of two authors, there is no clear choice for diagonal values. They suggest the number of times an author has been co-cited with himself excluding self citation rather than the common treatment as zeros or as missing values. When the matrix is converted to a similarity matrix the normal procedure is to create a matrix of Pearson's r coefficients between data vectors. Ranking by r and by co-citation frequency and by intuition can easily yield three different orders. It would seem necessary that the adding of zeros to the matrix will not affect the value or the relative order of similarity measures but it is shown that this is not the case with Pearson's r. Using 913 bibliographic descriptions form the Web of Science of articles form JASIS and Scientometrics, authors names were extracted, edited and 12 information retrieval authors and 12 bibliometric authors each from the top 100 most cited were selected. Co-citation and r value (diagonal elements treated as missing) matrices were constructed, and then reconstructed in expanded form. Adding zeros can both change the r value and the ordering of the authors based upon that value. A chi-squared distance measure would not violate these requirements, nor would the cosine coefficient. It is also argued that co-citation data is ordinal data since there is no assurance of an absolute zero number of co-citations, and thus Pearson is not appropriate. The number of ties in co-citation data make the use of the Spearman rank order coefficient problematic.
    Date
    9. 7.2006 10:22:35
  2. Ahlgren, P.; Kekäläinen, J.: Indexing strategies for Swedish full text retrieval under different user scenarios (2007) 0.01
    0.014833267 = product of:
      0.0444998 = sum of:
        0.0444998 = weight(_text_:retrieval in 896) [ClassicSimilarity], result of:
          0.0444998 = score(doc=896,freq=8.0), product of:
            0.1331496 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.04401763 = queryNorm
            0.33420905 = fieldWeight in 896, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=896)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper deals with Swedish full text retrieval and the problem of morphological variation of query terms in the document database. The effects of combination of indexing strategies with query terms on retrieval effectiveness were studied. Three of five tested combinations involved indexing strategies that used conflation, in the form of normalization. Further, two of these three combinations used indexing strategies that employed compound splitting. Normalization and compound splitting were performed by SWETWOL, a morphological analyzer for the Swedish language. A fourth combination attempted to group related terms by right hand truncation of query terms. The four combinations were compared to each other and to a baseline combination, where no attempt was made to counteract the problem of morphological variation of query terms in the document database. The five combinations were evaluated under six different user scenarios, where each scenario simulated a certain user type. The four alternative combinations outperformed the baseline, for each user scenario. The truncation combination had the best performance under each user scenario. The main conclusion of the paper is that normalization and right hand truncation (performed by a search expert) enhanced retrieval effectiveness in comparison to the baseline. The performance of the three combinations of indexing strategies with query terms based on normalization was not far below the performance of the truncation combination.
  3. Ahlgren, P.; Grönqvist, L.: Evaluation of retrieval effectiveness with incomplete relevance data : theoretical and experimental comparison of three measures (2008) 0.01
    0.014684184 = product of:
      0.044052552 = sum of:
        0.044052552 = weight(_text_:retrieval in 2032) [ClassicSimilarity], result of:
          0.044052552 = score(doc=2032,freq=4.0), product of:
            0.1331496 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.04401763 = queryNorm
            0.33085006 = fieldWeight in 2032, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2032)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper investigates two relatively new measures of retrieval effectiveness in relation to the problem of incomplete relevance data. The measures, Bpref and RankEff, which do not take into account documents that have not been relevance judged, are compared theoretically and experimentally. The experimental comparisons involve a third measure, the well-known mean uninterpolated average precision. The results indicate that RankEff is the most stable of the three measures when the amount of relevance data is reduced, with respect to system ranking and absolute values. In addition, RankEff has the lowest error-rate.