Search (59 results, page 1 of 3)

Egghe, L.; Rousseau, R.: Introduction to informetrics : quantitative methods in library, documentation and information science (1990) 0.01

0.011245865 = product of:
  0.049481805 = sum of:
    0.011049435 = weight(_text_:und in 1515) [ClassicSimilarity], result of:
      0.011049435 = score(doc=1515,freq=4.0), product of:
        0.04558063 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.02056547 = queryNorm
        0.24241515 = fieldWeight in 1515, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1515)
    0.011049435 = weight(_text_:und in 1515) [ClassicSimilarity], result of:
      0.011049435 = score(doc=1515,freq=4.0), product of:
        0.04558063 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.02056547 = queryNorm
        0.24241515 = fieldWeight in 1515, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1515)
    0.009840704 = product of:
      0.019681407 = sum of:
        0.019681407 = weight(_text_:29 in 1515) [ClassicSimilarity], result of:
          0.019681407 = score(doc=1515,freq=2.0), product of:
            0.072342895 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.02056547 = queryNorm
            0.27205724 = fieldWeight in 1515, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1515)
      0.5 = coord(1/2)
    0.0029429442 = weight(_text_:in in 1515) [ClassicSimilarity], result of:
      0.0029429442 = score(doc=1515,freq=2.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.10520181 = fieldWeight in 1515, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1515)
    0.014599286 = product of:
      0.029198572 = sum of:
        0.029198572 = weight(_text_:science in 1515) [ClassicSimilarity], result of:
          0.029198572 = score(doc=1515,freq=14.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.5389985 = fieldWeight in 1515, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1515)
      0.5 = coord(1/2)
  0.22727273 = coord(5/22)

Classification: AN 70400 Allgemeines / Buch- und Bibliothekswesen, Informationswissenschaft / Bibliothekswesen / Bibliotheksverwaltung / Bibliotheksanalyse, -statistik
COMPASS: Information science / Statistical mathematics
Date: 29. 2.2008 19:02:46
LCSH: Information science / Statistical methods
Library science / Statistical methods
RVK: AN 70400 Allgemeines / Buch- und Bibliothekswesen, Informationswissenschaft / Bibliothekswesen / Bibliotheksverwaltung / Bibliotheksanalyse, -statistik
Subject: Information science / Statistical mathematics
Information science / Statistical methods
Library science / Statistical methods

Egghe, L.; Guns, R.; Rousseau, R.; Leuven, K.U.: Erratum (2012) 0.00

0.0043484843 = product of:
  0.047833327 = sum of:
    0.0042042066 = weight(_text_:in in 4992) [ClassicSimilarity], result of:
      0.0042042066 = score(doc=4992,freq=2.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.15028831 = fieldWeight in 4992, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.078125 = fieldNorm(doc=4992)
    0.04362912 = sum of:
      0.015765747 = weight(_text_:science in 4992) [ClassicSimilarity], result of:
        0.015765747 = score(doc=4992,freq=2.0), product of:
          0.0541719 = queryWeight, product of:
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.02056547 = queryNorm
          0.2910318 = fieldWeight in 4992, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.078125 = fieldNorm(doc=4992)
      0.027863374 = weight(_text_:22 in 4992) [ClassicSimilarity], result of:
        0.027863374 = score(doc=4992,freq=2.0), product of:
          0.072016776 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.02056547 = queryNorm
          0.38690117 = fieldWeight in 4992, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.078125 = fieldNorm(doc=4992)
  0.09090909 = coord(2/22)

Date: 14. 2.2012 12:53:22
Footnote: This article corrects: Thoughts on uncitedness: Nobel laureates and Fields medalists as case studies in: JASIST 62(2011) no,8, S.1637-1644.
Source: Journal of the American Society for Information Science and Technology. 63(2012) no.2, S.429

Egghe, L.; Rousseau, R.: Averaging and globalising quotients of informetric and scientometric data (1996) 0.00

0.0027040783 = product of:
  0.02974486 = sum of:
    0.0035673876 = weight(_text_:in in 7659) [ClassicSimilarity], result of:
      0.0035673876 = score(doc=7659,freq=4.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.12752387 = fieldWeight in 7659, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=7659)
    0.026177472 = sum of:
      0.009459447 = weight(_text_:science in 7659) [ClassicSimilarity], result of:
        0.009459447 = score(doc=7659,freq=2.0), product of:
          0.0541719 = queryWeight, product of:
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.02056547 = queryNorm
          0.17461908 = fieldWeight in 7659, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.046875 = fieldNorm(doc=7659)
      0.016718024 = weight(_text_:22 in 7659) [ClassicSimilarity], result of:
        0.016718024 = score(doc=7659,freq=2.0), product of:
          0.072016776 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.02056547 = queryNorm
          0.23214069 = fieldWeight in 7659, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=7659)
  0.09090909 = coord(2/22)

Abstract: It is possible, using ISI's Journal Citation Report (JCR), to calculate average impact factors (AIF) for LCR's subject categories but it can be more useful to know the global Impact Factor (GIF) of a subject category and compare the 2 values. Reports results of a study to compare the relationships between AIFs and GIFs of subjects, based on the particular case of the average impact factor of a subfield versus the impact factor of this subfield as a whole, the difference being studied between an average of quotients, denoted as AQ, and a global average, obtained as a quotient of averages, and denoted as GQ. In the case of impact factors, AQ becomes the average impact factor of a field, and GQ becomes its global impact factor. Discusses a number of applications of this technique in the context of informetrics and scientometrics
Source: Journal of information science. 22(1996) no.3, S.165-170

Egghe, L.: ¬A noninformetric analysis of the relationship between citation age and journal productivity (2001) 0.00

0.0025643385 = product of:
  0.018805148 = sum of:
    0.008434889 = product of:
      0.016869778 = sum of:
        0.016869778 = weight(_text_:29 in 5685) [ClassicSimilarity], result of:
          0.016869778 = score(doc=5685,freq=2.0), product of:
            0.072342895 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.02056547 = queryNorm
            0.23319192 = fieldWeight in 5685, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=5685)
      0.5 = coord(1/2)
    0.005640535 = weight(_text_:in in 5685) [ClassicSimilarity], result of:
      0.005640535 = score(doc=5685,freq=10.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.20163295 = fieldWeight in 5685, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=5685)
    0.0047297236 = product of:
      0.009459447 = sum of:
        0.009459447 = weight(_text_:science in 5685) [ClassicSimilarity], result of:
          0.009459447 = score(doc=5685,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.17461908 = fieldWeight in 5685, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=5685)
      0.5 = coord(1/2)
  0.13636364 = coord(3/22)

Abstract: A problem, raised by Wallace (JASIS, 37,136-145,1986), on the relation between the journal's median citation age and its number of articles is studied. Leaving open the problem as such, we give a statistical explanation of this relationship, when replacing "median" by "mean" in Wallace's problem. The cloud of points, found by Wallace, is explained in this sense that the points are scattered over the area in first quadrant, limited by a curve of the form y=1 + E/x**2 where E is a constant. This curve is obtained by using the Central Limit Theorem in statistics and, hence, has no intrinsic informetric foundation. The article closes with some reflections on explanations of regularities in informetrics, based on statistical, probabilistic or informetric results, or on a combination thereof
Date: 29. 9.2001 13:59:34
Source: Journal of the American Society for Information Science and technology. 52(2001) no.5, S.371-377

Egghe, L.: Influence of adding or deleting items and sources on the h-index (2010) 0.00

0.0025643385 = product of:
  0.018805148 = sum of:
    0.008434889 = product of:
      0.016869778 = sum of:
        0.016869778 = weight(_text_:29 in 3336) [ClassicSimilarity], result of:
          0.016869778 = score(doc=3336,freq=2.0), product of:
            0.072342895 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.02056547 = queryNorm
            0.23319192 = fieldWeight in 3336, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=3336)
      0.5 = coord(1/2)
    0.005640535 = weight(_text_:in in 3336) [ClassicSimilarity], result of:
      0.005640535 = score(doc=3336,freq=10.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.20163295 = fieldWeight in 3336, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3336)
    0.0047297236 = product of:
      0.009459447 = sum of:
        0.009459447 = weight(_text_:science in 3336) [ClassicSimilarity], result of:
          0.009459447 = score(doc=3336,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.17461908 = fieldWeight in 3336, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=3336)
      0.5 = coord(1/2)
  0.13636364 = coord(3/22)

Abstract: Adding or deleting items such as self-citations has an influence on the h-index of an author. This influence will be proved mathematically in this article. We hereby prove the experimental finding in E. Gianoli and M.A. Molina-Montenegro ([2009]) that the influence of adding or deleting self-citations on the h-index is greater for low values of the h-index. Why this is logical also is shown by a simple theoretical example. Adding or deleting sources such as adding or deleting minor contributions of an author also has an influence on the h-index of this author; this influence is modeled in this article. This model explains some practical examples found in X. Hu, R. Rousseau, and J. Chen (in press).
Date: 31. 5.2010 15:02:29
Source: Journal of the American Society for Information Science and Technology. 61(2010) no.2, S.370-373

Egghe, L.: Untangling Herdan's law and Heaps' law : mathematical and informetric arguments (2007) 0.00

0.0024466908 = product of:
  0.017942399 = sum of:
    0.0070290747 = product of:
      0.014058149 = sum of:
        0.014058149 = weight(_text_:29 in 271) [ClassicSimilarity], result of:
          0.014058149 = score(doc=271,freq=2.0), product of:
            0.072342895 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.02056547 = queryNorm
            0.19432661 = fieldWeight in 271, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=271)
      0.5 = coord(1/2)
    0.006971888 = weight(_text_:in in 271) [ClassicSimilarity], result of:
      0.006971888 = score(doc=271,freq=22.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.24922498 = fieldWeight in 271, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=271)
    0.0039414368 = product of:
      0.0078828735 = sum of:
        0.0078828735 = weight(_text_:science in 271) [ClassicSimilarity], result of:
          0.0078828735 = score(doc=271,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.1455159 = fieldWeight in 271, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=271)
      0.5 = coord(1/2)
  0.13636364 = coord(3/22)

Abstract: Herdan's law in linguistics and Heaps' law in information retrieval are different formulations of the same phenomenon. Stated briefly and in linguistic terms they state that vocabularies' sizes are concave increasing power laws of texts' sizes. This study investigates these laws from a purely mathematical and informetric point of view. A general informetric argument shows that the problem of proving these laws is, in fact, ill-posed. Using the more general terminology of sources and items, the author shows by presenting exact formulas from Lotkaian informetrics that the total number T of sources is not only a function of the total number A of items, but is also a function of several parameters (e.g., the parameters occurring in Lotka's law). Consequently, it is shown that a fixed T(or A) value can lead to different possible A (respectively, T) values. Limiting the T(A)-variability to increasing samples (e.g., in a text as done in linguistics) the author then shows, in a purely mathematical way, that for large sample sizes T~ A**phi, where phi is a constant, phi < 1 but close to 1, hence roughly, Heaps' or Herdan's law can be proved without using any linguistic or informetric argument. The author also shows that for smaller samples, a is not a constant but essentially decreases as confirmed by practical examples. Finally, an exact informetric argument on random sampling in the items shows that, in most cases, T= T(A) is a concavely increasing function, in accordance with practical examples.
Date: 29. 4.2007 19:51:08
Source: Journal of the American Society for Information Science and Technology. 58(2007) no.5, S.702-709

Egghe, L.: Properties of the n-overlap vector and n-overlap similarity theory (2006) 0.00

0.0022543848 = product of:
  0.016532155 = sum of:
    0.0070290747 = product of:
      0.014058149 = sum of:
        0.014058149 = weight(_text_:29 in 194) [ClassicSimilarity], result of:
          0.014058149 = score(doc=194,freq=2.0), product of:
            0.072342895 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.02056547 = queryNorm
            0.19432661 = fieldWeight in 194, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=194)
      0.5 = coord(1/2)
    0.005561643 = weight(_text_:in in 194) [ClassicSimilarity], result of:
      0.005561643 = score(doc=194,freq=14.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.19881277 = fieldWeight in 194, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=194)
    0.0039414368 = product of:
      0.0078828735 = sum of:
        0.0078828735 = weight(_text_:science in 194) [ClassicSimilarity], result of:
          0.0078828735 = score(doc=194,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.1455159 = fieldWeight in 194, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=194)
      0.5 = coord(1/2)
  0.13636364 = coord(3/22)

Abstract: In the first part of this article the author defines the n-overlap vector whose coordinates consist of the fraction of the objects (e.g., books, N-grams, etc.) that belong to 1, 2, , n sets (more generally: families) (e.g., libraries, databases, etc.). With the aid of the Lorenz concentration theory, a theory of n-overlap similarity is conceived together with corresponding measures, such as the generalized Jaccard index (generalizing the well-known Jaccard index in case n 5 2). Next, the distributional form of the n-overlap vector is determined assuming certain distributions of the object's and of the set (family) sizes. In this section the decreasing power law and decreasing exponential distribution is explained for the n-overlap vector. Both item (token) n-overlap and source (type) n-overlap are studied. The n-overlap properties of objects indexed by a hierarchical system (e.g., books indexed by numbers from a UDC or Dewey system or by N-grams) are presented in the final section. The author shows how the results given in the previous section can be applied as well as how the Lorenz order of the n-overlap vector is respected by an increase or a decrease of the level of refinement in the hierarchical system (e.g., the value N in N-grams).
Date: 3. 1.2007 14:26:29
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.9, S.1165-1177

Egghe, L.: Empirical and combinatorial study of country occurrences in multi-authored papers (2006) 0.00
```
0.0018662467 = product of:
  0.0136858085 = sum of:
    0.004464646 = weight(_text_:und in 81) [ClassicSimilarity], result of:
      0.004464646 = score(doc=81,freq=2.0), product of:
        0.04558063 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.02056547 = queryNorm
        0.09795051 = fieldWeight in 81, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.03125 = fieldNorm(doc=81)
    0.004464646 = weight(_text_:und in 81) [ClassicSimilarity], result of:
      0.004464646 = score(doc=81,freq=2.0), product of:
        0.04558063 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.02056547 = queryNorm
        0.09795051 = fieldWeight in 81, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.03125 = fieldNorm(doc=81)
    0.004756517 = weight(_text_:in in 81) [ClassicSimilarity], result of:
      0.004756517 = score(doc=81,freq=16.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.17003182 = fieldWeight in 81, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.03125 = fieldNorm(doc=81)
  0.13636364 = coord(3/22)
```
Abstract

Papers written by several authors can be classified according to the countries of the author affiliations. The empirical part of this paper consists of two datasets. One dataset consists of 1,035 papers retrieved via the search "pedagog*" in the years 2004 and 2005 (up to October) in Academic Search Elite which is a case where phi(m) = the number of papers with m =1, 2,3 ... authors is decreasing, hence most of the papers have a low number of authors. Here we find that #, m = the number of times a country occurs j times in a m-authored paper, j =1, ..., m-1 is decreasing and that # m, m is much higher than all the other #j, m values. The other dataset consists of 3,271 papers retrieved via the search "enzyme" in the year 2005 (up to October) in the same database which is a case of a non-decreasing phi(m): most papers have 3 or 4 authors and we even find many papers with a much higher number of authors. In this case we show again that # m, m is much higher than the other #j, m values but that #j, m is not decreasing anymore in j =1, ..., m-1, although #1, m is (apart from # m, m) the largest number amongst the #j,m. The combinatorial part gives a proof of the fact that #j,m decreases for j = 1, m-1, supposing that all cases are equally possible. This shows that the first dataset is more conform with this model than the second dataset. Explanations for these findings are given. From the data we also find the (we think: new) distribution of number of papers with n =1, 2,3,... countries (i.e. where there are n different countries involved amongst the m (a n) authors of a paper): a fast decreasing function e.g. as a power law with a very large Lotka exponent.

Source

Information - Wissenschaft und Praxis. 57(2006) H.8, S.427-432
Egghe, L.: ¬A universal method of information retrieval evaluation : the "missing" link M and the universal IR surface (2004) 0.00
```
0.001218551 = product of:
  0.01340406 = sum of:
    0.0050450475 = weight(_text_:in in 2558) [ClassicSimilarity], result of:
      0.0050450475 = score(doc=2558,freq=8.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.18034597 = fieldWeight in 2558, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2558)
    0.008359012 = product of:
      0.016718024 = sum of:
        0.016718024 = weight(_text_:22 in 2558) [ClassicSimilarity], result of:
          0.016718024 = score(doc=2558,freq=2.0), product of:
            0.072016776 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02056547 = queryNorm
            0.23214069 = fieldWeight in 2558, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2558)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)
```
Abstract

The paper shows that the present evaluation methods in information retrieval (basically recall R and precision P and in some cases fallout F ) lack universal comparability in the sense that their values depend on the generality of the IR problem. A solution is given by using all "parts" of the database, including the non-relevant documents and also the not-retrieved documents. It turns out that the solution is given by introducing the measure M being the fraction of the not-retrieved documents that are relevant (hence the "miss" measure). We prove that - independent of the IR problem or of the IR action - the quadruple (P,R,F,M) belongs to a universal IR surface, being the same for all IR-activities. This universality is then exploited by defining a new measure for evaluation in IR allowing for unbiased comparisons of all IR results. We also show that only using one, two or even three measures from the set {P,R,F,M} necessary leads to evaluation measures that are non-universal and hence not capable of comparing different IR situations.

Date

14. 8.2004 19:17:22

Egghe, L.: On the law of Zipf-Mandelbrot for multi-word phrases (1999) 0.00

0.0011028926 = product of:
  0.012131818 = sum of:
    0.0058255196 = weight(_text_:in in 3058) [ClassicSimilarity], result of:
      0.0058255196 = score(doc=3058,freq=6.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.2082456 = fieldWeight in 3058, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=3058)
    0.0063062985 = product of:
      0.012612597 = sum of:
        0.012612597 = weight(_text_:science in 3058) [ClassicSimilarity], result of:
          0.012612597 = score(doc=3058,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.23282544 = fieldWeight in 3058, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0625 = fieldNorm(doc=3058)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: This article studies the probabilities of the occurence of multi-word (m-word) phrases (m=2,3,...) in relation to the probabilities of occurence of the single words. It is well known that, in the latter case, the lae of Zipf is valid (i.e., a power law). We prove that in the case of m-word phrases (m>=2), this is not the case. We present 2 independent proof of this
Source: Journal of the American Society for Information Science. 50(1999) no.3, S.233-241

Egghe, L.; Liang, L.; Rousseau, R.: ¬A relation between h-index and impact factor in the power-law model (2009) 0.00

0.0011028926 = product of:
  0.012131818 = sum of:
    0.0058255196 = weight(_text_:in in 6759) [ClassicSimilarity], result of:
      0.0058255196 = score(doc=6759,freq=6.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.2082456 = fieldWeight in 6759, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=6759)
    0.0063062985 = product of:
      0.012612597 = sum of:
        0.012612597 = weight(_text_:science in 6759) [ClassicSimilarity], result of:
          0.012612597 = score(doc=6759,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.23282544 = fieldWeight in 6759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0625 = fieldNorm(doc=6759)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: Using a power-law model, the two best-known topics in citation analysis, namely the impact factor and the Hirsch index, are unified into one relation (not a function). The validity of our model is, at least in a qualitative way, confirmed by real data.
Source: Journal of the American Society for Information Science and Technology. 60(2009) no.11, S.2362-2365

Egghe, L.: Dynamic h-index : the Hirsch index in function of time (2007) 0.00

0.0011028926 = product of:
  0.012131818 = sum of:
    0.0058255196 = weight(_text_:in in 147) [ClassicSimilarity], result of:
      0.0058255196 = score(doc=147,freq=6.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.2082456 = fieldWeight in 147, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=147)
    0.0063062985 = product of:
      0.012612597 = sum of:
        0.012612597 = weight(_text_:science in 147) [ClassicSimilarity], result of:
          0.012612597 = score(doc=147,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.23282544 = fieldWeight in 147, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0625 = fieldNorm(doc=147)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: When there are a group of articles and the present time is fixed we can determine the unique number h being the number of articles that received h or more citations while the other articles received a number of citations which is not larger than h. In this article, the time dependence of the h-index is determined. This is important to describe the expected career evolution of a scientist's work or of a journal's production in a fixed year.
Source: Journal of the American Society for Information Science and Technology. 58(2007) no.3, S.452-454

Egghe, L.; Rousseau, R.; Rousseau, S.: TOP-curves (2007) 0.00

0.001099876 = product of:
  0.012098636 = sum of:
    0.0065806243 = weight(_text_:in in 50) [ClassicSimilarity], result of:
      0.0065806243 = score(doc=50,freq=10.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.23523843 = fieldWeight in 50, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=50)
    0.0055180113 = product of:
      0.011036023 = sum of:
        0.011036023 = weight(_text_:science in 50) [ClassicSimilarity], result of:
          0.011036023 = score(doc=50,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.20372227 = fieldWeight in 50, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0546875 = fieldNorm(doc=50)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: Several characteristics of classical Lorenz curves make them unsuitable for the study of a group of topperformers. TOP-curves, defined as a kind of mirror image of TIP-curves used in poverty studies, are shown to possess the properties necessary for adequate empirical ranking of various data arrays, based on the properties of the highest performers (i.e., the core). TOP-curves and essential TOP-curves, also introduced in this article, simultaneously represent the incidence, intensity, and inequality among the top. It is shown that TOPdominance partial order, introduced in this article, is stronger than Lorenz dominance order. In this way, this article contributes to the study of cores, a central issue in applied informetrics.
Source: Journal of the American Society for Information Science and Technology. 58(2007) no.6, S.777-785

Egghe, L.: Sampling and concentration values of incomplete bibliographies (2002) 0.00

0.0010367181 = product of:
  0.0114039 = sum of:
    0.0058858884 = weight(_text_:in in 450) [ClassicSimilarity], result of:
      0.0058858884 = score(doc=450,freq=8.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.21040362 = fieldWeight in 450, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=450)
    0.0055180113 = product of:
      0.011036023 = sum of:
        0.011036023 = weight(_text_:science in 450) [ClassicSimilarity], result of:
          0.011036023 = score(doc=450,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.20372227 = fieldWeight in 450, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0546875 = fieldNorm(doc=450)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: This article studies concentration aspects of bibliographies. More, in particular, we study the impact of incompleteness of such a bibliography on its concentration values (i.e., its degree of inequality of production of its sources). Incompleteness is modeled by sampling in the complete bibliography. The model is general enough to comprise truncation of a bibliography as well as a systematic sample on sources or items. In all cases we prove that the sampled bibliography (or incomplete one) has a higher concentration value than the complete one. These models, hence, shed some light on the measurement of production inequality in incomplete bibliographies.
Source: Journal of the American Society for Information Science and technology. 53(2002) no.4, S.271-281

Egghe, L.: ¬A rationale for the Hirsch-index rank-order distribution and a comparison with the impact factor rank-order distribution (2009) 0.00

0.0010367181 = product of:
  0.0114039 = sum of:
    0.0058858884 = weight(_text_:in in 3124) [ClassicSimilarity], result of:
      0.0058858884 = score(doc=3124,freq=8.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.21040362 = fieldWeight in 3124, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3124)
    0.0055180113 = product of:
      0.011036023 = sum of:
        0.011036023 = weight(_text_:science in 3124) [ClassicSimilarity], result of:
          0.011036023 = score(doc=3124,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.20372227 = fieldWeight in 3124, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3124)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: We present a rationale for the Hirsch-index rank-order distribution and prove that it is a power law (hence a straight line in the log-log scale). This is confirmed by experimental data of Pyykkö and by data produced in this article on 206 mathematics journals. This distribution is of a completely different nature than the impact factor (IF) rank-order distribution which (as proved in a previous article) is S-shaped. This is also confirmed by our example. Only in the log-log scale of the h-index distribution do we notice a concave deviation of the straight line for higher ranks. This phenomenon is discussed.
Source: Journal of the American Society for Information Science and Technology. 60(2009) no.10, S.2142-2144

Egghe, L.: ¬A new short proof of Naranan's theorem, explaining Lotka's law and Zipf's law (2010) 0.00

0.0010367181 = product of:
  0.0114039 = sum of:
    0.0058858884 = weight(_text_:in in 3432) [ClassicSimilarity], result of:
      0.0058858884 = score(doc=3432,freq=8.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.21040362 = fieldWeight in 3432, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3432)
    0.0055180113 = product of:
      0.011036023 = sum of:
        0.011036023 = weight(_text_:science in 3432) [ClassicSimilarity], result of:
          0.011036023 = score(doc=3432,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.20372227 = fieldWeight in 3432, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3432)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: Naranan's important theorem, published in Nature in 1970, states that if the number of journals grows exponentially and if the number of articles in each journal grows exponentially (at the same rate for each journal), then the system satisfies Lotka's law and a formula for the Lotka's exponent is given in function of the growth rates of the journals and the articles. This brief communication re-proves this result by showing that the system satisfies Zipf's law, which is equivalent with Lotka's law. The proof is short and algebraic and does not use infinitesimal arguments.
Source: Journal of the American Society for Information Science and Technology. 61(2010) no.12, S.2581-2583

Egghe, L.: Mathematical theory of the h- and g-index in case of fractional counting of authorship (2008) 0.00

0.0010366995 = product of:
  0.011403695 = sum of:
    0.006673971 = weight(_text_:in in 2004) [ClassicSimilarity], result of:
      0.006673971 = score(doc=2004,freq=14.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.23857531 = fieldWeight in 2004, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2004)
    0.0047297236 = product of:
      0.009459447 = sum of:
        0.009459447 = weight(_text_:science in 2004) [ClassicSimilarity], result of:
          0.009459447 = score(doc=2004,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.17461908 = fieldWeight in 2004, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=2004)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: This article studies the h-index (Hirsch index) and the g-index of authors, in case one counts authorship of the cited articles in a fractional way. There are two ways to do this: One counts the citations to these papers in a fractional way or one counts the ranks of the papers in a fractional way as credit for an author. In both cases, we define the fractional h- and g-indexes, and we present inequalities (both upper and lower bounds) between these fractional h- and g-indexes and their corresponding unweighted values (also involving, of course, the coauthorship distribution). Wherever applicable, examples and counterexamples are provided. In a concrete example (the publication citation list of the present author), we make explicit calculations of these fractional h- and g-indexes and show that they are not very different from the unweighted ones.
Source: Journal of the American Society for Information Science and Technology. 59(2008) no.10, S.1608-1616

Egghe, L.: ¬The influence of transformations on the h-index and the g-index (2008) 0.00

9.6503104E-4 = product of:
  0.010615341 = sum of:
    0.0050973296 = weight(_text_:in in 1881) [ClassicSimilarity], result of:
      0.0050973296 = score(doc=1881,freq=6.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.1822149 = fieldWeight in 1881, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1881)
    0.0055180113 = product of:
      0.011036023 = sum of:
        0.011036023 = weight(_text_:science in 1881) [ClassicSimilarity], result of:
          0.011036023 = score(doc=1881,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.20372227 = fieldWeight in 1881, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1881)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: In a previous article, we introduced a general transformation on sources and one on items in an arbitrary information production process (IPP). In this article, we investigate the influence of these transformations on the h-index and on the g-index. General formulae that describe this influence are presented. These are applied to the case that the size-frequency function is Lotkaian (i.e., is a decreasing power function). We further show that the h-index of the transformed IPP belongs to the interval bounded by the two transformations of the h-index of the original IPP, and we also show that this property is not true for the g-index.
Source: Journal of the American Society for Information Science and Technology. 59(2008) no.8, S.1304-1312

Egghe, L.; Guns, R.; Rousseau, R.: Thoughts on uncitedness : Nobel laureates and Fields medalists as case studies (2011) 0.00

9.323844E-4 = product of:
  0.010256228 = sum of:
    0.0035673876 = weight(_text_:in in 4994) [ClassicSimilarity], result of:
      0.0035673876 = score(doc=4994,freq=4.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.12752387 = fieldWeight in 4994, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=4994)
    0.00668884 = product of:
      0.01337768 = sum of:
        0.01337768 = weight(_text_:science in 4994) [ClassicSimilarity], result of:
          0.01337768 = score(doc=4994,freq=4.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.24694869 = fieldWeight in 4994, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=4994)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)

Abstract: Contrary to what one might expect, Nobel laureates and Fields medalists have a rather large fraction (10% or more) of uncited publications. This is the case for (in total) 75 examined researchers from the fields of mathematics (Fields medalists), physics, chemistry, and physiology or medicine (Nobel laureates). We study several indicators for these researchers, including the h-index, total number of publications, average number of citations per publication, the number (and fraction) of uncited publications, and their interrelations. The most remarkable result is a positive correlation between the h-index and the number of uncited articles. We also present a Lotkaian model, which partially explains the empirically found regularities.
Footnote: Vgl.: Erratum. In: Journal of the American Society for Information Science and Technology. 63(2012) no.2, S.429.
Source: Journal of the American Society for Information Science and Technology. 62(2011) no.8, S.1637-1644

Egghe, L.: Type/Token-Taken informetrics (2003) 0.00
```
8.988257E-4 = product of:
  0.0098870825 = sum of:
    0.0059456457 = weight(_text_:in in 1608) [ClassicSimilarity], result of:
      0.0059456457 = score(doc=1608,freq=16.0), product of:
        0.027974274 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02056547 = queryNorm
        0.21253976 = fieldWeight in 1608, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1608)
    0.0039414368 = product of:
      0.0078828735 = sum of:
        0.0078828735 = weight(_text_:science in 1608) [ClassicSimilarity], result of:
          0.0078828735 = score(doc=1608,freq=2.0), product of:
            0.0541719 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.02056547 = queryNorm
            0.1455159 = fieldWeight in 1608, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1608)
      0.5 = coord(1/2)
  0.09090909 = coord(2/22)
```
Abstract

Type/Token-Taken informetrics is a new part of informetrics that studies the use of items rather than the items itself. Here, items are the objects that are produced by the sources (e.g., journals producing articles, authors producing papers, etc.). In linguistics a source is also called a type (e.g., a word), and an item a token (e.g., the use of words in texts). In informetrics, types that occur often, for example, in a database will also be requested often, for example, in information retrieval. The relative use of these occurrences will be higher than their relative occurrences itself; hence, the name Type/ Token-Taken informetrics. This article studies the frequency distribution of Type/Token-Taken informetrics, starting from the one of Type/Token informetrics (i.e., source-item relationships). We are also studying the average number my* of item uses in Type/Token-Taken informetrics and compare this with the classical average number my in Type/Token informetrics. We show that my* >= my always, and that my* is an increasing function of my. A method is presented to actually calculate my* from my, and a given a, which is the exponent in Lotka's frequency distribution of Type/Token informetrics. We leave open the problem of developing non-Lotkaian Type/TokenTaken informetrics.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.7, S.603-610

Search (59 results, page 1 of 3)

Authors

Years

Types

Themes

Subjects

Classifications