Search (22 results, page 1 of 2)

  • × author_ss:"Egghe, L."
  1. Egghe, L.; Guns, R.; Rousseau, R.; Leuven, K.U.: Erratum (2012) 0.07
    0.06794351 = product of:
      0.18118271 = sum of:
        0.08462543 = weight(_text_:case in 4992) [ClassicSimilarity], result of:
          0.08462543 = score(doc=4992,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.48573974 = fieldWeight in 4992, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.078125 = fieldNorm(doc=4992)
        0.06971227 = weight(_text_:studies in 4992) [ClassicSimilarity], result of:
          0.06971227 = score(doc=4992,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.44086722 = fieldWeight in 4992, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.078125 = fieldNorm(doc=4992)
        0.026845016 = product of:
          0.05369003 = sum of:
            0.05369003 = weight(_text_:22 in 4992) [ClassicSimilarity], result of:
              0.05369003 = score(doc=4992,freq=2.0), product of:
                0.13876937 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03962768 = queryNorm
                0.38690117 = fieldWeight in 4992, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4992)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Date
    14. 2.2012 12:53:22
    Footnote
    This article corrects: Thoughts on uncitedness: Nobel laureates and Fields medalists as case studies in: JASIST 62(2011) no,8, S.1637-1644.
  2. Egghe, L.: On the law of Zipf-Mandelbrot for multi-word phrases (1999) 0.04
    0.043257564 = product of:
      0.17303026 = sum of:
        0.11726044 = weight(_text_:case in 3058) [ClassicSimilarity], result of:
          0.11726044 = score(doc=3058,freq=6.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.6730608 = fieldWeight in 3058, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0625 = fieldNorm(doc=3058)
        0.055769812 = weight(_text_:studies in 3058) [ClassicSimilarity], result of:
          0.055769812 = score(doc=3058,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.35269377 = fieldWeight in 3058, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0625 = fieldNorm(doc=3058)
      0.25 = coord(2/8)
    
    Abstract
    This article studies the probabilities of the occurence of multi-word (m-word) phrases (m=2,3,...) in relation to the probabilities of occurence of the single words. It is well known that, in the latter case, the lae of Zipf is valid (i.e., a power law). We prove that in the case of m-word phrases (m>=2), this is not the case. We present 2 independent proof of this
  3. Egghe, L.: Mathematical theory of the h- and g-index in case of fractional counting of authorship (2008) 0.03
    0.028408606 = product of:
      0.11363442 = sum of:
        0.071807064 = weight(_text_:case in 2004) [ClassicSimilarity], result of:
          0.071807064 = score(doc=2004,freq=4.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.41216385 = fieldWeight in 2004, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=2004)
        0.04182736 = weight(_text_:studies in 2004) [ClassicSimilarity], result of:
          0.04182736 = score(doc=2004,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.26452032 = fieldWeight in 2004, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.046875 = fieldNorm(doc=2004)
      0.25 = coord(2/8)
    
    Abstract
    This article studies the h-index (Hirsch index) and the g-index of authors, in case one counts authorship of the cited articles in a fractional way. There are two ways to do this: One counts the citations to these papers in a fractional way or one counts the ranks of the papers in a fractional way as credit for an author. In both cases, we define the fractional h- and g-indexes, and we present inequalities (both upper and lower bounds) between these fractional h- and g-indexes and their corresponding unweighted values (also involving, of course, the coauthorship distribution). Wherever applicable, examples and counterexamples are provided. In a concrete example (the publication citation list of the present author), we make explicit calculations of these fractional h- and g-indexes and show that they are not very different from the unweighted ones.
  4. Egghe, L.; Guns, R.; Rousseau, R.: Thoughts on uncitedness : Nobel laureates and Fields medalists as case studies (2011) 0.03
    0.028408606 = product of:
      0.11363442 = sum of:
        0.071807064 = weight(_text_:case in 4994) [ClassicSimilarity], result of:
          0.071807064 = score(doc=4994,freq=4.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.41216385 = fieldWeight in 4994, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=4994)
        0.04182736 = weight(_text_:studies in 4994) [ClassicSimilarity], result of:
          0.04182736 = score(doc=4994,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.26452032 = fieldWeight in 4994, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.046875 = fieldNorm(doc=4994)
      0.25 = coord(2/8)
    
    Abstract
    Contrary to what one might expect, Nobel laureates and Fields medalists have a rather large fraction (10% or more) of uncited publications. This is the case for (in total) 75 examined researchers from the fields of mathematics (Fields medalists), physics, chemistry, and physiology or medicine (Nobel laureates). We study several indicators for these researchers, including the h-index, total number of publications, average number of citations per publication, the number (and fraction) of uncited publications, and their interrelations. The most remarkable result is a positive correlation between the h-index and the number of uncited articles. We also present a Lotkaian model, which partially explains the empirically found regularities.
  5. Egghe, L.; Rousseau, R.: Averaging and globalising quotients of informetric and scientometric data (1996) 0.02
    0.021978518 = product of:
      0.08791407 = sum of:
        0.071807064 = weight(_text_:case in 7659) [ClassicSimilarity], result of:
          0.071807064 = score(doc=7659,freq=4.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.41216385 = fieldWeight in 7659, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=7659)
        0.01610701 = product of:
          0.03221402 = sum of:
            0.03221402 = weight(_text_:22 in 7659) [ClassicSimilarity], result of:
              0.03221402 = score(doc=7659,freq=2.0), product of:
                0.13876937 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03962768 = queryNorm
                0.23214069 = fieldWeight in 7659, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=7659)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    It is possible, using ISI's Journal Citation Report (JCR), to calculate average impact factors (AIF) for LCR's subject categories but it can be more useful to know the global Impact Factor (GIF) of a subject category and compare the 2 values. Reports results of a study to compare the relationships between AIFs and GIFs of subjects, based on the particular case of the average impact factor of a subfield versus the impact factor of this subfield as a whole, the difference being studied between an average of quotients, denoted as AQ, and a global average, obtained as a quotient of averages, and denoted as GQ. In the case of impact factors, AQ becomes the average impact factor of a field, and GQ becomes its global impact factor. Discusses a number of applications of this technique in the context of informetrics and scientometrics
    Source
    Journal of information science. 22(1996) no.3, S.165-170
  6. Egghe, L.: Properties of the n-overlap vector and n-overlap similarity theory (2006) 0.02
    0.016484251 = product of:
      0.065937005 = sum of:
        0.023624292 = weight(_text_:libraries in 194) [ClassicSimilarity], result of:
          0.023624292 = score(doc=194,freq=2.0), product of:
            0.13017908 = queryWeight, product of:
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.03962768 = queryNorm
            0.18147534 = fieldWeight in 194, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.0390625 = fieldNorm(doc=194)
        0.042312715 = weight(_text_:case in 194) [ClassicSimilarity], result of:
          0.042312715 = score(doc=194,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.24286987 = fieldWeight in 194, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=194)
      0.25 = coord(2/8)
    
    Abstract
    In the first part of this article the author defines the n-overlap vector whose coordinates consist of the fraction of the objects (e.g., books, N-grams, etc.) that belong to 1, 2, , n sets (more generally: families) (e.g., libraries, databases, etc.). With the aid of the Lorenz concentration theory, a theory of n-overlap similarity is conceived together with corresponding measures, such as the generalized Jaccard index (generalizing the well-known Jaccard index in case n 5 2). Next, the distributional form of the n-overlap vector is determined assuming certain distributions of the object's and of the set (family) sizes. In this section the decreasing power law and decreasing exponential distribution is explained for the n-overlap vector. Both item (token) n-overlap and source (type) n-overlap are studied. The n-overlap properties of objects indexed by a hierarchical system (e.g., books indexed by numbers from a UDC or Dewey system or by N-grams) are presented in the final section. The author shows how the results given in the previous section can be applied as well as how the Lorenz order of the n-overlap vector is respected by an increase or a decrease of the level of refinement in the hierarchical system (e.g., the value N in N-grams).
  7. Egghe, L.: New relations between similarity measures for vectors based on vector norms (2009) 0.01
    0.008975883 = product of:
      0.071807064 = sum of:
        0.071807064 = weight(_text_:case in 2708) [ClassicSimilarity], result of:
          0.071807064 = score(doc=2708,freq=4.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.41216385 = fieldWeight in 2708, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=2708)
      0.125 = coord(1/8)
    
    Abstract
    The well-known similarity measures Jaccard, Salton's cosine, Dice, and several related overlap measures for vectors are compared. While general relations are not possible to prove, we study these measures on the trajectories of the form [X]=a[Y], where a > 0 is a constant and [·] denotes the Euclidean norm of a vector. In this case, direct functional relations between these measures are proved. For Jaccard, we prove that it is a convexly increasing function of Salton's cosine measure, but always smaller than or equal to the latter, hereby explaining a curve, experimentally found by Leydesdorff. All the other measures have a linear relation with Salton's cosine, reducing even to equality, in case a = 1. Hence, for equally normed vectors (e.g., for normalized vectors) we, essentially, only have Jaccard's measure and Salton's cosine measure since all the other measures are equal to the latter.
  8. Egghe, L.: ¬The influence of transformations on the h-index and the g-index (2008) 0.01
    0.0074047255 = product of:
      0.059237804 = sum of:
        0.059237804 = weight(_text_:case in 1881) [ClassicSimilarity], result of:
          0.059237804 = score(doc=1881,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.34001783 = fieldWeight in 1881, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1881)
      0.125 = coord(1/8)
    
    Abstract
    In a previous article, we introduced a general transformation on sources and one on items in an arbitrary information production process (IPP). In this article, we investigate the influence of these transformations on the h-index and on the g-index. General formulae that describe this influence are presented. These are applied to the case that the size-frequency function is Lotkaian (i.e., is a decreasing power function). We further show that the h-index of the transformed IPP belongs to the interval bounded by the two transformations of the h-index of the original IPP, and we also show that this property is not true for the g-index.
  9. Egghe, L.: Remarks on the paper by A. De Visscher, "what does the g-index really measure?" (2012) 0.01
    0.0074047255 = product of:
      0.059237804 = sum of:
        0.059237804 = weight(_text_:case in 463) [ClassicSimilarity], result of:
          0.059237804 = score(doc=463,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.34001783 = fieldWeight in 463, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0546875 = fieldNorm(doc=463)
      0.125 = coord(1/8)
    
    Abstract
    The author presents a different view on properties of impact measures than given in the paper of De Visscher (2011). He argues that a good impact measure works better when citations are concentrated rather than spread out over articles. The author also presents theoretical evidence that the g-index and the R-index can be close to the square root of the total number of citations, whereas this is not the case for the A-index. Here the author confirms an assertion of De Visscher.
  10. Egghe, L.: Theory of the topical coverage of multiple databases (2013) 0.01
    0.0074047255 = product of:
      0.059237804 = sum of:
        0.059237804 = weight(_text_:case in 526) [ClassicSimilarity], result of:
          0.059237804 = score(doc=526,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.34001783 = fieldWeight in 526, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0546875 = fieldNorm(doc=526)
      0.125 = coord(1/8)
    
    Abstract
    We present a model that describes which fraction of the literature on a certain topic we will find when we use n (n = 1, 2, .) databases. It is a generalization of the theory of discovering usability problems. We prove that, in all practical cases, this fraction is a concave function of n, the number of used databases, thereby explaining some graphs that exist in the literature. We also study limiting features of this fraction for n very high and we characterize the case that we find all literature on a certain topic for n high enough.
  11. Egghe, L.; Rousseau, R.; Hooydonk, G. van: Methods for accrediting publications to authors or countries : consequences for evaluation studies (2000) 0.01
    0.0073941024 = product of:
      0.05915282 = sum of:
        0.05915282 = weight(_text_:studies in 4384) [ClassicSimilarity], result of:
          0.05915282 = score(doc=4384,freq=4.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.37408823 = fieldWeight in 4384, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.046875 = fieldNorm(doc=4384)
      0.125 = coord(1/8)
    
    Abstract
    One aim of science evaluation studies is to determine quantitatively the contribution of different players (authors, departments, countries) to the whole system. This information is then used to study the evolution of the system, for instance to gauge the results of special national or international programs. Taking articles as our basic data, we want to determine the exact relative contribution of each coauthor or each country. These numbers are brought together to obtain country scores, or department scores, etc. It turns out, as we will show in this article, that different scoring methods can yield totally different rankings. Conseqeuntly, a ranking between countries, universities, research groups or authors, based on one particular accrediting methods does not contain an absolute truth about their relative importance
  12. Egghe, L.: Empirical and combinatorial study of country occurrences in multi-authored papers (2006) 0.01
    0.0073287776 = product of:
      0.05863022 = sum of:
        0.05863022 = weight(_text_:case in 81) [ClassicSimilarity], result of:
          0.05863022 = score(doc=81,freq=6.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.3365304 = fieldWeight in 81, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03125 = fieldNorm(doc=81)
      0.125 = coord(1/8)
    
    Abstract
    Papers written by several authors can be classified according to the countries of the author affiliations. The empirical part of this paper consists of two datasets. One dataset consists of 1,035 papers retrieved via the search "pedagog*" in the years 2004 and 2005 (up to October) in Academic Search Elite which is a case where phi(m) = the number of papers with m =1, 2,3 ... authors is decreasing, hence most of the papers have a low number of authors. Here we find that #, m = the number of times a country occurs j times in a m-authored paper, j =1, ..., m-1 is decreasing and that # m, m is much higher than all the other #j, m values. The other dataset consists of 3,271 papers retrieved via the search "enzyme" in the year 2005 (up to October) in the same database which is a case of a non-decreasing phi(m): most papers have 3 or 4 authors and we even find many papers with a much higher number of authors. In this case we show again that # m, m is much higher than the other #j, m values but that #j, m is not decreasing anymore in j =1, ..., m-1, although #1, m is (apart from # m, m) the largest number amongst the #j,m. The combinatorial part gives a proof of the fact that #j,m decreases for j = 1, m-1, supposing that all cases are equally possible. This shows that the first dataset is more conform with this model than the second dataset. Explanations for these findings are given. From the data we also find the (we think: new) distribution of number of papers with n =1, 2,3,... countries (i.e. where there are n different countries involved amongst the m (a n) authors of a paper): a fast decreasing function e.g. as a power law with a very large Lotka exponent.
  13. Egghe, L.: ¬The amount of actions needed for shelving and reshelving (1996) 0.01
    0.0069712265 = product of:
      0.055769812 = sum of:
        0.055769812 = weight(_text_:studies in 4394) [ClassicSimilarity], result of:
          0.055769812 = score(doc=4394,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.35269377 = fieldWeight in 4394, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0625 = fieldNorm(doc=4394)
      0.125 = coord(1/8)
    
    Abstract
    Discusses the number of actions (or time) needed to organize library shelves. Studies 2 types pf problem: organizing a library shelf out of an unordered pile of books, and putting an existing shelf of books in the rough order. Uses results from information theory as well as from rank order statistics (runs). Draws conclusions about the advised frequency with which these actions should be undertaken
  14. Egghe, L.: Type/Token-Taken informetrics (2003) 0.01
    0.0061617517 = product of:
      0.049294014 = sum of:
        0.049294014 = weight(_text_:studies in 1608) [ClassicSimilarity], result of:
          0.049294014 = score(doc=1608,freq=4.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.3117402 = fieldWeight in 1608, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1608)
      0.125 = coord(1/8)
    
    Abstract
    Type/Token-Taken informetrics is a new part of informetrics that studies the use of items rather than the items itself. Here, items are the objects that are produced by the sources (e.g., journals producing articles, authors producing papers, etc.). In linguistics a source is also called a type (e.g., a word), and an item a token (e.g., the use of words in texts). In informetrics, types that occur often, for example, in a database will also be requested often, for example, in information retrieval. The relative use of these occurrences will be higher than their relative occurrences itself; hence, the name Type/ Token-Taken informetrics. This article studies the frequency distribution of Type/Token-Taken informetrics, starting from the one of Type/Token informetrics (i.e., source-item relationships). We are also studying the average number my* of item uses in Type/Token-Taken informetrics and compare this with the classical average number my in Type/Token informetrics. We show that my* >= my always, and that my* is an increasing function of my. A method is presented to actually calculate my* from my, and a given a, which is the exponent in Lotka's frequency distribution of Type/Token informetrics. We leave open the problem of developing non-Lotkaian Type/TokenTaken informetrics.
  15. Egghe, L.: Mathematical study of h-index sequences (2009) 0.01
    0.0061617517 = product of:
      0.049294014 = sum of:
        0.049294014 = weight(_text_:studies in 4217) [ClassicSimilarity], result of:
          0.049294014 = score(doc=4217,freq=4.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.3117402 = fieldWeight in 4217, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4217)
      0.125 = coord(1/8)
    
    Abstract
    This paper studies mathematical properties of h-index sequences as developed by Liang [Liang, L. (2006). h-Index sequence and h-index matrix: Constructions and applications. Scientometrics, 69(1), 153-159]. For practical reasons, Liming studies such sequences where the time goes backwards while it is more logical to use the time going forward (real career periods). Both type of h-index sequences are studied here and their interrelations are revealed. We show cases where these sequences are convex, linear and concave. We also show that, when one of the sequences is convex then the other one is concave, showing that the reverse-time sequence, in general, cannot be used to derive similar properties of the (difficult to obtain) forward time sequence. We show that both sequences are the same if and only if the author produces the same number of papers per year. If the author produces an increasing number of papers per year, then Liang's h-sequences are above the "normal" ones. All these results are also valid for g- and R-sequences. The results are confirmed by the h-, g- and R-sequences (forward and reverse time) of the author.
  16. Egghe, L.: Sampling and concentration values of incomplete bibliographies (2002) 0.01
    0.006099823 = product of:
      0.048798583 = sum of:
        0.048798583 = weight(_text_:studies in 450) [ClassicSimilarity], result of:
          0.048798583 = score(doc=450,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.30860704 = fieldWeight in 450, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0546875 = fieldNorm(doc=450)
      0.125 = coord(1/8)
    
    Abstract
    This article studies concentration aspects of bibliographies. More, in particular, we study the impact of incompleteness of such a bibliography on its concentration values (i.e., its degree of inequality of production of its sources). Incompleteness is modeled by sampling in the complete bibliography. The model is general enough to comprise truncation of a bibliography as well as a systematic sample on sources or items. In all cases we prove that the sampled bibliography (or incomplete one) has a higher concentration value than the complete one. These models, hence, shed some light on the measurement of production inequality in incomplete bibliographies.
  17. Egghe, L.; Rousseau, R.; Rousseau, S.: TOP-curves (2007) 0.01
    0.006099823 = product of:
      0.048798583 = sum of:
        0.048798583 = weight(_text_:studies in 50) [ClassicSimilarity], result of:
          0.048798583 = score(doc=50,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.30860704 = fieldWeight in 50, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0546875 = fieldNorm(doc=50)
      0.125 = coord(1/8)
    
    Abstract
    Several characteristics of classical Lorenz curves make them unsuitable for the study of a group of topperformers. TOP-curves, defined as a kind of mirror image of TIP-curves used in poverty studies, are shown to possess the properties necessary for adequate empirical ranking of various data arrays, based on the properties of the highest performers (i.e., the core). TOP-curves and essential TOP-curves, also introduced in this article, simultaneously represent the incidence, intensity, and inequality among the top. It is shown that TOPdominance partial order, introduced in this article, is stronger than Lorenz dominance order. In this way, this article contributes to the study of cores, a central issue in applied informetrics.
  18. Egghe, L.; Rousseau, R.: ¬A measure for the cohesion of weighted networks (2003) 0.01
    0.0052890894 = product of:
      0.042312715 = sum of:
        0.042312715 = weight(_text_:case in 5157) [ClassicSimilarity], result of:
          0.042312715 = score(doc=5157,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.24286987 = fieldWeight in 5157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5157)
      0.125 = coord(1/8)
    
    Abstract
    Measurement of the degree of interconnectedness in graph like networks of hyperlinks or citations can indicate the existence of research fields and assist in comparative evaluation of research efforts. In this issue we begin with Egghe and Rousseau who review compactness measures and investigate the compactness of a network as a weighted graph with dissimilarity values characterizing the arcs between nodes. They make use of a generalization of the Botofogo, Rivlin, Shneiderman, (BRS) compaction measure which treats the distance between unreachable nodes not as infinity but rather as the number of nodes in the network. The dissimilarity values are determined by summing the reciprocals of the weights of the arcs in the shortest chain between two nodes where no weight is smaller than one. The BRS measure is then the maximum value for the sum of the dissimilarity measures less the actual sum divided by the difference between the maximum and minimum. The Wiener index, the sum of all elements in the dissimilarity matrix divided by two, is then computed for Small's particle physics co-citation data as well as the BRS measure, the dissimilarity values and shortest paths. The compactness measure for the weighted network is smaller than for the un-weighted. When the bibliographic coupling network is utilized it is shown to be less compact than the co-citation network which indicates that the new measure produces results that confirm to an obvious case.
  19. Egghe, L.: ¬The measures precision, recall, fallout and miss as a function of the number of retrieved documents and their mutual interrelations (2008) 0.01
    0.0052890894 = product of:
      0.042312715 = sum of:
        0.042312715 = weight(_text_:case in 2067) [ClassicSimilarity], result of:
          0.042312715 = score(doc=2067,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.24286987 = fieldWeight in 2067, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2067)
      0.125 = coord(1/8)
    
    Abstract
    In this paper, for the first time, we present global curves for the measures precision, recall, fallout and miss in function of the number of retrieved documents. Different curves apply for different retrieved systems, for which we give exact definitions in terms of a retrieval density function: perverse retrieval, perfect retrieval, random retrieval, normal retrieval, hereby extending results of Buckland and Gey and of Egghe in the following sense: mathematically more advanced methods yield a better insight into these curves, more types of retrieval are considered and, very importantly, the theory is developed for the "complete" set of measures: precision, recall, fallout and miss. Next we study the interrelationships between precision, recall, fallout and miss in these different types of retrieval, hereby again extending results of Buckland and Gey (incl. a correction) and of Egghe. In the case of normal retrieval we prove that precision in function of recall and recall in function of miss is a concavely decreasing relationship while recall in function of fallout is a concavely increasing relationship. We also show, by producing examples, that the relationships between fallout and precision, miss and precision and miss and fallout are not always convex or concave.
  20. Egghe, L.; Ravichandra Rao, I.K.: ¬The influence of the broadness of a query of a topic on its h-index : models and examples of the h-index of n-grams (2008) 0.00
    0.0043570166 = product of:
      0.034856133 = sum of:
        0.034856133 = weight(_text_:studies in 2009) [ClassicSimilarity], result of:
          0.034856133 = score(doc=2009,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.22043361 = fieldWeight in 2009, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2009)
      0.125 = coord(1/8)
    
    Abstract
    The article studies the influence of the query formulation of a topic on its h-index. In order to generate pure random sets of documents, we used N-grams (N variable) to measure this influence: strings of zeros, truncated at the end. The used databases are WoS and Scopus. The formula h=T**1/alpha, proved in Egghe and Rousseau (2006) where T is the number of retrieved documents and is Lotka's exponent, is confirmed being a concavely increasing function of T. We also give a formula for the relation between h and N the length of the N-gram: h=D10**(-N/alpha) where D is a constant, a convexly decreasing function, which is found in our experiments. Nonlinear regression on h=T**1/alpha gives an estimation of , which can then be used to estimate the h-index of the entire database (Web of Science [WoS] and Scopus): h=S**1/alpha, , where S is the total number of documents in the database.