Search (44 results, page 1 of 3)

Egghe, L.; Rousseau, R.: Averaging and globalising quotients of informetric and scientometric data (1996) 0.02

0.017282655 = product of:
  0.04320664 = sum of:
    0.0262271 = weight(_text_:of in 7659) [ClassicSimilarity], result of:
      0.0262271 = score(doc=7659,freq=30.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.4014868 = fieldWeight in 7659, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=7659)
    0.016979538 = product of:
      0.033959076 = sum of:
        0.033959076 = weight(_text_:22 in 7659) [ClassicSimilarity], result of:
          0.033959076 = score(doc=7659,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.23214069 = fieldWeight in 7659, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=7659)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: It is possible, using ISI's Journal Citation Report (JCR), to calculate average impact factors (AIF) for LCR's subject categories but it can be more useful to know the global Impact Factor (GIF) of a subject category and compare the 2 values. Reports results of a study to compare the relationships between AIFs and GIFs of subjects, based on the particular case of the average impact factor of a subfield versus the impact factor of this subfield as a whole, the difference being studied between an average of quotients, denoted as AQ, and a global average, obtained as a quotient of averages, and denoted as GQ. In the case of impact factors, AQ becomes the average impact factor of a field, and GQ becomes its global impact factor. Discusses a number of applications of this technique in the context of informetrics and scientometrics
Source: Journal of information science. 22(1996) no.3, S.165-170

Egghe, L.: ¬A noninformetric analysis of the relationship between citation age and journal productivity (2001) 0.01

0.014079607 = product of:
  0.035199016 = sum of:
    0.017282499 = product of:
      0.08641249 = sum of:
        0.08641249 = weight(_text_:problem in 5685) [ClassicSimilarity], result of:
          0.08641249 = score(doc=5685,freq=6.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.48735106 = fieldWeight in 5685, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.046875 = fieldNorm(doc=5685)
      0.2 = coord(1/5)
    0.01791652 = weight(_text_:of in 5685) [ClassicSimilarity], result of:
      0.01791652 = score(doc=5685,freq=14.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.2742677 = fieldWeight in 5685, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=5685)
  0.4 = coord(2/5)

Abstract: A problem, raised by Wallace (JASIS, 37,136-145,1986), on the relation between the journal's median citation age and its number of articles is studied. Leaving open the problem as such, we give a statistical explanation of this relationship, when replacing "median" by "mean" in Wallace's problem. The cloud of points, found by Wallace, is explained in this sense that the points are scattered over the area in first quadrant, limited by a curve of the form y=1 + E/x**2 where E is a constant. This curve is obtained by using the Central Limit Theorem in statistics and, hence, has no intrinsic informetric foundation. The article closes with some reflections on explanations of regularities in informetrics, based on statistical, probabilistic or informetric results, or on a combination thereof
Source: Journal of the American Society for Information Science and technology. 52(2001) no.5, S.371-377

Egghe, L.: Type/Token-Taken informetrics (2003) 0.01
```
0.010812533 = product of:
  0.027031332 = sum of:
    0.008315044 = product of:
      0.041575223 = sum of:
        0.041575223 = weight(_text_:problem in 1608) [ClassicSimilarity], result of:
          0.041575223 = score(doc=1608,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.23447686 = fieldWeight in 1608, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1608)
      0.2 = coord(1/5)
    0.018716287 = weight(_text_:of in 1608) [ClassicSimilarity], result of:
      0.018716287 = score(doc=1608,freq=22.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.28651062 = fieldWeight in 1608, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1608)
  0.4 = coord(2/5)
```
Abstract

Type/Token-Taken informetrics is a new part of informetrics that studies the use of items rather than the items itself. Here, items are the objects that are produced by the sources (e.g., journals producing articles, authors producing papers, etc.). In linguistics a source is also called a type (e.g., a word), and an item a token (e.g., the use of words in texts). In informetrics, types that occur often, for example, in a database will also be requested often, for example, in information retrieval. The relative use of these occurrences will be higher than their relative occurrences itself; hence, the name Type/ Token-Taken informetrics. This article studies the frequency distribution of Type/Token-Taken informetrics, starting from the one of Type/Token informetrics (i.e., source-item relationships). We are also studying the average number my* of item uses in Type/Token-Taken informetrics and compare this with the classical average number my in Type/Token informetrics. We show that my* >= my always, and that my* is an increasing function of my. A method is presented to actually calculate my* from my, and a given a, which is the exponent in Lotka's frequency distribution of Type/Token informetrics. We leave open the problem of developing non-Lotkaian Type/TokenTaken informetrics.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.7, S.603-610
Egghe, L.: Untangling Herdan's law and Heaps' law : mathematical and informetric arguments (2007) 0.01
```
0.010464129 = product of:
  0.026160322 = sum of:
    0.008315044 = product of:
      0.041575223 = sum of:
        0.041575223 = weight(_text_:problem in 271) [ClassicSimilarity], result of:
          0.041575223 = score(doc=271,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.23447686 = fieldWeight in 271, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=271)
      0.2 = coord(1/5)
    0.017845279 = weight(_text_:of in 271) [ClassicSimilarity], result of:
      0.017845279 = score(doc=271,freq=20.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.27317715 = fieldWeight in 271, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=271)
  0.4 = coord(2/5)
```
Abstract

Herdan's law in linguistics and Heaps' law in information retrieval are different formulations of the same phenomenon. Stated briefly and in linguistic terms they state that vocabularies' sizes are concave increasing power laws of texts' sizes. This study investigates these laws from a purely mathematical and informetric point of view. A general informetric argument shows that the problem of proving these laws is, in fact, ill-posed. Using the more general terminology of sources and items, the author shows by presenting exact formulas from Lotkaian informetrics that the total number T of sources is not only a function of the total number A of items, but is also a function of several parameters (e.g., the parameters occurring in Lotka's law). Consequently, it is shown that a fixed T(or A) value can lead to different possible A (respectively, T) values. Limiting the T(A)-variability to increasing samples (e.g., in a text as done in linguistics) the author then shows, in a purely mathematical way, that for large sample sizes T~ A**phi, where phi is a constant, phi < 1 but close to 1, hence roughly, Heaps' or Herdan's law can be proved without using any linguistic or informetric argument. The author also shows that for smaller samples, a is not a constant but essentially decreases as confirmed by practical examples. Finally, an exact informetric argument on random sampling in the items shows that, in most cases, T= T(A) is a concavely increasing function, in accordance with practical examples.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.5, S.702-709

Egghe, L.: On the law of Zipf-Mandelbrot for multi-word phrases (1999) 0.01

0.0054174457 = product of:
  0.027087228 = sum of:
    0.027087228 = weight(_text_:of in 3058) [ClassicSimilarity], result of:
      0.027087228 = score(doc=3058,freq=18.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.41465375 = fieldWeight in 3058, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=3058)
  0.2 = coord(1/5)

Abstract: This article studies the probabilities of the occurence of multi-word (m-word) phrases (m=2,3,...) in relation to the probabilities of occurence of the single words. It is well known that, in the latter case, the lae of Zipf is valid (i.e., a power law). We prove that in the case of m-word phrases (m>=2), this is not the case. We present 2 independent proof of this
Source: Journal of the American Society for Information Science. 50(1999) no.3, S.233-241

Egghe, L.: Mathematical theories of citation (1998) 0.01

0.0054174457 = product of:
  0.027087228 = sum of:
    0.027087228 = weight(_text_:of in 5125) [ClassicSimilarity], result of:
      0.027087228 = score(doc=5125,freq=18.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.41465375 = fieldWeight in 5125, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=5125)
  0.2 = coord(1/5)

Abstract: Focuses on possible mathematical theories of citation and on the intrinsic problems related to it. Sheds light on aspects of mathematical complexity as encountered in, for example, fractal theory and Mandelbrot's law. Also discusses dynamical aspects of citation theory as reflected in evolutions of journal rankings, centres of gravity or of the set of source journals. Makes some comments in this connection on growth and obsolescence
Footnote: Contribution to a thematic issue devoted to 'Theories of citation?'

Egghe, L.: ¬A model for the size-frequency function of coauthor pairs (2008) 0.01
```
0.00524542 = product of:
  0.0262271 = sum of:
    0.0262271 = weight(_text_:of in 2366) [ClassicSimilarity], result of:
      0.0262271 = score(doc=2366,freq=30.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.4014868 = fieldWeight in 2366, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2366)
  0.2 = coord(1/5)
```
Abstract

Lotka's law was formulated to describe the number of authors with a certain number of publications. Empirical results (Morris & Goldstein, 2007) indicate that Lotka's law is also valid if one counts the number of publications of coauthor pairs. This article gives a simple model proving this to be true, with the same Lotka exponent, if the number of coauthored papers is proportional to the number of papers of the individual coauthors. Under the assumption that this number of coauthored papers is more than proportional to the number of papers of the individual authors (to be explained in the article), we can prove that the size-frequency function of coauthor pairs is Lotkaian with an exponent that is higher than that of the Lotka function of individual authors, a fact that is confirmed in experimental results.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.13, S.2133-2137
Egghe, L.: Dynamic h-index : the Hirsch index in function of time (2007) 0.01
```
0.005107617 = product of:
  0.025538085 = sum of:
    0.025538085 = weight(_text_:of in 147) [ClassicSimilarity], result of:
      0.025538085 = score(doc=147,freq=16.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.39093933 = fieldWeight in 147, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=147)
  0.2 = coord(1/5)
```
Abstract

When there are a group of articles and the present time is fixed we can determine the unique number h being the number of articles that received h or more citations while the other articles received a number of citations which is not larger than h. In this article, the time dependence of the h-index is determined. This is important to describe the expected career evolution of a scientist's work or of a journal's production in a fixed year.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.3, S.452-454
Egghe, L.: Zipfian and Lotkaian continuous concentration theory (2005) 0.01
```
0.0050675566 = product of:
  0.025337784 = sum of:
    0.025337784 = weight(_text_:of in 3678) [ClassicSimilarity], result of:
      0.025337784 = score(doc=3678,freq=28.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.38787308 = fieldWeight in 3678, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3678)
  0.2 = coord(1/5)
```
Abstract

In this article concentration (i.e., inequality) aspects of the functions of Zipf and of Lotka are studied. Since both functions are power laws (i.e., they are mathematically the same) it suffices to develop one concentration theory for power laws and apply it twice for the different interpretations of the laws of Zipf and Lotka. After a brief repetition of the functional relationships between Zipf's law and Lotka's law, we prove that Price's law of concentration is equivalent with Zipf's law. A major part of this article is devoted to the development of continuous concentration theory, based an Lorenz curves. The Lorenz curve for power functions is calculated and, based an this, some important concentration measures such as the ones of Gini, Theil, and the variation coefficient. Using Lorenz curves, it is shown that the concentration of a power law increases with its exponent and this result is interpreted in terms of the functions of Zipf and Lotka.

Source

Journal of the American Society for Information Science and Technology. 56(2005) no.9, S.935-945
Egghe, L.: Sampling and concentration values of incomplete bibliographies (2002) 0.00
```
0.0049966783 = product of:
  0.024983391 = sum of:
    0.024983391 = weight(_text_:of in 450) [ClassicSimilarity], result of:
      0.024983391 = score(doc=450,freq=20.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.38244802 = fieldWeight in 450, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=450)
  0.2 = coord(1/5)
```
Abstract

This article studies concentration aspects of bibliographies. More, in particular, we study the impact of incompleteness of such a bibliography on its concentration values (i.e., its degree of inequality of production of its sources). Incompleteness is modeled by sampling in the complete bibliography. The model is general enough to comprise truncation of a bibliography as well as a systematic sample on sources or items. In all cases we prove that the sampled bibliography (or incomplete one) has a higher concentration value than the complete one. These models, hence, shed some light on the measurement of production inequality in incomplete bibliographies.

Source

Journal of the American Society for Information Science and technology. 53(2002) no.4, S.271-281
Egghe, L.; Rousseau, R.; Rousseau, S.: TOP-curves (2007) 0.00
```
0.004740265 = product of:
  0.023701325 = sum of:
    0.023701325 = weight(_text_:of in 50) [ClassicSimilarity], result of:
      0.023701325 = score(doc=50,freq=18.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.36282203 = fieldWeight in 50, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=50)
  0.2 = coord(1/5)
```
Abstract

Several characteristics of classical Lorenz curves make them unsuitable for the study of a group of topperformers. TOP-curves, defined as a kind of mirror image of TIP-curves used in poverty studies, are shown to possess the properties necessary for adequate empirical ranking of various data arrays, based on the properties of the highest performers (i.e., the core). TOP-curves and essential TOP-curves, also introduced in this article, simultaneously represent the incidence, intensity, and inequality among the top. It is shown that TOPdominance partial order, introduced in this article, is stronger than Lorenz dominance order. In this way, this article contributes to the study of cores, a central issue in applied informetrics.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.6, S.777-785
Egghe, L.: Theory of the topical coverage of multiple databases (2013) 0.00
```
0.004740265 = product of:
  0.023701325 = sum of:
    0.023701325 = weight(_text_:of in 526) [ClassicSimilarity], result of:
      0.023701325 = score(doc=526,freq=18.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.36282203 = fieldWeight in 526, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=526)
  0.2 = coord(1/5)
```
Abstract

We present a model that describes which fraction of the literature on a certain topic we will find when we use n (n = 1, 2, .) databases. It is a generalization of the theory of discovering usability problems. We prove that, in all practical cases, this fraction is a concave function of n, the number of used databases, thereby explaining some graphs that exist in the literature. We also study limiting features of this fraction for n very high and we characterize the case that we find all literature on a certain topic for n high enough.

Source

Journal of the American Society for Information Science and Technology. 64(2013) no.1, S.126-131

Egghe, L.: Special features of the author - publication relationship and a new explanation of Lotka's law based on convolution theory (1994) 0.00

0.004691646 = product of:
  0.02345823 = sum of:
    0.02345823 = weight(_text_:of in 5068) [ClassicSimilarity], result of:
      0.02345823 = score(doc=5068,freq=6.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.3591007 = fieldWeight in 5068, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=5068)
  0.2 = coord(1/5)

Source: Journal of the American Society for Information Science. 45(1994) no.6, S.422-427

Egghe, L.; Ravichandra Rao, I.K.: ¬The influence of the broadness of a query of a topic on its h-index : models and examples of the h-index of n-grams (2008) 0.00
```
0.0046534794 = product of:
  0.023267398 = sum of:
    0.023267398 = weight(_text_:of in 2009) [ClassicSimilarity], result of:
      0.023267398 = score(doc=2009,freq=34.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.35617945 = fieldWeight in 2009, product of:
          5.8309517 = tf(freq=34.0), with freq of:
            34.0 = termFreq=34.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2009)
  0.2 = coord(1/5)
```
Abstract

The article studies the influence of the query formulation of a topic on its h-index. In order to generate pure random sets of documents, we used N-grams (N variable) to measure this influence: strings of zeros, truncated at the end. The used databases are WoS and Scopus. The formula h=T**1/alpha, proved in Egghe and Rousseau (2006) where T is the number of retrieved documents and is Lotka's exponent, is confirmed being a concavely increasing function of T. We also give a formula for the relation between h and N the length of the N-gram: h=D10**(-N/alpha) where D is a constant, a convexly decreasing function, which is found in our experiments. Nonlinear regression on h=T**1/alpha gives an estimation of , which can then be used to estimate the h-index of the entire database (Web of Science [WoS] and Scopus): h=S**1/alpha, , where S is the total number of documents in the database.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.10, S.1688-1693
Egghe, L.; Ravichandra Rao, I.K.: Study of different h-indices for groups of authors (2008) 0.00
```
0.0044919094 = product of:
  0.022459546 = sum of:
    0.022459546 = weight(_text_:of in 1878) [ClassicSimilarity], result of:
      0.022459546 = score(doc=1878,freq=22.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.34381276 = fieldWeight in 1878, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1878)
  0.2 = coord(1/5)
```
Abstract

In this article, for any group of authors, we define three different h-indices. First, there is the successive h-index h2 based on the ranked list of authors and their h-indices h1 as defined by Schubert (2007). Next, there is the h-index hP based on the ranked list of authors and their number of publications. Finally, there is the h-index hC based on the ranked list of authors and their number of citations. We present formulae for these three indices in Lotkaian informetrics from which it also follows that h2 < hp < hc. We give a concrete example of a group of 167 authors on the topic optical flow estimation. Besides these three h-indices, we also calculate the two-by-two Spearman rank correlation coefficient and prove that these rankings are significantly related.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.8, S.1276-1281
Egghe, L.; Guns, R.: Applications of the generalized law of Benford to informetric data (2012) 0.00
```
0.0044919094 = product of:
  0.022459546 = sum of:
    0.022459546 = weight(_text_:of in 376) [ClassicSimilarity], result of:
      0.022459546 = score(doc=376,freq=22.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.34381276 = fieldWeight in 376, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=376)
  0.2 = coord(1/5)
```
Abstract

In a previous work (Egghe, 2011), the first author showed that Benford's law (describing the logarithmic distribution of the numbers 1, 2, ... , 9 as first digits of data in decimal form) is related to the classical law of Zipf with exponent 1. The work of Campanario and Coslado (2011), however, shows that Benford's law does not always fit practical data in a statistical sense. In this article, we use a generalization of Benford's law related to the general law of Zipf with exponent ? > 0. Using data from Campanario and Coslado, we apply nonlinear least squares to determine the optimal ? and show that this generalized law of Benford fits the data better than the classical law of Benford.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.8, S.1662-1665
Egghe, L.; Ravichandra Rao, I.K.: Duality revisited : construction of fractional frequency distributions based on two dual Lotka laws (2002) 0.00
```
0.004282867 = product of:
  0.021414334 = sum of:
    0.021414334 = weight(_text_:of in 1006) [ClassicSimilarity], result of:
      0.021414334 = score(doc=1006,freq=20.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.32781258 = fieldWeight in 1006, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1006)
  0.2 = coord(1/5)
```
Abstract

Fractional frequency distributions of, for example, authors with a certain (fractional) number of papers are very irregular and, therefore, not easy to model or to explain. This article gives a first attempt to this by assuming two simple Lotka laws (with exponent 2): one for the number of authors with n papers (total count here) and one for the number of papers with n authors, n E N. Based an an earlier made convolution model of Egghe, interpreted and reworked now for discrete scores, we are able to produce theoretical fractional frequency distributions with only one parameter, which are in very close agreement with the practical ones as found in a large dataset produced earlier by Rao. The article also shows that (irregular) fractional frequency distributions are a consequence of Lotka's law, and are not examples of breakdowns of this famous historical law.

Source

Journal of the American Society for Information Science and technology. 53(2002) no.10, S.789-801
Egghe, L.: Mathematical theory of the h- and g-index in case of fractional counting of authorship (2008) 0.00
```
0.004282867 = product of:
  0.021414334 = sum of:
    0.021414334 = weight(_text_:of in 2004) [ClassicSimilarity], result of:
      0.021414334 = score(doc=2004,freq=20.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.32781258 = fieldWeight in 2004, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2004)
  0.2 = coord(1/5)
```
Abstract

This article studies the h-index (Hirsch index) and the g-index of authors, in case one counts authorship of the cited articles in a fractional way. There are two ways to do this: One counts the citations to these papers in a fractional way or one counts the ranks of the papers in a fractional way as credit for an author. In both cases, we define the fractional h- and g-indexes, and we present inequalities (both upper and lower bounds) between these fractional h- and g-indexes and their corresponding unweighted values (also involving, of course, the coauthorship distribution). Wherever applicable, examples and counterexamples are provided. In a concrete example (the publication citation list of the present author), we make explicit calculations of these fractional h- and g-indexes and show that they are not very different from the unweighted ones.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.10, S.1608-1616

Egghe, L.: Informetric explanation of some Leiden Ranking graphs (2014) 0.00

0.004037926 = product of:
  0.02018963 = sum of:
    0.02018963 = weight(_text_:of in 1236) [ClassicSimilarity], result of:
      0.02018963 = score(doc=1236,freq=10.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.3090647 = fieldWeight in 1236, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=1236)
  0.2 = coord(1/5)

Abstract: The S-shaped functional relation between the mean citation score and the proportion of top 10% publications for the 500 Leiden Ranking universities is explained using results of the shifted Lotka function. Also the concave or convex relation between the proportion of top 100?% publications, for different fractions ?, is explained using the obtained new informetric model.
Source: Journal of the Association for Information Science and Technology. 65(2014) no.4, S.737-741

Egghe, L.: ¬The influence of transformations on the h-index and the g-index (2008) 0.00
```
0.0038704101 = product of:
  0.01935205 = sum of:
    0.01935205 = weight(_text_:of in 1881) [ClassicSimilarity], result of:
      0.01935205 = score(doc=1881,freq=12.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.29624295 = fieldWeight in 1881, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1881)
  0.2 = coord(1/5)
```
Abstract

In a previous article, we introduced a general transformation on sources and one on items in an arbitrary information production process (IPP). In this article, we investigate the influence of these transformations on the h-index and on the g-index. General formulae that describe this influence are presented. These are applied to the case that the size-frequency function is Lotkaian (i.e., is a decreasing power function). We further show that the h-index of the transformed IPP belongs to the interval bounded by the two transformations of the h-index of the original IPP, and we also show that this property is not true for the g-index.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.8, S.1304-1312

Search (44 results, page 1 of 3)

Authors

Years

Types

Themes

Subjects

Classifications