Search (34 results, page 1 of 2)

Egghe, L.: ¬A universal method of information retrieval evaluation : the "missing" link M and the universal IR surface (2004) 0.02
```
0.018567387 = product of:
  0.055702157 = sum of:
    0.055702157 = sum of:
      0.020087399 = weight(_text_:of in 2558) [ClassicSimilarity], result of:
        0.020087399 = score(doc=2558,freq=16.0), product of:
          0.06850986 = queryWeight, product of:
            1.5637573 = idf(docFreq=25162, maxDocs=44218)
            0.043811057 = queryNorm
          0.2932045 = fieldWeight in 2558, product of:
            4.0 = tf(freq=16.0), with freq of:
              16.0 = termFreq=16.0
            1.5637573 = idf(docFreq=25162, maxDocs=44218)
            0.046875 = fieldNorm(doc=2558)
      0.03561476 = weight(_text_:22 in 2558) [ClassicSimilarity], result of:
        0.03561476 = score(doc=2558,freq=2.0), product of:
          0.15341885 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.043811057 = queryNorm
          0.23214069 = fieldWeight in 2558, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2558)
  0.33333334 = coord(1/3)
```
Abstract

The paper shows that the present evaluation methods in information retrieval (basically recall R and precision P and in some cases fallout F ) lack universal comparability in the sense that their values depend on the generality of the IR problem. A solution is given by using all "parts" of the database, including the non-relevant documents and also the not-retrieved documents. It turns out that the solution is given by introducing the measure M being the fraction of the not-retrieved documents that are relevant (hence the "miss" measure). We prove that - independent of the IR problem or of the IR action - the quadruple (P,R,F,M) belongs to a universal IR surface, being the same for all IR-activities. This universality is then exploited by defining a new measure for evaluation in IR allowing for unbiased comparisons of all IR results. We also show that only using one, two or even three measures from the set {P,R,F,M} necessary leads to evaluation measures that are non-universal and hence not capable of comparing different IR situations.

Date

14. 8.2004 19:17:22
Egghe, L.: ¬A model for the size-frequency function of coauthor pairs (2008) 0.00
```
0.0045843013 = product of:
  0.013752903 = sum of:
    0.013752903 = product of:
      0.027505806 = sum of:
        0.027505806 = weight(_text_:of in 2366) [ClassicSimilarity], result of:
          0.027505806 = score(doc=2366,freq=30.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.4014868 = fieldWeight in 2366, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2366)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Lotka's law was formulated to describe the number of authors with a certain number of publications. Empirical results (Morris & Goldstein, 2007) indicate that Lotka's law is also valid if one counts the number of publications of coauthor pairs. This article gives a simple model proving this to be true, with the same Lotka exponent, if the number of coauthored papers is proportional to the number of papers of the individual coauthors. Under the assumption that this number of coauthored papers is more than proportional to the number of papers of the individual authors (to be explained in the article), we can prove that the size-frequency function of coauthor pairs is Lotkaian with an exponent that is higher than that of the Lotka function of individual authors, a fact that is confirmed in experimental results.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.13, S.2133-2137

Egghe, L.: Dynamic h-index : the Hirsch index in function of time (2007) 0.00

0.004463867 = product of:
  0.0133916 = sum of:
    0.0133916 = product of:
      0.0267832 = sum of:
        0.0267832 = weight(_text_:of in 147) [ClassicSimilarity], result of:
          0.0267832 = score(doc=147,freq=16.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.39093933 = fieldWeight in 147, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=147)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: When there are a group of articles and the present time is fixed we can determine the unique number h being the number of articles that received h or more citations while the other articles received a number of citations which is not larger than h. In this article, the time dependence of the h-index is determined. This is important to describe the expected career evolution of a scientist's work or of a journal's production in a fixed year.
Source: Journal of the American Society for Information Science and Technology. 58(2007) no.3, S.452-454

Egghe, L.: Zipfian and Lotkaian continuous concentration theory (2005) 0.00
```
0.004428855 = product of:
  0.013286565 = sum of:
    0.013286565 = product of:
      0.02657313 = sum of:
        0.02657313 = weight(_text_:of in 3678) [ClassicSimilarity], result of:
          0.02657313 = score(doc=3678,freq=28.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.38787308 = fieldWeight in 3678, product of:
              5.2915025 = tf(freq=28.0), with freq of:
                28.0 = termFreq=28.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3678)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In this article concentration (i.e., inequality) aspects of the functions of Zipf and of Lotka are studied. Since both functions are power laws (i.e., they are mathematically the same) it suffices to develop one concentration theory for power laws and apply it twice for the different interpretations of the laws of Zipf and Lotka. After a brief repetition of the functional relationships between Zipf's law and Lotka's law, we prove that Price's law of concentration is equivalent with Zipf's law. A major part of this article is devoted to the development of continuous concentration theory, based an Lorenz curves. The Lorenz curve for power functions is calculated and, based an this, some important concentration measures such as the ones of Gini, Theil, and the variation coefficient. Using Lorenz curves, it is shown that the concentration of a power law increases with its exponent and this result is interpreted in terms of the functions of Zipf and Lotka.

Source

Journal of the American Society for Information Science and Technology. 56(2005) no.9, S.935-945
Egghe, L.: Sampling and concentration values of incomplete bibliographies (2002) 0.00
```
0.00436691 = product of:
  0.01310073 = sum of:
    0.01310073 = product of:
      0.02620146 = sum of:
        0.02620146 = weight(_text_:of in 450) [ClassicSimilarity], result of:
          0.02620146 = score(doc=450,freq=20.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.38244802 = fieldWeight in 450, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=450)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This article studies concentration aspects of bibliographies. More, in particular, we study the impact of incompleteness of such a bibliography on its concentration values (i.e., its degree of inequality of production of its sources). Incompleteness is modeled by sampling in the complete bibliography. The model is general enough to comprise truncation of a bibliography as well as a systematic sample on sources or items. In all cases we prove that the sampled bibliography (or incomplete one) has a higher concentration value than the complete one. These models, hence, shed some light on the measurement of production inequality in incomplete bibliographies.

Source

Journal of the American Society for Information Science and technology. 53(2002) no.4, S.271-281
Egghe, L.; Rousseau, R.; Rousseau, S.: TOP-curves (2007) 0.00
```
0.004142815 = product of:
  0.012428444 = sum of:
    0.012428444 = product of:
      0.024856888 = sum of:
        0.024856888 = weight(_text_:of in 50) [ClassicSimilarity], result of:
          0.024856888 = score(doc=50,freq=18.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36282203 = fieldWeight in 50, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=50)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Several characteristics of classical Lorenz curves make them unsuitable for the study of a group of topperformers. TOP-curves, defined as a kind of mirror image of TIP-curves used in poverty studies, are shown to possess the properties necessary for adequate empirical ranking of various data arrays, based on the properties of the highest performers (i.e., the core). TOP-curves and essential TOP-curves, also introduced in this article, simultaneously represent the incidence, intensity, and inequality among the top. It is shown that TOPdominance partial order, introduced in this article, is stronger than Lorenz dominance order. In this way, this article contributes to the study of cores, a central issue in applied informetrics.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.6, S.777-785
Egghe, L.; Ravichandra Rao, I.K.: ¬The influence of the broadness of a query of a topic on its h-index : models and examples of the h-index of n-grams (2008) 0.00
```
0.0040669674 = product of:
  0.012200902 = sum of:
    0.012200902 = product of:
      0.024401804 = sum of:
        0.024401804 = weight(_text_:of in 2009) [ClassicSimilarity], result of:
          0.024401804 = score(doc=2009,freq=34.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.35617945 = fieldWeight in 2009, product of:
              5.8309517 = tf(freq=34.0), with freq of:
                34.0 = termFreq=34.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2009)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The article studies the influence of the query formulation of a topic on its h-index. In order to generate pure random sets of documents, we used N-grams (N variable) to measure this influence: strings of zeros, truncated at the end. The used databases are WoS and Scopus. The formula h=T**1/alpha, proved in Egghe and Rousseau (2006) where T is the number of retrieved documents and is Lotka's exponent, is confirmed being a concavely increasing function of T. We also give a formula for the relation between h and N the length of the N-gram: h=D10**(-N/alpha) where D is a constant, a convexly decreasing function, which is found in our experiments. Nonlinear regression on h=T**1/alpha gives an estimation of , which can then be used to estimate the h-index of the entire database (Web of Science [WoS] and Scopus): h=S**1/alpha, , where S is the total number of documents in the database.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.10, S.1688-1693
Egghe, L.: ¬The measures precision, recall, fallout and miss as a function of the number of retrieved documents and their mutual interrelations (2008) 0.00
```
0.003945538 = product of:
  0.0118366135 = sum of:
    0.0118366135 = product of:
      0.023673227 = sum of:
        0.023673227 = weight(_text_:of in 2067) [ClassicSimilarity], result of:
          0.023673227 = score(doc=2067,freq=32.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.34554482 = fieldWeight in 2067, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2067)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In this paper, for the first time, we present global curves for the measures precision, recall, fallout and miss in function of the number of retrieved documents. Different curves apply for different retrieved systems, for which we give exact definitions in terms of a retrieval density function: perverse retrieval, perfect retrieval, random retrieval, normal retrieval, hereby extending results of Buckland and Gey and of Egghe in the following sense: mathematically more advanced methods yield a better insight into these curves, more types of retrieval are considered and, very importantly, the theory is developed for the "complete" set of measures: precision, recall, fallout and miss. Next we study the interrelationships between precision, recall, fallout and miss in these different types of retrieval, hereby again extending results of Buckland and Gey (incl. a correction) and of Egghe. In the case of normal retrieval we prove that precision in function of recall and recall in function of miss is a concavely decreasing relationship while recall in function of fallout is a concavely increasing relationship. We also show, by producing examples, that the relationships between fallout and precision, miss and precision and miss and fallout are not always convex or concave.
Egghe, L.; Ravichandra Rao, I.K.: Study of different h-indices for groups of authors (2008) 0.00
```
0.003925761 = product of:
  0.011777283 = sum of:
    0.011777283 = product of:
      0.023554565 = sum of:
        0.023554565 = weight(_text_:of in 1878) [ClassicSimilarity], result of:
          0.023554565 = score(doc=1878,freq=22.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.34381276 = fieldWeight in 1878, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1878)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In this article, for any group of authors, we define three different h-indices. First, there is the successive h-index h2 based on the ranked list of authors and their h-indices h1 as defined by Schubert (2007). Next, there is the h-index hP based on the ranked list of authors and their number of publications. Finally, there is the h-index hC based on the ranked list of authors and their number of citations. We present formulae for these three indices in Lotkaian informetrics from which it also follows that h2 < hp < hc. We give a concrete example of a group of 167 authors on the topic optical flow estimation. Besides these three h-indices, we also calculate the two-by-two Spearman rank correlation coefficient and prove that these rankings are significantly related.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.8, S.1276-1281
Egghe, L.; Rousseau, R.: ¬A measure for the cohesion of weighted networks (2003) 0.00
```
0.0038202507 = product of:
  0.011460752 = sum of:
    0.011460752 = product of:
      0.022921504 = sum of:
        0.022921504 = weight(_text_:of in 5157) [ClassicSimilarity], result of:
          0.022921504 = score(doc=5157,freq=30.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.33457235 = fieldWeight in 5157, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5157)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Measurement of the degree of interconnectedness in graph like networks of hyperlinks or citations can indicate the existence of research fields and assist in comparative evaluation of research efforts. In this issue we begin with Egghe and Rousseau who review compactness measures and investigate the compactness of a network as a weighted graph with dissimilarity values characterizing the arcs between nodes. They make use of a generalization of the Botofogo, Rivlin, Shneiderman, (BRS) compaction measure which treats the distance between unreachable nodes not as infinity but rather as the number of nodes in the network. The dissimilarity values are determined by summing the reciprocals of the weights of the arcs in the shortest chain between two nodes where no weight is smaller than one. The BRS measure is then the maximum value for the sum of the dissimilarity measures less the actual sum divided by the difference between the maximum and minimum. The Wiener index, the sum of all elements in the dissimilarity matrix divided by two, is then computed for Small's particle physics co-citation data as well as the BRS measure, the dissimilarity values and shortest paths. The compactness measure for the weighted network is smaller than for the un-weighted. When the bibliographic coupling network is utilized it is shown to be less compact than the co-citation network which indicates that the new measure produces results that confirm to an obvious case.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.3, S.193-202
Egghe, L.; Ravichandra Rao, I.K.: Duality revisited : construction of fractional frequency distributions based on two dual Lotka laws (2002) 0.00
```
0.003743066 = product of:
  0.0112291975 = sum of:
    0.0112291975 = product of:
      0.022458395 = sum of:
        0.022458395 = weight(_text_:of in 1006) [ClassicSimilarity], result of:
          0.022458395 = score(doc=1006,freq=20.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32781258 = fieldWeight in 1006, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1006)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Fractional frequency distributions of, for example, authors with a certain (fractional) number of papers are very irregular and, therefore, not easy to model or to explain. This article gives a first attempt to this by assuming two simple Lotka laws (with exponent 2): one for the number of authors with n papers (total count here) and one for the number of papers with n authors, n E N. Based an an earlier made convolution model of Egghe, interpreted and reworked now for discrete scores, we are able to produce theoretical fractional frequency distributions with only one parameter, which are in very close agreement with the practical ones as found in a large dataset produced earlier by Rao. The article also shows that (irregular) fractional frequency distributions are a consequence of Lotka's law, and are not examples of breakdowns of this famous historical law.

Source

Journal of the American Society for Information Science and technology. 53(2002) no.10, S.789-801
Egghe, L.: Mathematical theory of the h- and g-index in case of fractional counting of authorship (2008) 0.00
```
0.003743066 = product of:
  0.0112291975 = sum of:
    0.0112291975 = product of:
      0.022458395 = sum of:
        0.022458395 = weight(_text_:of in 2004) [ClassicSimilarity], result of:
          0.022458395 = score(doc=2004,freq=20.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32781258 = fieldWeight in 2004, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2004)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This article studies the h-index (Hirsch index) and the g-index of authors, in case one counts authorship of the cited articles in a fractional way. There are two ways to do this: One counts the citations to these papers in a fractional way or one counts the ranks of the papers in a fractional way as credit for an author. In both cases, we define the fractional h- and g-indexes, and we present inequalities (both upper and lower bounds) between these fractional h- and g-indexes and their corresponding unweighted values (also involving, of course, the coauthorship distribution). Wherever applicable, examples and counterexamples are provided. In a concrete example (the publication citation list of the present author), we make explicit calculations of these fractional h- and g-indexes and show that they are not very different from the unweighted ones.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.10, S.1608-1616
Egghe, L.: Properties of the n-overlap vector and n-overlap similarity theory (2006) 0.00
```
0.0036907129 = product of:
  0.011072138 = sum of:
    0.011072138 = product of:
      0.022144277 = sum of:
        0.022144277 = weight(_text_:of in 194) [ClassicSimilarity], result of:
          0.022144277 = score(doc=194,freq=28.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32322758 = fieldWeight in 194, product of:
              5.2915025 = tf(freq=28.0), with freq of:
                28.0 = termFreq=28.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=194)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In the first part of this article the author defines the n-overlap vector whose coordinates consist of the fraction of the objects (e.g., books, N-grams, etc.) that belong to 1, 2, , n sets (more generally: families) (e.g., libraries, databases, etc.). With the aid of the Lorenz concentration theory, a theory of n-overlap similarity is conceived together with corresponding measures, such as the generalized Jaccard index (generalizing the well-known Jaccard index in case n 5 2). Next, the distributional form of the n-overlap vector is determined assuming certain distributions of the object's and of the set (family) sizes. In this section the decreasing power law and decreasing exponential distribution is explained for the n-overlap vector. Both item (token) n-overlap and source (type) n-overlap are studied. The n-overlap properties of objects indexed by a hierarchical system (e.g., books indexed by numbers from a UDC or Dewey system or by N-grams) are presented in the final section. The author shows how the results given in the previous section can be applied as well as how the Lorenz order of the n-overlap vector is respected by an increase or a decrease of the level of refinement in the hierarchical system (e.g., the value N in N-grams).

Source

Journal of the American Society for Information Science and Technology. 57(2006) no.9, S.1165-1177
Egghe, L.: ¬The influence of transformations on the h-index and the g-index (2008) 0.00
```
0.003382594 = product of:
  0.010147782 = sum of:
    0.010147782 = product of:
      0.020295564 = sum of:
        0.020295564 = weight(_text_:of in 1881) [ClassicSimilarity], result of:
          0.020295564 = score(doc=1881,freq=12.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.29624295 = fieldWeight in 1881, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1881)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In a previous article, we introduced a general transformation on sources and one on items in an arbitrary information production process (IPP). In this article, we investigate the influence of these transformations on the h-index and on the g-index. General formulae that describe this influence are presented. These are applied to the case that the size-frequency function is Lotkaian (i.e., is a decreasing power function). We further show that the h-index of the transformed IPP belongs to the interval bounded by the two transformations of the h-index of the original IPP, and we also show that this property is not true for the g-index.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.8, S.1304-1312
Egghe, L.; Liang, L.; Rousseau, R.: Fundamental properties of rhythm sequences (2008) 0.00
```
0.003382594 = product of:
  0.010147782 = sum of:
    0.010147782 = product of:
      0.020295564 = sum of:
        0.020295564 = weight(_text_:of in 1965) [ClassicSimilarity], result of:
          0.020295564 = score(doc=1965,freq=12.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.29624295 = fieldWeight in 1965, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1965)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Fundamental mathematical properties of rhythm sequences are studied. In particular, a set of three axioms for valid rhythm indicators is proposed, and it is shown that the R-indicator satisfies only two out of three but that the R-indicator satisfies all three. This fills a critical, logical gap in the study of these indicator sequences. Matrices leading to a constant R-sequence are called baseline matrices. They are characterized as matrices with constant w-year diachronous impact factors. The relation with classical impact factors is clarified. Using regression analysis matrices with a rhythm sequence that is on average equal to 1 (smaller than 1, larger than 1) are characterized.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.9, S.1469-1478

Egghe, L.: Expansion of the field of informetrics : the second special issue (2006) 0.00

0.0033478998 = product of:
  0.010043699 = sum of:
    0.010043699 = product of:
      0.020087399 = sum of:
        0.020087399 = weight(_text_:of in 7119) [ClassicSimilarity], result of:
          0.020087399 = score(doc=7119,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.2932045 = fieldWeight in 7119, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.09375 = fieldNorm(doc=7119)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Egghe, L.: Expansion of the field of informetrics : origins and consequences (2005) 0.00

0.0033478998 = product of:
  0.010043699 = sum of:
    0.010043699 = product of:
      0.020087399 = sum of:
        0.020087399 = weight(_text_:of in 1910) [ClassicSimilarity], result of:
          0.020087399 = score(doc=1910,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.2932045 = fieldWeight in 1910, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.09375 = fieldNorm(doc=1910)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Egghe, L.: Relations between the continuous and the discrete Lotka power function (2005) 0.00
```
0.0033478998 = product of:
  0.010043699 = sum of:
    0.010043699 = product of:
      0.020087399 = sum of:
        0.020087399 = weight(_text_:of in 3464) [ClassicSimilarity], result of:
          0.020087399 = score(doc=3464,freq=16.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.2932045 = fieldWeight in 3464, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3464)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The discrete Lotka power function describes the number of sources (e.g., authors) with n = 1, 2, 3, ... items (e.g., publications). As in econometrics, informetrics theory requires functions of a continuous variable j, replacing the discrete variable n. Now j represents item densities instead of number of items. The continuous Lotka power function describes the density of sources with item density j. The discrete Lotka function one obtains from data, obtained empirically; the continuous Lotka function is the one needed when one wants to apply Lotkaian informetrics, i.e., to determine properties that can be derived from the (continuous) model. It is, hence, important to know the relations between the two models. We show that the exponents of the discrete Lotka function (if not too high, i.e., within limits encountered in practice) and of the continuous Lotka function are approximately the same. This is important to know in applying theoretical results (from the continuous model), derived from practical data.

Source

Journal of the American Society for Information Science and Technology. 56(2005) no.7, S.664-668
Egghe, L.; Leydesdorff, L.: ¬The relation between Pearson's correlation coefficient r and Salton's cosine measure (2009) 0.00
```
0.0033478998 = product of:
  0.010043699 = sum of:
    0.010043699 = product of:
      0.020087399 = sum of:
        0.020087399 = weight(_text_:of in 2803) [ClassicSimilarity], result of:
          0.020087399 = score(doc=2803,freq=16.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.2932045 = fieldWeight in 2803, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2803)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The relation between Pearson's correlation coefficient and Salton's cosine measure is revealed based on the different possible values of the division of the L1-norm and the L2-norm of a vector. These different values yield a sheaf of increasingly straight lines which together form a cloud of points, being the investigated relation. The theoretical results are tested against the author co-citation relations among 24 informetricians for whom two matrices can be constructed, based on co-citations: the asymmetric occurrence matrix and the symmetric co-citation matrix. Both examples completely confirm the theoretical results. The results enable us to specify an algorithm that provides a threshold value for the cosine above which none of the corresponding Pearson correlations would be negative. Using this threshold value can be expected to optimize the visualization of the vector space.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.5, S.1027-1036
Egghe, L.: Type/Token-Taken informetrics (2003) 0.00
```
0.003271467 = product of:
  0.009814401 = sum of:
    0.009814401 = product of:
      0.019628802 = sum of:
        0.019628802 = weight(_text_:of in 1608) [ClassicSimilarity], result of:
          0.019628802 = score(doc=1608,freq=22.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.28651062 = fieldWeight in 1608, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1608)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Type/Token-Taken informetrics is a new part of informetrics that studies the use of items rather than the items itself. Here, items are the objects that are produced by the sources (e.g., journals producing articles, authors producing papers, etc.). In linguistics a source is also called a type (e.g., a word), and an item a token (e.g., the use of words in texts). In informetrics, types that occur often, for example, in a database will also be requested often, for example, in information retrieval. The relative use of these occurrences will be higher than their relative occurrences itself; hence, the name Type/ Token-Taken informetrics. This article studies the frequency distribution of Type/Token-Taken informetrics, starting from the one of Type/Token informetrics (i.e., source-item relationships). We are also studying the average number my* of item uses in Type/Token-Taken informetrics and compare this with the classical average number my in Type/Token informetrics. We show that my* >= my always, and that my* is an increasing function of my. A method is presented to actually calculate my* from my, and a given a, which is the exponent in Lotka's frequency distribution of Type/Token informetrics. We leave open the problem of developing non-Lotkaian Type/TokenTaken informetrics.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.7, S.603-610

Search (34 results, page 1 of 2)

Authors

Themes