Search (59 results, page 1 of 3)

Egghe, L.; Bornmann, L.: Fallout and miss in journal peer review (2013) 0.05

0.04823032 = product of:
  0.09646064 = sum of:
    0.041327372 = weight(_text_:retrieval in 1759) [ClassicSimilarity], result of:
      0.041327372 = score(doc=1759,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33085006 = fieldWeight in 1759, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1759)
    0.029945528 = weight(_text_:use in 1759) [ClassicSimilarity], result of:
      0.029945528 = score(doc=1759,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 1759, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1759)
    0.017463053 = weight(_text_:of in 1759) [ClassicSimilarity], result of:
      0.017463053 = score(doc=1759,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2704316 = fieldWeight in 1759, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1759)
    0.007724685 = product of:
      0.01544937 = sum of:
        0.01544937 = weight(_text_:on in 1759) [ClassicSimilarity], result of:
          0.01544937 = score(doc=1759,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.17010231 = fieldWeight in 1759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1759)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Purpose - The authors exploit the analogy between journal peer review and information retrieval in order to quantify some imperfections of journal peer review. Design/methodology/approach - The authors define fallout rate and missing rate in order to describe quantitatively the weak papers that were accepted and the strong papers that were missed, respectively. To assess the quality of manuscripts the authors use bibliometric measures. Findings - Fallout rate and missing rate are put in relation with the hitting rate and success rate. Conclusions are drawn on what fraction of weak papers will be accepted in order to have a certain fraction of strong accepted papers. Originality/value - The paper illustrates that these curves are new in peer review research when interpreted in the information retrieval terminology.
Source: Journal of documentation. 69(2013) no.3, S.411-416

Egghe, L.: ¬A rationale for the Hirsch-index rank-order distribution and a comparison with the impact factor rank-order distribution (2009) 0.04

0.043737486 = product of:
  0.17494994 = sum of:
    0.017463053 = weight(_text_:of in 3124) [ClassicSimilarity], result of:
      0.017463053 = score(doc=3124,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2704316 = fieldWeight in 3124, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3124)
    0.15748689 = sum of:
      0.01544937 = weight(_text_:on in 3124) [ClassicSimilarity], result of:
        0.01544937 = score(doc=3124,freq=2.0), product of:
          0.090823986 = queryWeight, product of:
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.041294612 = queryNorm
          0.17010231 = fieldWeight in 3124, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3124)
      0.14203751 = weight(_text_:line in 3124) [ClassicSimilarity], result of:
        0.14203751 = score(doc=3124,freq=4.0), product of:
          0.23157367 = queryWeight, product of:
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.041294612 = queryNorm
          0.6133578 = fieldWeight in 3124, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3124)
  0.25 = coord(2/8)

Abstract: We present a rationale for the Hirsch-index rank-order distribution and prove that it is a power law (hence a straight line in the log-log scale). This is confirmed by experimental data of Pyykkö and by data produced in this article on 206 mathematics journals. This distribution is of a completely different nature than the impact factor (IF) rank-order distribution which (as proved in a previous article) is S-shaped. This is also confirmed by our example. Only in the log-log scale of the h-index distribution do we notice a concave deviation of the straight line for higher ranks. This phenomenon is discussed.
Source: Journal of the American Society for Information Science and Technology. 60(2009) no.10, S.2142-2144

Egghe, L.: ¬A universal method of information retrieval evaluation : the "missing" link M and the universal IR surface (2004) 0.04

0.03888139 = product of:
  0.07776278 = sum of:
    0.035423465 = weight(_text_:retrieval in 2558) [ClassicSimilarity], result of:
      0.035423465 = score(doc=2558,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 2558, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2558)
    0.018933605 = weight(_text_:of in 2558) [ClassicSimilarity], result of:
      0.018933605 = score(doc=2558,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2932045 = fieldWeight in 2558, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2558)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 2558) [ClassicSimilarity], result of:
          0.013242318 = score(doc=2558,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 2558, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2558)
      0.5 = coord(1/2)
    0.016784549 = product of:
      0.033569098 = sum of:
        0.033569098 = weight(_text_:22 in 2558) [ClassicSimilarity], result of:
          0.033569098 = score(doc=2558,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.23214069 = fieldWeight in 2558, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2558)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: The paper shows that the present evaluation methods in information retrieval (basically recall R and precision P and in some cases fallout F ) lack universal comparability in the sense that their values depend on the generality of the IR problem. A solution is given by using all "parts" of the database, including the non-relevant documents and also the not-retrieved documents. It turns out that the solution is given by introducing the measure M being the fraction of the not-retrieved documents that are relevant (hence the "miss" measure). We prove that - independent of the IR problem or of the IR action - the quadruple (P,R,F,M) belongs to a universal IR surface, being the same for all IR-activities. This universality is then exploited by defining a new measure for evaluation in IR allowing for unbiased comparisons of all IR results. We also show that only using one, two or even three measures from the set {P,R,F,M} necessary leads to evaluation measures that are non-universal and hence not capable of comparing different IR situations.
Date: 14. 8.2004 19:17:22

Egghe, L.; Rousseau, R.: Topological aspects of information retrieval (1998) 0.03

0.03394952 = product of:
  0.09053206 = sum of:
    0.06534432 = weight(_text_:retrieval in 2157) [ClassicSimilarity], result of:
      0.06534432 = score(doc=2157,freq=10.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.5231199 = fieldWeight in 2157, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2157)
    0.017463053 = weight(_text_:of in 2157) [ClassicSimilarity], result of:
      0.017463053 = score(doc=2157,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2704316 = fieldWeight in 2157, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2157)
    0.007724685 = product of:
      0.01544937 = sum of:
        0.01544937 = weight(_text_:on in 2157) [ClassicSimilarity], result of:
          0.01544937 = score(doc=2157,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.17010231 = fieldWeight in 2157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2157)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Let (DS, DQ, sim) be a retrieval system consisting of a document space DS, a query space QS, and a function sim, expressing the similarity between a document and a query. Following D.M. Everett and S.C. Cater (1992), we introduce topologies on the document space. These topologies are generated by the similarity function sim and the query space QS. 3 topologies will be studied: the retrieval topology, the similarity topology and the (pseudo-)metric one. It is shown that the retrieval topology is the coarsest of the three, while the (pseudo-)metric is the strongest. These 3 topologies are generally different, reflecting distinct topological aspects of information retrieval. We present necessary and sufficient conditions for these topological aspects to be equal
Source: Journal of the American Society for Information Science. 49(1998) no.13, S.1144-1160

Egghe, L.: Type/Token-Taken informetrics (2003) 0.03

0.028658554 = product of:
  0.07642281 = sum of:
    0.020873476 = weight(_text_:retrieval in 1608) [ClassicSimilarity], result of:
      0.020873476 = score(doc=1608,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 1608, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1608)
    0.037047986 = weight(_text_:use in 1608) [ClassicSimilarity], result of:
      0.037047986 = score(doc=1608,freq=6.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.29299045 = fieldWeight in 1608, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1608)
    0.01850135 = weight(_text_:of in 1608) [ClassicSimilarity], result of:
      0.01850135 = score(doc=1608,freq=22.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.28651062 = fieldWeight in 1608, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1608)
  0.375 = coord(3/8)

Abstract: Type/Token-Taken informetrics is a new part of informetrics that studies the use of items rather than the items itself. Here, items are the objects that are produced by the sources (e.g., journals producing articles, authors producing papers, etc.). In linguistics a source is also called a type (e.g., a word), and an item a token (e.g., the use of words in texts). In informetrics, types that occur often, for example, in a database will also be requested often, for example, in information retrieval. The relative use of these occurrences will be higher than their relative occurrences itself; hence, the name Type/ Token-Taken informetrics. This article studies the frequency distribution of Type/Token-Taken informetrics, starting from the one of Type/Token informetrics (i.e., source-item relationships). We are also studying the average number my* of item uses in Type/Token-Taken informetrics and compare this with the classical average number my in Type/Token informetrics. We show that my* >= my always, and that my* is an increasing function of my. A method is presented to actually calculate my* from my, and a given a, which is the exponent in Lotka's frequency distribution of Type/Token informetrics. We leave open the problem of developing non-Lotkaian Type/TokenTaken informetrics.
Source: Journal of the American Society for Information Science and technology. 54(2003) no.7, S.603-610

Egghe, L.: Theory of the topical coverage of multiple databases (2013) 0.02

0.024112135 = product of:
  0.064299025 = sum of:
    0.029945528 = weight(_text_:use in 526) [ClassicSimilarity], result of:
      0.029945528 = score(doc=526,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 526, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=526)
    0.02342914 = weight(_text_:of in 526) [ClassicSimilarity], result of:
      0.02342914 = score(doc=526,freq=18.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.36282203 = fieldWeight in 526, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=526)
    0.010924355 = product of:
      0.02184871 = sum of:
        0.02184871 = weight(_text_:on in 526) [ClassicSimilarity], result of:
          0.02184871 = score(doc=526,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24056101 = fieldWeight in 526, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=526)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: We present a model that describes which fraction of the literature on a certain topic we will find when we use n (n = 1, 2, .) databases. It is a generalization of the theory of discovering usability problems. We prove that, in all practical cases, this fraction is a concave function of n, the number of used databases, thereby explaining some graphs that exist in the literature. We also study limiting features of this fraction for n very high and we characterize the case that we find all literature on a certain topic for n high enough.
Source: Journal of the American Society for Information Science and Technology. 64(2013) no.1, S.126-131

Egghe, L.: ¬The measures precision, recall, fallout and miss as a function of the number of retrieved documents and their mutual interrelations (2008) 0.02
```
0.020338144 = product of:
  0.08135258 = sum of:
    0.059039105 = weight(_text_:retrieval in 2067) [ClassicSimilarity], result of:
      0.059039105 = score(doc=2067,freq=16.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.47264296 = fieldWeight in 2067, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2067)
    0.02231347 = weight(_text_:of in 2067) [ClassicSimilarity], result of:
      0.02231347 = score(doc=2067,freq=32.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.34554482 = fieldWeight in 2067, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2067)
  0.25 = coord(2/8)
```
Abstract

In this paper, for the first time, we present global curves for the measures precision, recall, fallout and miss in function of the number of retrieved documents. Different curves apply for different retrieved systems, for which we give exact definitions in terms of a retrieval density function: perverse retrieval, perfect retrieval, random retrieval, normal retrieval, hereby extending results of Buckland and Gey and of Egghe in the following sense: mathematically more advanced methods yield a better insight into these curves, more types of retrieval are considered and, very importantly, the theory is developed for the "complete" set of measures: precision, recall, fallout and miss. Next we study the interrelationships between precision, recall, fallout and miss in these different types of retrieval, hereby again extending results of Buckland and Gey (incl. a correction) and of Egghe. In the case of normal retrieval we prove that precision in function of recall and recall in function of miss is a concavely decreasing relationship while recall in function of fallout is a concavely increasing relationship. We also show, by producing examples, that the relationships between fallout and precision, miss and precision and miss and fallout are not always convex or concave.

Egghe, L.; Rousseau, R.: ¬A theoretical study of recall and precision using a topological approach to information retrieval (1998) 0.02

0.019854382 = product of:
  0.07941753 = sum of:
    0.066795126 = weight(_text_:retrieval in 3267) [ClassicSimilarity], result of:
      0.066795126 = score(doc=3267,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.5347345 = fieldWeight in 3267, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=3267)
    0.012622404 = weight(_text_:of in 3267) [ClassicSimilarity], result of:
      0.012622404 = score(doc=3267,freq=4.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.19546966 = fieldWeight in 3267, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=3267)
  0.25 = coord(2/8)

Abstract: Topologies for information retrieval systems are generated by certain subsets, called retrievals. Shows how recall and precision can be expressed using only retrievals. Investigates different types of retrieval systems: both threshold systems and close match systems and both optimal and non optimal retrieval. Highlights the relation with the hypergeometric and some non-standard distributions

Egghe, L.; Guns, R.; Rousseau, R.; Leuven, K.U.: Erratum (2012) 0.02

0.018812343 = product of:
  0.05016625 = sum of:
    0.011156735 = weight(_text_:of in 4992) [ClassicSimilarity], result of:
      0.011156735 = score(doc=4992,freq=2.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.17277241 = fieldWeight in 4992, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=4992)
    0.0110352645 = product of:
      0.022070529 = sum of:
        0.022070529 = weight(_text_:on in 4992) [ClassicSimilarity], result of:
          0.022070529 = score(doc=4992,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24300331 = fieldWeight in 4992, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.078125 = fieldNorm(doc=4992)
      0.5 = coord(1/2)
    0.02797425 = product of:
      0.0559485 = sum of:
        0.0559485 = weight(_text_:22 in 4992) [ClassicSimilarity], result of:
          0.0559485 = score(doc=4992,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.38690117 = fieldWeight in 4992, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4992)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Date: 14. 2.2012 12:53:22
Footnote: This article corrects: Thoughts on uncitedness: Nobel laureates and Fields medalists as case studies in: JASIST 62(2011) no,8, S.1637-1644.
Source: Journal of the American Society for Information Science and Technology. 63(2012) no.2, S.429

Egghe, L.; Rousseau, R.: Averaging and globalising quotients of informetric and scientometric data (1996) 0.02

0.018499356 = product of:
  0.049331617 = sum of:
    0.02592591 = weight(_text_:of in 7659) [ClassicSimilarity], result of:
      0.02592591 = score(doc=7659,freq=30.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.4014868 = fieldWeight in 7659, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=7659)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 7659) [ClassicSimilarity], result of:
          0.013242318 = score(doc=7659,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 7659, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=7659)
      0.5 = coord(1/2)
    0.016784549 = product of:
      0.033569098 = sum of:
        0.033569098 = weight(_text_:22 in 7659) [ClassicSimilarity], result of:
          0.033569098 = score(doc=7659,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.23214069 = fieldWeight in 7659, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=7659)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: It is possible, using ISI's Journal Citation Report (JCR), to calculate average impact factors (AIF) for LCR's subject categories but it can be more useful to know the global Impact Factor (GIF) of a subject category and compare the 2 values. Reports results of a study to compare the relationships between AIFs and GIFs of subjects, based on the particular case of the average impact factor of a subfield versus the impact factor of this subfield as a whole, the difference being studied between an average of quotients, denoted as AQ, and a global average, obtained as a quotient of averages, and denoted as GQ. In the case of impact factors, AQ becomes the average impact factor of a field, and GQ becomes its global impact factor. Discusses a number of applications of this technique in the context of informetrics and scientometrics
Source: Journal of information science. 22(1996) no.3, S.165-170

Egghe, L.: Untangling Herdan's law and Heaps' law : mathematical and informetric arguments (2007) 0.02

0.016511794 = product of:
  0.044031452 = sum of:
    0.020873476 = weight(_text_:retrieval in 271) [ClassicSimilarity], result of:
      0.020873476 = score(doc=271,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 271, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=271)
    0.017640345 = weight(_text_:of in 271) [ClassicSimilarity], result of:
      0.017640345 = score(doc=271,freq=20.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.27317715 = fieldWeight in 271, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=271)
    0.0055176322 = product of:
      0.0110352645 = sum of:
        0.0110352645 = weight(_text_:on in 271) [ClassicSimilarity], result of:
          0.0110352645 = score(doc=271,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.121501654 = fieldWeight in 271, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=271)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Herdan's law in linguistics and Heaps' law in information retrieval are different formulations of the same phenomenon. Stated briefly and in linguistic terms they state that vocabularies' sizes are concave increasing power laws of texts' sizes. This study investigates these laws from a purely mathematical and informetric point of view. A general informetric argument shows that the problem of proving these laws is, in fact, ill-posed. Using the more general terminology of sources and items, the author shows by presenting exact formulas from Lotkaian informetrics that the total number T of sources is not only a function of the total number A of items, but is also a function of several parameters (e.g., the parameters occurring in Lotka's law). Consequently, it is shown that a fixed T(or A) value can lead to different possible A (respectively, T) values. Limiting the T(A)-variability to increasing samples (e.g., in a text as done in linguistics) the author then shows, in a purely mathematical way, that for large sample sizes T~ A**phi, where phi is a constant, phi < 1 but close to 1, hence roughly, Heaps' or Herdan's law can be proved without using any linguistic or informetric argument. The author also shows that for smaller samples, a is not a constant but essentially decreases as confirmed by practical examples. Finally, an exact informetric argument on random sampling in the items shows that, in most cases, T= T(A) is a concavely increasing function, in accordance with practical examples.
Source: Journal of the American Society for Information Science and Technology. 58(2007) no.5, S.702-709

Egghe, L.: Vector retrieval, fuzzy retrieval and the universal fuzzy IR surface for IR evaluation (2004) 0.02

0.015415024 = product of:
  0.061660096 = sum of:
    0.050615493 = weight(_text_:retrieval in 2531) [ClassicSimilarity], result of:
      0.050615493 = score(doc=2531,freq=6.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.40520695 = fieldWeight in 2531, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2531)
    0.011044604 = weight(_text_:of in 2531) [ClassicSimilarity], result of:
      0.011044604 = score(doc=2531,freq=4.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.17103596 = fieldWeight in 2531, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2531)
  0.25 = coord(2/8)

Abstract: It is shown that vector information retrieval (IR) and general fuzzy IR uses two types of fuzzy set operations: the original "Zadeh min-max operations" and the so-called "probabilistic sum and algebraic product operations". The universal IR surface, valid for classical 0-1 IR (i.e. where ordinary sets are used) and used in IR evaluation, is extended to and reproved for vector IR, using the probabilistic sum and algebraic product model. We also show (by counterexample) that, using the "Zadeh min-max" fuzzy model, yields a breakdown of this IR surface.

Egghe, L.; Rousseau, R.: Duality in information retrieval and the hypegeometric distribution (1997) 0.01

0.014039168 = product of:
  0.056156673 = sum of:
    0.047231287 = weight(_text_:retrieval in 647) [ClassicSimilarity], result of:
      0.047231287 = score(doc=647,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.37811437 = fieldWeight in 647, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=647)
    0.008925388 = weight(_text_:of in 647) [ClassicSimilarity], result of:
      0.008925388 = score(doc=647,freq=2.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.13821793 = fieldWeight in 647, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=647)
  0.25 = coord(2/8)

Abstract: Asserts that duality is an important topic in informetrics, especially in connection with the classical informetric laws. Yet this concept is less studied in information retrieval. It deals with the unification or symmetry between queries and documents, search formulation versus indexing, and relevant versus retrieved documents. Elaborates these ideas and highlights the connection with the hypergeometric distribution
Source: Journal of documentation. 53(1997) no.5, S.499-496

Egghe, L.: ¬A new short proof of Naranan's theorem, explaining Lotka's law and Zipf's law (2010) 0.01

0.012268836 = product of:
  0.049075343 = sum of:
    0.029945528 = weight(_text_:use in 3432) [ClassicSimilarity], result of:
      0.029945528 = score(doc=3432,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 3432, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3432)
    0.019129815 = weight(_text_:of in 3432) [ClassicSimilarity], result of:
      0.019129815 = score(doc=3432,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.29624295 = fieldWeight in 3432, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3432)
  0.25 = coord(2/8)

Abstract: Naranan's important theorem, published in Nature in 1970, states that if the number of journals grows exponentially and if the number of articles in each journal grows exponentially (at the same rate for each journal), then the system satisfies Lotka's law and a formula for the Lotka's exponent is given in function of the growth rates of the journals and the articles. This brief communication re-proves this result by showing that the system satisfies Zipf's law, which is equivalent with Lotka's law. The proof is short and algebraic and does not use infinitesimal arguments.
Source: Journal of the American Society for Information Science and Technology. 61(2010) no.12, S.2581-2583

Egghe, L.; Guns, R.: Applications of the generalized law of Benford to informetric data (2012) 0.01

0.011967305 = product of:
  0.04786922 = sum of:
    0.025667597 = weight(_text_:use in 376) [ClassicSimilarity], result of:
      0.025667597 = score(doc=376,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 376, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=376)
    0.022201622 = weight(_text_:of in 376) [ClassicSimilarity], result of:
      0.022201622 = score(doc=376,freq=22.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.34381276 = fieldWeight in 376, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=376)
  0.25 = coord(2/8)

Abstract: In a previous work (Egghe, 2011), the first author showed that Benford's law (describing the logarithmic distribution of the numbers 1, 2, ... , 9 as first digits of data in decimal form) is related to the classical law of Zipf with exponent 1. The work of Campanario and Coslado (2011), however, shows that Benford's law does not always fit practical data in a statistical sense. In this article, we use a generalization of Benford's law related to the general law of Zipf with exponent ? > 0. Using data from Campanario and Coslado, we apply nonlinear least squares to determine the optimal ? and show that this generalized law of Benford fits the data better than the classical law of Benford.
Source: Journal of the American Society for Information Science and Technology. 63(2012) no.8, S.1662-1665

Egghe, L.: Mathematical theories of citation (1998) 0.01

0.011108146 = product of:
  0.044432584 = sum of:
    0.026776161 = weight(_text_:of in 5125) [ClassicSimilarity], result of:
      0.026776161 = score(doc=5125,freq=18.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.41465375 = fieldWeight in 5125, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=5125)
    0.017656423 = product of:
      0.035312846 = sum of:
        0.035312846 = weight(_text_:on in 5125) [ClassicSimilarity], result of:
          0.035312846 = score(doc=5125,freq=8.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.3888053 = fieldWeight in 5125, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=5125)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: Focuses on possible mathematical theories of citation and on the intrinsic problems related to it. Sheds light on aspects of mathematical complexity as encountered in, for example, fractal theory and Mandelbrot's law. Also discusses dynamical aspects of citation theory as reflected in evolutions of journal rankings, centres of gravity or of the set of source journals. Makes some comments in this connection on growth and obsolescence
Footnote: Contribution to a thematic issue devoted to 'Theories of citation?'

Egghe, L.; Rousseau, R.: ¬A measure for the cohesion of weighted networks (2003) 0.01
```
0.010748647 = product of:
  0.04299459 = sum of:
    0.021389665 = weight(_text_:use in 5157) [ClassicSimilarity], result of:
      0.021389665 = score(doc=5157,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 5157, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5157)
    0.021604925 = weight(_text_:of in 5157) [ClassicSimilarity], result of:
      0.021604925 = score(doc=5157,freq=30.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.33457235 = fieldWeight in 5157, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5157)
  0.25 = coord(2/8)
```
Abstract

Measurement of the degree of interconnectedness in graph like networks of hyperlinks or citations can indicate the existence of research fields and assist in comparative evaluation of research efforts. In this issue we begin with Egghe and Rousseau who review compactness measures and investigate the compactness of a network as a weighted graph with dissimilarity values characterizing the arcs between nodes. They make use of a generalization of the Botofogo, Rivlin, Shneiderman, (BRS) compaction measure which treats the distance between unreachable nodes not as infinity but rather as the number of nodes in the network. The dissimilarity values are determined by summing the reciprocals of the weights of the arcs in the shortest chain between two nodes where no weight is smaller than one. The BRS measure is then the maximum value for the sum of the dissimilarity measures less the actual sum divided by the difference between the maximum and minimum. The Wiener index, the sum of all elements in the dissimilarity matrix divided by two, is then computed for Small's particle physics co-citation data as well as the BRS measure, the dissimilarity values and shortest paths. The compactness measure for the weighted network is smaller than for the un-weighted. When the bibliographic coupling network is utilized it is shown to be less compact than the co-citation network which indicates that the new measure produces results that confirm to an obvious case.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.3, S.193-202

Egghe, L.: Existence theorem of the quadruple (P, R, F, M) : precision, recall, fallout and miss (2007) 0.01

0.010004126 = product of:
  0.040016502 = sum of:
    0.025048172 = weight(_text_:retrieval in 2011) [ClassicSimilarity], result of:
      0.025048172 = score(doc=2011,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.20052543 = fieldWeight in 2011, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2011)
    0.014968331 = weight(_text_:of in 2011) [ClassicSimilarity], result of:
      0.014968331 = score(doc=2011,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23179851 = fieldWeight in 2011, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2011)
  0.25 = coord(2/8)

Abstract: In an earlier paper [Egghe, L. (2004). A universal method of information retrieval evaluation: the "missing" link M and the universal IR surface. Information Processing and Management, 40, 21-30] we showed that, given an IR system, and if P denotes precision, R recall, F fallout and M miss (re-introduced in the paper mentioned above), we have the following relationship between P, R, F and M: P/(1-P)*(1-R)/R*F/(1-F)*(1-M)/M = 1. In this paper we prove the (more difficult) converse: given any four rational numbers in the interval ]0, 1[ satisfying the above equation, then there exists an IR system such that these four numbers (in any order) are the precision, recall, fallout and miss of this IR system. As a consequence we show that any three rational numbers in ]0, 1[ represent any three measures taken from precision, recall, fallout and miss of a certain IR system. We also show that this result is also true for two numbers instead of three.

Egghe, L.: Sampling and concentration values of incomplete bibliographies (2002) 0.01

0.009519008 = product of:
  0.038076032 = sum of:
    0.024696484 = weight(_text_:of in 450) [ClassicSimilarity], result of:
      0.024696484 = score(doc=450,freq=20.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.38244802 = fieldWeight in 450, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=450)
    0.013379549 = product of:
      0.026759097 = sum of:
        0.026759097 = weight(_text_:on in 450) [ClassicSimilarity], result of:
          0.026759097 = score(doc=450,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.29462588 = fieldWeight in 450, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=450)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: This article studies concentration aspects of bibliographies. More, in particular, we study the impact of incompleteness of such a bibliography on its concentration values (i.e., its degree of inequality of production of its sources). Incompleteness is modeled by sampling in the complete bibliography. The model is general enough to comprise truncation of a bibliography as well as a systematic sample on sources or items. In all cases we prove that the sampled bibliography (or incomplete one) has a higher concentration value than the complete one. These models, hence, shed some light on the measurement of production inequality in incomplete bibliographies.
Source: Journal of the American Society for Information Science and technology. 53(2002) no.4, S.271-281

Egghe, L.: Mathematical study of h-index sequences (2009) 0.01
```
0.009291917 = product of:
  0.03716767 = sum of:
    0.021389665 = weight(_text_:use in 4217) [ClassicSimilarity], result of:
      0.021389665 = score(doc=4217,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 4217, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4217)
    0.015778005 = weight(_text_:of in 4217) [ClassicSimilarity], result of:
      0.015778005 = score(doc=4217,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.24433708 = fieldWeight in 4217, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4217)
  0.25 = coord(2/8)
```
Abstract

This paper studies mathematical properties of h-index sequences as developed by Liang [Liang, L. (2006). h-Index sequence and h-index matrix: Constructions and applications. Scientometrics, 69(1), 153-159]. For practical reasons, Liming studies such sequences where the time goes backwards while it is more logical to use the time going forward (real career periods). Both type of h-index sequences are studied here and their interrelations are revealed. We show cases where these sequences are convex, linear and concave. We also show that, when one of the sequences is convex then the other one is concave, showing that the reverse-time sequence, in general, cannot be used to derive similar properties of the (difficult to obtain) forward time sequence. We show that both sequences are the same if and only if the author produces the same number of papers per year. If the author produces an increasing number of papers per year, then Liang's h-sequences are above the "normal" ones. All these results are also valid for g- and R-sequences. The results are confirmed by the h-, g- and R-sequences (forward and reverse time) of the author.

Search (59 results, page 1 of 3)

Authors

Years

Types

Themes

Subjects

Classifications