Search (4 results, page 1 of 1)

  • × author_ss:"Egghe, L."
  • × theme_ss:"Informetrie"
  1. Egghe, L.; Rousseau, R.: Duality in information retrieval and the hypegeometric distribution (1997) 0.01
    0.0064842156 = product of:
      0.019452646 = sum of:
        0.019452646 = product of:
          0.058357935 = sum of:
            0.058357935 = weight(_text_:retrieval in 647) [ClassicSimilarity], result of:
              0.058357935 = score(doc=647,freq=4.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.37811437 = fieldWeight in 647, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0625 = fieldNorm(doc=647)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Asserts that duality is an important topic in informetrics, especially in connection with the classical informetric laws. Yet this concept is less studied in information retrieval. It deals with the unification or symmetry between queries and documents, search formulation versus indexing, and relevant versus retrieved documents. Elaborates these ideas and highlights the connection with the hypergeometric distribution
  2. Egghe, L.; Rousseau, R.: Averaging and globalising quotients of informetric and scientometric data (1996) 0.00
    0.0046085827 = product of:
      0.013825747 = sum of:
        0.013825747 = product of:
          0.04147724 = sum of:
            0.04147724 = weight(_text_:22 in 7659) [ClassicSimilarity], result of:
              0.04147724 = score(doc=7659,freq=2.0), product of:
                0.17867287 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051022716 = queryNorm
                0.23214069 = fieldWeight in 7659, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=7659)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Journal of information science. 22(1996) no.3, S.165-170
  3. Egghe, L.: Type/Token-Taken informetrics (2003) 0.00
    0.0028656456 = product of:
      0.008596936 = sum of:
        0.008596936 = product of:
          0.025790809 = sum of:
            0.025790809 = weight(_text_:retrieval in 1608) [ClassicSimilarity], result of:
              0.025790809 = score(doc=1608,freq=2.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.16710453 = fieldWeight in 1608, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1608)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Type/Token-Taken informetrics is a new part of informetrics that studies the use of items rather than the items itself. Here, items are the objects that are produced by the sources (e.g., journals producing articles, authors producing papers, etc.). In linguistics a source is also called a type (e.g., a word), and an item a token (e.g., the use of words in texts). In informetrics, types that occur often, for example, in a database will also be requested often, for example, in information retrieval. The relative use of these occurrences will be higher than their relative occurrences itself; hence, the name Type/ Token-Taken informetrics. This article studies the frequency distribution of Type/Token-Taken informetrics, starting from the one of Type/Token informetrics (i.e., source-item relationships). We are also studying the average number my* of item uses in Type/Token-Taken informetrics and compare this with the classical average number my in Type/Token informetrics. We show that my* >= my always, and that my* is an increasing function of my. A method is presented to actually calculate my* from my, and a given a, which is the exponent in Lotka's frequency distribution of Type/Token informetrics. We leave open the problem of developing non-Lotkaian Type/TokenTaken informetrics.
  4. Egghe, L.: Untangling Herdan's law and Heaps' law : mathematical and informetric arguments (2007) 0.00
    0.0028656456 = product of:
      0.008596936 = sum of:
        0.008596936 = product of:
          0.025790809 = sum of:
            0.025790809 = weight(_text_:retrieval in 271) [ClassicSimilarity], result of:
              0.025790809 = score(doc=271,freq=2.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.16710453 = fieldWeight in 271, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=271)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Herdan's law in linguistics and Heaps' law in information retrieval are different formulations of the same phenomenon. Stated briefly and in linguistic terms they state that vocabularies' sizes are concave increasing power laws of texts' sizes. This study investigates these laws from a purely mathematical and informetric point of view. A general informetric argument shows that the problem of proving these laws is, in fact, ill-posed. Using the more general terminology of sources and items, the author shows by presenting exact formulas from Lotkaian informetrics that the total number T of sources is not only a function of the total number A of items, but is also a function of several parameters (e.g., the parameters occurring in Lotka's law). Consequently, it is shown that a fixed T(or A) value can lead to different possible A (respectively, T) values. Limiting the T(A)-variability to increasing samples (e.g., in a text as done in linguistics) the author then shows, in a purely mathematical way, that for large sample sizes T~ A**phi, where phi is a constant, phi < 1 but close to 1, hence roughly, Heaps' or Herdan's law can be proved without using any linguistic or informetric argument. The author also shows that for smaller samples, a is not a constant but essentially decreases as confirmed by practical examples. Finally, an exact informetric argument on random sampling in the items shows that, in most cases, T= T(A) is a concavely increasing function, in accordance with practical examples.