Search (101 results, page 1 of 6)

  • × theme_ss:"Automatisches Indexieren"
  1. Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.12
    0.11828627 = product of:
      0.23657253 = sum of:
        0.23657253 = sum of:
          0.1137505 = weight(_text_:y in 1952) [ClassicSimilarity], result of:
            0.1137505 = score(doc=1952,freq=2.0), product of:
              0.21393733 = queryWeight, product of:
                4.8124003 = idf(docFreq=976, maxDocs=44218)
                0.04445543 = queryNorm
              0.53170013 = fieldWeight in 1952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8124003 = idf(docFreq=976, maxDocs=44218)
                0.078125 = fieldNorm(doc=1952)
          0.06259105 = weight(_text_:k in 1952) [ClassicSimilarity], result of:
            0.06259105 = score(doc=1952,freq=2.0), product of:
              0.15869603 = queryWeight, product of:
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.04445543 = queryNorm
              0.39440846 = fieldWeight in 1952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.078125 = fieldNorm(doc=1952)
          0.060230974 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
            0.060230974 = score(doc=1952,freq=2.0), product of:
              0.15567535 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04445543 = queryNorm
              0.38690117 = fieldWeight in 1952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.078125 = fieldNorm(doc=1952)
      0.5 = coord(1/2)
    
    Date
    16. 8.1998 12:51:22
    Footnote
    Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.513-517.
    Source
    Proceedings of the 11th annual conference on research and development in information retrieval. Ed.: Y. Chiaramella
  2. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.05
    0.0495827 = product of:
      0.0991654 = sum of:
        0.0991654 = product of:
          0.1487481 = sum of:
            0.08851712 = weight(_text_:k in 4157) [ClassicSimilarity], result of:
              0.08851712 = score(doc=4157,freq=4.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.5577778 = fieldWeight in 4157, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4157)
            0.060230974 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
              0.060230974 = score(doc=4157,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.38690117 = fieldWeight in 4157, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4157)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
  3. Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.03
    0.032752544 = product of:
      0.06550509 = sum of:
        0.06550509 = product of:
          0.09825762 = sum of:
            0.050072845 = weight(_text_:k in 3581) [ClassicSimilarity], result of:
              0.050072845 = score(doc=3581,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.31552678 = fieldWeight in 3581, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3581)
            0.04818478 = weight(_text_:22 in 3581) [ClassicSimilarity], result of:
              0.04818478 = score(doc=3581,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.30952093 = fieldWeight in 3581, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3581)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Date
    24. 3.2006 12:22:02
  4. Kajanan, S.; Bao, Y.; Datta, A.; VanderMeer, D.; Dutta, K.: Efficient automatic search query formulation using phrase-level analysis (2014) 0.02
    0.02351221 = product of:
      0.04702442 = sum of:
        0.04702442 = product of:
          0.07053663 = sum of:
            0.045500204 = weight(_text_:y in 1264) [ClassicSimilarity], result of:
              0.045500204 = score(doc=1264,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.21268006 = fieldWeight in 1264, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1264)
            0.025036423 = weight(_text_:k in 1264) [ClassicSimilarity], result of:
              0.025036423 = score(doc=1264,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.15776339 = fieldWeight in 1264, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1264)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
  5. Wan, T.-L.; Evens, M.; Wan, Y.-W.; Pao, Y.-Y.: Experiments with automatic indexing and a relational thesaurus in a Chinese information retrieval system (1997) 0.02
    0.02298586 = product of:
      0.04597172 = sum of:
        0.04597172 = product of:
          0.13791516 = sum of:
            0.13791516 = weight(_text_:y in 956) [ClassicSimilarity], result of:
              0.13791516 = score(doc=956,freq=6.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.6446522 = fieldWeight in 956, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=956)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  6. Sparck Jones, K.: Automatic keyword classification for information retrieval (1971) 0.02
    0.020863686 = product of:
      0.04172737 = sum of:
        0.04172737 = product of:
          0.1251821 = sum of:
            0.1251821 = weight(_text_:k in 5176) [ClassicSimilarity], result of:
              0.1251821 = score(doc=5176,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.7888169 = fieldWeight in 5176, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.15625 = fieldNorm(doc=5176)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  7. SIGIR'92 : Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1992) 0.02
    0.020573184 = product of:
      0.041146368 = sum of:
        0.041146368 = product of:
          0.061719548 = sum of:
            0.039812677 = weight(_text_:y in 6671) [ClassicSimilarity], result of:
              0.039812677 = score(doc=6671,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.18609504 = fieldWeight in 6671, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=6671)
            0.021906871 = weight(_text_:k in 6671) [ClassicSimilarity], result of:
              0.021906871 = score(doc=6671,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.13804297 = fieldWeight in 6671, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=6671)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Content
    HARMAN, D.: Relevance feedback revisited; AALBERSBERG, I.J.: Incremental relevance feedback; TAGUE-SUTCLIFFE, J.: Measuring the informativeness of a retrieval process; LEWIS, D.D.: An evaluation of phrasal and clustered representations on a text categorization task; BLOSSEVILLE, M.J., G. HÉBRAIL, M.G. MONTEIL u. N. PÉNOT: Automatic document classification: natural language processing, statistical analysis, and expert system techniques used together; MASAND, B., G. LINOFF u. D. WALTZ: Classifying news stories using memory based reasoning; KEEN, E.M.: Term position ranking: some new test results; CROUCH, C.J. u. B. YANG: Experiments in automatic statistical thesaurus construction; GREFENSTETTE, G.: Use of syntactic context to produce term association lists for text retrieval; ANICK, P.G. u. R.A. FLYNN: Versioning of full-text information retrieval system; BURKOWSKI, F.J.: Retrieval activities in a database consisting of heterogeneous collections; DEERWESTER, S.C., K. WACLENA u. M. LaMAR: A textual object management system; NIE, J.-Y.:Towards a probabilistic modal logic for semantic-based information retrieval; WANG, A.W., S.K.M. WONG u. Y.Y. YAO: An analysis of vector space models based on computational geometry; BARTELL, B.T., G.W. COTTRELL u. R.K. BELEW: Latent semantic indexing is an optimal special case of multidimensional scaling; GLAVITSCH, U. u. P. SCHÄUBLE: A system for retrieving speech documents; MARGULIS, E.L.: N-Poisson document modelling; HESS, M.: An incrementally extensible document retrieval system based on linguistics and logical principles; COOPER, W.S., F.C. GEY u. D.P. DABNEY: Probabilistic retrieval based on staged logistic regression; FUHR, N.: Integration of probabilistic fact and text retrieval; CROFT, B., L.A. SMITH u. H. TURTLE: A loosely-coupled integration of a text retrieval system and an object-oriented database system; DUMAIS, S.T. u. J. NIELSEN: Automating the assignement of submitted manuscripts to reviewers; GOST, M.A. u. M. MASOTTI: Design of an OPAC database to permit different subject searching accesses; ROBERTSON, A.M. u. P. WILLETT: Searching for historical word forms in a database of 17th century English text using spelling correction methods; FAX, E.A., Q.F. CHEN u. L.S. HEATH: A faster algorithm for constructing minimal perfect hash functions; MOFFAT, A. u. J. ZOBEL: Parameterised compression for sparse bitmaps; GRANDI, F., P. TIBERIO u. P. Zezula: Frame-sliced patitioned parallel signature files; ALLEN, B.: Cognitive differences in end user searching of a CD-ROM index; SONNENWALD, D.H.: Developing a theory to guide the process of designing information retrieval systems; CUTTING, D.R., J.O. PEDERSEN, D. KARGER, u. J.W. TUKEY: Scatter/ Gather: a cluster-based approach to browsing large document collections; CHALMERS, M. u. P. CHITSON: Bead: Explorations in information visualization; WILLIAMSON, C. u. B. SHNEIDERMAN: The dynamic HomeFinder: evaluating dynamic queries in a real-estate information exploring system
  8. Sparck Jones, K.; Tait, J.I.: Automatic search term variant generation (1984) 0.02
    0.016690949 = product of:
      0.033381898 = sum of:
        0.033381898 = product of:
          0.10014569 = sum of:
            0.10014569 = weight(_text_:k in 2918) [ClassicSimilarity], result of:
              0.10014569 = score(doc=2918,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.63105357 = fieldWeight in 2918, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.125 = fieldNorm(doc=2918)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  9. Sparck Jones, K.; Jackson, D.M.: ¬The use of automatically obtained keyword classification for information retrieval (1970) 0.02
    0.016690949 = product of:
      0.033381898 = sum of:
        0.033381898 = product of:
          0.10014569 = sum of:
            0.10014569 = weight(_text_:k in 5177) [ClassicSimilarity], result of:
              0.10014569 = score(doc=5177,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.63105357 = fieldWeight in 5177, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.125 = fieldNorm(doc=5177)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  10. Sparck Jones, K.: Index term weighting (1973) 0.02
    0.016690949 = product of:
      0.033381898 = sum of:
        0.033381898 = product of:
          0.10014569 = sum of:
            0.10014569 = weight(_text_:k in 5491) [ClassicSimilarity], result of:
              0.10014569 = score(doc=5491,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.63105357 = fieldWeight in 5491, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.125 = fieldNorm(doc=5491)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  11. Klinger, K.-H.: Automatische Inhaltserschließung einer Volltextdatenbank : Machbarkeitsstudie am Beispiel der FAZ (1994) 0.02
    0.016690949 = product of:
      0.033381898 = sum of:
        0.033381898 = product of:
          0.10014569 = sum of:
            0.10014569 = weight(_text_:k in 2766) [ClassicSimilarity], result of:
              0.10014569 = score(doc=2766,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.63105357 = fieldWeight in 2766, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.125 = fieldNorm(doc=2766)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  12. Lepsky, K.: Automatische Indexierung (2012) 0.02
    0.016690949 = product of:
      0.033381898 = sum of:
        0.033381898 = product of:
          0.10014569 = sum of:
            0.10014569 = weight(_text_:k in 442) [ClassicSimilarity], result of:
              0.10014569 = score(doc=442,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.63105357 = fieldWeight in 442, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.125 = fieldNorm(doc=442)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  13. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.02
    0.016061593 = product of:
      0.032123186 = sum of:
        0.032123186 = product of:
          0.09636956 = sum of:
            0.09636956 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.09636956 = score(doc=402,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  14. Polity, Y.: Vers une ergonomie linguistique (1994) 0.02
    0.015166735 = product of:
      0.03033347 = sum of:
        0.03033347 = product of:
          0.09100041 = sum of:
            0.09100041 = weight(_text_:y in 36) [ClassicSimilarity], result of:
              0.09100041 = score(doc=36,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.4253601 = fieldWeight in 36, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.0625 = fieldNorm(doc=36)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  15. Lepsky, K.: Maschinelles Indexieren zur Verbesserung der sachlichen Suche im OPAC : DFG-Projekt an der Universitäts- und Landesbibliothek Düsseldorf (1994) 0.01
    0.0146045815 = product of:
      0.029209163 = sum of:
        0.029209163 = product of:
          0.087627485 = sum of:
            0.087627485 = weight(_text_:k in 2882) [ClassicSimilarity], result of:
              0.087627485 = score(doc=2882,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.5521719 = fieldWeight in 2882, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2882)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  16. Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.01
    0.014053894 = product of:
      0.028107788 = sum of:
        0.028107788 = product of:
          0.08432336 = sum of:
            0.08432336 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
              0.08432336 = score(doc=262,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.5416616 = fieldWeight in 262, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=262)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    20.10.2000 12:22:23
  17. Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.01
    0.014053894 = product of:
      0.028107788 = sum of:
        0.028107788 = product of:
          0.08432336 = sum of:
            0.08432336 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
              0.08432336 = score(doc=6265,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.5416616 = fieldWeight in 6265, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6265)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    Information outlook. 9(2005) no.8, S.22-23
  18. Chung, Y.M.; Lee, J.Y.: ¬A corpus-based approach to comparative evaluation of statistical term association measures (2001) 0.01
    0.013405627 = product of:
      0.026811253 = sum of:
        0.026811253 = product of:
          0.080433756 = sum of:
            0.080433756 = weight(_text_:y in 5769) [ClassicSimilarity], result of:
              0.080433756 = score(doc=5769,freq=4.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.37596878 = fieldWeight in 5769, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5769)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Statistical association measures have been widely applied in information retrieval research, usually employing a clustering of documents or terms on the basis of their relationships. Applications of the association measures for term clustering include automatic thesaurus construction and query expansion. This research evaluates the similarity of six association measures by comparing the relationship and behavior they demonstrate in various analyses of a test corpus. Analysis techniques include comparisons of highly ranked term pairs and term clusters, analyses of the correlation among the association measures using Pearson's correlation coefficient and MDS mapping, and an analysis of the impact of a term frequency on the association values by means of z-score. The major findings of the study are as follows: First, the most similar association measures are mutual information and Yule's coefficient of colligation Y, whereas cosine and Jaccard coefficients, as well as X**2 statistic and likelihood ratio, demonstrate quite similar behavior for terms with high frequency. Second, among all the measures, the X**2 statistic is the least affected by the frequency of terms. Third, although cosine and Jaccard coefficients tend to emphasize high frequency terms, mutual information and Yule's Y seem to overestimate rare terms
  19. Yang, T.-H.; Hsieh, Y.-L.; Liu, S.-H.; Chang, Y.-C.; Hsu, W.-L.: ¬A flexible template generation and matching method with applications for publication reference metadata extraction (2021) 0.01
    0.013405627 = product of:
      0.026811253 = sum of:
        0.026811253 = product of:
          0.080433756 = sum of:
            0.080433756 = weight(_text_:y in 63) [ClassicSimilarity], result of:
              0.080433756 = score(doc=63,freq=4.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.37596878 = fieldWeight in 63, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=63)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  20. Lepsky, K.; Zimmermann, H.H.: Katalogerweiterung durch Scanning und Automatische Dokumenterschließung : Das DFG-Projekt KASCADE (1998) 0.01
    0.012518212 = product of:
      0.025036424 = sum of:
        0.025036424 = product of:
          0.07510927 = sum of:
            0.07510927 = weight(_text_:k in 3938) [ClassicSimilarity], result of:
              0.07510927 = score(doc=3938,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.47329018 = fieldWeight in 3938, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.09375 = fieldNorm(doc=3938)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    

Years

Languages

  • e 51
  • d 47
  • chi 1
  • f 1
  • ru 1
  • More… Less…

Types

  • a 91
  • el 5
  • x 5
  • m 4
  • s 2
  • More… Less…