Search (101 results, page 1 of 6)

  • × year_i:[1990 TO 2000}
  • × theme_ss:"Automatisches Indexieren"
  1. Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.03
    0.03227781 = product of:
      0.053796347 = sum of:
        0.011641062 = product of:
          0.05820531 = sum of:
            0.05820531 = weight(_text_:problem in 530) [ClassicSimilarity], result of:
              0.05820531 = score(doc=530,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.3282676 = fieldWeight in 530, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=530)
          0.2 = coord(1/5)
        0.022345824 = weight(_text_:of in 530) [ClassicSimilarity], result of:
          0.022345824 = score(doc=530,freq=16.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.34207192 = fieldWeight in 530, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
        0.019809462 = product of:
          0.039618924 = sum of:
            0.039618924 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
              0.039618924 = score(doc=530,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.2708308 = fieldWeight in 530, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=530)
          0.5 = coord(1/2)
      0.6 = coord(3/5)
    
    Abstract
    Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
    Source
    International forum on information and documentation. 22(1997) no.1, S.17-28
  2. Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998) 0.03
    0.025122102 = product of:
      0.04187017 = sum of:
        0.0117592495 = product of:
          0.058796246 = sum of:
            0.058796246 = weight(_text_:problem in 1794) [ClassicSimilarity], result of:
              0.058796246 = score(doc=1794,freq=4.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.33160037 = fieldWeight in 1794, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1794)
          0.2 = coord(1/5)
        0.015961302 = weight(_text_:of in 1794) [ClassicSimilarity], result of:
          0.015961302 = score(doc=1794,freq=16.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.24433708 = fieldWeight in 1794, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1794)
        0.0141496165 = product of:
          0.028299233 = sum of:
            0.028299233 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
              0.028299233 = score(doc=1794,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.19345059 = fieldWeight in 1794, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1794)
          0.5 = coord(1/2)
      0.6 = coord(3/5)
    
    Abstract
    In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document
    Date
    11. 9.2000 19:53:22
    Source
    Journal of the American Society for Information Science. 49(1998) no.10, S.888-902
  3. Gomez, I.: Coping with the problem of subject classification diversity (1996) 0.02
    0.0170663 = product of:
      0.04266575 = sum of:
        0.01646295 = product of:
          0.08231475 = sum of:
            0.08231475 = weight(_text_:problem in 5074) [ClassicSimilarity], result of:
              0.08231475 = score(doc=5074,freq=4.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.46424055 = fieldWeight in 5074, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5074)
          0.2 = coord(1/5)
        0.026202802 = weight(_text_:of in 5074) [ClassicSimilarity], result of:
          0.026202802 = score(doc=5074,freq=22.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.40111488 = fieldWeight in 5074, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5074)
      0.4 = coord(2/5)
    
    Abstract
    The delimination of a research field in bibliometric studies presents the problem of the diversity of subject classifications used in the sources of input and output data. Classification of documents according the thematic codes or keywords is the most accurate method, mainly used is specialized bibliographic or patent databases. Classification of journals in disciplines presents lower specifity, and some shortcomings as the change over time of both journals and disciplines and the increasing interdisciplinarity of research. Standardization of subject classifications emerges as an important point in bibliometric studies in order to allow international comparisons, although flexibility is needed to meet the needs of local studies
  4. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.02
    0.015834233 = product of:
      0.03958558 = sum of:
        0.011286346 = weight(_text_:of in 4157) [ClassicSimilarity], result of:
          0.011286346 = score(doc=4157,freq=2.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.17277241 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
        0.028299233 = product of:
          0.056598466 = sum of:
            0.056598466 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
              0.056598466 = score(doc=4157,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.38690117 = fieldWeight in 4157, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4157)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
  5. Tsareva, P.V.: Algoritmy dlya raspoznavaniya pozitivnykh i negativnykh vkhozdenii deskriptorov v tekst i protsedura avtomaticheskoi klassifikatsii tekstov (1999) 0.02
    0.015834233 = product of:
      0.03958558 = sum of:
        0.011286346 = weight(_text_:of in 374) [ClassicSimilarity], result of:
          0.011286346 = score(doc=374,freq=2.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.17277241 = fieldWeight in 374, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=374)
        0.028299233 = product of:
          0.056598466 = sum of:
            0.056598466 = weight(_text_:22 in 374) [ClassicSimilarity], result of:
              0.056598466 = score(doc=374,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.38690117 = fieldWeight in 374, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=374)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Date
    1. 4.2002 10:22:41
    Footnote
    Übers. des Titels: Algorithms for selection of positive and negative descriptors from text and automated text indexing
  6. Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.02
    0.015311283 = product of:
      0.038278207 = sum of:
        0.01563882 = weight(_text_:of in 4709) [ClassicSimilarity], result of:
          0.01563882 = score(doc=4709,freq=6.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.23940048 = fieldWeight in 4709, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
        0.022639386 = product of:
          0.045278773 = sum of:
            0.045278773 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
              0.045278773 = score(doc=4709,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.30952093 = fieldWeight in 4709, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4709)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Proposes automatic linguistic knowledge acquisition from sublanguage corpora. The system combines existing linguistic knowledge and human intervention with corpus based techniques. The algorithm involves a gradual approximation which works to converge linguistic knowledge gradually towards desirable results. The 1st experiment revealed the characteristic of this algorithm and the others proved the effectiveness of this algorithm for a real corpus
    Date
    31. 7.1996 9:22:19
  7. Alexander, M.: Retrieving digital data with fuzzy matching (1997) 0.01
    0.01416828 = product of:
      0.0354207 = sum of:
        0.0133040715 = product of:
          0.066520356 = sum of:
            0.066520356 = weight(_text_:problem in 151) [ClassicSimilarity], result of:
              0.066520356 = score(doc=151,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.375163 = fieldWeight in 151, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0625 = fieldNorm(doc=151)
          0.2 = coord(1/5)
        0.02211663 = weight(_text_:of in 151) [ClassicSimilarity], result of:
          0.02211663 = score(doc=151,freq=12.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.33856338 = fieldWeight in 151, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=151)
      0.4 = coord(2/5)
    
    Abstract
    In 1993 the British Library established a programme of activities entitled Initiatives for Access (IFA) to identify and develop computer applications based on the new technologies emerging in the aereas of digital and network service. Discusses the problem of the effective retrieval of digital data after its capture focusing on the product Excalibur EFS which looks at the way information is sorted at its fundamental level and identifies patterns in numbers. Looks at the benefits of Excalibur and outlines other experiments in progress as part of the IFA programme
  8. Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01
    0.014163372 = product of:
      0.03540843 = sum of:
        0.0127690425 = weight(_text_:of in 6752) [ClassicSimilarity], result of:
          0.0127690425 = score(doc=6752,freq=4.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.19546966 = fieldWeight in 6752, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
        0.022639386 = product of:
          0.045278773 = sum of:
            0.045278773 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
              0.045278773 = score(doc=6752,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.30952093 = fieldWeight in 6752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6752)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    AutoSlog is a system that addresses the knowledge engineering bottleneck for information extraction. AutoSlog automatically creates domain specific dictionaries for information extraction, given an appropriate training corpus. Describes experiments with AutoSlog in terrorism, joint ventures and microelectronics domains. Compares the performance of AutoSlog across the 3 domains, discusses the lessons learned and presents results from 2 experiments which demonstrate that novice users can generate effective dictionaries using AutoSlog
    Date
    6. 3.1997 16:22:15
  9. Ward, M.L.: ¬The future of the human indexer (1996) 0.01
    0.013958423 = product of:
      0.034896057 = sum of:
        0.01791652 = weight(_text_:of in 7244) [ClassicSimilarity], result of:
          0.01791652 = score(doc=7244,freq=14.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.2742677 = fieldWeight in 7244, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=7244)
        0.016979538 = product of:
          0.033959076 = sum of:
            0.033959076 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
              0.033959076 = score(doc=7244,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.23214069 = fieldWeight in 7244, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=7244)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Considers the principles of indexing and the intellectual skills involved in order to determine what automatic indexing systems would be required in order to supplant or complement the human indexer. Good indexing requires: considerable prior knowledge of the literature; judgement as to what to index and what depth to index; reading skills; abstracting skills; and classification skills, Illustrates these features with a detailed description of abstracting and indexing processes involved in generating entries for the mechanical engineering database POWERLINK. Briefly assesses the possibility of replacing human indexers with specialist indexing software, with particular reference to the Object Analyzer from the InTEXT automatic indexing system and using the criteria described for human indexers. At present, it is unlikely that the automatic indexer will replace the human indexer, but when more primary texts are available in electronic form, it may be a useful productivity tool for dealing with large quantities of low grade texts (should they be wanted in the database)
    Date
    9. 2.1997 18:44:22
    Source
    Journal of librarianship and information science. 28(1996) no.4, S.217-225
  10. Clavel, G.; Walther, F.; Walther, J.: Indexation automatique de fonds bibliotheconomiques (1993) 0.01
    0.013594754 = product of:
      0.033986885 = sum of:
        0.011641062 = product of:
          0.05820531 = sum of:
            0.05820531 = weight(_text_:problem in 6610) [ClassicSimilarity], result of:
              0.05820531 = score(doc=6610,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.3282676 = fieldWeight in 6610, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=6610)
          0.2 = coord(1/5)
        0.022345824 = weight(_text_:of in 6610) [ClassicSimilarity], result of:
          0.022345824 = score(doc=6610,freq=16.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.34207192 = fieldWeight in 6610, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6610)
      0.4 = coord(2/5)
    
    Abstract
    A discussion of developments to date in the field of computerized indexing, based on presentations given at a seminar held at the Institute of Policy Studies in Paris in Nov 91. The methods tested so far, based on a linguistic approach, whether using natural language or special thesauri, encounter the same central problem - they are only successful when applied to collections of similar types of documents covering very specific subject areas. Despite this, the search for some sort of universal indexing metalanguage continues. In the end, computerized indexing works best when used in conjunction with manual indexing - ideally in the hands of a trained library science professional, who can extract the maximum value from a collection of documents for a particular user population
  11. Milstead, J.L.: Thesauri in a full-text world (1998) 0.01
    0.012797958 = product of:
      0.031994894 = sum of:
        0.017845279 = weight(_text_:of in 2337) [ClassicSimilarity], result of:
          0.017845279 = score(doc=2337,freq=20.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.27317715 = fieldWeight in 2337, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.0141496165 = product of:
          0.028299233 = sum of:
            0.028299233 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
              0.028299233 = score(doc=2337,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.19345059 = fieldWeight in 2337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2337)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Despite early claims to the contemporary, thesauri continue to find use as access tools for information in the full-text environment. Their mode of use is changing, but this change actually represents an expansion rather than a contrdiction of their utility. Thesauri and similar vocabulary tools can complement full-text access by aiding users in focusing their searches, by supplementing the linguistic analysis of the text search engine, and even by serving as one of the tools used by the linguistic engine for its analysis. While human indexing contunues to be used for many databases, the trend is to increase the use of machine aids for this purpose. All machine-aided indexing (MAI) systems rely on thesauri as the basis for term selection. In the 21st century, the balance of effort between human and machine will change at both input and output, but thesauri will continue to play an important role for the foreseeable future
    Date
    22. 9.1997 19:16:05
    Imprint
    Urbana-Champaign, IL : Illinois University at Urbana-Champaign, Graduate School of Library and Information Science
    Source
    Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al
  12. Damerau, F.J.: Generating an evaluating domain-oriented multi-word terms from texts (1993) 0.01
    0.01254489 = product of:
      0.031362224 = sum of:
        0.0133040715 = product of:
          0.066520356 = sum of:
            0.066520356 = weight(_text_:problem in 5814) [ClassicSimilarity], result of:
              0.066520356 = score(doc=5814,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.375163 = fieldWeight in 5814, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5814)
          0.2 = coord(1/5)
        0.018058153 = weight(_text_:of in 5814) [ClassicSimilarity], result of:
          0.018058153 = score(doc=5814,freq=8.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.27643585 = fieldWeight in 5814, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=5814)
      0.4 = coord(2/5)
    
    Abstract
    Examines techniques for automatically generating domain vocabularies from large text collections. Focuses on the problem of generating multi-word vocabulary terms (specifically pairs). Discusses statistical issues associated with word co-occurrences likely to be of use in a natural language interface. Provides a more objective evaluation of the selection procedures. As substantial experimentation with subjects using a working query system is absent, all evaluation is necessarily subjective. Uses surrogate for experimentation by relying on pre-existing dictionaries as indicators of domain relevance
  13. Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.01
    0.01239295 = product of:
      0.030982375 = sum of:
        0.011172912 = weight(_text_:of in 2673) [ClassicSimilarity], result of:
          0.011172912 = score(doc=2673,freq=4.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.17103596 = fieldWeight in 2673, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
        0.019809462 = product of:
          0.039618924 = sum of:
            0.039618924 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
              0.039618924 = score(doc=2673,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.2708308 = fieldWeight in 2673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2673)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Examines techniques that discover features in sets of pre-categorized documents, such that similar documents can be found on the WWW. Examines techniques which will classifiy training examples with high accuracy, then explains why this is not necessarily useful. Describes a method for extracting word clusters from the raw document features. Results show that the clustering technique is successful in discovering word groups in personal Web pages which can be used to find similar information on the WWW
    Date
    1. 8.1996 22:08:06
    Footnote
    Contribution to a special issue of papers from the 6th International World Wide Web conference, held 7-11 Apr 1997, Santa Clara, California
  14. McKiernan, G.: Automated categorisation of Web resources : a profile of selected projects, research, products, and services (1996) 0.01
    0.005529158 = product of:
      0.02764579 = sum of:
        0.02764579 = weight(_text_:of in 2533) [ClassicSimilarity], result of:
          0.02764579 = score(doc=2533,freq=12.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.42320424 = fieldWeight in 2533, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=2533)
      0.2 = coord(1/5)
    
    Abstract
    Profiles several representative current efforts that apply established as well as more innovative methods of automated classification, organization or other method of categorisation of WWW resources
    Source
    New review of information networking. 1996, no.2, S.15-40
  15. Frants, V.I.; Kamenoff, N.I.; Shapiro, J.: ¬One approach to classification of users and automatic clustering of documents (1993) 0.01
    0.005107617 = product of:
      0.025538085 = sum of:
        0.025538085 = weight(_text_:of in 4569) [ClassicSimilarity], result of:
          0.025538085 = score(doc=4569,freq=16.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.39093933 = fieldWeight in 4569, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=4569)
      0.2 = coord(1/5)
    
    Abstract
    Shows how to automatically construct a classification of users and a clustering of documents on the basis of users' information needs by creating clusters of documents and cross-references among clusters using users' search requests. Examines feedback in the construction of this classification and clustering so that the classification can be changed over time to reflect the changing needs of the users
  16. Hodge, G.M.: Computer-assisted database indexing : the state-of-the-art (1994) 0.01
    0.0050474075 = product of:
      0.025237037 = sum of:
        0.025237037 = weight(_text_:of in 7936) [ClassicSimilarity], result of:
          0.025237037 = score(doc=7936,freq=10.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.38633084 = fieldWeight in 7936, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=7936)
      0.2 = coord(1/5)
    
    Abstract
    Discusses the state-of-the art of computer indexing, defines indexing and computer assistance, describes the reasons for renewed interest. Identifies the types of computer support in use using selected operational systems, describes the integration of various computer supports in one databases production system, and speculates on the future
  17. Schuegraf, E.J.; Bommel, M.F.van: ¬An automatic document indexing system based on cooperating expert systems : design and development (1993) 0.00
    0.0047777384 = product of:
      0.023888692 = sum of:
        0.023888692 = weight(_text_:of in 6504) [ClassicSimilarity], result of:
          0.023888692 = score(doc=6504,freq=14.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.36569026 = fieldWeight in 6504, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=6504)
      0.2 = coord(1/5)
    
    Abstract
    Discusses the design of an automatic indexing system based on two cooperating expert systems and the investigation related to its development. The design combines statistical and artificial intelligence techniques. Examines choice of content indicators, the effect of stemming and the identification of characteristic vocabularies for given subject areas. Presents experimental results. Discusses the application of machine learning algorithms to the identification of vocabularies
    Source
    Canadian journal of information and library science. 18(1993) no.2, S.32-50
  18. Can, F.: Incremental clustering for dynamic information processing (1993) 0.00
    0.0047777384 = product of:
      0.023888692 = sum of:
        0.023888692 = weight(_text_:of in 6627) [ClassicSimilarity], result of:
          0.023888692 = score(doc=6627,freq=14.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.36569026 = fieldWeight in 6627, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=6627)
      0.2 = coord(1/5)
    
    Abstract
    Clustering of very large document databases is useful for both searching and browsing. The periodic updating of clusters is required due to the dynamic nature of databases. Introduces an algorithm for incremental clustering and discusses the complexity and cost of analysis of the algorithm together with an investigation of its expected behaviour. Shows through empirical testing that the algortihm achieves cost effectiveness and generates statistically valid clusters that are compatible with those of reclustering. The experimental evidence shows that the algorithm creates an effective and effecient retrieval environment
  19. Prasad, A.R.D.: PROMETHEUS: an automatic indexing system (1996) 0.00
    0.0047777384 = product of:
      0.023888692 = sum of:
        0.023888692 = weight(_text_:of in 5189) [ClassicSimilarity], result of:
          0.023888692 = score(doc=5189,freq=14.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.36569026 = fieldWeight in 5189, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=5189)
      0.2 = coord(1/5)
    
    Abstract
    An automatic indexing system using the tools and techniques of artificial intelligence is described. The paper presents the various components of the system like the parser, grammar formalism, lexicon, and the frame based knowledge representation for semantic representation. The semantic representation is based on the Ranganathan school of thought, especially that of Deep Structure of Subject Indexing Languages enunciated by Bhattacharyya. It is attempted to demonstrate the various stepts in indexing by providing an illustration
    Source
    Knowledge organization and change: Proceedings of the Fourth International ISKO Conference, 15-18 July 1996, Library of Congress, Washington, DC. Ed.: R. Green
  20. Stegentritt, E.: Evaluationsresultate des mehrsprachigen Suchsystems CANAL/LS (1998) 0.00
    0.0047777384 = product of:
      0.023888692 = sum of:
        0.023888692 = weight(_text_:of in 7216) [ClassicSimilarity], result of:
          0.023888692 = score(doc=7216,freq=14.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.36569026 = fieldWeight in 7216, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=7216)
      0.2 = coord(1/5)
    
    Abstract
    The search system CANAL/LS simplifies the searching of library catalogues by analyzing search questions linguistically and translating them if required. The linguistic analysis reduces the search question words to their basic forms so that they can be compared with basic title forms. Consequently all variants of words and parts of compounds in German can be found. Presents the results of an analysis of search questions in a catalogue of 45.000 titles in the field of psychology

Languages

Types

  • a 99
  • el 1
  • s 1
  • More… Less…