Search (174 results, page 1 of 9)

  • × theme_ss:"Automatisches Indexieren"
  1. Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.13
    0.13218892 = product of:
      0.26437783 = sum of:
        0.26437783 = sum of:
          0.16608931 = weight(_text_:indexing in 6265) [ClassicSimilarity], result of:
            0.16608931 = score(doc=6265,freq=4.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.8373461 = fieldWeight in 6265, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.109375 = fieldNorm(doc=6265)
          0.098288536 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
            0.098288536 = score(doc=6265,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.5416616 = fieldWeight in 6265, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.109375 = fieldNorm(doc=6265)
      0.5 = coord(1/2)
    
    Source
    Information outlook. 9(2005) no.8, S.22-23
  2. Ward, M.L.: ¬The future of the human indexer (1996) 0.08
    0.082706496 = product of:
      0.16541299 = sum of:
        0.16541299 = sum of:
          0.12328933 = weight(_text_:indexing in 7244) [ClassicSimilarity], result of:
            0.12328933 = score(doc=7244,freq=12.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.6215682 = fieldWeight in 7244, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.046875 = fieldNorm(doc=7244)
          0.042123657 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
            0.042123657 = score(doc=7244,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.23214069 = fieldWeight in 7244, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=7244)
      0.5 = coord(1/2)
    
    Abstract
    Considers the principles of indexing and the intellectual skills involved in order to determine what automatic indexing systems would be required in order to supplant or complement the human indexer. Good indexing requires: considerable prior knowledge of the literature; judgement as to what to index and what depth to index; reading skills; abstracting skills; and classification skills, Illustrates these features with a detailed description of abstracting and indexing processes involved in generating entries for the mechanical engineering database POWERLINK. Briefly assesses the possibility of replacing human indexers with specialist indexing software, with particular reference to the Object Analyzer from the InTEXT automatic indexing system and using the criteria described for human indexers. At present, it is unlikely that the automatic indexer will replace the human indexer, but when more primary texts are available in electronic form, it may be a useful productivity tool for dealing with large quantities of low grade texts (should they be wanted in the database)
    Date
    9. 2.1997 18:44:22
  3. Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.08
    0.07704693 = product of:
      0.15409386 = sum of:
        0.15409386 = sum of:
          0.08388777 = weight(_text_:indexing in 1952) [ClassicSimilarity], result of:
            0.08388777 = score(doc=1952,freq=2.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.42292362 = fieldWeight in 1952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.078125 = fieldNorm(doc=1952)
          0.0702061 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
            0.0702061 = score(doc=1952,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.38690117 = fieldWeight in 1952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.078125 = fieldNorm(doc=1952)
      0.5 = coord(1/2)
    
    Date
    16. 8.1998 12:51:22
  4. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.08
    0.07704693 = product of:
      0.15409386 = sum of:
        0.15409386 = sum of:
          0.08388777 = weight(_text_:indexing in 4157) [ClassicSimilarity], result of:
            0.08388777 = score(doc=4157,freq=2.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.42292362 = fieldWeight in 4157, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.078125 = fieldNorm(doc=4157)
          0.0702061 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
            0.0702061 = score(doc=4157,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.38690117 = fieldWeight in 4157, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
  5. Tsareva, P.V.: Algoritmy dlya raspoznavaniya pozitivnykh i negativnykh vkhozdenii deskriptorov v tekst i protsedura avtomaticheskoi klassifikatsii tekstov (1999) 0.08
    0.07704693 = product of:
      0.15409386 = sum of:
        0.15409386 = sum of:
          0.08388777 = weight(_text_:indexing in 374) [ClassicSimilarity], result of:
            0.08388777 = score(doc=374,freq=2.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.42292362 = fieldWeight in 374, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.078125 = fieldNorm(doc=374)
          0.0702061 = weight(_text_:22 in 374) [ClassicSimilarity], result of:
            0.0702061 = score(doc=374,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.38690117 = fieldWeight in 374, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.078125 = fieldNorm(doc=374)
      0.5 = coord(1/2)
    
    Date
    1. 4.2002 10:22:41
    Footnote
    Übers. des Titels: Algorithms for selection of positive and negative descriptors from text and automated text indexing
  6. Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.08
    0.07704693 = product of:
      0.15409386 = sum of:
        0.15409386 = sum of:
          0.08388777 = weight(_text_:indexing in 2759) [ClassicSimilarity], result of:
            0.08388777 = score(doc=2759,freq=2.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.42292362 = fieldWeight in 2759, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.078125 = fieldNorm(doc=2759)
          0.0702061 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
            0.0702061 = score(doc=2759,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.38690117 = fieldWeight in 2759, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.078125 = fieldNorm(doc=2759)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  7. Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.07
    0.06609446 = product of:
      0.13218892 = sum of:
        0.13218892 = sum of:
          0.083044656 = weight(_text_:indexing in 530) [ClassicSimilarity], result of:
            0.083044656 = score(doc=530,freq=4.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.41867304 = fieldWeight in 530, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.0546875 = fieldNorm(doc=530)
          0.049144268 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
            0.049144268 = score(doc=530,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.2708308 = fieldWeight in 530, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=530)
      0.5 = coord(1/2)
    
    Abstract
    Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
    Source
    International forum on information and documentation. 22(1997) no.1, S.17-28
  8. Milstead, J.L.: Thesauri in a full-text world (1998) 0.05
    0.047210332 = product of:
      0.094420664 = sum of:
        0.094420664 = sum of:
          0.059317615 = weight(_text_:indexing in 2337) [ClassicSimilarity], result of:
            0.059317615 = score(doc=2337,freq=4.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.29905218 = fieldWeight in 2337, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2337)
          0.03510305 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
            0.03510305 = score(doc=2337,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.19345059 = fieldWeight in 2337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2337)
      0.5 = coord(1/2)
    
    Abstract
    Despite early claims to the contemporary, thesauri continue to find use as access tools for information in the full-text environment. Their mode of use is changing, but this change actually represents an expansion rather than a contrdiction of their utility. Thesauri and similar vocabulary tools can complement full-text access by aiding users in focusing their searches, by supplementing the linguistic analysis of the text search engine, and even by serving as one of the tools used by the linguistic engine for its analysis. While human indexing contunues to be used for many databases, the trend is to increase the use of machine aids for this purpose. All machine-aided indexing (MAI) systems rely on thesauri as the basis for term selection. In the 21st century, the balance of effort between human and machine will change at both input and output, but thesauri will continue to play an important role for the foreseeable future
    Date
    22. 9.1997 19:16:05
  9. Mesquita, L.A.P.; Souza, R.R.; Baracho Porto, R.M.A.: Noun phrases in automatic indexing: : a structural analysis of the distribution of relevant terms in doctoral theses (2014) 0.04
    0.043100797 = product of:
      0.08620159 = sum of:
        0.08620159 = sum of:
          0.05811915 = weight(_text_:indexing in 1442) [ClassicSimilarity], result of:
            0.05811915 = score(doc=1442,freq=6.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.2930101 = fieldWeight in 1442, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.03125 = fieldNorm(doc=1442)
          0.028082438 = weight(_text_:22 in 1442) [ClassicSimilarity], result of:
            0.028082438 = score(doc=1442,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.15476047 = fieldWeight in 1442, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03125 = fieldNorm(doc=1442)
      0.5 = coord(1/2)
    
    Abstract
    The main objective of this research was to analyze whether there was a characteristic distribution behavior of relevant terms over a scientific text that could contribute as a criterion for their process of automatic indexing. The terms considered in this study were only full noun phrases contained in the texts themselves. The texts were considered a total of 98 doctoral theses of the eight areas of knowledge in a same university. Initially, 20 full noun phrases were automatically extracted from each text as candidates to be the most relevant terms, and each author of each text assigned a relevance value 0-6 (not relevant and highly relevant, respectively) for each of the 20 noun phrases sent. Only, 22.1 % of noun phrases were considered not relevant. A relevance values of the terms assigned by the authors were associated with their positions in the text. Each full noun phrases found in the text was considered as a valid linear position. The results that were obtained showed values resulting from this distribution by considering two types of position: linear, with values consolidated into ten equal consecutive parts; and structural, considering parts of the text (such as introduction, development and conclusion). As a result of considerable importance, all areas of knowledge related to the Natural Sciences showed a characteristic behavior in the distribution of relevant terms, as well as all areas of knowledge related to Social Sciences showed the same characteristic behavior of distribution, but distinct from the Natural Sciences. The difference of the distribution behavior between the Natural and Social Sciences can be clearly visualized through graphs. All behaviors, including the general behavior of all areas of knowledge together, were characterized in polynomial equations and can be applied in future as criteria for automatic indexing. Until the present date this work has become inedited of for two reasons: to present a method for characterizing the distribution of relevant terms in a scientific text, and also, through this method, pointing out a quantitative trait difference between the Natural and Social Sciences.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  10. Humphrey, S.M.: Automatic indexing of documents from journal descriptors : a preliminary investigation (1999) 0.04
    0.03979147 = product of:
      0.07958294 = sum of:
        0.07958294 = product of:
          0.15916587 = sum of:
            0.15916587 = weight(_text_:indexing in 3769) [ClassicSimilarity], result of:
              0.15916587 = score(doc=3769,freq=20.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.80244124 = fieldWeight in 3769, product of:
                  4.472136 = tf(freq=20.0), with freq of:
                    20.0 = termFreq=20.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3769)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A new, fully automated approach for indedexing documents is presented based on associating textwords in a training set of bibliographic citations with the indexing of journals. This journal-level indexing is in the form of a consistent, timely set of journal descriptors (JDs) indexing the individual journals themselves. This indexing is maintained in journal records in a serials authority database. The advantage of this novel approach is that the training set does not depend on previous manual indexing of thousands of documents (i.e., any such indexing already in the training set is not used), but rather the relatively small intellectual effort of indexing at the journal level, usually a matter of a few thousand unique journals for which retrospective indexing to maintain consistency and currency may be feasible. If successful, JD indexing would provide topical categorization of documents outside the training set, i.e., journal articles, monographs, Web documents, reports from the grey literature, etc., and therefore be applied in searching. Because JDs are quite general, corresponding to subject domains, their most problable use would be for improving or refining search results
  11. Wan, T.-L.; Evens, M.; Wan, Y.-W.; Pao, Y.-Y.: Experiments with automatic indexing and a relational thesaurus in a Chinese information retrieval system (1997) 0.04
    0.03884058 = product of:
      0.07768116 = sum of:
        0.07768116 = product of:
          0.15536232 = sum of:
            0.15536232 = weight(_text_:indexing in 956) [ClassicSimilarity], result of:
              0.15536232 = score(doc=956,freq=14.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.78326553 = fieldWeight in 956, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=956)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article describes a series of experiments with an interactive Chinese information retrieval system named CIRS and an interactive relational thesaurus. 2 important issues have been explored: whether thesauri enhance the retrieval effectiveness of Chinese documents, and whether automatic indexing can complete with manual indexing in a Chinese information retrieval system. Recall and precision are used to measure and evaluate the effectiveness of the system. Statistical analysis of the recall and precision measures suggest that the use of the relational thesaurus does improve the retrieval effectiveness both in the automatic indexing environment and in the manual indexing environment and that automatic indexing is at least as good as manual indexing
  12. Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998) 0.04
    0.038523465 = product of:
      0.07704693 = sum of:
        0.07704693 = sum of:
          0.041943885 = weight(_text_:indexing in 1794) [ClassicSimilarity], result of:
            0.041943885 = score(doc=1794,freq=2.0), product of:
              0.19835205 = queryWeight, product of:
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.051817898 = queryNorm
              0.21146181 = fieldWeight in 1794, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8278677 = idf(docFreq=2614, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1794)
          0.03510305 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
            0.03510305 = score(doc=1794,freq=2.0), product of:
              0.18145745 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051817898 = queryNorm
              0.19345059 = fieldWeight in 1794, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1794)
      0.5 = coord(1/2)
    
    Date
    11. 9.2000 19:53:22
  13. Bloomfield, M.: Indexing : neglected and poorly understood (2001) 0.04
    0.037749495 = product of:
      0.07549899 = sum of:
        0.07549899 = product of:
          0.15099798 = sum of:
            0.15099798 = weight(_text_:indexing in 5439) [ClassicSimilarity], result of:
              0.15099798 = score(doc=5439,freq=18.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.76126254 = fieldWeight in 5439, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5439)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The growth of the Internet has highlighted the use of machine indexing. The difficulties in using the Internet as a searching device can be frustrating. The use of the term "Python" is given as an example. Machine indexing is noted as "rotten" and human indexing as "capricious." The problem seems to be a lack of a theoretical foundation for the art of indexing. What librarians have learned over the last hundred years has yet to yield a consistent approach to what really works best in preparing index terms and in the ability of our customers to search the various indexes. An attempt is made to consider the elements of indexing, their pros and cons. The argument is made that machine indexing is far too prolific in its production of index terms. Neither librarians nor computer programmers have made much progress to improve Internet indexing. Human indexing has had the same problems for over fifty years.
  14. Gil-Leiva, I.: SISA-automatic indexing system for scientific articles : experiments with location heuristics rules versus TF-IDF rules (2017) 0.04
    0.037749495 = product of:
      0.07549899 = sum of:
        0.07549899 = product of:
          0.15099798 = sum of:
            0.15099798 = weight(_text_:indexing in 3622) [ClassicSimilarity], result of:
              0.15099798 = score(doc=3622,freq=18.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.76126254 = fieldWeight in 3622, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3622)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Indexing is contextualized and a brief description is provided of some of the most used automatic indexing systems. We describe SISA, a system which uses location heuristics rules, statistical rules like term frequency (TF) or TF-IDF to obtain automatic or semi-automatic indexing, depending on the user's preference. The aim of this research is to ascertain which rules (location heuristics rules or TF-IDF rules) provide the best indexing terms. SISA is used to obtain the automatic indexing of 200 scientific articles on fruit growing written in Portuguese. It uses, on the one hand, location heuristics rules founded on the value of certain parts of the articles for indexing such as titles, abstracts, keywords, headings, first paragraph, conclusions and references and, on the other, TF-IDF rules. The indexing is then evaluated to ascertain retrieval performance through recall, precision and f-measure. Automatic indexing of the articles with location heuristics rules provided the best results with the evaluation measures.
  15. Kim, P.K.: ¬An automatic indexing of compound words based on mutual information for Korean text retrieval (1995) 0.04
    0.037515752 = product of:
      0.075031504 = sum of:
        0.075031504 = product of:
          0.15006301 = sum of:
            0.15006301 = weight(_text_:indexing in 620) [ClassicSimilarity], result of:
              0.15006301 = score(doc=620,freq=10.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.7565488 = fieldWeight in 620, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0625 = fieldNorm(doc=620)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Presents an automatic indexing technique for compound words suitable for an agglutinative language, specifically Korean. Discusses some construction conditions for compound words and the rules for decomposing compound words to enhance the exhaustivity of indexing, demonstrating that this system, mutual information, enhances both the exhaustivity of indexing and the specifity of terms. Suggests that the construction conditions and rules for decomposition presented may be used in multilingual information retrieval systems to translate the indexing terms of the specific language into those of the language required
  16. Li, Z.: Research on dynamic morphological indexing (1998) 0.04
    0.037515752 = product of:
      0.075031504 = sum of:
        0.075031504 = product of:
          0.15006301 = sum of:
            0.15006301 = weight(_text_:indexing in 3242) [ClassicSimilarity], result of:
              0.15006301 = score(doc=3242,freq=10.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.7565488 = fieldWeight in 3242, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3242)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Notes that in automatic indexing of Chinese words using dictionary matching methods, there is some difficulty in the indexing of proper nouns. Presents a solution called dynamic morphological indexing, based on work using automatic indexing of archive documents. Presents the algorithm for this solution
  17. Hodge, G.M.: Computer-assisted database indexing : the state-of-the-art (1994) 0.04
    0.03632447 = product of:
      0.07264894 = sum of:
        0.07264894 = product of:
          0.14529788 = sum of:
            0.14529788 = weight(_text_:indexing in 7936) [ClassicSimilarity], result of:
              0.14529788 = score(doc=7936,freq=6.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.7325252 = fieldWeight in 7936, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.078125 = fieldNorm(doc=7936)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Discusses the state-of-the art of computer indexing, defines indexing and computer assistance, describes the reasons for renewed interest. Identifies the types of computer support in use using selected operational systems, describes the integration of various computer supports in one databases production system, and speculates on the future
  18. Olsgaard, J.N.; Evans, E.J.: Improving keyword indexing (1981) 0.04
    0.03632447 = product of:
      0.07264894 = sum of:
        0.07264894 = product of:
          0.14529788 = sum of:
            0.14529788 = weight(_text_:indexing in 4996) [ClassicSimilarity], result of:
              0.14529788 = score(doc=4996,freq=6.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.7325252 = fieldWeight in 4996, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4996)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This communication examines some of the most frequently cited critisms of keyword indexing. These critisms include (1) absence of general subject headings, (2) limited entry points, and (3) irrelevant indexing. Some solutions are suggested to meet these critisms.
  19. Silvester, J.P.; Genuardi, M.T.: Machine-aided indexing from the analysis of natural language text (1994) 0.04
    0.035590567 = product of:
      0.07118113 = sum of:
        0.07118113 = product of:
          0.14236227 = sum of:
            0.14236227 = weight(_text_:indexing in 2989) [ClassicSimilarity], result of:
              0.14236227 = score(doc=2989,freq=4.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.7177252 = fieldWeight in 2989, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2989)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Challenges in indexing electronic text and images. Ed.: R. Fidel et al
  20. Hmeidi, I.; Kanaan, G.; Evens, M.: Design and implementation of automatic indexing for information retrieval with Arabic documents (1997) 0.04
    0.035590567 = product of:
      0.07118113 = sum of:
        0.07118113 = product of:
          0.14236227 = sum of:
            0.14236227 = weight(_text_:indexing in 1660) [ClassicSimilarity], result of:
              0.14236227 = score(doc=1660,freq=16.0), product of:
                0.19835205 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.051817898 = queryNorm
                0.7177252 = fieldWeight in 1660, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1660)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A corpus of 242 abstracts of Arabic documents on computer science and information systems using the Proceedings of the Saudi Arabian National Conferences as a source was put together. Reports on the design and building of an automatic information retrieval system from scratch to handle Arabic data. Both automatic and manual indexing techniques were implemented. Experiments using measures of recall and precision has demonstrated that automatic indexing is at least as effective as manual indexing and more effective in some cases. Automatic indexing is both cheaper and faster. Results suggests that a wider coverage of the literature can be achieved with less money and produce as good results as with manual indexing. Compares the retrieval results using words as index terms versus stems and roots, and confirms the results obtained by Al-Kharashi and Abu-Salem with smaller corpora that root indexing is more effective than word indexing

Languages

Types

  • a 161
  • el 8
  • m 5
  • x 4
  • s 2
  • d 1
  • r 1
  • More… Less…