Search (61 results, page 1 of 4)

  • × theme_ss:"Automatisches Indexieren"
  1. Hauer, M.: Automatische Indexierung (2000) 0.09
    0.08566328 = product of:
      0.2141582 = sum of:
        0.13040888 = weight(_text_:index in 5887) [ClassicSimilarity], result of:
          0.13040888 = score(doc=5887,freq=2.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.5793543 = fieldWeight in 5887, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.09375 = fieldNorm(doc=5887)
        0.08374932 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
          0.08374932 = score(doc=5887,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.46428138 = fieldWeight in 5887, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5887)
      0.4 = coord(2/5)
    
    Object
    Index-5.0
    Source
    Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt
  2. Ward, M.L.: ¬The future of the human indexer (1996) 0.05
    0.05363507 = product of:
      0.13408767 = sum of:
        0.092213005 = weight(_text_:index in 7244) [ClassicSimilarity], result of:
          0.092213005 = score(doc=7244,freq=4.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.40966535 = fieldWeight in 7244, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=7244)
        0.04187466 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
          0.04187466 = score(doc=7244,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.23214069 = fieldWeight in 7244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=7244)
      0.4 = coord(2/5)
    
    Abstract
    Considers the principles of indexing and the intellectual skills involved in order to determine what automatic indexing systems would be required in order to supplant or complement the human indexer. Good indexing requires: considerable prior knowledge of the literature; judgement as to what to index and what depth to index; reading skills; abstracting skills; and classification skills, Illustrates these features with a detailed description of abstracting and indexing processes involved in generating entries for the mechanical engineering database POWERLINK. Briefly assesses the possibility of replacing human indexers with specialist indexing software, with particular reference to the Object Analyzer from the InTEXT automatic indexing system and using the criteria described for human indexers. At present, it is unlikely that the automatic indexer will replace the human indexer, but when more primary texts are available in electronic form, it may be a useful productivity tool for dealing with large quantities of low grade texts (should they be wanted in the database)
    Date
    9. 2.1997 18:44:22
  3. Sparck Jones, K.: Index term weighting (1973) 0.03
    0.0347757 = product of:
      0.1738785 = sum of:
        0.1738785 = weight(_text_:index in 5491) [ClassicSimilarity], result of:
          0.1738785 = score(doc=5491,freq=2.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.7724724 = fieldWeight in 5491, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.125 = fieldNorm(doc=5491)
      0.2 = coord(1/5)
    
  4. Leung, C.-H.; Kan, W.-K.: ¬A statistical learning approach to automatic indexing of controlled index terms (1997) 0.03
    0.031943526 = product of:
      0.15971762 = sum of:
        0.15971762 = weight(_text_:index in 6497) [ClassicSimilarity], result of:
          0.15971762 = score(doc=6497,freq=12.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.7095612 = fieldWeight in 6497, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=6497)
      0.2 = coord(1/5)
    
    Abstract
    A statistical learning approach to assigning controlled index terms is presented. In this approach, there are two processes: (1) the learning process and (2) the indexing process. The learning process constructs a relationship between an index term and the words relevant and irrelevant to it, based on the positive training set and negative training set, and those not indexed by it, respectively. The indexing process determines whether an index term is assigned to a certain document, based on the relationship constructed by the learning process, and the text found in the document. Furthermore, a learning feedback technique is introduced. This technique used in the learning process modifies the relationship between an index term and its relevant and irrelevant words to improve the learning performance and, thus, the indexing performance. Experimental results have shown that the statistical learning approach and the learning feedback technique are practical means to automatic indexing of controlled index terms
  5. Thönssen, B.: Automatische Indexierung und Schnittstellen zu Thesauri (1988) 0.03
    0.030737672 = product of:
      0.15368836 = sum of:
        0.15368836 = weight(_text_:index in 30) [ClassicSimilarity], result of:
          0.15368836 = score(doc=30,freq=4.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.6827756 = fieldWeight in 30, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.078125 = fieldNorm(doc=30)
      0.2 = coord(1/5)
    
    Abstract
    Über eine Schnittstelle zwischen Programmen zur automatischen Indexierung (PRIMUS-IDX) und zur maschinellen Thesaurusverwaltung (INDEX) sollen große Textmengen schnell, kostengünstig und konsistent erschlossen und verbesserte Recherchemöglichkeiten geschaffen werden. Zielvorstellung ist ein Verfahren, das auf PCs ablauffähig ist und speziell deutschsprachige Texte bearbeiten kann
    Object
    INDEX
  6. Moens, M.F.: Automatic indexing and abstracting of document texts (2000) 0.03
    0.030737672 = product of:
      0.15368836 = sum of:
        0.15368836 = weight(_text_:index in 6892) [ClassicSimilarity], result of:
          0.15368836 = score(doc=6892,freq=4.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.6827756 = fieldWeight in 6892, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.078125 = fieldNorm(doc=6892)
      0.2 = coord(1/5)
    
    Content
    Need for indexing and abstracting texts; attributes of texts; text representations and their use; selection of natural language index terms; assignment of controlled language index texts; automatic abstracting; applications
  7. Cohen, J.D.: Highlights: language- and domain-independent automatic indexing terms for abstracting (1995) 0.03
    0.026352067 = product of:
      0.13176033 = sum of:
        0.13176033 = weight(_text_:index in 1793) [ClassicSimilarity], result of:
          0.13176033 = score(doc=1793,freq=6.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.5853582 = fieldWeight in 1793, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1793)
      0.2 = coord(1/5)
    
    Abstract
    Presents a model of drawing index terms from text. The approach uses no stop list, stemmer, or other language and domain specific component, allowing operation in any language or domain with only trivial modification. The method uses n-grams counts, achieving a function similar to, but more general than, a stemmer. The generated index terms, called 'highlights', are suitable for identifying the topic for perusal and selection. An extension is also described and demonstrated which selects index terms to represent a subset of documents, distinguishing them from the corpus. Presents some experimental results, showing operation in English, Spanish, German, Georgian, Russian and Japanese
  8. O'Kane, K.C.: Generating hierarchical document indices from common denominators in large document collections (1996) 0.03
    0.026352067 = product of:
      0.13176033 = sum of:
        0.13176033 = weight(_text_:index in 4037) [ClassicSimilarity], result of:
          0.13176033 = score(doc=4037,freq=6.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.5853582 = fieldWeight in 4037, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4037)
      0.2 = coord(1/5)
    
    Abstract
    Describes an effective, simple and efficient algorithm for computer generation of hierarchical indices from Document Term matrices by means of calculating common denominator vectors from the document vector set. This procedure produces an intuitive, user friendly hierarchical index of a document collection not unlike that which would be expected had a manual indexer set about to create an index or outline of a collection. The resulting index, when presented with a graphical user interface, provides the user with a natural easily comprehended view of the document collection, permits general browsing and informal search activities with an access method that requires no keyboard entry or prior knowledge of the vocabulary
  9. Mansour, N.; Haraty, R.A.; Daher, W.; Houri, M.: ¬An auto-indexing method for Arabic text (2008) 0.03
    0.026081776 = product of:
      0.13040888 = sum of:
        0.13040888 = weight(_text_:index in 2103) [ClassicSimilarity], result of:
          0.13040888 = score(doc=2103,freq=8.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.5793543 = fieldWeight in 2103, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=2103)
      0.2 = coord(1/5)
    
    Abstract
    This work addresses the information retrieval problem of auto-indexing Arabic documents. Auto-indexing a text document refers to automatically extracting words that are suitable for building an index for the document. In this paper, we propose an auto-indexing method for Arabic text documents. This method is mainly based on morphological analysis and on a technique for assigning weights to words. The morphological analysis uses a number of grammatical rules to extract stem words that become candidate index words. The weight assignment technique computes weights for these words relative to the container document. The weight is based on how spread is the word in a document and not only on its rate of occurrence. The candidate index words are then sorted in descending order by weight so that information retrievers can select the more important index words. We empirically verify the usefulness of our method using several examples. For these examples, we obtained an average recall of 46% and an average precision of 64%.
  10. Nicoletti, M.: Automatische Indexierung (2001) 0.03
    0.026081776 = product of:
      0.13040888 = sum of:
        0.13040888 = weight(_text_:index in 4326) [ClassicSimilarity], result of:
          0.13040888 = score(doc=4326,freq=2.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.5793543 = fieldWeight in 4326, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.09375 = fieldNorm(doc=4326)
      0.2 = coord(1/5)
    
    Content
    Inhalt: 1. Aufgabe - 2. Ermittlung von Mehrwortgruppen - 2.1 Definition - 3. Kennzeichnung der Mehrwortgruppen - 4. Grundformen - 5. Term- und Dokumenthäufigkeit --- Termgewichtung - 6. Steuerungsinstrument Schwellenwert - 7. Invertierter Index. Vgl. unter: http://www.grin.com/de/e-book/104966/automatische-indexierung.
  11. Oberhauser, O.; Labner, J.: OPAC-Erweiterung durch automatische Indexierung : Empirische Untersuchung mit Daten aus dem Österreichischen Verbundkatalog (2002) 0.02
    0.022587484 = product of:
      0.11293741 = sum of:
        0.11293741 = weight(_text_:index in 883) [ClassicSimilarity], result of:
          0.11293741 = score(doc=883,freq=6.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.50173557 = fieldWeight in 883, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=883)
      0.2 = coord(1/5)
    
    Abstract
    In Anlehnung an die in den neunziger Jahren durchgeführten Erschließungsprojekte MILOS I und MILOS II, die die Eignung eines Verfahrens zur automatischen Indexierung für Bibliothekskataloge zum Thema hatten, wurde eine empirische Untersuchung anhand einer repräsentativen Stichprobe von Titelsätzen aus dem Österreichischen Verbundkatalog durchgeführt. Ziel war die Prüfung und Bewertung der Einsatzmöglichkeit dieses Verfahrens in den Online-Katalogen des Verbundes. Der Realsituation der OPAC-Benutzung gemäß wurde ausschließlich die Auswirkung auf den automatisch generierten Begriffen angereicherten Basic Index ("Alle Felder") untersucht. Dazu wurden 100 Suchanfragen zunächst im ursprünglichen Basic Index und sodann im angereicherten Basic Index in einem OPAC unter Aleph 500 durchgeführt. Die Tests erbrachten einen Zuwachs an relevanten Treffern bei nur leichten Verlusten an Precision, eine Reduktion der Nulltreffer-Ergebnisse sowie Aufschlüsse über die Auswirkung einer vorhandenen verbalen Sacherschließung.
  12. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.02
    0.022333153 = product of:
      0.11166576 = sum of:
        0.11166576 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.11166576 = score(doc=402,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.2 = coord(1/5)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  13. Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.02
    0.019541508 = product of:
      0.09770754 = sum of:
        0.09770754 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
          0.09770754 = score(doc=262,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.5416616 = fieldWeight in 262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=262)
      0.2 = coord(1/5)
    
    Date
    20.10.2000 12:22:23
  14. Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.02
    0.019541508 = product of:
      0.09770754 = sum of:
        0.09770754 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.09770754 = score(doc=6265,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.2 = coord(1/5)
    
    Source
    Information outlook. 9(2005) no.8, S.22-23
  15. Witschel, H.F.: Terminology extraction and automatic indexing : comparison and qualitative evaluation of methods (2005) 0.02
    0.018822905 = product of:
      0.09411452 = sum of:
        0.09411452 = weight(_text_:index in 1842) [ClassicSimilarity], result of:
          0.09411452 = score(doc=1842,freq=6.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.418113 = fieldWeight in 1842, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1842)
      0.2 = coord(1/5)
    
    Abstract
    Many terminology engineering processes involve the task of automatic terminology extraction: before the terminology of a given domain can be modelled, organised or standardised, important concepts (or terms) of this domain have to be identified and fed into terminological databases. These serve in further steps as a starting point for compiling dictionaries, thesauri or maybe even terminological ontologies for the domain. For the extraction of the initial concepts, extraction methods are needed that operate on specialised language texts. On the other hand, many machine learning or information retrieval applications require automatic indexing techniques. In Machine Learning applications concerned with the automatic clustering or classification of texts, often feature vectors are needed that describe the contents of a given text briefly but meaningfully. These feature vectors typically consist of a fairly small set of index terms together with weights indicating their importance. Short but meaningful descriptions of document contents as provided by good index terms are also useful to humans: some knowledge management applications (e.g. topic maps) use them as a set of basic concepts (topics). The author believes that the tasks of terminology extraction and automatic indexing have much in common and can thus benefit from the same set of basic algorithms. It is the goal of this paper to outline some methods that may be used in both contexts, but also to find the discriminating factors between the two tasks that call for the variation of parameters or application of different techniques. The discussion of these methods will be based on statistical, syntactical and especially morphological properties of (index) terms. The paper is concluded by the presentation of some qualitative and quantitative results comparing statistical and morphological methods.
  16. Ladewig, C.; Henkes, M.: Verfahren zur automatischen inhaltlichen Erschließung von elektronischen Texten : ASPECTIX (2001) 0.02
    0.018442601 = product of:
      0.092213005 = sum of:
        0.092213005 = weight(_text_:index in 5794) [ClassicSimilarity], result of:
          0.092213005 = score(doc=5794,freq=4.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.40966535 = fieldWeight in 5794, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=5794)
      0.2 = coord(1/5)
    
    Abstract
    Das Verfahren zur automatischen syntaktischen inhaltlichen Erschließung von elektronischen Texten, AspectiX, basiert auf einem Index, dessen Elemente mit einer universellen Aspekt-Klassifikation verknüpft sind, die es erlauben, ein syntaktisches Retrieval durchzuführen. Mit diesen, auf den jeweiligen Suchgegenstand inhaltlich bezogenen Klassifikationselementen, werden die Informationen in elektronischen Texten mit bekannten Suchalgorithmen abgefragt und die Ergebnisse entsprechend der Aspektverknüpfung ausgewertet. Mit diesen Aspekten ist es möglich, unbekannte Textdokumente automatisch fachgebiets- und sprachunabhängig nach Inhalten zu klassifizieren und beim Suchen in einem Textcorpus nicht nur auf die Verwendung von Zeichenfolgen angewiesen zu sein wie bei Suchmaschinen im WWW. Der Index kann bei diesen Vorgängen intellektuell und automatisch weiter ausgebaut werden und liefert Ergebnisse im Retrieval von nahezu 100 Prozent Precision, bei gleichzeitig nahezu 100 Prozent Recall. Damit ist das Verfahren AspectiX allen anderen Recherchetools um bis zu 40 Prozent an Precision bzw. Recall überlegen, wie an zahlreichen Recherchen in drei Datenbanken, die unterschiedlich groß und thematisch unähnlich sind, nachgewiesen wird
  17. Pirkola, A.: Morphological typology of languages for IR (2001) 0.02
    0.018442601 = product of:
      0.092213005 = sum of:
        0.092213005 = weight(_text_:index in 4476) [ClassicSimilarity], result of:
          0.092213005 = score(doc=4476,freq=4.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.40966535 = fieldWeight in 4476, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=4476)
      0.2 = coord(1/5)
    
    Abstract
    This paper presents a morphological classification of languages from the IR perspective. Linguistic typology research has shown that the morphological complexity of every language in the world can be described by two variables, index of synthesis and index of fusion. These variables provide a theoretical basis for IR research handling morphological issues. A common theoretical framework is needed in particular because of the increasing significance of cross-language retrieval research and CLIR systems processing different languages. The paper elaborates the linguistic morphological typology for the purposes of IR research. It studies how the indexes of synthesis and fusion could be used as practical tools in mono- and cross-lingual IR research. The need for semantic and syntactic typologies is discussed. The paper also reviews studies made in different languages on the effects of morphology and stemming in IR.
  18. Bloomfield, M.: Indexing : neglected and poorly understood (2001) 0.02
    0.018442601 = product of:
      0.092213005 = sum of:
        0.092213005 = weight(_text_:index in 5439) [ClassicSimilarity], result of:
          0.092213005 = score(doc=5439,freq=4.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.40966535 = fieldWeight in 5439, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=5439)
      0.2 = coord(1/5)
    
    Abstract
    The growth of the Internet has highlighted the use of machine indexing. The difficulties in using the Internet as a searching device can be frustrating. The use of the term "Python" is given as an example. Machine indexing is noted as "rotten" and human indexing as "capricious." The problem seems to be a lack of a theoretical foundation for the art of indexing. What librarians have learned over the last hundred years has yet to yield a consistent approach to what really works best in preparing index terms and in the ability of our customers to search the various indexes. An attempt is made to consider the elements of indexing, their pros and cons. The argument is made that machine indexing is far too prolific in its production of index terms. Neither librarians nor computer programmers have made much progress to improve Internet indexing. Human indexing has had the same problems for over fifty years.
  19. Asula, M.; Makke, J.; Freienthal, L.; Kuulmets, H.-A.; Sirel, R.: Kratt: developing an automatic subject indexing tool for the National Library of Estonia : how to transfer metadata information among work cluster members (2021) 0.02
    0.018442601 = product of:
      0.092213005 = sum of:
        0.092213005 = weight(_text_:index in 723) [ClassicSimilarity], result of:
          0.092213005 = score(doc=723,freq=4.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.40966535 = fieldWeight in 723, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=723)
      0.2 = coord(1/5)
    
    Abstract
    Manual subject indexing in libraries is a time-consuming and costly process and the quality of the assigned subjects is affected by the cataloger's knowledge on the specific topics contained in the book. Trying to solve these issues, we exploited the opportunities arising from artificial intelligence to develop Kratt: a prototype of an automatic subject indexing tool. Kratt is able to subject index a book independent of its extent and genre with a set of keywords present in the Estonian Subject Thesaurus. It takes Kratt approximately one minute to subject index a book, outperforming humans 10-15 times. Although the resulting keywords were not considered satisfactory by the catalogers, the ratings of a small sample of regular library users showed more promise. We also argue that the results can be enhanced by including a bigger corpus for training the model and applying more careful preprocessing techniques.
  20. Faraj, N.: Analyse d'une methode d'indexation automatique basée sur une analyse syntaxique de texte (1996) 0.02
    0.01738785 = product of:
      0.08693925 = sum of:
        0.08693925 = weight(_text_:index in 685) [ClassicSimilarity], result of:
          0.08693925 = score(doc=685,freq=2.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.3862362 = fieldWeight in 685, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0625 = fieldNorm(doc=685)
      0.2 = coord(1/5)
    
    Abstract
    Evaluates an automatic indexing method based on syntactical text analysis combined with statistical analysis. Tests many combinations for the choice of term categories and weighting methods. The experiment, conducted on a software engineering corpus, shows systematic improvement in the use of syntactic term phrases compared to using only individual words as index terms

Years

Languages

  • e 39
  • d 20
  • f 1
  • ru 1
  • More… Less…

Types

  • a 53
  • x 5
  • el 3
  • m 2
  • s 1
  • More… Less…