Search (50 results, page 1 of 3)

Malone, L.C.; Wildman-Pepe, J.; Driscoll, J.R.: Evaluation of an automated keywording system (1990) 0.03

0.032706924 = product of:
  0.081767306 = sum of:
    0.02060168 = product of:
      0.04120336 = sum of:
        0.04120336 = weight(_text_:problems in 4999) [ClassicSimilarity], result of:
          0.04120336 = score(doc=4999,freq=2.0), product of:
            0.15058853 = queryWeight, product of:
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.036484417 = queryNorm
            0.27361554 = fieldWeight in 4999, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.046875 = fieldNorm(doc=4999)
      0.5 = coord(1/2)
    0.061165623 = product of:
      0.12233125 = sum of:
        0.12233125 = weight(_text_:exercises in 4999) [ClassicSimilarity], result of:
          0.12233125 = score(doc=4999,freq=2.0), product of:
            0.25947425 = queryWeight, product of:
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.036484417 = queryNorm
            0.47145814 = fieldWeight in 4999, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.046875 = fieldNorm(doc=4999)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: An automated keywording system has been designed ro artifically behave as a human "expert" indexer. The system was designed to keyword 100 to 800 word documents representing lessons learned from military exercises and operations. A set of 74 documents can be keyworded on an IBM PS/2 model 80 in about five minutes. This paper presents a variety of ways for statistical documenting improvements in the development of an automated keywording system over time. It is not only beneficial to have some measure of system performance for a given time, but it is also useful as attemps are made to improve a system to assess if actual statistically significant improvements have been made. Furthermore, it is useful to identify the source of any existing problems so that they can be rectified. The specifics of the automated system that was evaluated are described, and the performance measures used are discussed.

Hodges, P.R.: Keyword in title indexes : effectiveness of retrieval in computer searches (1983) 0.02

0.016534507 = product of:
  0.04133627 = sum of:
    0.024035294 = product of:
      0.048070587 = sum of:
        0.048070587 = weight(_text_:problems in 5001) [ClassicSimilarity], result of:
          0.048070587 = score(doc=5001,freq=2.0), product of:
            0.15058853 = queryWeight, product of:
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.036484417 = queryNorm
            0.31921813 = fieldWeight in 5001, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
      0.5 = coord(1/2)
    0.017300973 = product of:
      0.034601945 = sum of:
        0.034601945 = weight(_text_:22 in 5001) [ClassicSimilarity], result of:
          0.034601945 = score(doc=5001,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.2708308 = fieldWeight in 5001, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: A study was done to test the effectiveness of retrieval using title word searching. It was based on actual search profiles used in the Mechanized Information Center at Ohio State University, in order ro replicate as closely as possible actual searching conditions. Fewer than 50% of the relevant titles were retrieved by keywords in titles. The low rate of retrieval can be attributes to three sources: titles themselves, user and information specialist ignorance of the subject vocabulary in use, and to general language problems. Across fields it was found that the social sciences had the best retrieval rate, with science having the next best, and arts and humanities the lowest. Ways to enhance and supplement keyword in title searching on the computer and in printed indexes are discussed.
Date: 14. 3.1996 13:22:21

Malone, L.C.; Driscoll, J.R.; Pepe, J.W.: Modeling the performance of an automated keywording system (1991) 0.02

0.016310833 = product of:
  0.08155417 = sum of:
    0.08155417 = product of:
      0.16310833 = sum of:
        0.16310833 = weight(_text_:exercises in 6682) [ClassicSimilarity], result of:
          0.16310833 = score(doc=6682,freq=2.0), product of:
            0.25947425 = queryWeight, product of:
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.036484417 = queryNorm
            0.62861085 = fieldWeight in 6682, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.11192 = idf(docFreq=97, maxDocs=44218)
              0.0625 = fieldNorm(doc=6682)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Abstract: Presents a model for predicting the performance of a computerised keyword assigning and indexing system. Statistical procedures were investigated in order to protect against incorrect keywording by the system behaving as an expert system designed to mimic the behaviour of human keyword indexers and representing lessons learned from military exercises and operations

Busch, D.: Domänenspezifische hybride automatische Indexierung von bibliographischen Metadaten (2019) 0.01

0.014172435 = product of:
  0.035431087 = sum of:
    0.02060168 = product of:
      0.04120336 = sum of:
        0.04120336 = weight(_text_:problems in 5628) [ClassicSimilarity], result of:
          0.04120336 = score(doc=5628,freq=2.0), product of:
            0.15058853 = queryWeight, product of:
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.036484417 = queryNorm
            0.27361554 = fieldWeight in 5628, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.046875 = fieldNorm(doc=5628)
      0.5 = coord(1/2)
    0.014829405 = product of:
      0.02965881 = sum of:
        0.02965881 = weight(_text_:22 in 5628) [ClassicSimilarity], result of:
          0.02965881 = score(doc=5628,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.23214069 = fieldWeight in 5628, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=5628)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Im Fraunhofer-Informationszentrum Raum und Bau (IRB) wird Fachliteratur im Bereich Planen und Bauen bibliographisch erschlossen. Die daraus resultierenden Dokumente (Metadaten-Einträge) werden u.a. bei der Produktion der bibliographischen Datenbanken des IRB verwendet. In Abb. 1 ist ein Dokument dargestellt, das einen Zeitschriftenartikel beschreibt. Die Dokumente werden mit Deskriptoren von einer Nomenklatur (Schlagwortliste IRB) indexiert. Ein Deskriptor ist "eine Benennung., die für sich allein verwendbar, eindeutig zur Inhaltskennzeichnung geeignet und im betreffenden Dokumentationssystem zugelassen ist". Momentan wird die Indexierung intellektuell von menschlichen Experten durchgeführt. Die intellektuelle Indexierung ist zeitaufwendig und teuer. Eine Lösung des Problems besteht in der automatischen Indexierung, bei der die Zuordnung von Deskriptoren durch ein Computerprogramm erfolgt. Solche Computerprogramme werden im Folgenden auch als Klassifikatoren bezeichnet. In diesem Beitrag geht es um ein System zur automatischen Indexierung von deutschsprachigen Dokumenten im Bereich Bauwesen mit Deskriptoren aus der Schlagwortliste IRB.
Source: B.I.T.online. 22(2019) H.6, S.465-469

Zeng, L.: Automatic indexing for Chinese text : problems and progress (1992) 0.01

0.009614117 = product of:
  0.048070587 = sum of:
    0.048070587 = product of:
      0.096141174 = sum of:
        0.096141174 = weight(_text_:problems in 1289) [ClassicSimilarity], result of:
          0.096141174 = score(doc=1289,freq=2.0), product of:
            0.15058853 = queryWeight, product of:
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.036484417 = queryNorm
            0.63843626 = fieldWeight in 1289, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.109375 = fieldNorm(doc=1289)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.01

0.007909016 = product of:
  0.039545078 = sum of:
    0.039545078 = product of:
      0.079090156 = sum of:
        0.079090156 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.079090156 = score(doc=402,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: Information processing and management. 22(1986) no.6, S.465-476

Humphrey, S.M.: Automatic indexing of documents from journal descriptors : a preliminary investigation (1999) 0.01
```
0.007095774 = product of:
  0.03547887 = sum of:
    0.03547887 = product of:
      0.07095774 = sum of:
        0.07095774 = weight(_text_:etc in 3769) [ClassicSimilarity], result of:
          0.07095774 = score(doc=3769,freq=2.0), product of:
            0.19761753 = queryWeight, product of:
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.036484417 = queryNorm
            0.35906604 = fieldWeight in 3769, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.046875 = fieldNorm(doc=3769)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

A new, fully automated approach for indedexing documents is presented based on associating textwords in a training set of bibliographic citations with the indexing of journals. This journal-level indexing is in the form of a consistent, timely set of journal descriptors (JDs) indexing the individual journals themselves. This indexing is maintained in journal records in a serials authority database. The advantage of this novel approach is that the training set does not depend on previous manual indexing of thousands of documents (i.e., any such indexing already in the training set is not used), but rather the relatively small intellectual effort of indexing at the journal level, usually a matter of a few thousand unique journals for which retrospective indexing to maintain consistency and currency may be feasible. If successful, JD indexing would provide topical categorization of documents outside the training set, i.e., journal articles, monographs, Web documents, reports from the grey literature, etc., and therefore be applied in searching. Because JDs are quite general, corresponding to subject domains, their most problable use would be for improving or refining search results
Banerjee, K.; Johnson, M.: Improving access to archival collections with automated entity extraction (2015) 0.01
```
0.007095774 = product of:
  0.03547887 = sum of:
    0.03547887 = product of:
      0.07095774 = sum of:
        0.07095774 = weight(_text_:etc in 2144) [ClassicSimilarity], result of:
          0.07095774 = score(doc=2144,freq=2.0), product of:
            0.19761753 = queryWeight, product of:
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.036484417 = queryNorm
            0.35906604 = fieldWeight in 2144, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.046875 = fieldNorm(doc=2144)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

The complexity and diversity of archival resources make constructing rich metadata records time consuming and expensive, which in turn limits access to these valuable materials. However, significant automation of the metadata creation process would dramatically reduce the cost of providing access points, improve access to individual resources, and establish connections between resources that would otherwise remain unknown. Using a case study at Oregon Health & Science University as a lens to examine the conceptual and technical challenges associated with automated extraction of access points, we discuss using publically accessible API's to extract entities (i.e. people, places, concepts, etc.) from digital and digitized objects. We describe why Linked Open Data is not well suited for a use case such as ours. We conclude with recommendations about how this method can be used in archives as well as for other library applications.

Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.01

0.0069203894 = product of:
  0.034601945 = sum of:
    0.034601945 = product of:
      0.06920389 = sum of:
        0.06920389 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
          0.06920389 = score(doc=262,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.5416616 = fieldWeight in 262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=262)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 20.10.2000 12:22:23

Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.01

0.0069203894 = product of:
  0.034601945 = sum of:
    0.034601945 = product of:
      0.06920389 = sum of:
        0.06920389 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.06920389 = score(doc=6265,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: Information outlook. 9(2005) no.8, S.22-23

Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.01

0.005931762 = product of:
  0.02965881 = sum of:
    0.02965881 = product of:
      0.05931762 = sum of:
        0.05931762 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
          0.05931762 = score(doc=58,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.46428138 = fieldWeight in 58, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=58)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 14. 6.2015 22:12:44

Hauer, M.: Automatische Indexierung (2000) 0.01

0.005931762 = product of:
  0.02965881 = sum of:
    0.02965881 = product of:
      0.05931762 = sum of:
        0.05931762 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
          0.05931762 = score(doc=5887,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.46428138 = fieldWeight in 5887, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5887)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt

Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.01

0.005931762 = product of:
  0.02965881 = sum of:
    0.02965881 = product of:
      0.05931762 = sum of:
        0.05931762 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
          0.05931762 = score(doc=2051,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.46428138 = fieldWeight in 2051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=2051)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 14. 6.2015 22:12:56

Hauer, M.: Tiefenindexierung im Bibliothekskatalog : 17 Jahre intelligentCAPTURE (2019) 0.01

0.005931762 = product of:
  0.02965881 = sum of:
    0.02965881 = product of:
      0.05931762 = sum of:
        0.05931762 = weight(_text_:22 in 5629) [ClassicSimilarity], result of:
          0.05931762 = score(doc=5629,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.46428138 = fieldWeight in 5629, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5629)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: B.I.T.online. 22(2019) H.2, S.163-166

Koryconski, C.; Newell, A.F.: Natural-language processing and automatic indexing (1990) 0.01

0.0054937815 = product of:
  0.027468907 = sum of:
    0.027468907 = product of:
      0.054937813 = sum of:
        0.054937813 = weight(_text_:problems in 2313) [ClassicSimilarity], result of:
          0.054937813 = score(doc=2313,freq=2.0), product of:
            0.15058853 = queryWeight, product of:
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.036484417 = queryNorm
            0.36482072 = fieldWeight in 2313, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.0625 = fieldNorm(doc=2313)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Abstract: The task of producing satisfactory indexes by automatic means has been tackled on two fronts: by statistical analysis of text and by attempting content analysis of the text in much the same way as a human indexer does. Though statistical techniques have a lot to offer for free-text database systems, neither method has had much success with back-of-the-book indexing. This review examines some problems associated with the application of natural-language processing techniques to book texts. - Vgl. auch die Erwiderung von K.P. Jones

Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.00

0.004943135 = product of:
  0.024715675 = sum of:
    0.024715675 = product of:
      0.04943135 = sum of:
        0.04943135 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
          0.04943135 = score(doc=1952,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.38690117 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 16. 8.1998 12:51:22

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.00

0.004943135 = product of:
  0.024715675 = sum of:
    0.024715675 = product of:
      0.04943135 = sum of:
        0.04943135 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.04943135 = score(doc=4157,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Tsareva, P.V.: Algoritmy dlya raspoznavaniya pozitivnykh i negativnykh vkhozdenii deskriptorov v tekst i protsedura avtomaticheskoi klassifikatsii tekstov (1999) 0.00

0.004943135 = product of:
  0.024715675 = sum of:
    0.024715675 = product of:
      0.04943135 = sum of:
        0.04943135 = weight(_text_:22 in 374) [ClassicSimilarity], result of:
          0.04943135 = score(doc=374,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.38690117 = fieldWeight in 374, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=374)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 1. 4.2002 10:22:41

Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.00

0.004943135 = product of:
  0.024715675 = sum of:
    0.024715675 = product of:
      0.04943135 = sum of:
        0.04943135 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
          0.04943135 = score(doc=2759,freq=2.0), product of:
            0.12776221 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036484417 = queryNorm
            0.38690117 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 1. 2.2016 18:25:22

Vledutz-Stokolov, N.: Concept recognition in an automatic text-processing system for the life sciences (1987) 0.00
```
0.004855863 = product of:
  0.024279313 = sum of:
    0.024279313 = product of:
      0.048558626 = sum of:
        0.048558626 = weight(_text_:problems in 2849) [ClassicSimilarity], result of:
          0.048558626 = score(doc=2849,freq=4.0), product of:
            0.15058853 = queryWeight, product of:
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.036484417 = queryNorm
            0.322459 = fieldWeight in 2849, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.1274753 = idf(docFreq=1937, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2849)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

This article describes a natural-language text-processing system designed as an automatic aid to subject indexing at BIOSIS. The intellectual procedure the system should model is a deep indexing with a controlled vocabulary of biological concepts - Concept Headings (CHs). On the average, ten CHs are assigned to each article by BIOSIS indexers. The automatic procedure consists of two stages: (1) translation of natural-language biological titles into title-semantic representations which are in the constructed formalized language of Concept Primitives, and (2) translation of the latter representations into the language of CHs. The first stage is performed by matching the titles agianst the system's Semantic Vocabulary (SV). The SV currently contains approximately 15.000 biological natural-language terms and their translations in the language of Concept Primitives. Tor the ambiguous terms, the SV contains the algorithmical rules of term disambiguation, ruels based on semantic analysis of the contexts. The second stage of the automatic procedure is performed by matching the title representations against the CH definitions, formulated as Boolean search strategies in the language of Concept Primitives. Three experiments performed with the system and their results are decribed. The most typical problems the system encounters, the problems of lexical and situational ambiguities, are discussed. The disambiguation techniques employed are described and demonstrated in many examples

Search (50 results, page 1 of 3)

Authors

Years

Languages

Types

Themes

Classifications