Search (24 results, page 1 of 2)

Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998) 0.05
```
0.05479938 = product of:
  0.10959876 = sum of:
    0.10959876 = sum of:
      0.078771055 = weight(_text_:subject in 1794) [ClassicSimilarity], result of:
        0.078771055 = score(doc=1794,freq=12.0), product of:
          0.16275941 = queryWeight, product of:
            3.576596 = idf(docFreq=3361, maxDocs=44218)
            0.04550679 = queryNorm
          0.48397237 = fieldWeight in 1794, product of:
            3.4641016 = tf(freq=12.0), with freq of:
              12.0 = termFreq=12.0
            3.576596 = idf(docFreq=3361, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1794)
      0.03082771 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
        0.03082771 = score(doc=1794,freq=2.0), product of:
          0.15935703 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04550679 = queryNorm
          0.19345059 = fieldWeight in 1794, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1794)
  0.5 = coord(1/2)
```
Abstract

In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document

Date

11. 9.2000 19:53:22

Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.04

0.0440901 = product of:
  0.0881802 = sum of:
    0.0881802 = sum of:
      0.045021407 = weight(_text_:subject in 530) [ClassicSimilarity], result of:
        0.045021407 = score(doc=530,freq=2.0), product of:
          0.16275941 = queryWeight, product of:
            3.576596 = idf(docFreq=3361, maxDocs=44218)
            0.04550679 = queryNorm
          0.27661324 = fieldWeight in 530, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.576596 = idf(docFreq=3361, maxDocs=44218)
            0.0546875 = fieldNorm(doc=530)
      0.043158792 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
        0.043158792 = score(doc=530,freq=2.0), product of:
          0.15935703 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04550679 = queryNorm
          0.2708308 = fieldWeight in 530, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=530)
  0.5 = coord(1/2)

Abstract: Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
Source: International forum on information and documentation. 22(1997) no.1, S.17-28

Milstead, J.L.: Thesauri in a full-text world (1998) 0.03

0.03149293 = product of:
  0.06298586 = sum of:
    0.06298586 = sum of:
      0.032158148 = weight(_text_:subject in 2337) [ClassicSimilarity], result of:
        0.032158148 = score(doc=2337,freq=2.0), product of:
          0.16275941 = queryWeight, product of:
            3.576596 = idf(docFreq=3361, maxDocs=44218)
            0.04550679 = queryNorm
          0.19758089 = fieldWeight in 2337, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.576596 = idf(docFreq=3361, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2337)
      0.03082771 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
        0.03082771 = score(doc=2337,freq=2.0), product of:
          0.15935703 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04550679 = queryNorm
          0.19345059 = fieldWeight in 2337, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2337)
  0.5 = coord(1/2)

Date: 22. 9.1997 19:16:05
Source: Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al

Micco, M.; Popp, R.: Improving library subject access (ILSA) : a theory of clustering based in classification (1994) 0.03
```
0.025167733 = product of:
  0.050335467 = sum of:
    0.050335467 = product of:
      0.10067093 = sum of:
        0.10067093 = weight(_text_:subject in 7715) [ClassicSimilarity], result of:
          0.10067093 = score(doc=7715,freq=10.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.61852604 = fieldWeight in 7715, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7715)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The ILSA prototype was developed using an object-oriented multimedia user interfcae on six NeXT workstations with two databases: the first with 100.000 MARC records and the second with 20.000 additional records enhanced with table of contents data. The items are grouped into subject clusters consisting of the classification number and the first subject heading assigned. Every other distinct keyword in the MARC record is linked to the subject cluster in an automated natural language mapping scheme, which leads the user from the term entered to the controlled vocabulary of the subject clusters in which the term appeared. The use of a hierarchical classification number (Dewey) makes it possible to broaden or narrow a search at will

Hirawa, M.: Role of keywords in the network searching era (1998) 0.02

0.02227982 = product of:
  0.04455964 = sum of:
    0.04455964 = product of:
      0.08911928 = sum of:
        0.08911928 = weight(_text_:subject in 3446) [ClassicSimilarity], result of:
          0.08911928 = score(doc=3446,freq=6.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.5475522 = fieldWeight in 3446, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0625 = fieldNorm(doc=3446)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: A survey of Japanese OPACs available on the Internet was conducted relating to use of keywords for subject access. The findings suggest that present OPACs are not capable of storing subject-oriented information. Currently available keyword access derives from a merely title-based retrieval system. Contents data should be added to bibliographic records as an efficient way of providing subject access, and costings for this process should be estimated. Word standardisation issues must also be addressed

Gomez, I.: Coping with the problem of subject classification diversity (1996) 0.02
```
0.019494843 = product of:
  0.038989685 = sum of:
    0.038989685 = product of:
      0.07797937 = sum of:
        0.07797937 = weight(_text_:subject in 5074) [ClassicSimilarity], result of:
          0.07797937 = score(doc=5074,freq=6.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.4791082 = fieldWeight in 5074, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5074)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The delimination of a research field in bibliometric studies presents the problem of the diversity of subject classifications used in the sources of input and output data. Classification of documents according the thematic codes or keywords is the most accurate method, mainly used is specialized bibliographic or patent databases. Classification of journals in disciplines presents lower specifity, and some shortcomings as the change over time of both journals and disciplines and the increasing interdisciplinarity of research. Standardization of subject classifications emerges as an important point in bibliometric studies in order to allow international comparisons, although flexibility is needed to meet the needs of local studies

Yongcheng, W.; Xiaoming, G.; Lixia, W.: Automatic indexing on subject of Chinese text (1998) 0.02

0.016079074 = product of:
  0.032158148 = sum of:
    0.032158148 = product of:
      0.064316295 = sum of:
        0.064316295 = weight(_text_:subject in 3241) [ClassicSimilarity], result of:
          0.064316295 = score(doc=3241,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.39516178 = fieldWeight in 3241, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.078125 = fieldNorm(doc=3241)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Milstead, J.L.: Methodologies for subject analysis in bibliographic databases (1992) 0.02
```
0.01591747 = product of:
  0.03183494 = sum of:
    0.03183494 = product of:
      0.06366988 = sum of:
        0.06366988 = weight(_text_:subject in 2311) [ClassicSimilarity], result of:
          0.06366988 = score(doc=2311,freq=4.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.3911902 = fieldWeight in 2311, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2311)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The goal of the study was to determine the state of the art of subject analysis as applied to large bibliographic data bases. The intent was to gather and evaluate information, casting it in a form that could be applied by management. There was no attempt to determine actual costs or trade-offs among costs and possible benefits. Commercial automatic indexing packages were also reviewed. The overall conclusion was that data base producers should begin working seriously on upgrading their thesauri and codifying their indexing policies as a means of moving toward development of machine aids to indexing, but that fully automatic indexing is not yet ready for wholesale implementation
Losee, R.M.: ¬A Gray code based ordering for documents on shelves : classification for browsing and retrieval (1992) 0.02
```
0.01591747 = product of:
  0.03183494 = sum of:
    0.03183494 = product of:
      0.06366988 = sum of:
        0.06366988 = weight(_text_:subject in 2335) [ClassicSimilarity], result of:
          0.06366988 = score(doc=2335,freq=4.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.3911902 = fieldWeight in 2335, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2335)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

A document classifier places documents together in a linear arrangement for browsing or high-speed access by human or computerised information retrieval systems. Requirements for document classification and browsing systems are developed from similarity measures, distance measures, and the notion of subject aboutness. A requirement that documents be arranged in decreasing order of similarity as the distance from a given document increases can often not be met. Based on these requirements, information-theoretic considerations, and the Gray code, a classification system is proposed that can classifiy documents without human intervention. A measure of classifier performance is developed, and used to evaluate experimental results comparing the distance between subject headings assigned to documents given classifications from the proposed system and the Library of Congress Classification (LCC) system
Shafer, K.: Scorpion Project explores using Dewey to organize the Web (1996) 0.02
```
0.01591747 = product of:
  0.03183494 = sum of:
    0.03183494 = product of:
      0.06366988 = sum of:
        0.06366988 = weight(_text_:subject in 6750) [ClassicSimilarity], result of:
          0.06366988 = score(doc=6750,freq=4.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.3911902 = fieldWeight in 6750, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6750)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

As the amount of accessible information on the WWW increases, so will the cost of accessing it, even if search servcies remain free, due to the increasing amount of time users will have to spend to find needed items. Considers what the seemingly unorganized Web and the organized world of libraries can offer each other. The OCLC Scorpion Project is attempting to combine indexing and cataloguing, specifically focusing on building tools for automatic subject recognition using the technqiues of library science and information retrieval. If subject headings or concept domains can be automatically assigned to electronic items, improved filtering tools for searching can be produced

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.02

0.015413855 = product of:
  0.03082771 = sum of:
    0.03082771 = product of:
      0.06165542 = sum of:
        0.06165542 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.06165542 = score(doc=4157,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Tsareva, P.V.: Algoritmy dlya raspoznavaniya pozitivnykh i negativnykh vkhozdenii deskriptorov v tekst i protsedura avtomaticheskoi klassifikatsii tekstov (1999) 0.02

0.015413855 = product of:
  0.03082771 = sum of:
    0.03082771 = product of:
      0.06165542 = sum of:
        0.06165542 = weight(_text_:22 in 374) [ClassicSimilarity], result of:
          0.06165542 = score(doc=374,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.38690117 = fieldWeight in 374, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=374)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 4.2002 10:22:41

Schuegraf, E.J.; Bommel, M.F.van: ¬An automatic document indexing system based on cooperating expert systems : design and development (1993) 0.01

0.012863259 = product of:
  0.025726518 = sum of:
    0.025726518 = product of:
      0.051453035 = sum of:
        0.051453035 = weight(_text_:subject in 6504) [ClassicSimilarity], result of:
          0.051453035 = score(doc=6504,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.31612942 = fieldWeight in 6504, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0625 = fieldNorm(doc=6504)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Discusses the design of an automatic indexing system based on two cooperating expert systems and the investigation related to its development. The design combines statistical and artificial intelligence techniques. Examines choice of content indicators, the effect of stemming and the identification of characteristic vocabularies for given subject areas. Presents experimental results. Discusses the application of machine learning algorithms to the identification of vocabularies

Konings, E.; Gramsbergen, E.: Automatische onderwerpsondexering van een bibliotheekscatalogus : Ervaringen van de Bibliotheek TU Delft (1999) 0.01

0.012863259 = product of:
  0.025726518 = sum of:
    0.025726518 = product of:
      0.051453035 = sum of:
        0.051453035 = weight(_text_:subject in 3263) [ClassicSimilarity], result of:
          0.051453035 = score(doc=3263,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.31612942 = fieldWeight in 3263, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0625 = fieldNorm(doc=3263)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Übers. d. Titels: Experiences at Delft Technical University: automatic subject indexing of a library catalogue

Prasad, A.R.D.: PROMETHEUS: an automatic indexing system (1996) 0.01

0.012863259 = product of:
  0.025726518 = sum of:
    0.025726518 = product of:
      0.051453035 = sum of:
        0.051453035 = weight(_text_:subject in 5189) [ClassicSimilarity], result of:
          0.051453035 = score(doc=5189,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.31612942 = fieldWeight in 5189, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0625 = fieldNorm(doc=5189)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: An automatic indexing system using the tools and techniques of artificial intelligence is described. The paper presents the various components of the system like the parser, grammar formalism, lexicon, and the frame based knowledge representation for semantic representation. The semantic representation is based on the Ranganathan school of thought, especially that of Deep Structure of Subject Indexing Languages enunciated by Bhattacharyya. It is attempted to demonstrate the various stepts in indexing by providing an illustration

Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.01

0.012331083 = product of:
  0.024662167 = sum of:
    0.024662167 = product of:
      0.049324334 = sum of:
        0.049324334 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
          0.049324334 = score(doc=4709,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.30952093 = fieldWeight in 4709, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01

0.012331083 = product of:
  0.024662167 = sum of:
    0.024662167 = product of:
      0.049324334 = sum of:
        0.049324334 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
          0.049324334 = score(doc=6752,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.30952093 = fieldWeight in 6752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 6. 3.1997 16:22:15

Clavel, G.; Walther, F.; Walther, J.: Indexation automatique de fonds bibliotheconomiques (1993) 0.01
```
0.011255352 = product of:
  0.022510704 = sum of:
    0.022510704 = product of:
      0.045021407 = sum of:
        0.045021407 = weight(_text_:subject in 6610) [ClassicSimilarity], result of:
          0.045021407 = score(doc=6610,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.27661324 = fieldWeight in 6610, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6610)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

A discussion of developments to date in the field of computerized indexing, based on presentations given at a seminar held at the Institute of Policy Studies in Paris in Nov 91. The methods tested so far, based on a linguistic approach, whether using natural language or special thesauri, encounter the same central problem - they are only successful when applied to collections of similar types of documents covering very specific subject areas. Despite this, the search for some sort of universal indexing metalanguage continues. In the end, computerized indexing works best when used in conjunction with manual indexing - ideally in the hands of a trained library science professional, who can extract the maximum value from a collection of documents for a particular user population
Driscoll, J.R.; Rajala, D.A.; Shaffer, W.H.: ¬The operation and performance of an artificially intelligent keywording system (1991) 0.01
```
0.011255352 = product of:
  0.022510704 = sum of:
    0.022510704 = product of:
      0.045021407 = sum of:
        0.045021407 = weight(_text_:subject in 6681) [ClassicSimilarity], result of:
          0.045021407 = score(doc=6681,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.27661324 = fieldWeight in 6681, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6681)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Presents a new approach to text analysis for automating the key phrase indexing process, using artificial intelligence techniques. This mimics the behaviour of human experts by using a rule base consisting of insertion and deletion rules generated by subject-matter experts. The insertion rules are based on the idea that some phrases found in a text imply or trigger other phrases. The deletion rules apply to semantically ambiguous phrases where text presence alone does not determine appropriateness as a key phrase. The insertion and deletion rules are used to transform a list of found phrases to a list of key phrases for indexing a document. Statistical data are provided to demonstrate the performance of this expert rule based system
Lepsky, K.; Siepmann, J.; Zimmermann, A.: Automatische Indexierung für Online-Kataloge : Ergebnisse eines Retrievaltests (1996) 0.01
```
0.011255352 = product of:
  0.022510704 = sum of:
    0.022510704 = product of:
      0.045021407 = sum of:
        0.045021407 = weight(_text_:subject in 3251) [ClassicSimilarity], result of:
          0.045021407 = score(doc=3251,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.27661324 = fieldWeight in 3251, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3251)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Examines the effectiveness of automated indexing and presents the results of a study of information retrieval from a segment (40.000 items) of the ULB Düsseldorf database. The segment was selected randomly and all the documents included were indexed automatically. The search topics included 50 subject areas ranging from economic growth to alternative energy sources. While there were 876 relevant documents in the database segment for each of the 50 search topics, the recall ranged from 1 to 244 references, with the average being 17.52 documents per topic. Therefore it seems that, in the immediate future, automatic indexing should be used in combination with intellectual indexing

Search (24 results, page 1 of 2)

Authors

Languages

Types

Themes