Search (40 results, page 1 of 2)

Milstead, J.L.: Thesauri in a full-text world (1998) 0.05

0.049989868 = product of:
  0.099979736 = sum of:
    0.02586502 = weight(_text_:data in 2337) [ClassicSimilarity], result of:
      0.02586502 = score(doc=2337,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 2337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2337)
    0.07411472 = sum of:
      0.042392377 = weight(_text_:processing in 2337) [ClassicSimilarity], result of:
        0.042392377 = score(doc=2337,freq=2.0), product of:
          0.18956426 = queryWeight, product of:
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.046827413 = queryNorm
          0.22363065 = fieldWeight in 2337, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2337)
      0.03172234 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
        0.03172234 = score(doc=2337,freq=2.0), product of:
          0.16398162 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046827413 = queryNorm
          0.19345059 = fieldWeight in 2337, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2337)
  0.5 = coord(2/4)

Date: 22. 9.1997 19:16:05
Source: Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al

Fox, C.: Lexical analysis and stoplists (1992) 0.05

0.046219878 = product of:
  0.092439756 = sum of:
    0.058525857 = weight(_text_:data in 3502) [ClassicSimilarity], result of:
      0.058525857 = score(doc=3502,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.3952563 = fieldWeight in 3502, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=3502)
    0.033913903 = product of:
      0.067827806 = sum of:
        0.067827806 = weight(_text_:processing in 3502) [ClassicSimilarity], result of:
          0.067827806 = score(doc=3502,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.35780904 = fieldWeight in 3502, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0625 = fieldNorm(doc=3502)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Lexical analysis is a fundamental operation in both query processing and automatic indexing, and filtering stoplist words is an important step in the automatic indexing process. Presents basic algorithms and data structures for lexical analysis, and shows how stoplist word removal can be efficiently incorporated into lexical analysis
Source: Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates

Milstead, J.L.: Methodologies for subject analysis in bibliographic databases (1992) 0.04

0.040442396 = product of:
  0.08088479 = sum of:
    0.051210128 = weight(_text_:data in 2311) [ClassicSimilarity], result of:
      0.051210128 = score(doc=2311,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.34584928 = fieldWeight in 2311, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2311)
    0.029674664 = product of:
      0.05934933 = sum of:
        0.05934933 = weight(_text_:processing in 2311) [ClassicSimilarity], result of:
          0.05934933 = score(doc=2311,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.3130829 = fieldWeight in 2311, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2311)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: The goal of the study was to determine the state of the art of subject analysis as applied to large bibliographic data bases. The intent was to gather and evaluate information, casting it in a form that could be applied by management. There was no attempt to determine actual costs or trade-offs among costs and possible benefits. Commercial automatic indexing packages were also reviewed. The overall conclusion was that data base producers should begin working seriously on upgrading their thesauri and codifying their indexing policies as a means of moving toward development of machine aids to indexing, but that fully automatic indexing is not yet ready for wholesale implementation
Source: Information processing and management. 28(1992) no.3, S.407-431

Polity, Y.: Vers une ergonomie linguistique (1994) 0.04

0.03764897 = product of:
  0.07529794 = sum of:
    0.04138403 = weight(_text_:data in 36) [ClassicSimilarity], result of:
      0.04138403 = score(doc=36,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2794884 = fieldWeight in 36, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=36)
    0.033913903 = product of:
      0.067827806 = sum of:
        0.067827806 = weight(_text_:processing in 36) [ClassicSimilarity], result of:
          0.067827806 = score(doc=36,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.35780904 = fieldWeight in 36, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0625 = fieldNorm(doc=36)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Analyzed a special type of man-mchine interaction, that of searching an information system with natural language. A model for full text processing for information retrieval was proposed that considered the system's users and how they employ information. Describes how INIST (the National Institute for Scientific and Technical Information) is developing computer assisted indexing as an aid to improving relevance when retrieving information from bibliographic data banks

Driscoll, J.R.; Rajala, D.A.; Shaffer, W.H.: ¬The operation and performance of an artificially intelligent keywording system (1991) 0.03

0.032942846 = product of:
  0.06588569 = sum of:
    0.036211025 = weight(_text_:data in 6681) [ClassicSimilarity], result of:
      0.036211025 = score(doc=6681,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24455236 = fieldWeight in 6681, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6681)
    0.029674664 = product of:
      0.05934933 = sum of:
        0.05934933 = weight(_text_:processing in 6681) [ClassicSimilarity], result of:
          0.05934933 = score(doc=6681,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.3130829 = fieldWeight in 6681, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6681)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Presents a new approach to text analysis for automating the key phrase indexing process, using artificial intelligence techniques. This mimics the behaviour of human experts by using a rule base consisting of insertion and deletion rules generated by subject-matter experts. The insertion rules are based on the idea that some phrases found in a text imply or trigger other phrases. The deletion rules apply to semantically ambiguous phrases where text presence alone does not determine appropriateness as a key phrase. The insertion and deletion rules are used to transform a list of found phrases to a list of key phrases for indexing a document. Statistical data are provided to demonstrate the performance of this expert rule based system
Source: Information processing and management. 27(1991) no.1, S.43-54

Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.03

0.032085977 = product of:
  0.12834391 = sum of:
    0.12834391 = sum of:
      0.08393263 = weight(_text_:processing in 530) [ClassicSimilarity], result of:
        0.08393263 = score(doc=530,freq=4.0), product of:
          0.18956426 = queryWeight, product of:
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.046827413 = queryNorm
          0.4427661 = fieldWeight in 530, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.0546875 = fieldNorm(doc=530)
      0.044411276 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
        0.044411276 = score(doc=530,freq=2.0), product of:
          0.16398162 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046827413 = queryNorm
          0.2708308 = fieldWeight in 530, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=530)
  0.25 = coord(1/4)

Abstract: Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
Source: International forum on information and documentation. 22(1997) no.1, S.17-28

Chowdhury, G.G.: Natural language processing and information retrieval : pt.1: basic issues; pt.2: major applications (1991) 0.02

0.018356439 = product of:
  0.073425755 = sum of:
    0.073425755 = product of:
      0.14685151 = sum of:
        0.14685151 = weight(_text_:processing in 3313) [ClassicSimilarity], result of:
          0.14685151 = score(doc=3313,freq=6.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.7746793 = fieldWeight in 3313, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.078125 = fieldNorm(doc=3313)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Abstract: Reviews the basic issues and procedures involved in natural language processing of textual material for final use in information retrieval. Covers: natural language processing; natural language understanding; syntactic and semantic analysis; parsing; knowledge bases and knowledge representation

Salton, G.; Allen, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine-readable data (1994) 0.02

0.018105512 = product of:
  0.07242205 = sum of:
    0.07242205 = weight(_text_:data in 1168) [ClassicSimilarity], result of:
      0.07242205 = score(doc=1168,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.48910472 = fieldWeight in 1168, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.109375 = fieldNorm(doc=1168)
  0.25 = coord(1/4)

Jones, K.P.: Natural-language processing and automatic indexing : a reply (1990) 0.02

0.016956951 = product of:
  0.067827806 = sum of:
    0.067827806 = product of:
      0.13565561 = sum of:
        0.13565561 = weight(_text_:processing in 394) [ClassicSimilarity], result of:
          0.13565561 = score(doc=394,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.7156181 = fieldWeight in 394, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.125 = fieldNorm(doc=394)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Taylor, S.L.: Integrating natural language understanding with document structure analysis (1994) 0.02
```
0.016588641 = product of:
  0.066354565 = sum of:
    0.066354565 = product of:
      0.13270913 = sum of:
        0.13270913 = weight(_text_:processing in 1794) [ClassicSimilarity], result of:
          0.13270913 = score(doc=1794,freq=10.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.7000747 = fieldWeight in 1794, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1794)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Document understanding, the interpretation of a document from its image form, is a technology area which benefits greatly from the integration of natural language processing with image processing. Develops a prototype of an Intelligent Document Understanding System (IDUS) which employs several technologies: image processing, optical character recognition, document structure analysis and text understanding in a cooperative fashion. Discusses those areas of research during development of IDUS where it is found that the most benefit from the integration of natural language processing and image processing occured: document structure analysis, OCR correction, and text analysis. Discusses 2 applications which are supported by IDUS: text retrieval and automatic generation of hypertext links
Alexander, M.: Retrieving digital data with fuzzy matching (1997) 0.01
```
0.014631464 = product of:
  0.058525857 = sum of:
    0.058525857 = weight(_text_:data in 151) [ClassicSimilarity], result of:
      0.058525857 = score(doc=151,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.3952563 = fieldWeight in 151, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=151)
  0.25 = coord(1/4)
```
Abstract

In 1993 the British Library established a programme of activities entitled Initiatives for Access (IFA) to identify and develop computer applications based on the new technologies emerging in the aereas of digital and network service. Discusses the problem of the effective retrieval of digital data after its capture focusing on the product Excalibur EFS which looks at the way information is sorted at its fundamental level and identifies patterns in numbers. Looks at the benefits of Excalibur and outlines other experiments in progress as part of the IFA programme
Gödert, W.; Liebig, M.: Maschinelle Indexierung auf dem Prüfstand : Ergebnisse eines Retrievaltests zum MILOS II Projekt (1997) 0.01
```
0.012802532 = product of:
  0.051210128 = sum of:
    0.051210128 = weight(_text_:data in 1174) [ClassicSimilarity], result of:
      0.051210128 = score(doc=1174,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.34584928 = fieldWeight in 1174, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1174)
  0.25 = coord(1/4)
```
Abstract

The test ran between Nov 95-Aug 96 in Cologne Fachhochschule fur Bibliothekswesen (College of Librarianship).The test basis was a database of 190,000 book titles published between 1990-95. MILOS II mechanized indexing methods proved helpful in avoiding or reducing numbers of unsatisfied/no result retrieval searches. Retrieval from mechanised indexing is 3 times more successful than from title keyword data. MILOS II also used a standardized semantic vocabulary. Mechanised indexing demands high quality software and output data

Koryconski, C.; Newell, A.F.: Natural-language processing and automatic indexing (1990) 0.01

0.011990375 = product of:
  0.0479615 = sum of:
    0.0479615 = product of:
      0.095923 = sum of:
        0.095923 = weight(_text_:processing in 2313) [ClassicSimilarity], result of:
          0.095923 = score(doc=2313,freq=4.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.5060184 = fieldWeight in 2313, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0625 = fieldNorm(doc=2313)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Abstract: The task of producing satisfactory indexes by automatic means has been tackled on two fronts: by statistical analysis of text and by attempting content analysis of the text in much the same way as a human indexer does. Though statistical techniques have a lot to offer for free-text database systems, neither method has had much success with back-of-the-book indexing. This review examines some problems associated with the application of natural-language processing techniques to book texts. - Vgl. auch die Erwiderung von K.P. Jones

Pritchard, J.: Information retrieval : smarter indexing (1991) 0.01

0.010598094 = product of:
  0.042392377 = sum of:
    0.042392377 = product of:
      0.08478475 = sum of:
        0.08478475 = weight(_text_:processing in 4890) [ClassicSimilarity], result of:
          0.08478475 = score(doc=4890,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.4472613 = fieldWeight in 4890, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.078125 = fieldNorm(doc=4890)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Abstract: Describes full text retrieval (FTR) which indexes every occurrence of every word except defined 'stop' words. This permits much more sophisticated searching than with keyword indexing. Also discusses document imaging processing (DIP). Lists suppliers and users of the software and describes the experiences of ESOO's Planning Division with Computer Intertrade Ltd. (CIL) ImagePro DIP and their operational practices

Alexander, M.: Retrieving digital data with fuzzy matching (1996) 0.01

0.0103460075 = product of:
  0.04138403 = sum of:
    0.04138403 = weight(_text_:data in 6961) [ClassicSimilarity], result of:
      0.04138403 = score(doc=6961,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2794884 = fieldWeight in 6961, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=6961)
  0.25 = coord(1/4)

Hirawa, M.: Role of keywords in the network searching era (1998) 0.01
```
0.0103460075 = product of:
  0.04138403 = sum of:
    0.04138403 = weight(_text_:data in 3446) [ClassicSimilarity], result of:
      0.04138403 = score(doc=3446,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2794884 = fieldWeight in 3446, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=3446)
  0.25 = coord(1/4)
```
Abstract

A survey of Japanese OPACs available on the Internet was conducted relating to use of keywords for subject access. The findings suggest that present OPACs are not capable of storing subject-oriented information. Currently available keyword access derives from a merely title-based retrieval system. Contents data should be added to bibliographic records as an efficient way of providing subject access, and costings for this process should be estimated. Word standardisation issues must also be addressed
Micco, M.; Popp, R.: Improving library subject access (ILSA) : a theory of clustering based in classification (1994) 0.01
```
0.009052756 = product of:
  0.036211025 = sum of:
    0.036211025 = weight(_text_:data in 7715) [ClassicSimilarity], result of:
      0.036211025 = score(doc=7715,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24455236 = fieldWeight in 7715, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7715)
  0.25 = coord(1/4)
```
Abstract

The ILSA prototype was developed using an object-oriented multimedia user interfcae on six NeXT workstations with two databases: the first with 100.000 MARC records and the second with 20.000 additional records enhanced with table of contents data. The items are grouped into subject clusters consisting of the classification number and the first subject heading assigned. Every other distinct keyword in the MARC record is linked to the subject cluster in an automated natural language mapping scheme, which leads the user from the term entered to the controlled vocabulary of the subject clusters in which the term appeared. The use of a hierarchical classification number (Dewey) makes it possible to broaden or narrow a search at will
Gomez, I.: Coping with the problem of subject classification diversity (1996) 0.01
```
0.009052756 = product of:
  0.036211025 = sum of:
    0.036211025 = weight(_text_:data in 5074) [ClassicSimilarity], result of:
      0.036211025 = score(doc=5074,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24455236 = fieldWeight in 5074, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5074)
  0.25 = coord(1/4)
```
Abstract

The delimination of a research field in bibliometric studies presents the problem of the diversity of subject classifications used in the sources of input and output data. Classification of documents according the thematic codes or keywords is the most accurate method, mainly used is specialized bibliographic or patent databases. Classification of journals in disciplines presents lower specifity, and some shortcomings as the change over time of both journals and disciplines and the increasing interdisciplinarity of research. Standardization of subject classifications emerges as an important point in bibliometric studies in order to allow international comparisons, although flexibility is needed to meet the needs of local studies
Wacholder, N.; Byrd, R.J.: Retrieving information from full text using linguistic knowledge (1994) 0.01
```
0.008992782 = product of:
  0.035971127 = sum of:
    0.035971127 = product of:
      0.071942255 = sum of:
        0.071942255 = weight(_text_:processing in 8524) [ClassicSimilarity], result of:
          0.071942255 = score(doc=8524,freq=4.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.3795138 = fieldWeight in 8524, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=8524)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Examines how techniques in the field of natural language processing can be applied to the analysis of text in information retrieval. State of the art text searching programs cannot distinguish, for example, between occurrences of the sickness, AIDS and aids as tool or between library school and school nor equate such terms as online or on-line which are variants of the same form. To make these distinction, systems must incorporate knowledge about the meaning of words in context. Research in natural language processing has concentrated on the automatic 'understanding' of language; how to analyze the grammatical structure and meaning of text. Although many asoects of this research remain experimental, describes how these techniques to recognize spelling variants, names, acronyms, and abbreviations

Frants, V.I.; Kamenoff, N.I.; Shapiro, J.: ¬One approach to classification of users and automatic clustering of documents (1993) 0.01

0.008478476 = product of:
  0.033913903 = sum of:
    0.033913903 = product of:
      0.067827806 = sum of:
        0.067827806 = weight(_text_:processing in 4569) [ClassicSimilarity], result of:
          0.067827806 = score(doc=4569,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.35780904 = fieldWeight in 4569, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0625 = fieldNorm(doc=4569)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information processing and management. 29(1993) no.2, S.187-195

Search (40 results, page 1 of 2)

Authors

Languages

Types

Themes