Search (17 results, page 1 of 1)

Damerau, F.J.: Generating an evaluating domain-oriented multi-word terms from texts (1993) 0.01

0.006899295 = product of:
  0.02759718 = sum of:
    0.016360147 = product of:
      0.04908044 = sum of:
        0.04908044 = weight(_text_:problem in 5814) [ClassicSimilarity], result of:
          0.04908044 = score(doc=5814,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.375163 = fieldWeight in 5814, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0625 = fieldNorm(doc=5814)
      0.33333334 = coord(1/3)
    0.011237033 = product of:
      0.033711098 = sum of:
        0.033711098 = weight(_text_:29 in 5814) [ClassicSimilarity], result of:
          0.033711098 = score(doc=5814,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.31092256 = fieldWeight in 5814, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=5814)
      0.33333334 = coord(1/3)
  0.25 = coord(2/8)

Abstract: Examines techniques for automatically generating domain vocabularies from large text collections. Focuses on the problem of generating multi-word vocabulary terms (specifically pairs). Discusses statistical issues associated with word co-occurrences likely to be of use in a natural language interface. Provides a more objective evaluation of the selection procedures. As substantial experimentation with subjects using a working query system is absent, all evaluation is necessarily subjective. Uses surrogate for experimentation by relying on pre-existing dictionaries as indicators of domain relevance
Source: Information processing and management. 29(1993) no.4, S.433-447

Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.01

0.006014771 = product of:
  0.024059083 = sum of:
    0.014315128 = product of:
      0.042945385 = sum of:
        0.042945385 = weight(_text_:problem in 530) [ClassicSimilarity], result of:
          0.042945385 = score(doc=530,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.3282676 = fieldWeight in 530, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
      0.33333334 = coord(1/3)
    0.009743956 = product of:
      0.029231867 = sum of:
        0.029231867 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
          0.029231867 = score(doc=530,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.2708308 = fieldWeight in 530, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
      0.33333334 = coord(1/3)
  0.25 = coord(2/8)

Abstract: Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
Source: International forum on information and documentation. 22(1997) no.1, S.17-28

Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998) 0.01
```
0.005355108 = product of:
  0.021420432 = sum of:
    0.014460463 = product of:
      0.04338139 = sum of:
        0.04338139 = weight(_text_:problem in 1794) [ClassicSimilarity], result of:
          0.04338139 = score(doc=1794,freq=4.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.33160037 = fieldWeight in 1794, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1794)
      0.33333334 = coord(1/3)
    0.0069599687 = product of:
      0.020879906 = sum of:
        0.020879906 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
          0.020879906 = score(doc=1794,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.19345059 = fieldWeight in 1794, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1794)
      0.33333334 = coord(1/3)
  0.25 = coord(2/8)
```
Abstract

In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document

Date

11. 9.2000 19:53:22

Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.00

0.00489409 = product of:
  0.03915272 = sum of:
    0.03915272 = product of:
      0.05872908 = sum of:
        0.029497212 = weight(_text_:29 in 2673) [ClassicSimilarity], result of:
          0.029497212 = score(doc=2673,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.27205724 = fieldWeight in 2673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
        0.029231867 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
          0.029231867 = score(doc=2673,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.2708308 = fieldWeight in 2673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
      0.6666667 = coord(2/3)
  0.125 = coord(1/8)

Date: 1. 8.1996 22:08:06
Source: Computer networks and ISDN systems. 29(1997) no.8, S.1147-1156

Gomez, I.: Coping with the problem of subject classification diversity (1996) 0.00
```
0.0025305813 = product of:
  0.02024465 = sum of:
    0.02024465 = product of:
      0.06073395 = sum of:
        0.06073395 = weight(_text_:problem in 5074) [ClassicSimilarity], result of:
          0.06073395 = score(doc=5074,freq=4.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.46424055 = fieldWeight in 5074, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5074)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

The delimination of a research field in bibliometric studies presents the problem of the diversity of subject classifications used in the sources of input and output data. Classification of documents according the thematic codes or keywords is the most accurate method, mainly used is specialized bibliographic or patent databases. Classification of journals in disciplines presents lower specifity, and some shortcomings as the change over time of both journals and disciplines and the increasing interdisciplinarity of research. Standardization of subject classifications emerges as an important point in bibliometric studies in order to allow international comparisons, although flexibility is needed to meet the needs of local studies

Alexander, M.: Retrieving digital data with fuzzy matching (1997) 0.00

0.0020450184 = product of:
  0.016360147 = sum of:
    0.016360147 = product of:
      0.04908044 = sum of:
        0.04908044 = weight(_text_:problem in 151) [ClassicSimilarity], result of:
          0.04908044 = score(doc=151,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.375163 = fieldWeight in 151, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0625 = fieldNorm(doc=151)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Abstract: In 1993 the British Library established a programme of activities entitled Initiatives for Access (IFA) to identify and develop computer applications based on the new technologies emerging in the aereas of digital and network service. Discusses the problem of the effective retrieval of digital data after its capture focusing on the product Excalibur EFS which looks at the way information is sorted at its fundamental level and identifies patterns in numbers. Looks at the benefits of Excalibur and outlines other experiments in progress as part of the IFA programme

Salton, G.; Allan, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine readable texts (1994) 0.00

0.0017557865 = product of:
  0.014046292 = sum of:
    0.014046292 = product of:
      0.042138875 = sum of:
        0.042138875 = weight(_text_:29 in 1949) [ClassicSimilarity], result of:
          0.042138875 = score(doc=1949,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.38865322 = fieldWeight in 1949, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=1949)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 16. 8.1998 12:30:29

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.00

0.0017399922 = product of:
  0.013919937 = sum of:
    0.013919937 = product of:
      0.04175981 = sum of:
        0.04175981 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.04175981 = score(doc=4157,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Koryconski, C.; Newell, A.F.: Natural-language processing and automatic indexing (1990) 0.00

0.0014046291 = product of:
  0.011237033 = sum of:
    0.011237033 = product of:
      0.033711098 = sum of:
        0.033711098 = weight(_text_:29 in 2313) [ClassicSimilarity], result of:
          0.033711098 = score(doc=2313,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.31092256 = fieldWeight in 2313, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=2313)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Indexer. 17(1990), S.21-29

Frants, V.I.; Kamenoff, N.I.; Shapiro, J.: ¬One approach to classification of users and automatic clustering of documents (1993) 0.00

0.0014046291 = product of:
  0.011237033 = sum of:
    0.011237033 = product of:
      0.033711098 = sum of:
        0.033711098 = weight(_text_:29 in 4569) [ClassicSimilarity], result of:
          0.033711098 = score(doc=4569,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.31092256 = fieldWeight in 4569, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=4569)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Information processing and management. 29(1993) no.2, S.187-195

Haas, S.; He, S.: Toward the automatic identification of sublanguage vocabulary (1993) 0.00

0.0014046291 = product of:
  0.011237033 = sum of:
    0.011237033 = product of:
      0.033711098 = sum of:
        0.033711098 = weight(_text_:29 in 4891) [ClassicSimilarity], result of:
          0.033711098 = score(doc=4891,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.31092256 = fieldWeight in 4891, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=4891)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Information processing and management. 29(1993) no.6, S.721-744

Molto, M.: Improving full text search performance through textual analysis (1993) 0.00

0.0014046291 = product of:
  0.011237033 = sum of:
    0.011237033 = product of:
      0.033711098 = sum of:
        0.033711098 = weight(_text_:29 in 5099) [ClassicSimilarity], result of:
          0.033711098 = score(doc=5099,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.31092256 = fieldWeight in 5099, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=5099)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Information processing and management. 29(1993) no.5, S.614-632

Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.00

0.0013919937 = product of:
  0.01113595 = sum of:
    0.01113595 = product of:
      0.03340785 = sum of:
        0.03340785 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
          0.03340785 = score(doc=4709,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.30952093 = fieldWeight in 4709, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 31. 7.1996 9:22:19

Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.00

0.0013919937 = product of:
  0.01113595 = sum of:
    0.01113595 = product of:
      0.03340785 = sum of:
        0.03340785 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
          0.03340785 = score(doc=6752,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.30952093 = fieldWeight in 6752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 6. 3.1997 16:22:15

Hmeidi, I.; Kanaan, G.; Evens, M.: Design and implementation of automatic indexing for information retrieval with Arabic documents (1997) 0.00

0.0010534719 = product of:
  0.008427775 = sum of:
    0.008427775 = product of:
      0.025283325 = sum of:
        0.025283325 = weight(_text_:29 in 1660) [ClassicSimilarity], result of:
          0.025283325 = score(doc=1660,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.23319192 = fieldWeight in 1660, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=1660)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 29. 7.1998 17:40:01

Ward, M.L.: ¬The future of the human indexer (1996) 0.00

0.0010439953 = product of:
  0.008351962 = sum of:
    0.008351962 = product of:
      0.025055885 = sum of:
        0.025055885 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
          0.025055885 = score(doc=7244,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.23214069 = fieldWeight in 7244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=7244)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 9. 2.1997 18:44:22

Milstead, J.L.: Thesauri in a full-text world (1998) 0.00

8.699961E-4 = product of:
  0.0069599687 = sum of:
    0.0069599687 = product of:
      0.020879906 = sum of:
        0.020879906 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
          0.020879906 = score(doc=2337,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.19345059 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 22. 9.1997 19:16:05

Search (17 results, page 1 of 1)

Authors

Themes