Search (214 results, page 2 of 11)

Frants, V.I.; Kamenoff, N.I.; Shapiro, J.: ¬One approach to classification of users and automatic clustering of documents (1993) 0.00

0.004463867 = product of:
  0.0133916 = sum of:
    0.0133916 = product of:
      0.0267832 = sum of:
        0.0267832 = weight(_text_:of in 4569) [ClassicSimilarity], result of:
          0.0267832 = score(doc=4569,freq=16.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.39093933 = fieldWeight in 4569, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=4569)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Shows how to automatically construct a classification of users and a clustering of documents on the basis of users' information needs by creating clusters of documents and cross-references among clusters using users' search requests. Examines feedback in the construction of this classification and clustering so that the classification can be changed over time to reflect the changing needs of the users

Luhn, H.P.: ¬A statistical approach to the mechanical encoding and searching of literary information (1957) 0.00

0.004463867 = product of:
  0.0133916 = sum of:
    0.0133916 = product of:
      0.0267832 = sum of:
        0.0267832 = weight(_text_:of in 5453) [ClassicSimilarity], result of:
          0.0267832 = score(doc=5453,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.39093933 = fieldWeight in 5453, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.125 = fieldNorm(doc=5453)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: IBM journal of research and development. 1(1957), S.309-317

Salton, G.; Yang, C.S.: On the specification of term values in automatic indexing (1973) 0.00

0.004463867 = product of:
  0.0133916 = sum of:
    0.0133916 = product of:
      0.0267832 = sum of:
        0.0267832 = weight(_text_:of in 5476) [ClassicSimilarity], result of:
          0.0267832 = score(doc=5476,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.39093933 = fieldWeight in 5476, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.125 = fieldNorm(doc=5476)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Journal of documentation. 29(1973), S.351-372

Croft, W.B.: Clustering large files of documents using the single link method (1977) 0.00

0.004463867 = product of:
  0.0133916 = sum of:
    0.0133916 = product of:
      0.0267832 = sum of:
        0.0267832 = weight(_text_:of in 5489) [ClassicSimilarity], result of:
          0.0267832 = score(doc=5489,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.39093933 = fieldWeight in 5489, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.125 = fieldNorm(doc=5489)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Journal of the American Society for Information Science. 28(1977), S.341-344

Abdul, H.; Khoo, C.: Automatic indexing of medical literature using phrase matching : an exploratory study 0.00

0.004463867 = product of:
  0.0133916 = sum of:
    0.0133916 = product of:
      0.0267832 = sum of:
        0.0267832 = weight(_text_:of in 3601) [ClassicSimilarity], result of:
          0.0267832 = score(doc=3601,freq=16.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.39093933 = fieldWeight in 3601, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3601)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Reports the 1st part of a study to apply the technique of phrase matching to the automatic assignment of MeSH subject headings and subheadings to abstracts of periodical articles.
Source: Health information: new directions. Proceedings of the Joint Conference of the Health Libraries Sections of the Australian Library and Information Association and New Zealand Library Association, Auckland, New Zealand, 12.-16.11.1989

Hodge, G.M.: Computer-assisted database indexing : the state-of-the-art (1994) 0.00

0.0044112457 = product of:
  0.013233736 = sum of:
    0.013233736 = product of:
      0.026467472 = sum of:
        0.026467472 = weight(_text_:of in 7936) [ClassicSimilarity], result of:
          0.026467472 = score(doc=7936,freq=10.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.38633084 = fieldWeight in 7936, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=7936)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Discusses the state-of-the art of computer indexing, defines indexing and computer assistance, describes the reasons for renewed interest. Identifies the types of computer support in use using selected operational systems, describes the integration of various computer supports in one databases production system, and speculates on the future

Witschel, H.F.: Terminology extraction and automatic indexing : comparison and qualitative evaluation of methods (2005) 0.00
```
0.0044112457 = product of:
  0.013233736 = sum of:
    0.013233736 = product of:
      0.026467472 = sum of:
        0.026467472 = weight(_text_:of in 1842) [ClassicSimilarity], result of:
          0.026467472 = score(doc=1842,freq=40.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.38633084 = fieldWeight in 1842, product of:
              6.3245554 = tf(freq=40.0), with freq of:
                40.0 = termFreq=40.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1842)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Many terminology engineering processes involve the task of automatic terminology extraction: before the terminology of a given domain can be modelled, organised or standardised, important concepts (or terms) of this domain have to be identified and fed into terminological databases. These serve in further steps as a starting point for compiling dictionaries, thesauri or maybe even terminological ontologies for the domain. For the extraction of the initial concepts, extraction methods are needed that operate on specialised language texts. On the other hand, many machine learning or information retrieval applications require automatic indexing techniques. In Machine Learning applications concerned with the automatic clustering or classification of texts, often feature vectors are needed that describe the contents of a given text briefly but meaningfully. These feature vectors typically consist of a fairly small set of index terms together with weights indicating their importance. Short but meaningful descriptions of document contents as provided by good index terms are also useful to humans: some knowledge management applications (e.g. topic maps) use them as a set of basic concepts (topics). The author believes that the tasks of terminology extraction and automatic indexing have much in common and can thus benefit from the same set of basic algorithms. It is the goal of this paper to outline some methods that may be used in both contexts, but also to find the discriminating factors between the two tasks that call for the variation of parameters or application of different techniques. The discussion of these methods will be based on statistical, syntactical and especially morphological properties of (index) terms. The paper is concluded by the presentation of some qualitative and quantitative results comparing statistical and morphological methods.

Source

TKE 2005: Proc. of Terminology and Knowledge Engineering (TKE) 2005

Croft, W.B.: Automatic indexing : file organization and display for information retrieval (1989) 0.00

0.0044112457 = product of:
  0.013233736 = sum of:
    0.013233736 = product of:
      0.026467472 = sum of:
        0.026467472 = weight(_text_:of in 2412) [ClassicSimilarity], result of:
          0.026467472 = score(doc=2412,freq=10.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.38633084 = fieldWeight in 2412, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=2412)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Indexing: the state of our knowledge and the state of our ignorance. Proceedings of the 20th Annual Meeting of the American Society of Indexers, New York City, May 13, 1988. Ed.: B.H. Weinberg

Fagan, J.L.: ¬The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval (1989) 0.00
```
0.004184875 = product of:
  0.012554625 = sum of:
    0.012554625 = product of:
      0.02510925 = sum of:
        0.02510925 = weight(_text_:of in 1845) [ClassicSimilarity], result of:
          0.02510925 = score(doc=1845,freq=36.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36650562 = fieldWeight in 1845, product of:
              6.0 = tf(freq=36.0), with freq of:
                36.0 = termFreq=36.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1845)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

It may be possible to improve the quality of automatic indexing systems by using complex descriptors, for example, phrases, in addition to the simple descriptors (words or word stems) that are normally used in automatically constructed representations of document content. This study is directed toward the goal of developing effective methods of identifying phrases in natural language text from which good quality phrase descriptors can be constructed. The effectiveness of one method, a simple nonsyntactic phrase indexing procedure, has been tested on five experimental document collections. The results have been analyzed in order to identify the inadequacies of the procedure, and to determine what kinds of information about text structure are needed in order to construct phrase descriptors that are good indicators of document content. Two primary conclusions have been reached: (1) In the retrieval experiments, the nonsyntactic phrase construction procedure did not consistently yield substantial improvements in effectiveness. It is therefore not likely that phrase indexing of this kind will prove to be an important method of enhancing the performance of automatic document indexing and retrieval systems in operational environments. (2) Many of the shortcomings of the nonsyntactic approach can be overcome by incorporating syntactic information into the phrase construction process. However, a general syntactic analysis facility may be required, since many useful sources of phrases cannot be exploited if only a limited inventory of syntactic patterns can be recognized. Further research should be conducted into methods of incorporating automatic syntactic analysis into content analysis for document retrieval.

Source

Journal of the American Society for Information Science. 40(1989) no.2, S.115-132
Schuegraf, E.J.; Bommel, M.F.van: ¬An automatic document indexing system based on cooperating expert systems : design and development (1993) 0.00
```
0.004175565 = product of:
  0.012526695 = sum of:
    0.012526695 = product of:
      0.02505339 = sum of:
        0.02505339 = weight(_text_:of in 6504) [ClassicSimilarity], result of:
          0.02505339 = score(doc=6504,freq=14.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36569026 = fieldWeight in 6504, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=6504)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Discusses the design of an automatic indexing system based on two cooperating expert systems and the investigation related to its development. The design combines statistical and artificial intelligence techniques. Examines choice of content indicators, the effect of stemming and the identification of characteristic vocabularies for given subject areas. Presents experimental results. Discusses the application of machine learning algorithms to the identification of vocabularies

Source

Canadian journal of information and library science. 18(1993) no.2, S.32-50

Can, F.: Incremental clustering for dynamic information processing (1993) 0.00

0.004175565 = product of:
  0.012526695 = sum of:
    0.012526695 = product of:
      0.02505339 = sum of:
        0.02505339 = weight(_text_:of in 6627) [ClassicSimilarity], result of:
          0.02505339 = score(doc=6627,freq=14.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36569026 = fieldWeight in 6627, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=6627)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Clustering of very large document databases is useful for both searching and browsing. The periodic updating of clusters is required due to the dynamic nature of databases. Introduces an algorithm for incremental clustering and discusses the complexity and cost of analysis of the algorithm together with an investigation of its expected behaviour. Shows through empirical testing that the algortihm achieves cost effectiveness and generates statistically valid clusters that are compatible with those of reclustering. The experimental evidence shows that the algorithm creates an effective and effecient retrieval environment

Prasad, A.R.D.: PROMETHEUS: an automatic indexing system (1996) 0.00
```
0.004175565 = product of:
  0.012526695 = sum of:
    0.012526695 = product of:
      0.02505339 = sum of:
        0.02505339 = weight(_text_:of in 5189) [ClassicSimilarity], result of:
          0.02505339 = score(doc=5189,freq=14.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36569026 = fieldWeight in 5189, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=5189)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

An automatic indexing system using the tools and techniques of artificial intelligence is described. The paper presents the various components of the system like the parser, grammar formalism, lexicon, and the frame based knowledge representation for semantic representation. The semantic representation is based on the Ranganathan school of thought, especially that of Deep Structure of Subject Indexing Languages enunciated by Bhattacharyya. It is attempted to demonstrate the various stepts in indexing by providing an illustration

Source

Knowledge organization and change: Proceedings of the Fourth International ISKO Conference, 15-18 July 1996, Library of Congress, Washington, DC. Ed.: R. Green

Dow Jones unveils knowledge indexing system (1997) 0.00

0.004175565 = product of:
  0.012526695 = sum of:
    0.012526695 = product of:
      0.02505339 = sum of:
        0.02505339 = weight(_text_:of in 751) [ClassicSimilarity], result of:
          0.02505339 = score(doc=751,freq=14.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36569026 = fieldWeight in 751, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=751)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Dow Jones Interactive Publishing has developed a sophisticated automatic knowledge indexing system that will allow searchers of the Dow Jones News / Retrieval service to get highly targeted results from a search in the service's Publications Library. Instead of relying on a thesaurus of company names, the new system uses a combination of that basic algorithm plus unique rules based on the editorial styles of individual publications in the Library. Dow Jones have also announced its acceptance of the definitions of 'selected full text' and 'full text' from Bibliodata's Fulltext Sources Online directory

Gödert, W.: Detecting multiword phrases in mathematical text corpora (2012) 0.00

0.004175565 = product of:
  0.012526695 = sum of:
    0.012526695 = product of:
      0.02505339 = sum of:
        0.02505339 = weight(_text_:of in 466) [ClassicSimilarity], result of:
          0.02505339 = score(doc=466,freq=14.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36569026 = fieldWeight in 466, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=466)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: We present an approach for detecting multiword phrases in mathematical text corpora. The method used is based on characteristic features of mathematical terminology. It makes use of a software tool named Lingo which allows to identify words by means of previously defined dictionaries for specific word classes as adjectives, personal names or nouns. The detection of multiword groups is done algorithmically. Possible advantages of the method for indexing and information retrieval and conclusions for applying dictionary-based methods of automatic indexing instead of stemming procedures are discussed.

Bonzi, S.: Representation of concepts in text : a comparison of within-document frequency, anaphora, and synonymy (1991) 0.00
```
0.004142815 = product of:
  0.012428444 = sum of:
    0.012428444 = product of:
      0.024856888 = sum of:
        0.024856888 = weight(_text_:of in 4933) [ClassicSimilarity], result of:
          0.024856888 = score(doc=4933,freq=18.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36282203 = fieldWeight in 4933, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4933)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Investigates the 3 major ways by which a concept may be represented in text: within-document frequency, anaphoric reference, and synonyms in order to determine which provides the optical means of representation. Analysis a sample of 60 abstracts, drawn at random for the abstracting journals of 4 disciplines. Results show that in general, initial within-document frequency is higher for keyword terms. Additionally, frequency of keyword terms referenced anaphorically or with intellectually related terms is higher that that of other keyword terms. It appears that initial document length influences both the number and impact of both anaphoric resolutions and intellectually related terms

Source

Canadian journal of information science. 16(1991) no.3, S.21-31
Koch, T.: Experiments with automatic classification of WAIS databases and indexing of WWW : some results from the Nordic WAIS/WWW project (1994) 0.00
```
0.004142815 = product of:
  0.012428444 = sum of:
    0.012428444 = product of:
      0.024856888 = sum of:
        0.024856888 = weight(_text_:of in 7209) [ClassicSimilarity], result of:
          0.024856888 = score(doc=7209,freq=18.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36282203 = fieldWeight in 7209, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7209)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The Nordic WAIS/WWW project sponsored by NORDINFO is a joint project between Lund University Library and the National Technological Library of Denmark. It aims to improve the existing networked information discovery and retrieval tools Wide Area Information System (WAIS) and World Wide Web (WWW), and to move towards unifying WWW and WAIS. Details current results focusing on the WAIS side of the project. Describes research into automatic indexing and classification of WAIS sources, development of an orientation tool for WAIS, and development of a WAIS index of WWW resources

Source

Internet world and document delivery world international 94: Proceedings of the 2nd Annual Conference, London, May 1994
Bookstein, A.; Klein, S.T.; Raita, T.: Clumping properties of content-bearing words (1998) 0.00
```
0.004142815 = product of:
  0.012428444 = sum of:
    0.012428444 = product of:
      0.024856888 = sum of:
        0.024856888 = weight(_text_:of in 442) [ClassicSimilarity], result of:
          0.024856888 = score(doc=442,freq=18.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36282203 = fieldWeight in 442, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=442)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Information Retrieval Systems identify content bearing words, and possibly also assign weights, as part of the process of formulating requests. For optimal retrieval efficiency, it is desirable that this be done automatically. This article defines the notion of serial clustering of words in text, and explores the value of such clustering as an indicator of a word's bearing content. This approach is flexible in the sense that it is sensitive to context: a term may be assessed as content-bearing within one collection, but not another. Our approach, being numerical, may also be of value in assigning weights to terms in requests. Experimental support is obtained from natural text databases in three different languages

Source

Journal of the American Society for Information Science. 49(1998) no.2, S.102-114
Pulgarin, A.; Gil-Leiva, I.: Bibliometric analysis of the automatic indexing literature : 1956-2000 (2004) 0.00
```
0.004142815 = product of:
  0.012428444 = sum of:
    0.012428444 = product of:
      0.024856888 = sum of:
        0.024856888 = weight(_text_:of in 2566) [ClassicSimilarity], result of:
          0.024856888 = score(doc=2566,freq=18.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.36282203 = fieldWeight in 2566, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2566)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

We present a bibliometric study of a corpus of 839 bibliographic references about automatic indexing, covering the period 1956-2000. We analyse the distribution of authors and works, the obsolescence and its dispersion, and the distribution of the literature by topic, year, and source type. We conclude that: (i) there has been a constant interest on the part of researchers; (ii) the most studied topics were the techniques and methods employed and the general aspects of automatic indexing; (iii) the productivity of the authors does fit a Lotka distribution (Dmax=0.02 and critical value=0.054); (iv) the annual aging factor is 95%; and (v) the dispersion of the literature is low.

Salton, G.; Araya, J.: On the use of clustered file organizations in information search and retrieval (1990) 0.00

0.0041003237 = product of:
  0.01230097 = sum of:
    0.01230097 = product of:
      0.02460194 = sum of:
        0.02460194 = weight(_text_:of in 2409) [ClassicSimilarity], result of:
          0.02460194 = score(doc=2409,freq=6.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.3591007 = fieldWeight in 2409, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.09375 = fieldNorm(doc=2409)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Imprint: Edmonton, Alberta : Univ. of Alberta, Faculty of Extension

Humphrey, S.M.: Automatic indexing of documents from journal descriptors : a preliminary investigation (1999) 0.00
```
0.0041003237 = product of:
  0.01230097 = sum of:
    0.01230097 = product of:
      0.02460194 = sum of:
        0.02460194 = weight(_text_:of in 3769) [ClassicSimilarity], result of:
          0.02460194 = score(doc=3769,freq=24.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.3591007 = fieldWeight in 3769, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3769)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

A new, fully automated approach for indedexing documents is presented based on associating textwords in a training set of bibliographic citations with the indexing of journals. This journal-level indexing is in the form of a consistent, timely set of journal descriptors (JDs) indexing the individual journals themselves. This indexing is maintained in journal records in a serials authority database. The advantage of this novel approach is that the training set does not depend on previous manual indexing of thousands of documents (i.e., any such indexing already in the training set is not used), but rather the relatively small intellectual effort of indexing at the journal level, usually a matter of a few thousand unique journals for which retrospective indexing to maintain consistency and currency may be feasible. If successful, JD indexing would provide topical categorization of documents outside the training set, i.e., journal articles, monographs, Web documents, reports from the grey literature, etc., and therefore be applied in searching. Because JDs are quite general, corresponding to subject domains, their most problable use would be for improving or refining search results

Source

Journal of the American Society for Information Science. 50(1999) no.8, S.661-674

Search (214 results, page 2 of 11)

Authors

Years

Types

Themes