Search (44 results, page 1 of 3)

Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.09

0.08782172 = product of:
  0.17564344 = sum of:
    0.17564344 = sum of:
      0.12764274 = weight(_text_:abstracts in 530) [ClassicSimilarity], result of:
        0.12764274 = score(doc=530,freq=2.0), product of:
          0.2890173 = queryWeight, product of:
            5.7104354 = idf(docFreq=397, maxDocs=44218)
            0.05061213 = queryNorm
          0.44164395 = fieldWeight in 530, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.7104354 = idf(docFreq=397, maxDocs=44218)
            0.0546875 = fieldNorm(doc=530)
      0.048000712 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
        0.048000712 = score(doc=530,freq=2.0), product of:
          0.17723505 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05061213 = queryNorm
          0.2708308 = fieldWeight in 530, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=530)
  0.5 = coord(1/2)

Abstract: Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
Source: International forum on information and documentation. 22(1997) no.1, S.17-28

Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998) 0.06
```
0.062729806 = product of:
  0.12545961 = sum of:
    0.12545961 = sum of:
      0.09117339 = weight(_text_:abstracts in 1794) [ClassicSimilarity], result of:
        0.09117339 = score(doc=1794,freq=2.0), product of:
          0.2890173 = queryWeight, product of:
            5.7104354 = idf(docFreq=397, maxDocs=44218)
            0.05061213 = queryNorm
          0.31545997 = fieldWeight in 1794, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.7104354 = idf(docFreq=397, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1794)
      0.034286223 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
        0.034286223 = score(doc=1794,freq=2.0), product of:
          0.17723505 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05061213 = queryNorm
          0.19345059 = fieldWeight in 1794, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1794)
  0.5 = coord(1/2)
```
Abstract

In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document

Date

11. 9.2000 19:53:22

Haas, S.; He, S.: Toward the automatic identification of sublanguage vocabulary (1993) 0.05

0.051575456 = product of:
  0.10315091 = sum of:
    0.10315091 = product of:
      0.20630182 = sum of:
        0.20630182 = weight(_text_:abstracts in 4891) [ClassicSimilarity], result of:
          0.20630182 = score(doc=4891,freq=4.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.7138044 = fieldWeight in 4891, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.0625 = fieldNorm(doc=4891)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Describes a method developed for automatic identification of sublanguage vocabulary words as they occur in abstracts. Describes the sublanguage vocabulary identification procedures using abstracts from computer science and library and information science as sublanguage sources. Evaluates the results using three criteria. Discuss the practical and theoretical significance of this research and plans for further experiments

Gil-Leiva, I.; Munoz, J.V.R.: Analisis de los descriptores de diferentes areas del conocimiento indizades en bases de datos del CSIC : Aplicacion a la indizacion automatica (1997) 0.04
```
0.03868159 = product of:
  0.07736318 = sum of:
    0.07736318 = product of:
      0.15472636 = sum of:
        0.15472636 = weight(_text_:abstracts in 2637) [ClassicSimilarity], result of:
          0.15472636 = score(doc=2637,freq=4.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.5353533 = fieldWeight in 2637, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.046875 = fieldNorm(doc=2637)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Studies the value of scientific articles' titles and abstracts as sources of terms for document indexing in relation to 6 areas of knowledge: library and information science, medicine, chemistry, biology, psychology and physics, indexed in the databases ISOC, IME and ICYT of the CSIC. Also examines the syntagmatic structures of the indexing terms found in the field 'descriptors'. as well as the relationship between length of document and number of descriptors. Concludes that if the abstracts are not well made and the titles are not precise, they are not definitive sources for the extractions of concepts; the most common syntactic structure is the noun phrase, followed by noun+adjective and noun+noun; and no significant relationship was found between length of document and number of descriptors assigned to it

Garfield, E.: ¬The relationship between mechanical indexing, structural linguistics and information retrieval (1992) 0.04

0.036469355 = product of:
  0.07293871 = sum of:
    0.07293871 = product of:
      0.14587742 = sum of:
        0.14587742 = weight(_text_:abstracts in 3632) [ClassicSimilarity], result of:
          0.14587742 = score(doc=3632,freq=2.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.50473595 = fieldWeight in 3632, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.0625 = fieldNorm(doc=3632)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: It is possible to locate over 60% of indexing terms used in the Current List of Medical Literature by analysing the titles of the articles. Citation indexes contain 'noise' and lack many pertinent citations. Mechanical indexing or analysis of text must begin with some linguistic technique. Discusses Harris' methods of structural linguistics, discourse analysis and transformational analysis. Provides 3 examples with references, abstracts and index entries

Abdul, H.; Khoo, C.: Automatic indexing of medical literature using phrase matching : an exploratory study 0.04

0.036469355 = product of:
  0.07293871 = sum of:
    0.07293871 = product of:
      0.14587742 = sum of:
        0.14587742 = weight(_text_:abstracts in 3601) [ClassicSimilarity], result of:
          0.14587742 = score(doc=3601,freq=2.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.50473595 = fieldWeight in 3601, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.0625 = fieldNorm(doc=3601)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Reports the 1st part of a study to apply the technique of phrase matching to the automatic assignment of MeSH subject headings and subheadings to abstracts of periodical articles.

Renouf, A.: Sticking to the text : a corpus linguist's view of language (1993) 0.03
```
0.031910684 = product of:
  0.06382137 = sum of:
    0.06382137 = product of:
      0.12764274 = sum of:
        0.12764274 = weight(_text_:abstracts in 2314) [ClassicSimilarity], result of:
          0.12764274 = score(doc=2314,freq=2.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.44164395 = fieldWeight in 2314, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2314)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Corpus linguistics is the study of large, computer held bodies of text. Some corpus linguists are concerned with language descriptions for its own sake. On the corpus-linguistic continuum, the study of raw ASCII text is situated at one end, and the study of heavily pre-coded text at the other. Discusses the use of word frequency to identify changes in the lexicon; word repetition and word positioning in automatic abstracting and word clusters in automatic text retrieval. Compares the machine extract with manual abstracts. Abstractors and indexers may find themselves taking the original wording of the text more into account as the focus moves towards the electronic medium and away from the hard copy
Bonzi, S.: Representation of concepts in text : a comparison of within-document frequency, anaphora, and synonymy (1991) 0.03
```
0.031910684 = product of:
  0.06382137 = sum of:
    0.06382137 = product of:
      0.12764274 = sum of:
        0.12764274 = weight(_text_:abstracts in 4933) [ClassicSimilarity], result of:
          0.12764274 = score(doc=4933,freq=2.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.44164395 = fieldWeight in 4933, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4933)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Investigates the 3 major ways by which a concept may be represented in text: within-document frequency, anaphoric reference, and synonyms in order to determine which provides the optical means of representation. Analysis a sample of 60 abstracts, drawn at random for the abstracting journals of 4 disciplines. Results show that in general, initial within-document frequency is higher for keyword terms. Additionally, frequency of keyword terms referenced anaphorically or with intellectually related terms is higher that that of other keyword terms. It appears that initial document length influences both the number and impact of both anaphoric resolutions and intellectually related terms

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.03

0.027428979 = product of:
  0.054857958 = sum of:
    0.054857958 = product of:
      0.109715916 = sum of:
        0.109715916 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.109715916 = score(doc=402,freq=2.0), product of:
            0.17723505 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05061213 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information processing and management. 22(1986) no.6, S.465-476

Hmeidi, I.; Kanaan, G.; Evens, M.: Design and implementation of automatic indexing for information retrieval with Arabic documents (1997) 0.03
```
0.027352015 = product of:
  0.05470403 = sum of:
    0.05470403 = product of:
      0.10940806 = sum of:
        0.10940806 = weight(_text_:abstracts in 1660) [ClassicSimilarity], result of:
          0.10940806 = score(doc=1660,freq=2.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.37855196 = fieldWeight in 1660, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.046875 = fieldNorm(doc=1660)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

A corpus of 242 abstracts of Arabic documents on computer science and information systems using the Proceedings of the Saudi Arabian National Conferences as a source was put together. Reports on the design and building of an automatic information retrieval system from scratch to handle Arabic data. Both automatic and manual indexing techniques were implemented. Experiments using measures of recall and precision has demonstrated that automatic indexing is at least as effective as manual indexing and more effective in some cases. Automatic indexing is both cheaper and faster. Results suggests that a wider coverage of the literature can be achieved with less money and produce as good results as with manual indexing. Compares the retrieval results using words as index terms versus stems and roots, and confirms the results obtained by Al-Kharashi and Abu-Salem with smaller corpora that root indexing is more effective than word indexing
Roberts, D.; Souter, C.: ¬The automation of controlled vocabulary subject indexing of medical journal articles (2000) 0.03
```
0.027352015 = product of:
  0.05470403 = sum of:
    0.05470403 = product of:
      0.10940806 = sum of:
        0.10940806 = weight(_text_:abstracts in 711) [ClassicSimilarity], result of:
          0.10940806 = score(doc=711,freq=2.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.37855196 = fieldWeight in 711, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.046875 = fieldNorm(doc=711)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article discusses the possibility of the automation of sophisticated subject indexing of medical journal articles. Approaches to subject descriptor assignment in information retrieval research are usually either based upon the manual descriptors in the database or generation of search parameters from the text of the article. The principles of the Medline indexing system are described, followed by a summary of a pilot project, based upon the Amed database. The results suggest that a more extended study, based upon Medline, should encompass various components: Extraction of 'concept strings' from titles and abstracts of records, based upon linguistic features characteristic of medical literature. Use of the Unified Medical Language System (UMLS) for identification of controlled vocabulary descriptors. Coordination of descriptors, utilising features of the Medline indexing system. The emphasis should be on system manipulation of data, based upon input, available resources and specifically designed rules.
Gil-Leiva, I.: SISA-automatic indexing system for scientific articles : experiments with location heuristics rules versus TF-IDF rules (2017) 0.03
```
0.027352015 = product of:
  0.05470403 = sum of:
    0.05470403 = product of:
      0.10940806 = sum of:
        0.10940806 = weight(_text_:abstracts in 3622) [ClassicSimilarity], result of:
          0.10940806 = score(doc=3622,freq=2.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.37855196 = fieldWeight in 3622, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.046875 = fieldNorm(doc=3622)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Indexing is contextualized and a brief description is provided of some of the most used automatic indexing systems. We describe SISA, a system which uses location heuristics rules, statistical rules like term frequency (TF) or TF-IDF to obtain automatic or semi-automatic indexing, depending on the user's preference. The aim of this research is to ascertain which rules (location heuristics rules or TF-IDF rules) provide the best indexing terms. SISA is used to obtain the automatic indexing of 200 scientific articles on fruit growing written in Portuguese. It uses, on the one hand, location heuristics rules founded on the value of certain parts of the articles for indexing such as titles, abstracts, keywords, headings, first paragraph, conclusions and references and, on the other, TF-IDF rules. The indexing is then evaluated to ascertain retrieval performance through recall, precision and f-measure. Automatic indexing of the articles with location heuristics rules provided the best results with the evaluation measures.
Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.03
```
0.027352015 = product of:
  0.05470403 = sum of:
    0.05470403 = product of:
      0.10940806 = sum of:
        0.10940806 = weight(_text_:abstracts in 720) [ClassicSimilarity], result of:
          0.10940806 = score(doc=720,freq=2.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.37855196 = fieldWeight in 720, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.046875 = fieldNorm(doc=720)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This project brought together undergraduate students in Computer Science with librarians to mine abstracts of articles from the Texas A&M University Libraries' institutional repository, OAKTrust, in order to probe the creation of new metadata to improve discovery and use. The mining operation task consisted simply of classifying the articles into two categories of research type: basic research ("for understanding," "curiosity-based," or "knowledge-based") and applied research ("use-based"). These categories are fundamental especially for funders but are also important to researchers. The mining-to-classification steps took several iterations, but ultimately, we achieved good results with the toolkit BERT (Bidirectional Encoder Representations from Transformers). The project and its workflows represent a preview of what may lie ahead in the future of crafting metadata using text mining techniques to enhance discoverability.

Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.02

0.024000356 = product of:
  0.048000712 = sum of:
    0.048000712 = product of:
      0.096001424 = sum of:
        0.096001424 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
          0.096001424 = score(doc=262,freq=2.0), product of:
            0.17723505 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05061213 = queryNorm
            0.5416616 = fieldWeight in 262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=262)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 20.10.2000 12:22:23

Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.02

0.024000356 = product of:
  0.048000712 = sum of:
    0.048000712 = product of:
      0.096001424 = sum of:
        0.096001424 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.096001424 = score(doc=6265,freq=2.0), product of:
            0.17723505 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05061213 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information outlook. 9(2005) no.8, S.22-23

Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.02

0.020571733 = product of:
  0.041143466 = sum of:
    0.041143466 = product of:
      0.08228693 = sum of:
        0.08228693 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
          0.08228693 = score(doc=58,freq=2.0), product of:
            0.17723505 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05061213 = queryNorm
            0.46428138 = fieldWeight in 58, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=58)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 14. 6.2015 22:12:44

Hauer, M.: Automatische Indexierung (2000) 0.02

0.020571733 = product of:
  0.041143466 = sum of:
    0.041143466 = product of:
      0.08228693 = sum of:
        0.08228693 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
          0.08228693 = score(doc=5887,freq=2.0), product of:
            0.17723505 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05061213 = queryNorm
            0.46428138 = fieldWeight in 5887, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5887)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt

Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.02

0.020571733 = product of:
  0.041143466 = sum of:
    0.041143466 = product of:
      0.08228693 = sum of:
        0.08228693 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
          0.08228693 = score(doc=2051,freq=2.0), product of:
            0.17723505 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05061213 = queryNorm
            0.46428138 = fieldWeight in 2051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=2051)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 14. 6.2015 22:12:56

Hauer, M.: Tiefenindexierung im Bibliothekskatalog : 17 Jahre intelligentCAPTURE (2019) 0.02

0.020571733 = product of:
  0.041143466 = sum of:
    0.041143466 = product of:
      0.08228693 = sum of:
        0.08228693 = weight(_text_:22 in 5629) [ClassicSimilarity], result of:
          0.08228693 = score(doc=5629,freq=2.0), product of:
            0.17723505 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05061213 = queryNorm
            0.46428138 = fieldWeight in 5629, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5629)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: B.I.T.online. 22(2019) H.2, S.163-166

Tavakolizadeh-Ravari, M.: Analysis of the long term dynamics in thesaurus developments and its consequences (2017) 0.02
```
0.018234678 = product of:
  0.036469355 = sum of:
    0.036469355 = product of:
      0.07293871 = sum of:
        0.07293871 = weight(_text_:abstracts in 3081) [ClassicSimilarity], result of:
          0.07293871 = score(doc=3081,freq=2.0), product of:
            0.2890173 = queryWeight, product of:
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.05061213 = queryNorm
            0.25236797 = fieldWeight in 3081, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7104354 = idf(docFreq=397, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Die Arbeit analysiert die dynamische Entwicklung und den Gebrauch von Thesaurusbegriffen. Zusätzlich konzentriert sie sich auf die Faktoren, die die Zahl von Indexbegriffen pro Dokument oder Zeitschrift beeinflussen. Als Untersuchungsobjekt dienten der MeSH und die entsprechende Datenbank "MEDLINE". Die wichtigsten Konsequenzen sind: 1. Der MeSH-Thesaurus hat sich durch drei unterschiedliche Phasen jeweils logarithmisch entwickelt. Solch einen Thesaurus sollte folgenden Gleichung folgen: "T = 3.076,6 Ln (d) - 22.695 + 0,0039d" (T = Begriffe, Ln = natürlicher Logarithmus und d = Dokumente). Um solch einen Thesaurus zu konstruieren, muss man demnach etwa 1.600 Dokumente von unterschiedlichen Themen des Bereiches des Thesaurus haben. Die dynamische Entwicklung von Thesauri wie MeSH erfordert die Einführung eines neuen Begriffs pro Indexierung von 256 neuen Dokumenten. 2. Die Verteilung der Thesaurusbegriffe erbrachte drei Kategorien: starke, normale und selten verwendete Headings. Die letzte Gruppe ist in einer Testphase, während in der ersten und zweiten Kategorie die neu hinzukommenden Deskriptoren zu einem Thesauruswachstum führen. 3. Es gibt ein logarithmisches Verhältnis zwischen der Zahl von Index-Begriffen pro Aufsatz und dessen Seitenzahl für die Artikeln zwischen einer und einundzwanzig Seiten. 4. Zeitschriftenaufsätze, die in MEDLINE mit Abstracts erscheinen erhalten fast zwei Deskriptoren mehr. 5. Die Findablity der nicht-englisch sprachigen Dokumente in MEDLINE ist geringer als die englische Dokumente. 6. Aufsätze der Zeitschriften mit einem Impact Factor 0 bis fünfzehn erhalten nicht mehr Indexbegriffe als die der anderen von MEDINE erfassten Zeitschriften. 7. In einem Indexierungssystem haben unterschiedliche Zeitschriften mehr oder weniger Gewicht in ihrem Findability. Die Verteilung der Indexbegriffe pro Seite hat gezeigt, dass es bei MEDLINE drei Kategorien der Publikationen gibt. Außerdem gibt es wenige stark bevorzugten Zeitschriften."

Search (44 results, page 1 of 3)

Authors

Years

Languages

Types

Themes