Search (64 results, page 1 of 4)

Garfield, E.: ¬The relationship between mechanical indexing, structural linguistics and information retrieval (1992) 0.09

0.09304918 = product of:
  0.13957377 = sum of:
    0.097319394 = weight(_text_:citation in 3632) [ClassicSimilarity], result of:
      0.097319394 = score(doc=3632,freq=2.0), product of:
        0.23479973 = queryWeight, product of:
          4.6892867 = idf(docFreq=1104, maxDocs=44218)
          0.050071523 = queryNorm
        0.4144783 = fieldWeight in 3632, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6892867 = idf(docFreq=1104, maxDocs=44218)
          0.0625 = fieldNorm(doc=3632)
    0.042254377 = product of:
      0.084508754 = sum of:
        0.084508754 = weight(_text_:index in 3632) [ClassicSimilarity], result of:
          0.084508754 = score(doc=3632,freq=2.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.3862362 = fieldWeight in 3632, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0625 = fieldNorm(doc=3632)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: It is possible to locate over 60% of indexing terms used in the Current List of Medical Literature by analysing the titles of the articles. Citation indexes contain 'noise' and lack many pertinent citations. Mechanical indexing or analysis of text must begin with some linguistic technique. Discusses Harris' methods of structural linguistics, discourse analysis and transformational analysis. Provides 3 examples with references, abstracts and index entries

Hauer, M.: Automatische Indexierung (2000) 0.07

0.06939038 = product of:
  0.20817113 = sum of:
    0.20817113 = sum of:
      0.12676314 = weight(_text_:index in 5887) [ClassicSimilarity], result of:
        0.12676314 = score(doc=5887,freq=2.0), product of:
          0.21880072 = queryWeight, product of:
            4.369764 = idf(docFreq=1520, maxDocs=44218)
            0.050071523 = queryNorm
          0.5793543 = fieldWeight in 5887, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.369764 = idf(docFreq=1520, maxDocs=44218)
            0.09375 = fieldNorm(doc=5887)
      0.081407994 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
        0.081407994 = score(doc=5887,freq=2.0), product of:
          0.17534193 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050071523 = queryNorm
          0.46428138 = fieldWeight in 5887, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.09375 = fieldNorm(doc=5887)
  0.33333334 = coord(1/3)

Object: Index-5.0
Source: Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt

Ward, M.L.: ¬The future of the human indexer (1996) 0.04
```
0.04344636 = product of:
  0.13033907 = sum of:
    0.13033907 = sum of:
      0.089635074 = weight(_text_:index in 7244) [ClassicSimilarity], result of:
        0.089635074 = score(doc=7244,freq=4.0), product of:
          0.21880072 = queryWeight, product of:
            4.369764 = idf(docFreq=1520, maxDocs=44218)
            0.050071523 = queryNorm
          0.40966535 = fieldWeight in 7244, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            4.369764 = idf(docFreq=1520, maxDocs=44218)
            0.046875 = fieldNorm(doc=7244)
      0.040703997 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
        0.040703997 = score(doc=7244,freq=2.0), product of:
          0.17534193 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050071523 = queryNorm
          0.23214069 = fieldWeight in 7244, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=7244)
  0.33333334 = coord(1/3)
```
Abstract

Considers the principles of indexing and the intellectual skills involved in order to determine what automatic indexing systems would be required in order to supplant or complement the human indexer. Good indexing requires: considerable prior knowledge of the literature; judgement as to what to index and what depth to index; reading skills; abstracting skills; and classification skills, Illustrates these features with a detailed description of abstracting and indexing processes involved in generating entries for the mechanical engineering database POWERLINK. Briefly assesses the possibility of replacing human indexers with specialist indexing software, with particular reference to the Object Analyzer from the InTEXT automatic indexing system and using the criteria described for human indexers. At present, it is unlikely that the automatic indexer will replace the human indexer, but when more primary texts are available in electronic form, it may be a useful productivity tool for dealing with large quantities of low grade texts (should they be wanted in the database)

Date

9. 2.1997 18:44:22
Blank, I.; Rokach, L.; Shani, G.: Leveraging metadata to recommend keywords for academic papers (2016) 0.04
```
0.040549748 = product of:
  0.12164924 = sum of:
    0.12164924 = weight(_text_:citation in 3232) [ClassicSimilarity], result of:
      0.12164924 = score(doc=3232,freq=8.0), product of:
        0.23479973 = queryWeight, product of:
          4.6892867 = idf(docFreq=1104, maxDocs=44218)
          0.050071523 = queryNorm
        0.5180979 = fieldWeight in 3232, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.6892867 = idf(docFreq=1104, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3232)
  0.33333334 = coord(1/3)
```
Abstract

Users of research databases, such as CiteSeerX, Google Scholar, and Microsoft Academic, often search for papers using a set of keywords. Unfortunately, many authors avoid listing sufficient keywords for their papers. As such, these applications may need to automatically associate good descriptive keywords with papers. When the full text of the paper is available this problem has been thoroughly studied. In many cases, however, due to copyright limitations, research databases do not have access to the full text. On the other hand, such databases typically maintain metadata, such as the title and abstract and the citation network of each paper. In this paper we study the problem of predicting which keywords are appropriate for a research paper, using different methods based on the citation network and available metadata. Our main goal is in providing search engines with the ability to extract keywords from the available metadata. However, our system can also be used for other applications, such as for recommending keywords for the authors of new papers. We create a data set of research papers, and their citation network, keywords, and other metadata, containing over 470K papers with and more than 2 million keywords. We compare our methods with predicting keywords using the title and abstract, in offline experiments and in a user study, concluding that the citation network provides much better predictions.
Gábor, K.; Zargayouna, H.; Tellier, I.; Buscaldi, D.; Charnois, T.: ¬A typology of semantic relations dedicated to scientific literature analysis (2016) 0.03
```
0.028384823 = product of:
  0.08515447 = sum of:
    0.08515447 = weight(_text_:citation in 2933) [ClassicSimilarity], result of:
      0.08515447 = score(doc=2933,freq=2.0), product of:
        0.23479973 = queryWeight, product of:
          4.6892867 = idf(docFreq=1104, maxDocs=44218)
          0.050071523 = queryNorm
        0.3626685 = fieldWeight in 2933, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6892867 = idf(docFreq=1104, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2933)
  0.33333334 = coord(1/3)
```
Abstract

We propose a method for improving access to scientific literature by analyzing the content of research papers beyond citation links and topic tracking. Our model relies on a typology of explicit semantic relations. These relations are instantiated in the abstract/introduction part of the papers and can be identified automatically using textual data and external ontologies. Preliminary results show a promising precision in unsupervised relationship classification.

Sparck Jones, K.: Index term weighting (1973) 0.03

0.028169585 = product of:
  0.084508754 = sum of:
    0.084508754 = product of:
      0.16901751 = sum of:
        0.16901751 = weight(_text_:index in 5491) [ClassicSimilarity], result of:
          0.16901751 = score(doc=5491,freq=2.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.7724724 = fieldWeight in 5491, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.125 = fieldNorm(doc=5491)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Leung, C.-H.; Kan, W.-K.: ¬A statistical learning approach to automatic indexing of controlled index terms (1997) 0.03
```
0.02587542 = product of:
  0.07762626 = sum of:
    0.07762626 = product of:
      0.15525252 = sum of:
        0.15525252 = weight(_text_:index in 6497) [ClassicSimilarity], result of:
          0.15525252 = score(doc=6497,freq=12.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.7095612 = fieldWeight in 6497, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=6497)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

A statistical learning approach to assigning controlled index terms is presented. In this approach, there are two processes: (1) the learning process and (2) the indexing process. The learning process constructs a relationship between an index term and the words relevant and irrelevant to it, based on the positive training set and negative training set, and those not indexed by it, respectively. The indexing process determines whether an index term is assigned to a certain document, based on the relationship constructed by the learning process, and the text found in the document. Furthermore, a learning feedback technique is introduced. This technique used in the learning process modifies the relationship between an index term and its relevant and irrelevant words to improve the learning performance and, thus, the indexing performance. Experimental results have shown that the statistical learning approach and the learning feedback technique are practical means to automatic indexing of controlled index terms

Thönssen, B.: Automatische Indexierung und Schnittstellen zu Thesauri (1988) 0.02

0.024898633 = product of:
  0.0746959 = sum of:
    0.0746959 = product of:
      0.1493918 = sum of:
        0.1493918 = weight(_text_:index in 30) [ClassicSimilarity], result of:
          0.1493918 = score(doc=30,freq=4.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.6827756 = fieldWeight in 30, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.078125 = fieldNorm(doc=30)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Über eine Schnittstelle zwischen Programmen zur automatischen Indexierung (PRIMUS-IDX) und zur maschinellen Thesaurusverwaltung (INDEX) sollen große Textmengen schnell, kostengünstig und konsistent erschlossen und verbesserte Recherchemöglichkeiten geschaffen werden. Zielvorstellung ist ein Verfahren, das auf PCs ablauffähig ist und speziell deutschsprachige Texte bearbeiten kann
Object: INDEX

Moens, M.F.: Automatic indexing and abstracting of document texts (2000) 0.02

0.024898633 = product of:
  0.0746959 = sum of:
    0.0746959 = product of:
      0.1493918 = sum of:
        0.1493918 = weight(_text_:index in 6892) [ClassicSimilarity], result of:
          0.1493918 = score(doc=6892,freq=4.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.6827756 = fieldWeight in 6892, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.078125 = fieldNorm(doc=6892)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Content: Need for indexing and abstracting texts; attributes of texts; text representations and their use; selection of natural language index terms; assignment of controlled language index texts; automatic abstracting; applications

Cohen, J.D.: Highlights: language- and domain-independent automatic indexing terms for abstracting (1995) 0.02
```
0.021346133 = product of:
  0.064038396 = sum of:
    0.064038396 = product of:
      0.12807679 = sum of:
        0.12807679 = weight(_text_:index in 1793) [ClassicSimilarity], result of:
          0.12807679 = score(doc=1793,freq=6.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.5853582 = fieldWeight in 1793, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1793)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Presents a model of drawing index terms from text. The approach uses no stop list, stemmer, or other language and domain specific component, allowing operation in any language or domain with only trivial modification. The method uses n-grams counts, achieving a function similar to, but more general than, a stemmer. The generated index terms, called 'highlights', are suitable for identifying the topic for perusal and selection. An extension is also described and demonstrated which selects index terms to represent a subset of documents, distinguishing them from the corpus. Presents some experimental results, showing operation in English, Spanish, German, Georgian, Russian and Japanese
O'Kane, K.C.: Generating hierarchical document indices from common denominators in large document collections (1996) 0.02
```
0.021346133 = product of:
  0.064038396 = sum of:
    0.064038396 = product of:
      0.12807679 = sum of:
        0.12807679 = weight(_text_:index in 4037) [ClassicSimilarity], result of:
          0.12807679 = score(doc=4037,freq=6.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.5853582 = fieldWeight in 4037, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4037)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Describes an effective, simple and efficient algorithm for computer generation of hierarchical indices from Document Term matrices by means of calculating common denominator vectors from the document vector set. This procedure produces an intuitive, user friendly hierarchical index of a document collection not unlike that which would be expected had a manual indexer set about to create an index or outline of a collection. The resulting index, when presented with a graphical user interface, provides the user with a natural easily comprehended view of the document collection, permits general browsing and informal search activities with an access method that requires no keyboard entry or prior knowledge of the vocabulary
Mansour, N.; Haraty, R.A.; Daher, W.; Houri, M.: ¬An auto-indexing method for Arabic text (2008) 0.02
```
0.02112719 = product of:
  0.06338157 = sum of:
    0.06338157 = product of:
      0.12676314 = sum of:
        0.12676314 = weight(_text_:index in 2103) [ClassicSimilarity], result of:
          0.12676314 = score(doc=2103,freq=8.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.5793543 = fieldWeight in 2103, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=2103)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This work addresses the information retrieval problem of auto-indexing Arabic documents. Auto-indexing a text document refers to automatically extracting words that are suitable for building an index for the document. In this paper, we propose an auto-indexing method for Arabic text documents. This method is mainly based on morphological analysis and on a technique for assigning weights to words. The morphological analysis uses a number of grammatical rules to extract stem words that become candidate index words. The weight assignment technique computes weights for these words relative to the container document. The weight is based on how spread is the word in a document and not only on its rate of occurrence. The candidate index words are then sorted in descending order by weight so that information retrievers can select the more important index words. We empirically verify the usefulness of our method using several examples. For these examples, we obtained an average recall of 46% and an average precision of 64%.

Nicoletti, M.: Automatische Indexierung (2001) 0.02

0.02112719 = product of:
  0.06338157 = sum of:
    0.06338157 = product of:
      0.12676314 = sum of:
        0.12676314 = weight(_text_:index in 4326) [ClassicSimilarity], result of:
          0.12676314 = score(doc=4326,freq=2.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.5793543 = fieldWeight in 4326, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.09375 = fieldNorm(doc=4326)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Content: Inhalt: 1. Aufgabe - 2. Ermittlung von Mehrwortgruppen - 2.1 Definition - 3. Kennzeichnung der Mehrwortgruppen - 4. Grundformen - 5. Term- und Dokumenthäufigkeit --- Termgewichtung - 6. Steuerungsinstrument Schwellenwert - 7. Invertierter Index. Vgl. unter: http://www.grin.com/de/e-book/104966/automatische-indexierung.

Smiraglia, R.P.; Cai, X.: Tracking the evolution of clustering, machine learning, automatic indexing and automatic classification in knowledge organization (2017) 0.02
```
0.020274874 = product of:
  0.06082462 = sum of:
    0.06082462 = weight(_text_:citation in 3627) [ClassicSimilarity], result of:
      0.06082462 = score(doc=3627,freq=2.0), product of:
        0.23479973 = queryWeight, product of:
          4.6892867 = idf(docFreq=1104, maxDocs=44218)
          0.050071523 = queryNorm
        0.25904894 = fieldWeight in 3627, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.6892867 = idf(docFreq=1104, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3627)
  0.33333334 = coord(1/3)
```
Abstract

A very important extension of the traditional domain of knowledge organization (KO) arises from attempts to incorporate techniques devised in the computer science domain for automatic concept extraction and for grouping, categorizing, clustering and otherwise organizing knowledge using mechanical means. Four specific terms have emerged to identify the most prevalent techniques: machine learning, clustering, automatic indexing, and automatic classification. Our study presents three domain analytical case analyses in search of answers. The first case relies on citations located using the ISKO-supported "Knowledge Organization Bibliography." The second case relies on works in both Web of Science and SCOPUS. Case three applies co-word analysis and citation analysis to the contents of the papers in the present special issue. We observe scholars involved in "clustering" and "automatic classification" who share common thematic emphases. But we have found no coherence, no common activity and no social semantics. We have not found a research front, or a common teleology within the KO domain. We also have found a lively group of authors who have succeeded in submitting papers to this special issue, and their work quite interestingly aligns with the case studies we report. There is an emphasis on KO for information retrieval; there is much work on clustering (which involves conceptual points within texts) and automatic classification (which involves semantic groupings at the meta-document level).
Oberhauser, O.; Labner, J.: OPAC-Erweiterung durch automatische Indexierung : Empirische Untersuchung mit Daten aus dem Österreichischen Verbundkatalog (2002) 0.02
```
0.018296685 = product of:
  0.05489005 = sum of:
    0.05489005 = product of:
      0.1097801 = sum of:
        0.1097801 = weight(_text_:index in 883) [ClassicSimilarity], result of:
          0.1097801 = score(doc=883,freq=6.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.50173557 = fieldWeight in 883, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=883)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In Anlehnung an die in den neunziger Jahren durchgeführten Erschließungsprojekte MILOS I und MILOS II, die die Eignung eines Verfahrens zur automatischen Indexierung für Bibliothekskataloge zum Thema hatten, wurde eine empirische Untersuchung anhand einer repräsentativen Stichprobe von Titelsätzen aus dem Österreichischen Verbundkatalog durchgeführt. Ziel war die Prüfung und Bewertung der Einsatzmöglichkeit dieses Verfahrens in den Online-Katalogen des Verbundes. Der Realsituation der OPAC-Benutzung gemäß wurde ausschließlich die Auswirkung auf den automatisch generierten Begriffen angereicherten Basic Index ("Alle Felder") untersucht. Dazu wurden 100 Suchanfragen zunächst im ursprünglichen Basic Index und sodann im angereicherten Basic Index in einem OPAC unter Aleph 500 durchgeführt. Die Tests erbrachten einen Zuwachs an relevanten Treffern bei nur leichten Verlusten an Precision, eine Reduktion der Nulltreffer-Ergebnisse sowie Aufschlüsse über die Auswirkung einer vorhandenen verbalen Sacherschließung.

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.02

0.018090667 = product of:
  0.054272 = sum of:
    0.054272 = product of:
      0.108544 = sum of:
        0.108544 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.108544 = score(doc=402,freq=2.0), product of:
            0.17534193 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050071523 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information processing and management. 22(1986) no.6, S.465-476

Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.02

0.015829332 = product of:
  0.047487997 = sum of:
    0.047487997 = product of:
      0.09497599 = sum of:
        0.09497599 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
          0.09497599 = score(doc=262,freq=2.0), product of:
            0.17534193 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050071523 = queryNorm
            0.5416616 = fieldWeight in 262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=262)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 20.10.2000 12:22:23

Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.02

0.015829332 = product of:
  0.047487997 = sum of:
    0.047487997 = product of:
      0.09497599 = sum of:
        0.09497599 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.09497599 = score(doc=6265,freq=2.0), product of:
            0.17534193 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050071523 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information outlook. 9(2005) no.8, S.22-23

Witschel, H.F.: Terminology extraction and automatic indexing : comparison and qualitative evaluation of methods (2005) 0.02
```
0.015247239 = product of:
  0.045741715 = sum of:
    0.045741715 = product of:
      0.09148343 = sum of:
        0.09148343 = weight(_text_:index in 1842) [ClassicSimilarity], result of:
          0.09148343 = score(doc=1842,freq=6.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.418113 = fieldWeight in 1842, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1842)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Many terminology engineering processes involve the task of automatic terminology extraction: before the terminology of a given domain can be modelled, organised or standardised, important concepts (or terms) of this domain have to be identified and fed into terminological databases. These serve in further steps as a starting point for compiling dictionaries, thesauri or maybe even terminological ontologies for the domain. For the extraction of the initial concepts, extraction methods are needed that operate on specialised language texts. On the other hand, many machine learning or information retrieval applications require automatic indexing techniques. In Machine Learning applications concerned with the automatic clustering or classification of texts, often feature vectors are needed that describe the contents of a given text briefly but meaningfully. These feature vectors typically consist of a fairly small set of index terms together with weights indicating their importance. Short but meaningful descriptions of document contents as provided by good index terms are also useful to humans: some knowledge management applications (e.g. topic maps) use them as a set of basic concepts (topics). The author believes that the tasks of terminology extraction and automatic indexing have much in common and can thus benefit from the same set of basic algorithms. It is the goal of this paper to outline some methods that may be used in both contexts, but also to find the discriminating factors between the two tasks that call for the variation of parameters or application of different techniques. The discussion of these methods will be based on statistical, syntactical and especially morphological properties of (index) terms. The paper is concluded by the presentation of some qualitative and quantitative results comparing statistical and morphological methods.
Ladewig, C.; Henkes, M.: Verfahren zur automatischen inhaltlichen Erschließung von elektronischen Texten : ASPECTIX (2001) 0.01
```
0.01493918 = product of:
  0.044817537 = sum of:
    0.044817537 = product of:
      0.089635074 = sum of:
        0.089635074 = weight(_text_:index in 5794) [ClassicSimilarity], result of:
          0.089635074 = score(doc=5794,freq=4.0), product of:
            0.21880072 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.050071523 = queryNorm
            0.40966535 = fieldWeight in 5794, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=5794)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Das Verfahren zur automatischen syntaktischen inhaltlichen Erschließung von elektronischen Texten, AspectiX, basiert auf einem Index, dessen Elemente mit einer universellen Aspekt-Klassifikation verknüpft sind, die es erlauben, ein syntaktisches Retrieval durchzuführen. Mit diesen, auf den jeweiligen Suchgegenstand inhaltlich bezogenen Klassifikationselementen, werden die Informationen in elektronischen Texten mit bekannten Suchalgorithmen abgefragt und die Ergebnisse entsprechend der Aspektverknüpfung ausgewertet. Mit diesen Aspekten ist es möglich, unbekannte Textdokumente automatisch fachgebiets- und sprachunabhängig nach Inhalten zu klassifizieren und beim Suchen in einem Textcorpus nicht nur auf die Verwendung von Zeichenfolgen angewiesen zu sein wie bei Suchmaschinen im WWW. Der Index kann bei diesen Vorgängen intellektuell und automatisch weiter ausgebaut werden und liefert Ergebnisse im Retrieval von nahezu 100 Prozent Precision, bei gleichzeitig nahezu 100 Prozent Recall. Damit ist das Verfahren AspectiX allen anderen Recherchetools um bis zu 40 Prozent an Precision bzw. Recall überlegen, wie an zahlreichen Recherchen in drei Datenbanken, die unterschiedlich groß und thematisch unähnlich sind, nachgewiesen wird

Search (64 results, page 1 of 4)

Authors

Years

Languages

Types

Themes