Search (216 results, page 1 of 11)

Stock, M.: Textwortmethode und Übersetzungsrelation : Eine Methode zum Aufbau von kombinierten Literaturnachweis- und Terminologiedatenbanken (1989) 0.02

0.022424987 = product of:
  0.07848745 = sum of:
    0.007323784 = product of:
      0.03661892 = sum of:
        0.03661892 = weight(_text_:retrieval in 3412) [ClassicSimilarity], result of:
          0.03661892 = score(doc=3412,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.33420905 = fieldWeight in 3412, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=3412)
      0.2 = coord(1/5)
    0.07116366 = product of:
      0.14232732 = sum of:
        0.14232732 = weight(_text_:zugriff in 3412) [ClassicSimilarity], result of:
          0.14232732 = score(doc=3412,freq=2.0), product of:
            0.2160124 = queryWeight, product of:
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.03622214 = queryNorm
            0.65888494 = fieldWeight in 3412, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.078125 = fieldNorm(doc=3412)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: Geisteswissenschaftliche Fachinformation erfordert eine enge Kooperation zwischen Literaturnachweis- und Terminologieinformationssystemen. Eine geeignete Dokumentationsmethode für die Auswertung geisteswissen- schaftlicher Literatur ist die Textwortwethode. Dem originalsprachig aufgenommenen Begriffsrepertoire ist ein einheitssprachiger Zugriff beizuordnen, der einerseits ein vollständiges und genaues Retrieval garantiert und andererseits den Aufbau fachspezifischer Wörterbücher vorantreibt

Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.02

0.01746637 = product of:
  0.06113229 = sum of:
    0.03659429 = product of:
      0.091485724 = sum of:
        0.051786967 = weight(_text_:retrieval in 1952) [ClassicSimilarity], result of:
          0.051786967 = score(doc=1952,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.47264296 = fieldWeight in 1952, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
        0.03969876 = weight(_text_:system in 1952) [ClassicSimilarity], result of:
          0.03969876 = score(doc=1952,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.3479797 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.4 = coord(2/5)
    0.024538001 = product of:
      0.049076002 = sum of:
        0.049076002 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
          0.049076002 = score(doc=1952,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.38690117 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Date: 16. 8.1998 12:51:22
Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.513-517.
Source: Proceedings of the 11th annual conference on research and development in information retrieval. Ed.: Y. Chiaramella

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.01

0.014565388 = product of:
  0.050978854 = sum of:
    0.011718053 = product of:
      0.058590267 = sum of:
        0.058590267 = weight(_text_:retrieval in 402) [ClassicSimilarity], result of:
          0.058590267 = score(doc=402,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.5347345 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.2 = coord(1/5)
    0.0392608 = product of:
      0.0785216 = sum of:
        0.0785216 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.0785216 = score(doc=402,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Source: Information processing and management. 22(1986) no.6, S.465-476

Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.01

0.014089733 = product of:
  0.049314063 = sum of:
    0.029683663 = product of:
      0.07420915 = sum of:
        0.029295133 = weight(_text_:retrieval in 3581) [ClassicSimilarity], result of:
          0.029295133 = score(doc=3581,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.26736724 = fieldWeight in 3581, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=3581)
        0.044914022 = weight(_text_:system in 3581) [ClassicSimilarity], result of:
          0.044914022 = score(doc=3581,freq=4.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.3936941 = fieldWeight in 3581, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=3581)
      0.4 = coord(2/5)
    0.0196304 = product of:
      0.0392608 = sum of:
        0.0392608 = weight(_text_:22 in 3581) [ClassicSimilarity], result of:
          0.0392608 = score(doc=3581,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.30952093 = fieldWeight in 3581, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3581)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: Lingo ist ein frei verfügbares System (open source) zur automatischen Indexierung der deutschen Sprache. Bei der Entwicklung von lingo standen hohe Konfigurierbarkeit und Flexibilität des Systems für unterschiedliche Einsatzmöglichkeiten im Vordergrund. Der Beitrag zeigt den Nutzen einer linguistisch basierten automatischen Indexierung für das Information Retrieval auf. Die für eine Retrievalverbesserung zur Verfügung stehende linguistische Funktionalität von lingo wird vorgestellt und an Beispielen erläutert: Grundformerkennung, Kompositumerkennung bzw. Kompositumzerlegung, Wortrelationierung, lexikalische und algorithmische Mehrwortgruppenerkennung, OCR-Fehlerkorrektur. Der offene Systemaufbau von lingo wird beschrieben, mögliche Einsatzszenarien und Anwendungsgrenzen werden benannt.
Date: 24. 3.2006 12:22:02

Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.01

0.013541961 = product of:
  0.04739686 = sum of:
    0.030220259 = product of:
      0.075550646 = sum of:
        0.036250874 = weight(_text_:retrieval in 530) [ClassicSimilarity], result of:
          0.036250874 = score(doc=530,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.33085006 = fieldWeight in 530, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
        0.039299767 = weight(_text_:system in 530) [ClassicSimilarity], result of:
          0.039299767 = score(doc=530,freq=4.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.34448233 = fieldWeight in 530, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
      0.4 = coord(2/5)
    0.0171766 = product of:
      0.0343532 = sum of:
        0.0343532 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
          0.0343532 = score(doc=530,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.2708308 = fieldWeight in 530, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
Source: International forum on information and documentation. 22(1997) no.1, S.17-28

Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998) 0.01

0.007866439 = product of:
  0.027532537 = sum of:
    0.015263537 = product of:
      0.03815884 = sum of:
        0.01830946 = weight(_text_:retrieval in 1794) [ClassicSimilarity], result of:
          0.01830946 = score(doc=1794,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.16710453 = fieldWeight in 1794, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1794)
        0.01984938 = weight(_text_:system in 1794) [ClassicSimilarity], result of:
          0.01984938 = score(doc=1794,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.17398985 = fieldWeight in 1794, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1794)
      0.4 = coord(2/5)
    0.0122690005 = product of:
      0.024538001 = sum of:
        0.024538001 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
          0.024538001 = score(doc=1794,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.19345059 = fieldWeight in 1794, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1794)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document
Date: 11. 9.2000 19:53:22

Hodges, P.R.: Keyword in title indexes : effectiveness of retrieval in computer searches (1983) 0.01

0.007837114 = product of:
  0.027429897 = sum of:
    0.010253297 = product of:
      0.051266484 = sum of:
        0.051266484 = weight(_text_:retrieval in 5001) [ClassicSimilarity], result of:
          0.051266484 = score(doc=5001,freq=8.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.46789268 = fieldWeight in 5001, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
      0.2 = coord(1/5)
    0.0171766 = product of:
      0.0343532 = sum of:
        0.0343532 = weight(_text_:22 in 5001) [ClassicSimilarity], result of:
          0.0343532 = score(doc=5001,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.2708308 = fieldWeight in 5001, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: A study was done to test the effectiveness of retrieval using title word searching. It was based on actual search profiles used in the Mechanized Information Center at Ohio State University, in order ro replicate as closely as possible actual searching conditions. Fewer than 50% of the relevant titles were retrieved by keywords in titles. The low rate of retrieval can be attributes to three sources: titles themselves, user and information specialist ignorance of the subject vocabulary in use, and to general language problems. Across fields it was found that the social sciences had the best retrieval rate, with science having the next best, and arts and humanities the lowest. Ways to enhance and supplement keyword in title searching on the computer and in printed indexes are discussed.
Date: 14. 3.1996 13:22:21

Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.01

0.007423487 = product of:
  0.025982203 = sum of:
    0.006351802 = product of:
      0.03175901 = sum of:
        0.03175901 = weight(_text_:system in 4709) [ClassicSimilarity], result of:
          0.03175901 = score(doc=4709,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.27838376 = fieldWeight in 4709, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
      0.2 = coord(1/5)
    0.0196304 = product of:
      0.0392608 = sum of:
        0.0392608 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
          0.0392608 = score(doc=4709,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.30952093 = fieldWeight in 4709, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4709)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: Proposes automatic linguistic knowledge acquisition from sublanguage corpora. The system combines existing linguistic knowledge and human intervention with corpus based techniques. The algorithm involves a gradual approximation which works to converge linguistic knowledge gradually towards desirable results. The 1st experiment revealed the characteristic of this algorithm and the others proved the effectiveness of this algorithm for a real corpus
Date: 31. 7.1996 9:22:19

Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01

0.007423487 = product of:
  0.025982203 = sum of:
    0.006351802 = product of:
      0.03175901 = sum of:
        0.03175901 = weight(_text_:system in 6752) [ClassicSimilarity], result of:
          0.03175901 = score(doc=6752,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.27838376 = fieldWeight in 6752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
      0.2 = coord(1/5)
    0.0196304 = product of:
      0.0392608 = sum of:
        0.0392608 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
          0.0392608 = score(doc=6752,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.30952093 = fieldWeight in 6752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: AutoSlog is a system that addresses the knowledge engineering bottleneck for information extraction. AutoSlog automatically creates domain specific dictionaries for information extraction, given an appropriate training corpus. Describes experiments with AutoSlog in terrorism, joint ventures and microelectronics domains. Compares the performance of AutoSlog across the 3 domains, discusses the lessons learned and presents results from 2 experiments which demonstrate that novice users can generate effective dictionaries using AutoSlog
Date: 6. 3.1997 16:22:15

Probst, M.; Mittelbach, J.: Maschinelle Indexierung in der Sacherschließung wissenschaftlicher Bibliotheken (2006) 0.01

0.007282694 = product of:
  0.025489427 = sum of:
    0.0058590267 = product of:
      0.029295133 = sum of:
        0.029295133 = weight(_text_:retrieval in 1755) [ClassicSimilarity], result of:
          0.029295133 = score(doc=1755,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.26736724 = fieldWeight in 1755, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=1755)
      0.2 = coord(1/5)
    0.0196304 = product of:
      0.0392608 = sum of:
        0.0392608 = weight(_text_:22 in 1755) [ClassicSimilarity], result of:
          0.0392608 = score(doc=1755,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.30952093 = fieldWeight in 1755, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1755)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: Obwohl fast alle größeren Bibliotheken intellektuelle Sacherschließung betreiben, sind elektronische Kataloge für die zielgerichtete sachliche Suche nur eingeschränkt nutzbar. Durch maschinelle Indexierung können ohne nennenswerten personellen Mehraufwand ausreichend große Datenmengen für Informationsretrievalsysteme erzeugt und somit die Auffindbarkeit von Dokumenten erhöht werden. Geeignete Sprachanalysetechniken zur Indextermerzeugung sind bekannt und bieten im Gegensatz zur gebräuchlichen Freitextinvertierung entscheidende Vorteile beim Retrieval. Im Fokus steht die Betrachtung der Vor- und Nachteile der gängigen Indexierungssysteme MILOS und intelligentCAPTURE.
Date: 22. 3.2008 12:35:19

Martins, A.L.; Souza, R.R.; Ribeiro de Mello, H.: ¬The use of noun phrases in information retrieval : proposing a mechanism for automatic classification (2014) 0.01
```
0.0070448667 = product of:
  0.024657032 = sum of:
    0.014841831 = product of:
      0.037104577 = sum of:
        0.014647567 = weight(_text_:retrieval in 1441) [ClassicSimilarity], result of:
          0.014647567 = score(doc=1441,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.13368362 = fieldWeight in 1441, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=1441)
        0.022457011 = weight(_text_:system in 1441) [ClassicSimilarity], result of:
          0.022457011 = score(doc=1441,freq=4.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.19684705 = fieldWeight in 1441, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03125 = fieldNorm(doc=1441)
      0.4 = coord(2/5)
    0.0098152 = product of:
      0.0196304 = sum of:
        0.0196304 = weight(_text_:22 in 1441) [ClassicSimilarity], result of:
          0.0196304 = score(doc=1441,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.15476047 = fieldWeight in 1441, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1441)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

This paper presents a research on syntactic structures known as noun phrases (NP) being applied to increase the effectiveness and efficiency of the mechanisms for the document's classification. Our hypothesis is the fact that the NP can be used instead of single words as a semantic aggregator to reduce the number of words that will be used for the classification system without losing its semantic coverage, increasing its efficiency. The experiment divided the documents classification process in three phases: a) NP preprocessing b) system training; and c) classification experiments. In the first step, a corpus of digitalized texts was submitted to a natural language processing platform1 in which the part-of-speech tagging was done, and them PERL scripts pertaining to the PALAVRAS package were used to extract the Noun Phrases. The preprocessing also involved the tasks of a) removing NP low meaning pre-modifiers, as quantifiers; b) identification of synonyms and corresponding substitution for common hyperonyms; and c) stemming of the relevant words contained in the NP, for similitude checking with other NPs. The first tests with the resulting documents have demonstrated its effectiveness. We have compared the structural similarity of the documents before and after the whole pre-processing steps of phase one. The texts maintained the consistency with the original and have kept the readability. The second phase involves submitting the modified documents to a SVM algorithm to identify clusters and classify the documents. The classification rules are to be established using a machine learning approach. Finally, tests will be conducted to check the effectiveness of the whole process.

Source

Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

¬The smart retrieval system : experiments in automatic document processing (1971) 0.01

0.0069776163 = product of:
  0.048843313 = sum of:
    0.048843313 = product of:
      0.12210828 = sum of:
        0.058590267 = weight(_text_:retrieval in 2330) [ClassicSimilarity], result of:
          0.058590267 = score(doc=2330,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.5347345 = fieldWeight in 2330, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=2330)
        0.06351802 = weight(_text_:system in 2330) [ClassicSimilarity], result of:
          0.06351802 = score(doc=2330,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.5567675 = fieldWeight in 2330, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.125 = fieldNorm(doc=2330)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)

Dattola, R.T.: FIRST: Flexible information retrieval system for text (1979) 0.01

0.0069776163 = product of:
  0.048843313 = sum of:
    0.048843313 = product of:
      0.12210828 = sum of:
        0.058590267 = weight(_text_:retrieval in 5172) [ClassicSimilarity], result of:
          0.058590267 = score(doc=5172,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.5347345 = fieldWeight in 5172, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=5172)
        0.06351802 = weight(_text_:system in 5172) [ClassicSimilarity], result of:
          0.06351802 = score(doc=5172,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.5567675 = fieldWeight in 5172, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.125 = fieldNorm(doc=5172)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)

Salton, G.: Another look at automatic text-retrieval systems (1986) 0.01

0.006947495 = product of:
  0.04863246 = sum of:
    0.04863246 = product of:
      0.12158115 = sum of:
        0.08188239 = weight(_text_:retrieval in 1356) [ClassicSimilarity], result of:
          0.08188239 = score(doc=1356,freq=10.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.74731416 = fieldWeight in 1356, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=1356)
        0.03969876 = weight(_text_:system in 1356) [ClassicSimilarity], result of:
          0.03969876 = score(doc=1356,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.3479797 = fieldWeight in 1356, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.078125 = fieldNorm(doc=1356)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)

Footnote: Bezugnahme auf: Blair, D.C.: An evaluation of retrieval effectiveness for a full-text document-retrieval system. Comm. ACM 28(1985) S.280-299. - Vgl. auch: Blair, D.C.: Full text retrieval ... Int. Class. 13(1986) S.18-23; Blair, D.C., M.E. Maron: full-text information retrieval ... Inf. Proc. Man. 26(1990) S.437-447.

Lassalle, E.: Text retrieval : from a monolingual system to a multilingual system (1993) 0.01

0.0065628807 = product of:
  0.045940164 = sum of:
    0.045940164 = product of:
      0.11485041 = sum of:
        0.036250874 = weight(_text_:retrieval in 7403) [ClassicSimilarity], result of:
          0.036250874 = score(doc=7403,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.33085006 = fieldWeight in 7403, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7403)
        0.078599535 = weight(_text_:system in 7403) [ClassicSimilarity], result of:
          0.078599535 = score(doc=7403,freq=16.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.68896466 = fieldWeight in 7403, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7403)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)

Abstract: Describes the TELMI monolingual text retrieval system and its future extension, a multilingual system. TELMI is designed for medium sized databases containing short texts. The characteristics of the system are fine-grained natural language processing (NLP); an open domain and a large scale knowledge base; automated indexing based on conceptual representation of texts and reusability of the NLP tools. Discusses the French MINITEL service, the MGS information service and the TELMI research system covering the full text system; NLP architecture; the lexical level; the syntactic level; the semantic level and an example of the use of a generic system

Wan, T.-L.; Evens, M.; Wan, Y.-W.; Pao, Y.-Y.: Experiments with automatic indexing and a relational thesaurus in a Chinese information retrieval system (1997) 0.01

0.0064511965 = product of:
  0.045158375 = sum of:
    0.045158375 = product of:
      0.112895936 = sum of:
        0.05731767 = weight(_text_:retrieval in 956) [ClassicSimilarity], result of:
          0.05731767 = score(doc=956,freq=10.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.5231199 = fieldWeight in 956, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=956)
        0.055578265 = weight(_text_:system in 956) [ClassicSimilarity], result of:
          0.055578265 = score(doc=956,freq=8.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.4871716 = fieldWeight in 956, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=956)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)

Abstract: This article describes a series of experiments with an interactive Chinese information retrieval system named CIRS and an interactive relational thesaurus. 2 important issues have been explored: whether thesauri enhance the retrieval effectiveness of Chinese documents, and whether automatic indexing can complete with manual indexing in a Chinese information retrieval system. Recall and precision are used to measure and evaluate the effectiveness of the system. Statistical analysis of the recall and precision measures suggest that the use of the relational thesaurus does improve the retrieval effectiveness both in the automatic indexing environment and in the manual indexing environment and that automatic indexing is at least as good as manual indexing

Renz, M.: Automatische Inhaltserschließung im Zeichen von Wissensmanagement (2001) 0.01

0.0063723573 = product of:
  0.02230325 = sum of:
    0.0051266486 = product of:
      0.025633242 = sum of:
        0.025633242 = weight(_text_:retrieval in 5671) [ClassicSimilarity], result of:
          0.025633242 = score(doc=5671,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.23394634 = fieldWeight in 5671, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5671)
      0.2 = coord(1/5)
    0.0171766 = product of:
      0.0343532 = sum of:
        0.0343532 = weight(_text_:22 in 5671) [ClassicSimilarity], result of:
          0.0343532 = score(doc=5671,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.2708308 = fieldWeight in 5671, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5671)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: Methoden der automatischen Inhaltserschließung werden seit mehr als 30 Jahren entwickelt, ohne in luD-Kreisen auf merkliche Akzeptanz zu stoßen. Gegenwärtig führen jedoch die steigende Informationsflut und der Bedarf an effizienten Zugriffsverfahren im Informations- und Wissensmanagement in breiten Anwenderkreisen zu einem wachsenden Interesse an diesen Methoden, zu verstärkten Anstrengungen in Forschung und Entwicklung und zu neuen Produkten. In diesem Beitrag werden verschiedene Ansätze zu intelligentem und inhaltsbasiertem Retrieval und zur automatischen Inhaltserschließung diskutiert sowie kommerziell vertriebene Softwarewerkzeuge und Lösungen präsentiert. Abschließend wird festgestellt, dass in naher Zukunft mit einer zunehmenden Automatisierung von bestimmten Komponenten des Informations- und Wissensmanagements zu rechnen ist, indem Software-Werkzeuge zur automatischen Inhaltserschließung in den Workflow integriert werden
Date: 22. 3.2001 13:14:48

Advances in intelligent retrieval: Proc. of a conference ... Wadham College, Oxford, 16.-17.4.1985 (1986) 0.01
```
0.005679251 = product of:
  0.039754756 = sum of:
    0.039754756 = product of:
      0.099386886 = sum of:
        0.058130726 = weight(_text_:retrieval in 1384) [ClassicSimilarity], result of:
          0.058130726 = score(doc=1384,freq=14.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.5305404 = fieldWeight in 1384, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1384)
        0.041256163 = weight(_text_:system in 1384) [ClassicSimilarity], result of:
          0.041256163 = score(doc=1384,freq=6.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.36163113 = fieldWeight in 1384, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=1384)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)
```
Content

Enthält die Beiträge: ADDIS, T.: Extended relational analysis: a design approach to knowledge-based systems; PARKINSON, D.: Supercomputers and non-numeric processing; McGREGOR, D.R. u. J.R. MALONE: An architectural approach to advances in information retrieval; ALLEN, M.J. u. O.S. HARRISON: Word processing and information retrieval: some practical problems; MURTAGH, F.: Clustering and nearest neighborhood searching; ENSER, P.G.B.: Experimenting with the automatic classification of books; TESKEY, N. u. Z. RAZAK: An analysis of ranking for free text retrieval systems; ZARRI, G.P.: Interactive information retrieval: an artificial intelligence approach to deal with biographical data; HANCOX, P. u. F. SMITH: A case system processor for the PRECIS indexing language; ROUAULT, J.: Linguistic methods in information retrieval systems; ARAGON-RAMIREZ, V. u. C.D. PAICE: Design of a system for the online elucidation of natural language search statements; BROOKS, H.M., P.J. DANIELS u. N.J. BELKIN: Problem descriptions and user models: developing an intelligent interface for document retrieval systems; BLACK, W.J., P. HARGREAVES u. P.B. MAYES: HEADS: a cataloguing advisory system; BELL, D.A.: An architecture for integrating data, knowledge, and information bases

Ward, M.L.: ¬The future of the human indexer (1996) 0.01

0.005567615 = product of:
  0.01948665 = sum of:
    0.0047638514 = product of:
      0.023819257 = sum of:
        0.023819257 = weight(_text_:system in 7244) [ClassicSimilarity], result of:
          0.023819257 = score(doc=7244,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.20878783 = fieldWeight in 7244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=7244)
      0.2 = coord(1/5)
    0.0147228 = product of:
      0.0294456 = sum of:
        0.0294456 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
          0.0294456 = score(doc=7244,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.23214069 = fieldWeight in 7244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=7244)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: Considers the principles of indexing and the intellectual skills involved in order to determine what automatic indexing systems would be required in order to supplant or complement the human indexer. Good indexing requires: considerable prior knowledge of the literature; judgement as to what to index and what depth to index; reading skills; abstracting skills; and classification skills, Illustrates these features with a detailed description of abstracting and indexing processes involved in generating entries for the mechanical engineering database POWERLINK. Briefly assesses the possibility of replacing human indexers with specialist indexing software, with particular reference to the Object Analyzer from the InTEXT automatic indexing system and using the criteria described for human indexers. At present, it is unlikely that the automatic indexer will replace the human indexer, but when more primary texts are available in electronic form, it may be a useful productivity tool for dealing with large quantities of low grade texts (should they be wanted in the database)
Date: 9. 2.1997 18:44:22

Busch, D.: Domänenspezifische hybride automatische Indexierung von bibliographischen Metadaten (2019) 0.01

0.005567615 = product of:
  0.01948665 = sum of:
    0.0047638514 = product of:
      0.023819257 = sum of:
        0.023819257 = weight(_text_:system in 5628) [ClassicSimilarity], result of:
          0.023819257 = score(doc=5628,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.20878783 = fieldWeight in 5628, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=5628)
      0.2 = coord(1/5)
    0.0147228 = product of:
      0.0294456 = sum of:
        0.0294456 = weight(_text_:22 in 5628) [ClassicSimilarity], result of:
          0.0294456 = score(doc=5628,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.23214069 = fieldWeight in 5628, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=5628)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: Im Fraunhofer-Informationszentrum Raum und Bau (IRB) wird Fachliteratur im Bereich Planen und Bauen bibliographisch erschlossen. Die daraus resultierenden Dokumente (Metadaten-Einträge) werden u.a. bei der Produktion der bibliographischen Datenbanken des IRB verwendet. In Abb. 1 ist ein Dokument dargestellt, das einen Zeitschriftenartikel beschreibt. Die Dokumente werden mit Deskriptoren von einer Nomenklatur (Schlagwortliste IRB) indexiert. Ein Deskriptor ist "eine Benennung., die für sich allein verwendbar, eindeutig zur Inhaltskennzeichnung geeignet und im betreffenden Dokumentationssystem zugelassen ist". Momentan wird die Indexierung intellektuell von menschlichen Experten durchgeführt. Die intellektuelle Indexierung ist zeitaufwendig und teuer. Eine Lösung des Problems besteht in der automatischen Indexierung, bei der die Zuordnung von Deskriptoren durch ein Computerprogramm erfolgt. Solche Computerprogramme werden im Folgenden auch als Klassifikatoren bezeichnet. In diesem Beitrag geht es um ein System zur automatischen Indexierung von deutschsprachigen Dokumenten im Bereich Bauwesen mit Deskriptoren aus der Schlagwortliste IRB.
Source: B.I.T.online. 22(2019) H.6, S.465-469

Search (216 results, page 1 of 11)

Authors

Years

Languages

Types

Themes

Subjects

Classifications