Search (304 results, page 1 of 16)

Panyr, J.: Vektorraum-Modell und Clusteranalyse in Information-Retrieval-Systemen (1987) 0.11

0.11161989 = product of:
  0.29765305 = sum of:
    0.016991155 = weight(_text_:information in 2322) [ClassicSimilarity], result of:
      0.016991155 = score(doc=2322,freq=6.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.2687516 = fieldWeight in 2322, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=2322)
    0.05044969 = weight(_text_:retrieval in 2322) [ClassicSimilarity], result of:
      0.05044969 = score(doc=2322,freq=6.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.46309367 = fieldWeight in 2322, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=2322)
    0.2302122 = weight(_text_:modell in 2322) [ClassicSimilarity], result of:
      0.2302122 = score(doc=2322,freq=8.0), product of:
        0.21656582 = queryWeight, product of:
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.036014426 = queryNorm
        1.0630126 = fieldWeight in 2322, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.0625 = fieldNorm(doc=2322)
  0.375 = coord(3/8)

Abstract: Ausgehend von theoretischen Indexierungsansätzen wird das klassische Vektorraum-Modell für automatische Indexierung (mit dem Trennschärfen-Modell) erläutert. Das Clustering in Information-Retrieval-Systemem wird als eine natürliche logische Folge aus diesem Modell aufgefaßt und in allen seinen Ausprägungen (d.h. als Dokumenten-, Term- oder Dokumenten- und Termklassifikation) behandelt. Anschließend werden die Suchstrategien in vorklassifizierten Dokumentenbeständen (Clustersuche) detailliert beschrieben. Zum Schluß wird noch die sinnvolle Anwendung der Clusteranalyse in Information-Retrieval-Systemen kurz diskutiert

Experimentelles und praktisches Information Retrieval : Festschrift für Gerhard Lustig (1992) 0.07
```
0.07421538 = product of:
  0.19790769 = sum of:
    0.01802184 = weight(_text_:information in 4) [ClassicSimilarity], result of:
      0.01802184 = score(doc=4,freq=12.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.2850541 = fieldWeight in 4, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4)
    0.05779738 = weight(_text_:retrieval in 4) [ClassicSimilarity], result of:
      0.05779738 = score(doc=4,freq=14.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.5305404 = fieldWeight in 4, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4)
    0.122088455 = weight(_text_:modell in 4) [ClassicSimilarity], result of:
      0.122088455 = score(doc=4,freq=4.0), product of:
        0.21656582 = queryWeight, product of:
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.036014426 = queryNorm
        0.5637476 = fieldWeight in 4, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.046875 = fieldNorm(doc=4)
  0.375 = coord(3/8)
```
Content

Enthält die Beiträge: SALTON, G.: Effective text understanding in information retrieval; KRAUSE, J.: Intelligentes Information retrieval; FUHR, N.: Konzepte zur Gestaltung zukünftiger Information-Retrieval-Systeme; HÜTHER, H.: Überlegungen zu einem mathematischen Modell für die Type-Token-, die Grundform-Token und die Grundform-Type-Relation; KNORZ, G.: Automatische Generierung inferentieller Links in und zwischen Hyperdokumenten; KONRAD, E.: Zur Effektivitätsbewertung von Information-Retrieval-Systemen; HENRICHS, N.: Retrievalunterstützung durch automatisch generierte Wortfelder; LÜCK, W., W. RITTBERGER u. M. SCHWANTNER: Der Einsatz des Automatischen Indexierungs- und Retrieval-System (AIR) im Fachinformationszentrum Karlsruhe; REIMER, U.: Verfahren der Automatischen Indexierung. Benötigtes Vorwissen und Ansätze zu seiner automatischen Akquisition: Ein Überblick; ENDRES-NIGGEMEYER, B.: Dokumentrepräsentation: Ein individuelles prozedurales Modell des Abstracting, des Indexierens und Klassifizierens; SEELBACH, D.: Zur Entwicklung von zwei- und mehrsprachigen lexikalischen Datenbanken und Terminologiedatenbanken; ZIMMERMANN, H.: Der Einfluß der Sprachbarrieren in Europa und Möglichkeiten zu ihrer Minderung; LENDERS, W.: Wörter zwischen Welt und Wissen; PANYR, J.: Frames, Thesauri und automatische Klassifikation (Clusteranalyse): HAHN, U.: Forschungsstrategien und Erkenntnisinteressen in der anwendungsorientierten automatischen Sprachverarbeitung. Überlegungen zu einer ingenieurorientierten Computerlinguistik; KUHLEN, R.: Hypertext und Information Retrieval - mehr als Browsing und Suche.
Grün, S.: Mehrwortbegriffe und Latent Semantic Analysis : Bewertung automatisch extrahierter Mehrwortgruppen mit LSA (2017) 0.05
```
0.054575153 = product of:
  0.14553374 = sum of:
    0.012262309 = weight(_text_:information in 3954) [ClassicSimilarity], result of:
      0.012262309 = score(doc=3954,freq=8.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.19395474 = fieldWeight in 3954, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3954)
    0.03153106 = weight(_text_:retrieval in 3954) [ClassicSimilarity], result of:
      0.03153106 = score(doc=3954,freq=6.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.28943354 = fieldWeight in 3954, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3954)
    0.10174038 = weight(_text_:modell in 3954) [ClassicSimilarity], result of:
      0.10174038 = score(doc=3954,freq=4.0), product of:
        0.21656582 = queryWeight, product of:
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.036014426 = queryNorm
        0.46978965 = fieldWeight in 3954, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3954)
  0.375 = coord(3/8)
```
Abstract

Die vorliegende Studie untersucht das Potenzial von Mehrwortbegriffen für das Information Retrieval. Zielsetzung der Arbeit ist es, intellektuell positiv bewertete Kandidaten mithilfe des Latent Semantic Analysis (LSA) Verfahren höher zu gewichten, als negativ bewertete Kandidaten. Die positiven Kandidaten sollen demnach bei einem Ranking im Information Retrieval bevorzugt werden. Als Kollektion wurde eine Version der sozialwissenschaftlichen GIRT-Datenbank (German Indexing and Retrieval Testdatabase) eingesetzt. Um Kandidaten für Mehrwortbegriffe zu identifizieren wurde die automatische Indexierung Lingo verwendet. Die notwendigen Kernfunktionalitäten waren Lemmatisierung, Identifizierung von Komposita, algorithmische Mehrworterkennung sowie Gewichtung von Indextermen durch das LSA-Modell. Die durch Lingo erkannten und LSAgewichteten Mehrwortkandidaten wurden evaluiert. Zuerst wurde dazu eine intellektuelle Auswahl von positiven und negativen Mehrwortkandidaten vorgenommen. Im zweiten Schritt der Evaluierung erfolgte die Berechnung der Ausbeute, um den Anteil der positiven Mehrwortkandidaten zu erhalten. Im letzten Schritt der Evaluierung wurde auf der Basis der R-Precision berechnet, wie viele positiv bewerteten Mehrwortkandidaten es an der Stelle k des Rankings geschafft haben. Die Ausbeute der positiven Mehrwortkandidaten lag bei durchschnittlich ca. 39%, während die R-Precision einen Durchschnittswert von 54% erzielte. Das LSA-Modell erzielt ein ambivalentes Ergebnis mit positiver Tendenz.

Footnote

Masterarbeit, Studiengang Informationswissenschaft und Sprachtechnologie, Institut für Sprache und Information, Philosophische Fakultät, Heinrich-Heine-Universität Düsseldorf

Imprint

Düsseldorf : Heinrich-Heine-Universität / Philosophische Fakultät / Institut für Sprache und Information
Bredack, J.: Terminologieextraktion von Mehrwortgruppen in kunsthistorischen Fachtexten (2013) 0.05
```
0.04657194 = product of:
  0.12419184 = sum of:
    0.006069533 = weight(_text_:information in 1054) [ClassicSimilarity], result of:
      0.006069533 = score(doc=1054,freq=4.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.0960027 = fieldWeight in 1054, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1054)
    0.0180215 = weight(_text_:retrieval in 1054) [ClassicSimilarity], result of:
      0.0180215 = score(doc=1054,freq=4.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.16542503 = fieldWeight in 1054, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1054)
    0.10010081 = weight(_text_:mathematisches in 1054) [ClassicSimilarity], result of:
      0.10010081 = score(doc=1054,freq=2.0), product of:
        0.30533072 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.036014426 = queryNorm
        0.32784387 = fieldWeight in 1054, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1054)
  0.375 = coord(3/8)
```
Abstract

Mit Hilfe eines algorithmisch arbeitenden Verfahrens können Mehrwortgruppen aus elektronisch vorliegenden Texten identifiziert und extrahiert werden. Als Datengrundlage für diese Arbeit dienen kunsthistorische Lexikonartikel des Reallexikons zur Deutschen Kunstgeschichte. Die linguistisch, wörterbuchbasierte Open-Source-Software Lingo wurde in dieser Studie genutzt. Mit Lingo ist es möglich, auf Basis erstellter Wortmuster, bestimmte Wortfolgen aus elektronisch vorliegenden Daten algorithmisch zu identifizieren und zu extrahieren. Die erstellten Wortmuster basieren auf Wortklassen, mit denen die lexikalisierten Einträge in den Wörterbüchern getaggt sind und dadurch näher definiert werden. So wurden individuelle Wortklassen für Fachterminologie, Eigennamen, oder Adjektive vergeben. In der vorliegenden Arbeit werden zusätzlich Funktionswörter in die Musterbildung mit einbezogen. Dafür wurden neue Wortklassen definiert. Funktionswörter bestimmen Artikel, Konjunktionen und Präpositionen. Ziel war es fachterminologische Mehrwortgruppen mit kunsthistorischen Inhalten zu extrahieren unter der gezielten Einbindung von Funktionswörtern. Anhand selbst gebildeter Kriterien, wurden die extrahierten Mehrwortgruppen qualitativ analysiert. Es konnte festgestellt werden, dass die Verwendung von Funktionswörtern fachterminologische Mehrwortgruppen erzeugt, die als potentielle Indexterme weitere Verwendung im Information Retrieval finden können.
Mehrwortgruppen sind als lexikalische Einheit zu betrachten und bestehen aus mindestens zwei miteinander in Verbindung stehenden Begriffen. Durch die Ver-bindung mehrerer Fachwörter transportieren sie in Fachtexten aussagekräftige Informationen. Sie vermitteln eindeutige Informationen, da aus den resultierenden Beziehungen zwischen den in Verbindung stehenden Fachbegriffen die inhaltliche Bedeutung eines Fachtextes ersichtlich wird. Demzufolge ist es sinnvoll, Mehrwort-gruppen aus Fachtexten zu extrahieren, da diese die Inhalte eindeutig repräsentieren. So können Mehrwortgruppen für eine inhaltliche Erschließung genutzt und beispiels-weise als Indexterme im Information Retrieval bereitgestellt werden. Mehrwortgruppen enthalten Informationen eines Textes, die in natürlicher Sprache vorliegen. Zur Extraktion von Informationen eines elektronisch vorliegenden Textes kommen maschinelle Verfahren zum Einsatz, da Sprache Strukturen aufweist, die maschinell verarbeitet werden können. Eine mögliche Methode Mehrwortgruppen innerhalb von elektronisch vorliegenden Fachtexten zu identifizieren und extrahieren ist ein algorithmisches Verfahren. Diese Methode erkennt Wortfolgen durch das Bilden von Wortmustern, aus denen sich eine Mehrwortgruppe in einem Text zusammensetzt. Die Wortmuster repräsentieren somit die einzelnen Bestandteile einer Mehrwortgruppe. Bereits an mathematischen Fachtexten wurde dieses Verfahren untersucht und analysiert. Relevante Mehrwortgruppen, die ein mathematisches Konzept oder mathe-matischen Inhalt repräsentierten, konnten erfolgreich extrahiert werden. Zum Einsatz kam das Indexierungssystem Lingo, mit dessen Programmodul sequencer eine algorithmische Identifizierung und Extraktion von Mehrwortgruppen möglich ist. In der vorliegenden Arbeit wird dieses algorithmische Verfahren unter Einsatz der Software Lingo genutzt, um Mehrwortgruppen aus kunsthistorischen Fachtexten zu extrahieren. Als Datenquelle dienen kunsthistorische Lexikonartikel aus dem Reallexikon zur Deutschen Kunstgeschichte, welches in deutscher Sprache vorliegt. Es wird untersucht, ob positive Ergebnisse im Sinne von fachterminologischen Mehrwort-gruppen mit kunsthistorischen Inhalten erzeugt werden können. Dabei soll zusätzlich die Einbindung von Funktionswörtern innerhalb einer Mehrwortgruppe erfolgen. Funktionswörter definieren Artikel, Konjunktionen und Präpositionen, die für sich alleinstehend keine inhaltstragende Bedeutung besitzen, allerdings innerhalb einer Mehrwortgruppe syntaktische Funktionen erfüllen. Anhand der daraus resultierenden Ergebnisse wird analysiert, ob das Hinzufügen von Funktionswörtern innerhalb einer Mehrwortgruppe zu positiven Ergebnissen führt. Ziel soll es demnach sein, fach-terminologische Mehrwortgruppen mit kunsthistorischen Inhalten zu erzeugen, unter Einbindung von Funktionswörtern. Bei der Extraktion fachterminologischer Mehrwortgruppen wird im Folgenden insbesondere auf die Erstellung von Wortmustern eingegangen, da diese die Basis liefern, mit welchen das Programmmodul sequencer Wortfolgen innerhalb der kunst-historischen Lexikonartikel identifiziert. Eine Einordung der Indexierungsergebnisse erfolgt anhand selbst gebildeter Kriterien, die definieren, was unter einer fach-terminologischen Mehrwortgruppe zu verstehen ist.

Salton, G.: Another look at automatic text-retrieval systems (1986) 0.04

0.04128287 = product of:
  0.11008765 = sum of:
    0.012262309 = weight(_text_:information in 1356) [ClassicSimilarity], result of:
      0.012262309 = score(doc=1356,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.19395474 = fieldWeight in 1356, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=1356)
    0.08141284 = weight(_text_:retrieval in 1356) [ClassicSimilarity], result of:
      0.08141284 = score(doc=1356,freq=10.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.74731416 = fieldWeight in 1356, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=1356)
    0.016412502 = product of:
      0.049237505 = sum of:
        0.049237505 = weight(_text_:29 in 1356) [ClassicSimilarity], result of:
          0.049237505 = score(doc=1356,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.38865322 = fieldWeight in 1356, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=1356)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Footnote: Bezugnahme auf: Blair, D.C.: An evaluation of retrieval effectiveness for a full-text document-retrieval system. Comm. ACM 28(1985) S.280-299. - Vgl. auch: Blair, D.C.: Full text retrieval ... Int. Class. 13(1986) S.18-23; Blair, D.C., M.E. Maron: full-text information retrieval ... Inf. Proc. Man. 26(1990) S.437-447.
Source: Communications of the Association for Computing Machinery. 29(1986), S.648-656

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.04

0.038961656 = product of:
  0.10389775 = sum of:
    0.019619694 = weight(_text_:information in 402) [ClassicSimilarity], result of:
      0.019619694 = score(doc=402,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.3103276 = fieldWeight in 402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.125 = fieldNorm(doc=402)
    0.058254283 = weight(_text_:retrieval in 402) [ClassicSimilarity], result of:
      0.058254283 = score(doc=402,freq=2.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.5347345 = fieldWeight in 402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.125 = fieldNorm(doc=402)
    0.026023773 = product of:
      0.07807132 = sum of:
        0.07807132 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.07807132 = score(doc=402,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Source: Information processing and management. 22(1986) no.6, S.465-476

Kaiser, A.: Computer-unterstütztes Indexieren in Intelligenten Information Retrieval Systemen : Ein Relevanz-Feedback orientierter Ansatz zur Informationserschließung in unformatierten Datenbanken (1993) 0.03
```
0.03363646 = product of:
  0.08969723 = sum of:
    0.0137644075 = weight(_text_:information in 4284) [ClassicSimilarity], result of:
      0.0137644075 = score(doc=4284,freq=28.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.21771365 = fieldWeight in 4284, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0234375 = fieldNorm(doc=4284)
    0.032768033 = weight(_text_:retrieval in 4284) [ClassicSimilarity], result of:
      0.032768033 = score(doc=4284,freq=18.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.30078813 = fieldWeight in 4284, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0234375 = fieldNorm(doc=4284)
    0.043164786 = weight(_text_:modell in 4284) [ClassicSimilarity], result of:
      0.043164786 = score(doc=4284,freq=2.0), product of:
        0.21656582 = queryWeight, product of:
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.036014426 = queryNorm
        0.19931486 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.0133076 = idf(docFreq=293, maxDocs=44218)
          0.0234375 = fieldNorm(doc=4284)
  0.375 = coord(3/8)
```
Abstract

Information ist in unserer Zeit zu einem sehr wichtigen Gut geworden. Sie ist Grundlage jeglicher seriösen Entscheidungsfindung. Die Informationsflut ist in den letzten Jahren stark angestiegen und auch in absehbarer Zeit wird die Menge der Informationen weiter anwachsen. Daher wird es immer wichtiger, sich ''Information über Information'' zu organisieren. Es ist nicht möglich, über alle Bereiche, mit denen man konfrontiert wird, im letzten Detail informiert zu sein. Notwendig und wichtig ist es aber zu wissen, wo man sich informieren kann. Relevante Informationen müssen möglichst rasch gefunden werden können. Im praktischen, EDV-unterstützten Einsatz findet man zu diesem Zweck Informationssysteme verschiedenster Art. Das Spektrum reicht dabei von Management-Informationssystemen, über Expertensysteme bis zu Datenbanksystemen und Information Retrieval Systemen (IR-Systemen). Obwohl die einzelnen Typen dieser informationsverarbeitenden Systeme für unterschiedliche Anwendergruppen und unterschiedliche Aufgabenarten konzipiert sind, ergeben sich beim Entwurf der Systeme doch sehr ähnlich gelagerte Problemkreise und Fragestellungen. * Die Darstellung und die Organisation von bestehendem Wissen und bekannten Fakten im Informationssystem (Informationserschließung). * Das (Wieder)finden relevanter Informationen aus dem Informationssystem und das Führen des Benutzers durch das Informationssystem. Ein Information Retrieval System beinhaltet unstrukturierte bibliographische oder textuelle Dokumente und unterscheidet sich dadurch wesentlich von Datenbanksystemen, die für gewöhnlich strukturierte Daten enthalten.
Konventionelle, formatierte Datenbanken sind heute in der Praxis bereits weit verbreitet. Dies nicht zuletzt auch deshalb, weil unter anderem die standardisierte Abfragesprache SQL existiert und insbesondere bei relationalen Datenbanksystemen die Forschung intensiv an Verbesserungen in Aufbau und Performance der Systeme arbeitet. Die Verbreitung und Akzeptanz von unformatierten Datenbanken, Information Retrieval Systemen, ist hingegen bei weitem nicht so weit gediehen. Ein Grund dafür ist in der mangelnden Benutzerfreundlichkeit der IR-Systeme und in unzulänglichen Methoden der Informationserschließung zu suchen. Mit der vorliegenden Arbeit soll eine Methode zur Informationserschliessung in Information Retrieval Systemen entwickelt werden, die die Bedürfnisse des Benutzers in den Mittelpunkt stellt und so einen Beitrag dazu leistet, die Akzeptanz und Verbreitung von Information Retrieval Systemen, insbesondere für den Bürobereich, zu erhöhen. Die Fragestellung lautet somit: Ist es möglich, den Benutzer bereits im Stadium der Indexierung von Dokumenten in verstärktem Maße miteinzubeziehen, ohne dabei aber auf die maschinelle Unterstützung völlig zu verzichten, wie dies bei der manuellen Indexierung der Fall ist. Jedes Retrievalsystem kann als ein System beschrieben werden, das aus einer Menge von Dokumenten und einer Menge von Suchfragen besteht und das einen Mechanismus enthält, der die für eine Suchanfrage relevanten Dokumente bestimmt.
Dazu sind folgende Teile eines IR-Systems notwendig: * Informationserschließung Eine Komponente zur Erschließung und Darstellung der gespeicherten Informationen. Dieser Teil dient dazu, den Inhalt der Dokumente zu beschreiben und so darzustellen, daß aufgrund dieser Merkmale ein Dokument gefunden werden kann. Eine Möglichkeit dazu besteht darin, den Dokumenten inhaltsbeschreibende Deskriptoren zuzuordnen. Durch den Prozeß der Indexierung werden die Dokumente in eine Indexierungssprache übersetzt. * Query-Language (Abfragesprache) Eine Komponente zur Formulierung der Suchanfragen des Benutzers. Dieser Teil dient dazu, die Suchanfrage des Benutzers so zu verarbeiten, daß mit der aus der Frage gewonnenen Information über die Bedürfnisse des Benutzers die passenden Dokumente gefunden werden können. * Informationsausgabe - Informationsaufbereitung Eine Komponente zur Ausgabe der auf Grund der Suchanfrage gefundenen Informationen. Dieser Teil stellt das Ergebnis der Suchanfrage dem Benutzer zur Verfügung.
Es würde den Rahmen der Arbeit sprengen, alle Komponenten eines Information Retrieval Systems zu untersuchen. Daher wird ein Schwerpunkt auf die Informationserschließung gelegt. Dabei wird die (semi)automatische Indexierung von Dokumenten zum Zwecke des Information Retrievals, also der Vorgang der Übersetzung der Dokumente in eine Indexierungssprache genauer behandelt. Dieser Schwerpunkt wurde unter anderem deshalb gewählt, weil meiner Ansicht nach die festzustellende mangelnde Akzeptanz von Information Retrieval Systemen auch damit zu begründen ist, daß die in der Praxis eingesetzten Indexierungskomponenten der Systeme zur Zeit noch nicht den Leistungsumfang erbringen, den der Benutzer von einem ''Intelligenten Information Retrieval System'' erwartet. Ziel der Arbeit ist es, ein Modell zur automatischen Indexierung schrittweise zu entwickeln, das den Benutzer in stärkerem Maße in die Indexierung mit einbezieht, als dies bei den in Literatur und Praxis beschriebenen Verfahren der Fall ist.

Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.03

0.031911142 = product of:
  0.085096374 = sum of:
    0.017341524 = weight(_text_:information in 1952) [ClassicSimilarity], result of:
      0.017341524 = score(doc=1952,freq=4.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.27429342 = fieldWeight in 1952, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=1952)
    0.051489998 = weight(_text_:retrieval in 1952) [ClassicSimilarity], result of:
      0.051489998 = score(doc=1952,freq=4.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.47264296 = fieldWeight in 1952, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=1952)
    0.01626486 = product of:
      0.048794575 = sum of:
        0.048794575 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
          0.048794575 = score(doc=1952,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.38690117 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Date: 16. 8.1998 12:51:22
Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.513-517.
Source: Proceedings of the 11th annual conference on research and development in information retrieval. Ed.: Y. Chiaramella

Knorz, G.: Automatische Indexierung (1994) 0.03

0.031573325 = product of:
  0.08419554 = sum of:
    0.020809827 = weight(_text_:information in 4254) [ClassicSimilarity], result of:
      0.020809827 = score(doc=4254,freq=4.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.3291521 = fieldWeight in 4254, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=4254)
    0.043690715 = weight(_text_:retrieval in 4254) [ClassicSimilarity], result of:
      0.043690715 = score(doc=4254,freq=2.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.40105087 = fieldWeight in 4254, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=4254)
    0.019695 = product of:
      0.059085 = sum of:
        0.059085 = weight(_text_:29 in 4254) [ClassicSimilarity], result of:
          0.059085 = score(doc=4254,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.46638384 = fieldWeight in 4254, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.09375 = fieldNorm(doc=4254)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Date: 29. 1.2011 17:56:21
Series: Berufsbegleitendes Ergänzungsstudium im Tätigkeitsfeld wissenschaftliche Information und Dokumentation (BETID): Lehrmaterialien; Nr.3
Source: Wissensrepräsentation und Information Retrieval. R.-D. Hennings u.a

Hodges, P.R.: Keyword in title indexes : effectiveness of retrieval in computer searches (1983) 0.03

0.027936364 = product of:
  0.07449697 = sum of:
    0.012139066 = weight(_text_:information in 5001) [ClassicSimilarity], result of:
      0.012139066 = score(doc=5001,freq=4.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.1920054 = fieldWeight in 5001, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5001)
    0.0509725 = weight(_text_:retrieval in 5001) [ClassicSimilarity], result of:
      0.0509725 = score(doc=5001,freq=8.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.46789268 = fieldWeight in 5001, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5001)
    0.0113854 = product of:
      0.0341562 = sum of:
        0.0341562 = weight(_text_:22 in 5001) [ClassicSimilarity], result of:
          0.0341562 = score(doc=5001,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.2708308 = fieldWeight in 5001, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: A study was done to test the effectiveness of retrieval using title word searching. It was based on actual search profiles used in the Mechanized Information Center at Ohio State University, in order ro replicate as closely as possible actual searching conditions. Fewer than 50% of the relevant titles were retrieved by keywords in titles. The low rate of retrieval can be attributes to three sources: titles themselves, user and information specialist ignorance of the subject vocabulary in use, and to general language problems. Across fields it was found that the social sciences had the best retrieval rate, with science having the next best, and arts and humanities the lowest. Ways to enhance and supplement keyword in title searching on the computer and in printed indexes are discussed.
Date: 14. 3.1996 13:22:21

Kim, P.K.: ¬An automatic indexing of compound words based on mutual information for Korean text retrieval (1995) 0.03

0.027728135 = product of:
  0.07394169 = sum of:
    0.019619694 = weight(_text_:information in 620) [ClassicSimilarity], result of:
      0.019619694 = score(doc=620,freq=8.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.3103276 = fieldWeight in 620, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=620)
    0.041192 = weight(_text_:retrieval in 620) [ClassicSimilarity], result of:
      0.041192 = score(doc=620,freq=4.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.37811437 = fieldWeight in 620, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=620)
    0.013130001 = product of:
      0.03939 = sum of:
        0.03939 = weight(_text_:29 in 620) [ClassicSimilarity], result of:
          0.03939 = score(doc=620,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.31092256 = fieldWeight in 620, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=620)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: Presents an automatic indexing technique for compound words suitable for an agglutinative language, specifically Korean. Discusses some construction conditions for compound words and the rules for decomposing compound words to enhance the exhaustivity of indexing, demonstrating that this system, mutual information, enhances both the exhaustivity of indexing and the specifity of terms. Suggests that the construction conditions and rules for decomposition presented may be used in multilingual information retrieval systems to translate the indexing terms of the specific language into those of the language required
Source: Library and information science. 1995, no.34, S.29-38

Jardine, N.; Rijsbergen, C.J. van: ¬The use of hierarchic clustering in information retrieval (1971) 0.03

0.02753261 = product of:
  0.11013044 = sum of:
    0.027746437 = weight(_text_:information in 5170) [ClassicSimilarity], result of:
      0.027746437 = score(doc=5170,freq=4.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.43886948 = fieldWeight in 5170, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.125 = fieldNorm(doc=5170)
    0.082384 = weight(_text_:retrieval in 5170) [ClassicSimilarity], result of:
      0.082384 = score(doc=5170,freq=4.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.75622874 = fieldWeight in 5170, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.125 = fieldNorm(doc=5170)
  0.25 = coord(2/8)

Source: Information storage and retrieval. 7(1971), S.217-240

Sparck Jones, K.; Jackson, D.M.: ¬The use of automatically obtained keyword classification for information retrieval (1970) 0.03

0.02753261 = product of:
  0.11013044 = sum of:
    0.027746437 = weight(_text_:information in 5177) [ClassicSimilarity], result of:
      0.027746437 = score(doc=5177,freq=4.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.43886948 = fieldWeight in 5177, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.125 = fieldNorm(doc=5177)
    0.082384 = weight(_text_:retrieval in 5177) [ClassicSimilarity], result of:
      0.082384 = score(doc=5177,freq=4.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.75622874 = fieldWeight in 5177, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.125 = fieldNorm(doc=5177)
  0.25 = coord(2/8)

Source: Information storage and retrieval. 5(1970), S.175-201

Kantor, P.B.; Voorhees, E.: Information retrieval with scanned texts (2000) 0.03

0.02753261 = product of:
  0.11013044 = sum of:
    0.027746437 = weight(_text_:information in 3901) [ClassicSimilarity], result of:
      0.027746437 = score(doc=3901,freq=4.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.43886948 = fieldWeight in 3901, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.125 = fieldNorm(doc=3901)
    0.082384 = weight(_text_:retrieval in 3901) [ClassicSimilarity], result of:
      0.082384 = score(doc=3901,freq=4.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.75622874 = fieldWeight in 3901, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.125 = fieldNorm(doc=3901)
  0.25 = coord(2/8)

Source: Information retrieval. 2(2000), S.165-176

Salton, G.; Allan, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine readable texts (1994) 0.02

0.024406403 = product of:
  0.06508374 = sum of:
    0.012262309 = weight(_text_:information in 1949) [ClassicSimilarity], result of:
      0.012262309 = score(doc=1949,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.19395474 = fieldWeight in 1949, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=1949)
    0.036408927 = weight(_text_:retrieval in 1949) [ClassicSimilarity], result of:
      0.036408927 = score(doc=1949,freq=2.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.33420905 = fieldWeight in 1949, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=1949)
    0.016412502 = product of:
      0.049237505 = sum of:
        0.049237505 = weight(_text_:29 in 1949) [ClassicSimilarity], result of:
          0.049237505 = score(doc=1949,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.38865322 = fieldWeight in 1949, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=1949)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Date: 16. 8.1998 12:30:29
Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.478-483.

RIAO 91 : Computer aided information retrieval. Conference, Barcelona, 2.-4.5.1991 (1991) 0.02

0.024335619 = product of:
  0.097342476 = sum of:
    0.024524618 = weight(_text_:information in 4651) [ClassicSimilarity], result of:
      0.024524618 = score(doc=4651,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.38790947 = fieldWeight in 4651, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.15625 = fieldNorm(doc=4651)
    0.072817855 = weight(_text_:retrieval in 4651) [ClassicSimilarity], result of:
      0.072817855 = score(doc=4651,freq=2.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.6684181 = fieldWeight in 4651, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.15625 = fieldNorm(doc=4651)
  0.25 = coord(2/8)

Sparck Jones, K.: Automatic keyword classification for information retrieval (1971) 0.02

0.024335619 = product of:
  0.097342476 = sum of:
    0.024524618 = weight(_text_:information in 5176) [ClassicSimilarity], result of:
      0.024524618 = score(doc=5176,freq=2.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.38790947 = fieldWeight in 5176, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.15625 = fieldNorm(doc=5176)
    0.072817855 = weight(_text_:retrieval in 5176) [ClassicSimilarity], result of:
      0.072817855 = score(doc=5176,freq=2.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.6684181 = fieldWeight in 5176, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.15625 = fieldNorm(doc=5176)
  0.25 = coord(2/8)

Hmeidi, I.; Kanaan, G.; Evens, M.: Design and implementation of automatic indexing for information retrieval with Arabic documents (1997) 0.02

0.023399828 = product of:
  0.06239954 = sum of:
    0.014714771 = weight(_text_:information in 1660) [ClassicSimilarity], result of:
      0.014714771 = score(doc=1660,freq=8.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.23274569 = fieldWeight in 1660, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1660)
    0.03783727 = weight(_text_:retrieval in 1660) [ClassicSimilarity], result of:
      0.03783727 = score(doc=1660,freq=6.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.34732026 = fieldWeight in 1660, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1660)
    0.0098475 = product of:
      0.0295425 = sum of:
        0.0295425 = weight(_text_:29 in 1660) [ClassicSimilarity], result of:
          0.0295425 = score(doc=1660,freq=2.0), product of:
            0.1266875 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036014426 = queryNorm
            0.23319192 = fieldWeight in 1660, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=1660)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: A corpus of 242 abstracts of Arabic documents on computer science and information systems using the Proceedings of the Saudi Arabian National Conferences as a source was put together. Reports on the design and building of an automatic information retrieval system from scratch to handle Arabic data. Both automatic and manual indexing techniques were implemented. Experiments using measures of recall and precision has demonstrated that automatic indexing is at least as effective as manual indexing and more effective in some cases. Automatic indexing is both cheaper and faster. Results suggests that a wider coverage of the literature can be achieved with less money and produce as good results as with manual indexing. Compares the retrieval results using words as index terms versus stems and roots, and confirms the results obtained by Al-Kharashi and Abu-Salem with smaller corpora that root indexing is more effective than word indexing
Date: 29. 7.1998 17:40:01
Source: Journal of the American Society for Information Science. 48(1997) no.10, S.867-881

Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.02

0.0223378 = product of:
  0.059567466 = sum of:
    0.012139066 = weight(_text_:information in 530) [ClassicSimilarity], result of:
      0.012139066 = score(doc=530,freq=4.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.1920054 = fieldWeight in 530, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=530)
    0.036043 = weight(_text_:retrieval in 530) [ClassicSimilarity], result of:
      0.036043 = score(doc=530,freq=4.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.33085006 = fieldWeight in 530, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=530)
    0.0113854 = product of:
      0.0341562 = sum of:
        0.0341562 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
          0.0341562 = score(doc=530,freq=2.0), product of:
            0.12611638 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036014426 = queryNorm
            0.2708308 = fieldWeight in 530, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
      0.33333334 = coord(1/3)
  0.375 = coord(3/8)

Abstract: Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
Source: International forum on information and documentation. 22(1997) no.1, S.17-28

Dattola, R.T.: FIRST: Flexible information retrieval system for text (1979) 0.02

0.02150018 = product of:
  0.08600072 = sum of:
    0.027746437 = weight(_text_:information in 5172) [ClassicSimilarity], result of:
      0.027746437 = score(doc=5172,freq=4.0), product of:
        0.06322253 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.036014426 = queryNorm
        0.43886948 = fieldWeight in 5172, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.125 = fieldNorm(doc=5172)
    0.058254283 = weight(_text_:retrieval in 5172) [ClassicSimilarity], result of:
      0.058254283 = score(doc=5172,freq=2.0), product of:
        0.10894058 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036014426 = queryNorm
        0.5347345 = fieldWeight in 5172, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.125 = fieldNorm(doc=5172)
  0.25 = coord(2/8)

Source: Journal of the American Society for Information Science. 30(1979), S.9-14

Search (304 results, page 1 of 16)

Authors

Years

Languages

Types

Themes

Subjects

Classifications