Search (1374 results, page 1 of 69)

Robertson, A.M.; Willett, P.: Applications of n-grams in textual information systems (1998) 0.11
```
0.111691535 = product of:
  0.5584577 = sum of:
    0.5584577 = weight(_text_:grams in 4715) [ClassicSimilarity], result of:
      0.5584577 = score(doc=4715,freq=8.0), product of:
        0.39198354 = queryWeight, product of:
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.04863741 = queryNorm
        1.4246967 = fieldWeight in 4715, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.0625 = fieldNorm(doc=4715)
  0.2 = coord(1/5)
```
Abstract

Provides an introduction to the use of n-grams in textual information systems, where an n-gram is a string of n, usually adjacent, characters, extracted from a section of continuous text. Applications that can be implemented efficiently and effectively using sets of n-grams include spelling errors detection and correction, query expansion, information retrieval with serial, inverted and signature files, dictionary look up, text compression, and language identification

Object

n-grams

Huffman, S.: Acquaintance : language-independent document categorization by n-grams (1996) 0.10

0.09773009 = product of:
  0.48865044 = sum of:
    0.48865044 = weight(_text_:grams in 7530) [ClassicSimilarity], result of:
      0.48865044 = score(doc=7530,freq=2.0), product of:
        0.39198354 = queryWeight, product of:
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.04863741 = queryNorm
        1.2466096 = fieldWeight in 7530, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.109375 = fieldNorm(doc=7530)
  0.2 = coord(1/5)

Cohen, J.D.: Highlights: language- and domain-independent automatic indexing terms for abstracting (1995) 0.05
```
0.048865046 = product of:
  0.24432522 = sum of:
    0.24432522 = weight(_text_:grams in 1793) [ClassicSimilarity], result of:
      0.24432522 = score(doc=1793,freq=2.0), product of:
        0.39198354 = queryWeight, product of:
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.04863741 = queryNorm
        0.6233048 = fieldWeight in 1793, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1793)
  0.2 = coord(1/5)
```
Abstract

Presents a model of drawing index terms from text. The approach uses no stop list, stemmer, or other language and domain specific component, allowing operation in any language or domain with only trivial modification. The method uses n-grams counts, achieving a function similar to, but more general than, a stemmer. The generated index terms, called 'highlights', are suitable for identifying the topic for perusal and selection. An extension is also described and demonstrated which selects index terms to represent a subset of documents, distinguishing them from the corpus. Presents some experimental results, showing operation in English, Spanish, German, Georgian, Russian and Japanese
Fox, K.L.; Frieder, O.; Knepper, M.M.; Snowberg, E.J.: SENTINEL: a multiple engine information retrieval and visualization system (1999) 0.05
```
0.048865046 = product of:
  0.24432522 = sum of:
    0.24432522 = weight(_text_:grams in 3547) [ClassicSimilarity], result of:
      0.24432522 = score(doc=3547,freq=2.0), product of:
        0.39198354 = queryWeight, product of:
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.04863741 = queryNorm
        0.6233048 = fieldWeight in 3547, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3547)
  0.2 = coord(1/5)
```
Abstract

We describe a prototype Information Retrieval system; SENTINEL, under development at Harris Corporation's Information Systems Division. SENTINEL is a fusion of multiple information retrieval technologies, integrating n-grams, a vector space model, and a neural network training rule. One of the primary advantages of SENTINEL is its 3-dimensional visualization capability that is based fully upon the mathematical representation of information with SENTINEL. The 3-dimensional visualization capability provides users with an intuitive understanding, with relevance/query refinement techniques athat can be better utilized, resulting in higher retrieval precision
Pearce, C.; Nicholas, C.: TELLTALE: Experiments in a dynamic hypertext environment for degraded and multilingual data (1996) 0.04
```
0.04188432 = product of:
  0.2094216 = sum of:
    0.2094216 = weight(_text_:grams in 4071) [ClassicSimilarity], result of:
      0.2094216 = score(doc=4071,freq=2.0), product of:
        0.39198354 = queryWeight, product of:
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.04863741 = queryNorm
        0.5342612 = fieldWeight in 4071, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.046875 = fieldNorm(doc=4071)
  0.2 = coord(1/5)
```
Abstract

Methods and tools for finding documents relevant to a user's needs in a document corpora can be found in the information retrieval, library science, and hypertext communities. Typically, these systems provide retrieval capabilities for fairly static copora, their algorithms are dependent on the language for which they are written, e.g. English, and they do not perform well when presented with misspelled words or text that has been degraded by OCR techniques. In this article, we present experimentation results for the TELLTALE system. TELLTALE is a dynamic hypertext environment that provides full-text search from a hypertext-style user interface for text corpora that may be garbled by OCR or transmission errors, and that may contain languages other than English. TELLTALE uses several techniques based on n-grams (n character sequences of text). With these results we show that the dynamic linkage mechanisms in TELLTALE are tolerant of garbles in up to 30% of the characters in the body of the texts
Zerbst, H.-J.; Kaptein, O.: Gegenwärtiger Stand und Entwicklungstendenzen der Sacherschließung : Auswertung einer Umfrage an deutschen wissenschaftlichen und Öffentlichen Bibliotheken (1993) 0.03
```
0.030899638 = product of:
  0.15449819 = sum of:
    0.15449819 = weight(_text_:3a in 7394) [ClassicSimilarity], result of:
      0.15449819 = score(doc=7394,freq=2.0), product of:
        0.41234848 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.04863741 = queryNorm
        0.3746787 = fieldWeight in 7394, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=7394)
  0.2 = coord(1/5)
```
Abstract

Ergebnis einer Umfrage aus dem Frühjahr 1993. A. Wissenschaftliche Bibliotheken: Versandt wurde der Fragebogen an die Mitglieder der Sektion IV des DBV. Fragen: (1a) Um welchen Bestand handelt es sich, der sachlich erschlossen wird? (1b) Wie groß ist dieser Bestand? (1c) Wird dieser Bestand vollständig oder nur in Auswahl (einzelne Fächer, Lehrbücher, Dissertationen o.ä.) sachlich erschlossen? (1d) Seit wann bestehen die jetzigen Sachkataloge? (2) Auf welche Art wird der Bestand zur Zeit sachlich erschlossen? (3a) Welche Klassifikation wird angewendet? (3b) Gibt es alphabetisches SyK-Register bzw. einen Zugriff auf die Klassenbeschreibungen? (3c) Gibt es ergänzende Schlüssel für die Aspekte Ort, Zeit, Form? (4) Falls Sie einen SWK führen (a) nach welchem Regelwerk? (b) Gibt es ein genormtes Vokabular oder einen Thesaurus (ggf. nur für bestimmte Fächer)? (5) In welcher Form existieren die Sachkataloge? (6) Ist die Bibliothek an einer kooperativen Sacherschließung, z.B. in einem Verbund beteiligt? [Nein: 79%] (7) Nutzen Sie Fremdleistungen bei der Sacherschließung? [Ja: 46%] (8) Welche sachlichen Suchmöglichkeiten gibt es für Benutzer? (9) Sind zukünftige Veränderungen bei der Sacherschließung geplant? [Ja: 73%]. - B. Öffentliche Bibliotheken: Die Umfrage richtete sich an alle ÖBs der Sektionen I, II und III des DBV. Fragen: (1) Welche Sachkataloge führen Sie? (2) Welche Klassifikationen (Systematiken) liegen dem SyK zugrunde? [ASB: 242; KAB: 333; SfB: 4 (???); SSD: 11; Berliner: 18] (3) Führen Sie ein eigenes Schlagwort-Register zum SyK bzw. zur Klassifikation (Systematik)? (4) Führen Sie den SWK nach ...? [RSWK: 132 (= ca. 60%) anderen Regeln: 93] (5) Seit wann bestehen die jetzigen Sachkataloge? (6) In welcher Form existiern die Sachkataloge? (7) In welchem Umfang wird der Bestand erschlossen? (8) Welche Signaturen verwenden Sie? (9) Ist die Bibliothek an einer kooperativen Sacherschließung, z.B. einem Verbund, beteiligt? [Nein: 96%] (10) Nutzen Sie Fremdleistungen bei der Sacherschließung? [Ja: 70%] (11) Woher beziehen Sie diese Fremdleistungen? (12) Verfügen Sie über ein Online-Katalogsystem mit OPAC? [Ja: 78; Nein: 614] (13) Sind zukünftig Veränderungen bei der Sacherschließung geplant? [Nein: 458; Ja: 237]; RESÜMEE für ÖB: "(i) Einführung von EDV-Katalogen bleibt auch in den 90er Jahren ein Thema, (ii) Der Aufbau von SWK wird in vielen Bibliotheken in Angriff genommen, dabei spielt die Fremddatenübernahme eine entscheidende Rolle, (iii) RSWK werden zunehmend angewandt, Nutzung der SWD auch für andere Regeln wirkt normierend, (iv) Große Bewegung auf dem 'Systematik-Markt' ist in absehbarer Zeit nicht zu erwarten, (v) Für kleinere Bibliotheken wird der Zettelkatalog auf absehbare Zeit noch die herrschende Katalogform sein, (vi) Der erhebliche Nachholbedarf in den neuen Bundesländern wird nur in einem größeren Zeitraum zu leisten sein. ??? SPEZIALBIBIOTHEKEN ???

Jascó, P.: Searching for images by similarity online (1998) 0.03

0.029821565 = product of:
  0.14910783 = sum of:
    0.14910783 = weight(_text_:22 in 393) [ClassicSimilarity], result of:
      0.14910783 = score(doc=393,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.8754574 = fieldWeight in 393, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.125 = fieldNorm(doc=393)
  0.2 = coord(1/5)

Date: 29.11.2004 13:03:22
Source: Online. 22(1998) no.6, S.99-102

Lutz, H.: Back to business : was CompuServe Unternehmen bietet (1997) 0.03

0.029821565 = product of:
  0.14910783 = sum of:
    0.14910783 = weight(_text_:22 in 6569) [ClassicSimilarity], result of:
      0.14910783 = score(doc=6569,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.8754574 = fieldWeight in 6569, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.125 = fieldNorm(doc=6569)
  0.2 = coord(1/5)

Date: 22. 2.1997 19:50:29
Source: Cogito. 1997, H.1, S.22-23

Klauß, H.: SISIS : 10. Anwenderforum Berlin-Brandenburg (1999) 0.03

0.029821565 = product of:
  0.14910783 = sum of:
    0.14910783 = weight(_text_:22 in 463) [ClassicSimilarity], result of:
      0.14910783 = score(doc=463,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.8754574 = fieldWeight in 463, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.125 = fieldNorm(doc=463)
  0.2 = coord(1/5)

Date: 22. 2.1999 10:22:52

fwt: Wie das Gehirn Bilder 'liest' (1999) 0.03

0.029821565 = product of:
  0.14910783 = sum of:
    0.14910783 = weight(_text_:22 in 4042) [ClassicSimilarity], result of:
      0.14910783 = score(doc=4042,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.8754574 = fieldWeight in 4042, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.125 = fieldNorm(doc=4042)
  0.2 = coord(1/5)

Date: 22. 7.2000 19:01:22

Winterhoff-Spurk, P.: Auf dem Weg in die mediale Klassengesellschaft : Psychologische Beiträge zur Wissenskluft-Forschung (1999) 0.03

0.029821565 = product of:
  0.14910783 = sum of:
    0.14910783 = weight(_text_:22 in 4130) [ClassicSimilarity], result of:
      0.14910783 = score(doc=4130,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.8754574 = fieldWeight in 4130, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.125 = fieldNorm(doc=4130)
  0.2 = coord(1/5)

Date: 8.11.1999 19:22:39
Source: Medien praktisch. 1999, H.3, S.17-22

Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.03
```
0.027922884 = product of:
  0.13961442 = sum of:
    0.13961442 = weight(_text_:grams in 6068) [ClassicSimilarity], result of:
      0.13961442 = score(doc=6068,freq=2.0), product of:
        0.39198354 = queryWeight, product of:
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.04863741 = queryNorm
        0.35617417 = fieldWeight in 6068, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.059301 = idf(docFreq=37, maxDocs=44218)
          0.03125 = fieldNorm(doc=6068)
  0.2 = coord(1/5)
```
Abstract

Over the past 50 years, a variety of language-related capabilities has been developed in machine translation, information retrieval, speech recognition, text summarization, and so on. These applications rest upon a set of core techniques such as language modeling, information extraction, parsing, generation, and multimedia planning and integration; and they involve methods using statistics, rules, grammars, lexicons, ontologies, training techniques, and so on. It is a puzzling fact that although all of this work deals with language in some form or other, the major applications have each developed a separate research field. For example, there is no reason why speech recognition techniques involving n-grams and hidden Markov models could not have been used in machine translation 15 years earlier than they were, or why some of the lexical and semantic insights from the subarea called Computational Linguistics are still not used in information retrieval.

Veittes, M.: Electronic Book (1995) 0.03

0.026358789 = product of:
  0.13179395 = sum of:
    0.13179395 = weight(_text_:22 in 3204) [ClassicSimilarity], result of:
      0.13179395 = score(doc=3204,freq=2.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.77380234 = fieldWeight in 3204, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.15625 = fieldNorm(doc=3204)
  0.2 = coord(1/5)

Source: RRZK-Kompass. 1995, Nr.65, S.21-22

Serial cataloguing : modern perspectives and international developments (1992) 0.03

0.026358789 = product of:
  0.13179395 = sum of:
    0.13179395 = weight(_text_:22 in 3704) [ClassicSimilarity], result of:
      0.13179395 = score(doc=3704,freq=2.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.77380234 = fieldWeight in 3704, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.15625 = fieldNorm(doc=3704)
  0.2 = coord(1/5)

Source: Serials librarian. 22(1992), nos.3/4

Smith, G.: Newspapers on CD-ROM (1992) 0.03

0.026358789 = product of:
  0.13179395 = sum of:
    0.13179395 = weight(_text_:22 in 6396) [ClassicSimilarity], result of:
      0.13179395 = score(doc=6396,freq=2.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.77380234 = fieldWeight in 6396, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.15625 = fieldNorm(doc=6396)
  0.2 = coord(1/5)

Source: Serials. 5(1992) no.3, S.17-22

Nanfito, N.: ¬The indexed Web : engineering tools for cataloging, storing and delivering Web based documents (1999) 0.03

0.02609387 = product of:
  0.13046935 = sum of:
    0.13046935 = weight(_text_:22 in 8727) [ClassicSimilarity], result of:
      0.13046935 = score(doc=8727,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.76602525 = fieldWeight in 8727, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=8727)
  0.2 = coord(1/5)

Date: 5. 8.2001 12:22:47
Source: Information outlook. 3(1999) no.2, S.18-22

Verkommt das Internet zur reinen Glotze? : Fertige Informationspakete gegen individuelle Suche: das neue 'Push-Prinzip' im Internet ist heftig umstritten (1997) 0.03

0.02609387 = product of:
  0.13046935 = sum of:
    0.13046935 = weight(_text_:22 in 7131) [ClassicSimilarity], result of:
      0.13046935 = score(doc=7131,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.76602525 = fieldWeight in 7131, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=7131)
  0.2 = coord(1/5)

Date: 18. 1.1997 12:15:22
Source: Kölner Stadtanzeiger. Nr.69 vom 22/23.3.1997, S.MZ7

Filk, C.: Online, Internet und Digitalkultur : eine Bibliographie zur jüngsten Diskussion um die Informationsgesellschaft (1996) 0.03

0.02609387 = product of:
  0.13046935 = sum of:
    0.13046935 = weight(_text_:22 in 44) [ClassicSimilarity], result of:
      0.13046935 = score(doc=44,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.76602525 = fieldWeight in 44, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=44)
  0.2 = coord(1/5)

Date: 5. 9.1997 19:22:27
Source: Rundfunk und Geschichte. 22(1996) H.2/3, S.184-193

Advances in librarianship (1998) 0.03

0.02609387 = product of:
  0.13046935 = sum of:
    0.13046935 = weight(_text_:22 in 4698) [ClassicSimilarity], result of:
      0.13046935 = score(doc=4698,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.76602525 = fieldWeight in 4698, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=4698)
  0.2 = coord(1/5)

Issue: Vol.22.
Signature: 78 BAHH 1089-22

Shatz, C.J.; Selkoe, D.J.; Freeman, W.J.: Gehirn und Bewußtsein (1994) 0.02

0.022366175 = product of:
  0.111830875 = sum of:
    0.111830875 = weight(_text_:22 in 7578) [ClassicSimilarity], result of:
      0.111830875 = score(doc=7578,freq=4.0), product of:
        0.17031991 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04863741 = queryNorm
        0.6565931 = fieldWeight in 7578, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=7578)
  0.2 = coord(1/5)

Date: 22. 7.2000 18:22:14

Search (1374 results, page 1 of 69)

Authors

Languages

Types

Themes

Subjects

Classifications