Search (104 results, page 2 of 6)

Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.02

0.021450995 = product of:
  0.04290199 = sum of:
    0.04290199 = product of:
      0.08580398 = sum of:
        0.08580398 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
          0.08580398 = score(doc=3117,freq=2.0), product of:
            0.15840882 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045236014 = queryNorm
            0.5416616 = fieldWeight in 3117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3117)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 28. 2.1999 10:48:22

¬Der Student aus dem Computer (2023) 0.02

0.021450995 = product of:
  0.04290199 = sum of:
    0.04290199 = product of:
      0.08580398 = sum of:
        0.08580398 = weight(_text_:22 in 1079) [ClassicSimilarity], result of:
          0.08580398 = score(doc=1079,freq=2.0), product of:
            0.15840882 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045236014 = queryNorm
            0.5416616 = fieldWeight in 1079, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1079)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 27. 1.2023 16:22:55

Bellaachia, A.; Amor-Tijani, G.: Proper nouns in English-Arabic cross language information retrieval (2008) 0.02
```
0.020116309 = product of:
  0.040232617 = sum of:
    0.040232617 = product of:
      0.080465235 = sum of:
        0.080465235 = weight(_text_:n in 2372) [ClassicSimilarity], result of:
          0.080465235 = score(doc=2372,freq=6.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.41255307 = fieldWeight in 2372, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2372)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Out of vocabulary words, mostly proper nouns and technical terms, are one main source of performance degradation in Cross Language Information Retrieval (CLIR) systems. Those are words not found in the dictionary. Bilingual dictionaries in general do not cover most proper nouns, which are usually primary keys in the query. As they are spelling variants of each other in most languages, using an approximate string matching technique against the target database index is the common approach taken to find the target language correspondents of the original query key. N-gram technique proved to be the most effective among other string matching techniques. The issue arises when the languages dealt with have different alphabets. Transliteration is then applied based on phonetic similarities between the languages involved. In this study, both transliteration and the n-gram technique are combined to generate possible transliterations in an English-Arabic CLIR system. We refer to this technique as Transliteration N-Gram (TNG). We further enhance TNG by applying Part Of Speech disambiguation on the set of transliterations so that words with a similar spelling, but a different meaning, are excluded. Experimental results show that TNG gives promising results, and enhanced TNG further improves performance.
Pepper, S.; Arnaud, P.J.L.: Absolutely PHAB : toward a general model of associative relations (2020) 0.02
```
0.020116309 = product of:
  0.040232617 = sum of:
    0.040232617 = product of:
      0.080465235 = sum of:
        0.080465235 = weight(_text_:n in 103) [ClassicSimilarity], result of:
          0.080465235 = score(doc=103,freq=6.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.41255307 = fieldWeight in 103, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0390625 = fieldNorm(doc=103)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

There have been many attempts at classifying the semantic modification relations (R) of N + N compounds but this work has not led to the acceptance of a definitive scheme, so that devising a reusable classification is a worthwhile aim. The scope of this undertaking is extended to other binominal lexemes, i.e. units that contain two thing-morphemes without explicitly stating R, like prepositional units, N + relational adjective units, etc. The 25-relation taxonomy of Bourque (2014) was tested against over 15,000 binominal lexemes from 106 languages and extended to a 29-relation scheme ("Bourque2") through the introduction of two new reversible relations. Bourque2 is then mapped onto Hatcher's (1960) four-relation scheme (extended by the addition of a fifth relation, similarity , as "Hatcher2"). This results in a two-tier system usable at different degrees of granularities. On account of its semantic proximity to compounding, metonymy is then taken into account, following Janda's (2011) suggestion that it plays a role in word formation; Peirsman and Geeraerts' (2006) inventory of 23 metonymic patterns is mapped onto Bourque2, confirming the identity of metonymic and binominal modification relations. Finally, Blank's (2003) and Koch's (2001) work on lexical semantics justifies the addition to the scheme of a third, superordinate level which comprises the three Aristotelean principles of similarity, contiguity and contrast.
Experimentelles und praktisches Information Retrieval : Festschrift für Gerhard Lustig (1992) 0.02
```
0.019709876 = product of:
  0.03941975 = sum of:
    0.03941975 = product of:
      0.0788395 = sum of:
        0.0788395 = weight(_text_:n in 4) [ClassicSimilarity], result of:
          0.0788395 = score(doc=4,freq=4.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.40421778 = fieldWeight in 4, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.046875 = fieldNorm(doc=4)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Content

Enthält die Beiträge: SALTON, G.: Effective text understanding in information retrieval; KRAUSE, J.: Intelligentes Information retrieval; FUHR, N.: Konzepte zur Gestaltung zukünftiger Information-Retrieval-Systeme; HÜTHER, H.: Überlegungen zu einem mathematischen Modell für die Type-Token-, die Grundform-Token und die Grundform-Type-Relation; KNORZ, G.: Automatische Generierung inferentieller Links in und zwischen Hyperdokumenten; KONRAD, E.: Zur Effektivitätsbewertung von Information-Retrieval-Systemen; HENRICHS, N.: Retrievalunterstützung durch automatisch generierte Wortfelder; LÜCK, W., W. RITTBERGER u. M. SCHWANTNER: Der Einsatz des Automatischen Indexierungs- und Retrieval-System (AIR) im Fachinformationszentrum Karlsruhe; REIMER, U.: Verfahren der Automatischen Indexierung. Benötigtes Vorwissen und Ansätze zu seiner automatischen Akquisition: Ein Überblick; ENDRES-NIGGEMEYER, B.: Dokumentrepräsentation: Ein individuelles prozedurales Modell des Abstracting, des Indexierens und Klassifizierens; SEELBACH, D.: Zur Entwicklung von zwei- und mehrsprachigen lexikalischen Datenbanken und Terminologiedatenbanken; ZIMMERMANN, H.: Der Einfluß der Sprachbarrieren in Europa und Möglichkeiten zu ihrer Minderung; LENDERS, W.: Wörter zwischen Welt und Wissen; PANYR, J.: Frames, Thesauri und automatische Klassifikation (Clusteranalyse): HAHN, U.: Forschungsstrategien und Erkenntnisinteressen in der anwendungsorientierten automatischen Sprachverarbeitung. Überlegungen zu einer ingenieurorientierten Computerlinguistik; KUHLEN, R.: Hypertext und Information Retrieval - mehr als Browsing und Suche.

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I.: Attention Is all you need (2017) 0.02

0.019709876 = product of:
  0.03941975 = sum of:
    0.03941975 = product of:
      0.0788395 = sum of:
        0.0788395 = weight(_text_:n in 970) [ClassicSimilarity], result of:
          0.0788395 = score(doc=970,freq=4.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.40421778 = fieldWeight in 970, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.046875 = fieldNorm(doc=970)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Frakes, W.B.: Stemming algorithms (1992) 0.02

0.01858265 = product of:
  0.0371653 = sum of:
    0.0371653 = product of:
      0.0743306 = sum of:
        0.0743306 = weight(_text_:n in 3503) [ClassicSimilarity], result of:
          0.0743306 = score(doc=3503,freq=2.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.38110018 = fieldWeight in 3503, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0625 = fieldNorm(doc=3503)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Desribes stemming algorithms - programs that relate morphologically similar indexing and search terms. Stemming is used to improve retrieval effectiveness and to reduce the size of indexing files. Several approaches to stemming are describes - table lookup, affix removal, successor variety, and n-gram. empirical studies of stemming are summarized. The Porter stemmer is described in detail, and a full implementation in C is presented

Vazov, N.: Identification des differentes structures temporelles dans des textes et leur rôles dans le raisonnement temporel (1999) 0.02

0.01858265 = product of:
  0.0371653 = sum of:
    0.0371653 = product of:
      0.0743306 = sum of:
        0.0743306 = weight(_text_:n in 6203) [ClassicSimilarity], result of:
          0.0743306 = score(doc=6203,freq=2.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.38110018 = fieldWeight in 6203, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0625 = fieldNorm(doc=6203)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Ferret, O.; Grau, B.; Masson, N.: Utilisation d'un réseau de cooccurences lexikales pour a méliorer une analyse thématique fondée sur la distribution des mots (1999) 0.02

0.01858265 = product of:
  0.0371653 = sum of:
    0.0371653 = product of:
      0.0743306 = sum of:
        0.0743306 = weight(_text_:n in 6295) [ClassicSimilarity], result of:
          0.0743306 = score(doc=6295,freq=2.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.38110018 = fieldWeight in 6295, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0625 = fieldNorm(doc=6295)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Kummer, N.: Indexierungstechniken für das japanische Retrieval (2006) 0.02

0.01858265 = product of:
  0.0371653 = sum of:
    0.0371653 = product of:
      0.0743306 = sum of:
        0.0743306 = weight(_text_:n in 5979) [ClassicSimilarity], result of:
          0.0743306 = score(doc=5979,freq=2.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.38110018 = fieldWeight in 5979, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0625 = fieldNorm(doc=5979)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Koppel, M.; Akiva, N.; Dagan, I.: Feature instability as a criterion for selecting potential style markers (2006) 0.02

0.01858265 = product of:
  0.0371653 = sum of:
    0.0371653 = product of:
      0.0743306 = sum of:
        0.0743306 = weight(_text_:n in 6092) [ClassicSimilarity], result of:
          0.0743306 = score(doc=6092,freq=2.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.38110018 = fieldWeight in 6092, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0625 = fieldNorm(doc=6092)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Bubenhofer, N.: Einführung in die Korpuslinguistik : Praktische Grundlagen und Werkzeuge (2006) 0.02

0.01858265 = product of:
  0.0371653 = sum of:
    0.0371653 = product of:
      0.0743306 = sum of:
        0.0743306 = weight(_text_:n in 3126) [ClassicSimilarity], result of:
          0.0743306 = score(doc=3126,freq=2.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.38110018 = fieldWeight in 3126, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0625 = fieldNorm(doc=3126)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02

0.018386567 = product of:
  0.036773134 = sum of:
    0.036773134 = product of:
      0.07354627 = sum of:
        0.07354627 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
          0.07354627 = score(doc=4483,freq=2.0), product of:
            0.15840882 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045236014 = queryNorm
            0.46428138 = fieldWeight in 4483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 15. 3.2000 10:22:37

Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.02

0.018386567 = product of:
  0.036773134 = sum of:
    0.036773134 = product of:
      0.07354627 = sum of:
        0.07354627 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
          0.07354627 = score(doc=4888,freq=2.0), product of:
            0.15840882 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045236014 = queryNorm
            0.46428138 = fieldWeight in 4888, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4888)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 3.2013 14:56:22

Monnerjahn, P.: Vorsprung ohne Technik : Übersetzen: Computer und Qualität (2000) 0.02

0.018386567 = product of:
  0.036773134 = sum of:
    0.036773134 = product of:
      0.07354627 = sum of:
        0.07354627 = weight(_text_:22 in 5429) [ClassicSimilarity], result of:
          0.07354627 = score(doc=5429,freq=2.0), product of:
            0.15840882 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045236014 = queryNorm
            0.46428138 = fieldWeight in 5429, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5429)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: c't. 2000, H.22, S.230-231

Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.02
```
0.016424898 = product of:
  0.032849796 = sum of:
    0.032849796 = product of:
      0.06569959 = sum of:
        0.06569959 = weight(_text_:n in 4277) [ClassicSimilarity], result of:
          0.06569959 = score(doc=4277,freq=4.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.33684817 = fieldWeight in 4277, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4277)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.
Doval, Y.; Gómez-Rodríguez, C.: Comparing neural- and N-gram-based language models for word segmentation (2019) 0.02
```
0.016424898 = product of:
  0.032849796 = sum of:
    0.032849796 = product of:
      0.06569959 = sum of:
        0.06569959 = weight(_text_:n in 4675) [ClassicSimilarity], result of:
          0.06569959 = score(doc=4675,freq=4.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.33684817 = fieldWeight in 4675, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4675)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Word segmentation is the task of inserting or deleting word boundary characters in order to separate character sequences that correspond to words in some language. In this article we propose an approach based on a beam search algorithm and a language model working at the byte/character level, the latter component implemented either as an n-gram model or a recurrent neural network. The resulting system analyzes the text input with no word boundaries one token at a time, which can be a character or a byte, and uses the information gathered by the language model to determine if a boundary must be placed in the current position or not. Our aim is to use this system in a preprocessing step for a microtext normalization system. This means that it needs to effectively cope with the data sparsity present on this kind of texts. We also strove to surpass the performance of two readily available word segmentation systems: The well-known and accessible Word Breaker by Microsoft, and the Python module WordSegment by Grant Jenks. The results show that we have met our objectives, and we hope to continue to improve both the precision and the efficiency of our system in the future.
SIGIR'92 : Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1992) 0.02
```
0.016259817 = product of:
  0.032519635 = sum of:
    0.032519635 = product of:
      0.06503927 = sum of:
        0.06503927 = weight(_text_:n in 6671) [ClassicSimilarity], result of:
          0.06503927 = score(doc=6671,freq=8.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.33346266 = fieldWeight in 6671, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.02734375 = fieldNorm(doc=6671)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Content

HARMAN, D.: Relevance feedback revisited; AALBERSBERG, I.J.: Incremental relevance feedback; TAGUE-SUTCLIFFE, J.: Measuring the informativeness of a retrieval process; LEWIS, D.D.: An evaluation of phrasal and clustered representations on a text categorization task; BLOSSEVILLE, M.J., G. HÉBRAIL, M.G. MONTEIL u. N. PÉNOT: Automatic document classification: natural language processing, statistical analysis, and expert system techniques used together; MASAND, B., G. LINOFF u. D. WALTZ: Classifying news stories using memory based reasoning; KEEN, E.M.: Term position ranking: some new test results; CROUCH, C.J. u. B. YANG: Experiments in automatic statistical thesaurus construction; GREFENSTETTE, G.: Use of syntactic context to produce term association lists for text retrieval; ANICK, P.G. u. R.A. FLYNN: Versioning of full-text information retrieval system; BURKOWSKI, F.J.: Retrieval activities in a database consisting of heterogeneous collections; DEERWESTER, S.C., K. WACLENA u. M. LaMAR: A textual object management system; NIE, J.-Y.:Towards a probabilistic modal logic for semantic-based information retrieval; WANG, A.W., S.K.M. WONG u. Y.Y. YAO: An analysis of vector space models based on computational geometry; BARTELL, B.T., G.W. COTTRELL u. R.K. BELEW: Latent semantic indexing is an optimal special case of multidimensional scaling; GLAVITSCH, U. u. P. SCHÄUBLE: A system for retrieving speech documents; MARGULIS, E.L.: N-Poisson document modelling; HESS, M.: An incrementally extensible document retrieval system based on linguistics and logical principles; COOPER, W.S., F.C. GEY u. D.P. DABNEY: Probabilistic retrieval based on staged logistic regression; FUHR, N.: Integration of probabilistic fact and text retrieval; CROFT, B., L.A. SMITH u. H. TURTLE: A loosely-coupled integration of a text retrieval system and an object-oriented database system; DUMAIS, S.T. u. J. NIELSEN: Automating the assignement of submitted manuscripts to reviewers; GOST, M.A. u. M. MASOTTI: Design of an OPAC database to permit different subject searching accesses; ROBERTSON, A.M. u. P. WILLETT: Searching for historical word forms in a database of 17th century English text using spelling correction methods; FAX, E.A., Q.F. CHEN u. L.S. HEATH: A faster algorithm for constructing minimal perfect hash functions; MOFFAT, A. u. J. ZOBEL: Parameterised compression for sparse bitmaps; GRANDI, F., P. TIBERIO u. P. Zezula: Frame-sliced patitioned parallel signature files; ALLEN, B.: Cognitive differences in end user searching of a CD-ROM index; SONNENWALD, D.H.: Developing a theory to guide the process of designing information retrieval systems; CUTTING, D.R., J.O. PEDERSEN, D. KARGER, u. J.W. TUKEY: Scatter/ Gather: a cluster-based approach to browsing large document collections; CHALMERS, M. u. P. CHITSON: Bead: Explorations in information visualization; WILLIAMSON, C. u. B. SHNEIDERMAN: The dynamic HomeFinder: evaluating dynamic queries in a real-estate information exploring system

Editor

Belkin, N.; Ingwersen, P.; Pejtersen, A.M.
Ekmekcioglu, F.C.; Lynch, M.F.; Willet, P.: Development and evaluation of conflation techniques for the implementation of a document retrieval system for Turkish text databases (1995) 0.02
```
0.016259817 = product of:
  0.032519635 = sum of:
    0.032519635 = product of:
      0.06503927 = sum of:
        0.06503927 = weight(_text_:n in 5797) [ClassicSimilarity], result of:
          0.06503927 = score(doc=5797,freq=2.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.33346266 = fieldWeight in 5797, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5797)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Considers language processing techniques necessary for the implementation of a document retrieval system for Turkish text databases. Introduces the main characteristics of the Turkish language. Discusses the development of a stopword list and the evaluation of a stemming algorithm that takes account of the language's morphological structure. A 2 level description of Turkish morphology developed in Bilkent University, Ankara, is incorporated into a morphological parser, PC-KIMMO, to carry out stemming in Turkish databases. Describes the evaluation of string similarity measures - n-gram matching techniques - for Turkish. Reports experiments on 6 different Turkish text corpora

Melucci, M.; Orio, N.: Design, implementation, and evaluation of a methodology for automatic stemmer generation (2007) 0.02

0.016259817 = product of:
  0.032519635 = sum of:
    0.032519635 = product of:
      0.06503927 = sum of:
        0.06503927 = weight(_text_:n in 268) [ClassicSimilarity], result of:
          0.06503927 = score(doc=268,freq=2.0), product of:
            0.19504215 = queryWeight, product of:
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.045236014 = queryNorm
            0.33346266 = fieldWeight in 268, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3116565 = idf(docFreq=1611, maxDocs=44218)
              0.0546875 = fieldNorm(doc=268)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Search (104 results, page 2 of 6)

Authors

Years

Languages

Types

Themes

Subjects

Classifications