Search (96 results, page 5 of 5)

Kraaij, W.; Pohlmann, R.: Evaluation of a Dutch stemming algorithm (1995) 0.01
```
0.00622365 = product of:
  0.01867095 = sum of:
    0.01867095 = weight(_text_:on in 5798) [ClassicSimilarity], result of:
      0.01867095 = score(doc=5798,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.17010231 = fieldWeight in 5798, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5798)
  0.33333334 = coord(1/3)
```
Abstract

A stemming algorithm enables the recall of text retrieval systems to be enhanced. Describes the development of a Dutch version of the Porter stemming algorithm. The stemmer was evaluated using a method drawn from Paice. The evaluation method is based on a list of groups of morphologically related words. Ideally, each group must be stemmed to the same root. The result of applying the stemmer to these groups of words is used to calculate the understemming and overstemming index. These parameters and the diversity of stem group categories that could be generated from the CELEX database enabled a careful analysis of the effects of each stemming rule. The test suite is highly suited to qualitative comparison of different versions of stemmers

Mustafa el Hadi, W.; Jouis, C.: Natural language processing-based systems for terminological construction and their contribution to information retrieval (1996) 0.01

0.00622365 = product of:
  0.01867095 = sum of:
    0.01867095 = weight(_text_:on in 6331) [ClassicSimilarity], result of:
      0.01867095 = score(doc=6331,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.17010231 = fieldWeight in 6331, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6331)
  0.33333334 = coord(1/3)

Source: TKE'96: Terminology and knowledge engineering. Proceedings 4th International Congress on Terminology and Knowledge Engineering, 26.-28.8.1996, Wien. Ed.: C. Galinski u. K.-D. Schmitz

Greengrass, M.: Conflation methods for searching databases of Latin text (1996) 0.01
```
0.00622365 = product of:
  0.01867095 = sum of:
    0.01867095 = weight(_text_:on in 6987) [ClassicSimilarity], result of:
      0.01867095 = score(doc=6987,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.17010231 = fieldWeight in 6987, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6987)
  0.33333334 = coord(1/3)
```
Abstract

Describes the results of a project to develop conflation tools for searching databases of Latin text. Reports on the results of a questionnaire sent to 64 users of Latin text retrieval systems. Describes a Latin stemming algorithm that uses a simple longest match with some recoding but differs from most stemmers in its use of 2 separate suffix dictionaries for processing query and database words. Describes a retrieval system in which a user inputs the principal component of their search term, these components are stemmed and the resulting stems matched against the noun based and verb based stem dictionaries. Evaluates the system, describing its limitations, and a more complex system

Brenner, E.H.: Beyond Boolean : new approaches in information retrieval; the quest for intuitive online search systems past, present & future (1995) 0.01

0.00622365 = product of:
  0.01867095 = sum of:
    0.01867095 = weight(_text_:on in 2547) [ClassicSimilarity], result of:
      0.01867095 = score(doc=2547,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.17010231 = fieldWeight in 2547, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2547)
  0.33333334 = coord(1/3)

Abstract: The challenge of effectively bringing specific, relevant information from the global sea of data to our fingertips, has become an increasingly difficult one. Discusses how the online information industry, founded on Boolean search systems, may be evolving to take advantage of other methods, such as 'term weighting', 'relevance ranking' and 'query by example'

Kokol, P.; Podgorelec, V.; Zorman, M.; Kokol, T.; Njivar, T.: Computer and natural language texts : a comparison based on long-range correlations (1999) 0.01

0.00622365 = product of:
  0.01867095 = sum of:
    0.01867095 = weight(_text_:on in 4299) [ClassicSimilarity], result of:
      0.01867095 = score(doc=4299,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.17010231 = fieldWeight in 4299, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4299)
  0.33333334 = coord(1/3)

Lee, Y.-H.; Evens, M.W.: Natural language interface for an expert system (1998) 0.01
```
0.00622365 = product of:
  0.01867095 = sum of:
    0.01867095 = weight(_text_:on in 5108) [ClassicSimilarity], result of:
      0.01867095 = score(doc=5108,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.17010231 = fieldWeight in 5108, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5108)
  0.33333334 = coord(1/3)
```
Abstract

Presents a complete analysis of the underlying principles of natural language interfaces from the screen manager to the parser / understander. The main focus is on the design and development of a subsystem for understanding natural language input in an expert system. Considers that fast response time and user friendliness are the most important considerations in the design. The screen manager provides an easy editing capability for users and the spelling correction system can detect most spelling errors and correct them automatically, quickly and effectively. The Lexical Functional Grammar (LFG) parser and the understander are designed to handle most types of simple sentences, fragments, and ellipses
Rorvig, M.; Smith, M.M.; Uemura, A.: ¬The N-gram hypothesis applied to matched sets of visualized Japanese-English technical documents (1999) 0.01
```
0.00622365 = product of:
  0.01867095 = sum of:
    0.01867095 = weight(_text_:on in 6675) [ClassicSimilarity], result of:
      0.01867095 = score(doc=6675,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.17010231 = fieldWeight in 6675, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6675)
  0.33333334 = coord(1/3)
```
Abstract

Shape Recovery Analysis (SHERA), a new visual analytical technique, is applied to the N-Gram hypothesis on matched Japanese-English technical documents supplied by the National Center for Science Information Systems (NACSIS) in Japan. The results of the SHERA study reveal compaction in the translation of Japanese subject terms to English subject terms. Surprisingly, the bigram approach to the Japanese data yields a remarkable similarity to the matching visualized English texts

Semantik, Lexikographie und Computeranwendungen : Workshop ... (Bonn) : 1995.01.27-28 (1996) 0.01

0.0056345966 = product of:
  0.01690379 = sum of:
    0.01690379 = product of:
      0.03380758 = sum of:
        0.03380758 = weight(_text_:22 in 190) [ClassicSimilarity], result of:
          0.03380758 = score(doc=190,freq=2.0), product of:
            0.1747608 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04990557 = queryNorm
            0.19345059 = fieldWeight in 190, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=190)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 14. 4.2007 10:04:22

Dork, B.J.; Garman, J.; Weinberg, A.: From syntactic encodings to thematic roles : building lexical entries for interlingual MT (1994/95) 0.01
```
0.0053345575 = product of:
  0.016003672 = sum of:
    0.016003672 = weight(_text_:on in 4074) [ClassicSimilarity], result of:
      0.016003672 = score(doc=4074,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.14580199 = fieldWeight in 4074, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.046875 = fieldNorm(doc=4074)
  0.33333334 = coord(1/3)
```
Abstract

Aims to construct large scale lexicons for interlingual machine translation of English, Arabic, Korean, and Spanish. Describes techniques that predict salient linguistic features of a non English word using the features of its English gloss in a bilingual dictionary. While not exact, owing to inexact glosses and language to language variations, these techniques can augment an existing dictionary with reasonable accuracy, thus savionf significant time. Conducts 2 experiments that demonstrate the value of these techniques. The 1st tests the feasibility of building a database of thematic grids for over 6500 Arabic verbs based on a mapping between English glosses and the syntactic codes in Longman's Dictionary of Contemporary English. The 2nd experiment tested the automatic classification of verbs into a richer semantic typology from which a more refined set of thematic grids derived. While human intervention will always be necessary for the construction of a semantic classification from LODOCE such intervention is significantly minimized as more knowledge about the syntax semantics relation is introduced
Mustafa el Hadi, W.: Automatic term recognition & extraction tools : examining the new interfaces and their effective communication role in LSP discourse (1998) 0.01
```
0.0053345575 = product of:
  0.016003672 = sum of:
    0.016003672 = weight(_text_:on in 67) [ClassicSimilarity], result of:
      0.016003672 = score(doc=67,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.14580199 = fieldWeight in 67, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.046875 = fieldNorm(doc=67)
  0.33333334 = coord(1/3)
```
Abstract

In this paper we will discuss the possibility of reorienting NLP (Natural Language Processing) systems towards the extraction, not only of terms and their semantic relations, but also towards a variety of other uses; the storage, accessing and retrieving of Language for Special Purposes (LSPZ-20) lexical combinations, the provision of contexts and other information on terms through the integration of more interfaces to terminological data-bases, term managing systems and existing NLP systems. The aim of making such interfaces available is to increase the efficiency of the systems and improve the terminology-oriented text analysis. Since automatic term extraction is the backbone of many applications such as machine translation (MT), indexing, technical writing, thesaurus construction and knowledge representation developments in this area will have asignificant impact
Sikkel, K.: Parsing schemata : a framework for specification and analysis of parsing algorithms (1996) 0.01
```
0.0053345575 = product of:
  0.016003672 = sum of:
    0.016003672 = weight(_text_:on in 685) [ClassicSimilarity], result of:
      0.016003672 = score(doc=685,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.14580199 = fieldWeight in 685, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.046875 = fieldNorm(doc=685)
  0.33333334 = coord(1/3)
```
Abstract

Parsing, the syntactic analysis of language, has been studied extensively in computer science and computational linguistics. Computer programs and natural languages share an underlying theory of formal languages and require efficient parsing algorithms. This introductions reviews the thory of parsing from a novel perspective, it provides a formalism to capture the essential traits of a parser that abstracts from the fine detail and allows a uniform description and comparison of a variety of parsers, including Earley, Tomita, LR, Left-Corner, and Head-Corner parsers. The emphasis is on context-free phrase structure grammar and how these parsers can be extended to unification formalisms. The book combines mathematical rigor with high readability and is suitable as a graduate course text
Chowdhury, A.; Mccabe, M.C.: Improving information retrieval systems using part of speech tagging (1993) 0.01
```
0.0053345575 = product of:
  0.016003672 = sum of:
    0.016003672 = weight(_text_:on in 1061) [ClassicSimilarity], result of:
      0.016003672 = score(doc=1061,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.14580199 = fieldWeight in 1061, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.046875 = fieldNorm(doc=1061)
  0.33333334 = coord(1/3)
```
Abstract

The object of Information Retrieval is to retrieve all relevant documents for a user query and only those relevant documents. Much research has focused on achieving this objective with little regard for storage overhead or performance. In the paper we evaluate the use of Part of Speech Tagging to improve, the index storage overhead and general speed of the system with only a minimal reduction to precision recall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and 1989 document collection provided by TREC for parts of speech. We then experimented to find the most relevant part of speech to index. We show that 90% of precision recall is achieved with 40% of the document collections terms. We also show that this is a improvement in overhead with only a 1% reduction in precision recall.
Ferber, R.; Wettler, M.; Rapp, R.: ¬An associative model of word selection in the generation of search queries (1995) 0.00
```
0.0044454644 = product of:
  0.013336393 = sum of:
    0.013336393 = weight(_text_:on in 3177) [ClassicSimilarity], result of:
      0.013336393 = score(doc=3177,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.121501654 = fieldWeight in 3177, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3177)
  0.33333334 = coord(1/3)
```
Abstract

To generate a search query based on an end user request, a database searcher has to select appropriate search terms. These terms can either be taken from the request, or they can be added by the searcher. This selection process is simulated by an associative lexical net; the nodes of the net are the terms used in 94 records of written requests to a psychological information agency and the respective online searches. The weights connecting the nodes are calculated from the co-occurrences of these terms in the abstracts of the database PsycLit. To simulate the term selection process of a query, the nodes of all terms used in the written requests are activated, and 1 or more spreading activation cycles are performed. The result of the simulation is a ranking of the terms according to the activities of their nodes. Simulations for all 94 records show a low mean activity rank for the terms selected from the request; the mean activity rank for new terms added by the searcher is lower than the mean activity rank for thode terms of the request that were not used in the query
From information to knowledge : conceptual and content analysis by computer (1995) 0.00
```
0.0044454644 = product of:
  0.013336393 = sum of:
    0.013336393 = weight(_text_:on in 5392) [ClassicSimilarity], result of:
      0.013336393 = score(doc=5392,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.121501654 = fieldWeight in 5392, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5392)
  0.33333334 = coord(1/3)
```
Content

SCHMIDT, K.M.: Concepts - content - meaning: an introduction; DUCHASTEL, J. et al.: The SACAO project: using computation toward textual data analysis; PAQUIN, L.-C. u. L. DUPUY: An approach to expertise transfer: computer-assisted text analysis; HOGENRAAD, R., Y. BESTGEN u. J.-L. NYSTEN: Terrorist rhetoric: texture and architecture; MOHLER, P.P.: On the interaction between reading and computing: an interpretative approach to content analysis; LANCASHIRE, I.: Computer tools for cognitive stylistics; MERGENTHALER, E.: An outline of knowledge based text analysis; NAMENWIRTH, J.Z.: Ideography in computer-aided content analysis; WEBER, R.P. u. J.Z. Namenwirth: Content-analytic indicators: a self-critique; McKINNON, A.: Optimizing the aberrant frequency word technique; ROSATI, R.: Factor analysis in classical archaeology: export patterns of Attic pottery trade; PETRILLO, P.S.: Old and new worlds: ancient coinage and modern technology; DARANYI, S., S. MARJAI u.a.: Caryatids and the measurement of semiosis in architecture; ZARRI, G.P.: Intelligent information retrieval: an application in the field of historical biographical data; BOUCHARD, G., R. ROY u.a.: Computers and genealogy: from family reconstitution to population reconstruction; DEMÉLAS-BOHY, M.-D. u. M. RENAUD: Instability, networks and political parties: a political history expert system prototype; DARANYI, S., A. ABRANYI u. G. KOVACS: Knowledge extraction from ethnopoetic texts by multivariate statistical methods; FRAUTSCHI, R.L.: Measures of narrative voice in French prose fiction applied to textual samples from the enlightenment to the twentieth century; DANNENBERG, R. u.a.: A project in computer music: the musician's workbench
Abu-Salem, H.; Al-Omari, M.; Evens, M.W.: Stemming methodologies over individual query words for an Arabic information retrieval system (1999) 0.00
```
0.0044454644 = product of:
  0.013336393 = sum of:
    0.013336393 = weight(_text_:on in 3672) [ClassicSimilarity], result of:
      0.013336393 = score(doc=3672,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.121501654 = fieldWeight in 3672, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3672)
  0.33333334 = coord(1/3)
```
Abstract

Stemming is one of the most important factors that affect the performance of information retrieval systems. This article investigates how to improve the performance of an Arabic information retrieval system by imposing the retrieval method over individual words of a query depending on the importance of the WORD, the STEM, or the ROOT of the query terms in the database. This method, called Mxed Stemming, computes term importance using a weighting scheme that use the Term Frequency (TF) and the Inverse Document Frequency (IDF), called TFxIDF. An extended version of the Arabic IRS system is designed, implemented, and evaluated to reduce the number of irrelevant documents retrieved. The results of the experiment suggest that the proposed method outperforms the Word index method using the TFxIDF weighting scheme. It also outperforms the Stem index method using the Binary weighting scheme but does not outperform the Stem index method using the TFxIDF weighting scheme, and again it outperforms the Root index method using the Binary weighting scheme but does not outperform the Root index method using the TFxIDF weighting scheme
Hutchins, W.J.; Somers, H.L.: ¬An introduction to machine translation (1992) 0.00
```
0.0044454644 = product of:
  0.013336393 = sum of:
    0.013336393 = weight(_text_:on in 4512) [ClassicSimilarity], result of:
      0.013336393 = score(doc=4512,freq=2.0), product of:
        0.109763056 = queryWeight, product of:
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.04990557 = queryNorm
        0.121501654 = fieldWeight in 4512, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.199415 = idf(docFreq=13325, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4512)
  0.33333334 = coord(1/3)
```
Abstract

The translation of foreign language texts by computers was one of the first tasks that the pioneers of Computing and Artificial Intelligence set themselves. Machine translation is again becoming an importantfield of research and development as the need for translations of technical and commercial documentation is growing well beyond the capacity of the translation profession.This is the first textbook of machine translation, providing a full course on both general machine translation systems characteristics and the computational linguistic foundations of the field. The book assumes no previous knowledge of machine translation and provides the basic background information to the linguistic and computational linguistics, artificial intelligence, natural language processing and information science.

Search (96 results, page 5 of 5)

Authors

Languages

Types

Themes

Subjects

Classifications