Search (492 results, page 2 of 25)

Yang, C.C.; Li, K.W.: Automatic construction of English/Chinese parallel corpora (2003) 0.03
```
0.025556937 = product of:
  0.07028157 = sum of:
    0.02793805 = weight(_text_:wide in 1683) [ClassicSimilarity], result of:
      0.02793805 = score(doc=1683,freq=2.0), product of:
        0.14267668 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.032201413 = queryNorm
        0.1958137 = fieldWeight in 1683, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=1683)
    0.015156886 = weight(_text_:web in 1683) [ClassicSimilarity], result of:
      0.015156886 = score(doc=1683,freq=2.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.14422815 = fieldWeight in 1683, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=1683)
    0.008771234 = weight(_text_:information in 1683) [ClassicSimilarity], result of:
      0.008771234 = score(doc=1683,freq=8.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.1551638 = fieldWeight in 1683, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=1683)
    0.018415406 = weight(_text_:retrieval in 1683) [ClassicSimilarity], result of:
      0.018415406 = score(doc=1683,freq=4.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.18905719 = fieldWeight in 1683, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=1683)
  0.36363637 = coord(4/11)
```
Abstract

As the demand for global information increases significantly, multilingual corpora has become a valuable linguistic resource for applications to cross-lingual information retrieval and natural language processing. In order to cross the boundaries that exist between different languages, dictionaries are the most typical tools. However, the general-purpose dictionary is less sensitive in both genre and domain. It is also impractical to manually construct tailored bilingual dictionaries or sophisticated multilingual thesauri for large applications. Corpusbased approaches, which do not have the limitation of dictionaries, provide a statistical translation model with which to cross the language boundary. There are many domain-specific parallel or comparable corpora that are employed in machine translation and cross-lingual information retrieval. Most of these are corpora between Indo-European languages, such as English/French and English/Spanish. The Asian/Indo-European corpus, especially English/Chinese corpus, is relatively sparse. The objective of the present research is to construct English/ Chinese parallel corpus automatically from the World Wide Web. In this paper, an alignment method is presented which is based an dynamic programming to identify the one-to-one Chinese and English title pairs. The method includes alignment at title level, word level and character level. The longest common subsequence (LCS) is applied to find the most reliabie Chinese translation of an English word. As one word for a language may translate into two or more words repetitively in another language, the edit operation, deletion, is used to resolve redundancy. A score function is then proposed to determine the optimal title pairs. Experiments have been conducted to investigate the performance of the proposed method using the daily press release articles by the Hong Kong SAR government as the test bed. The precision of the result is 0.998 while the recall is 0.806. The release articles and speech articles, published by Hongkong & Shanghai Banking Corporation Limited, are also used to test our method, the precision is 1.00, and the recall is 0.948.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.8, S.730-742

Li, Q.; Chen, Y.P.; Myaeng, S.-H.; Jin, Y.; Kang, B.-Y.: Concept unification of terms in different languages via web mining for Information Retrieval (2009) 0.02

0.024870243 = product of:
  0.09119089 = sum of:
    0.037892215 = weight(_text_:web in 4215) [ClassicSimilarity], result of:
      0.037892215 = score(doc=4215,freq=8.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.36057037 = fieldWeight in 4215, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
    0.013428154 = weight(_text_:information in 4215) [ClassicSimilarity], result of:
      0.013428154 = score(doc=4215,freq=12.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.23754507 = fieldWeight in 4215, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
    0.039870523 = weight(_text_:retrieval in 4215) [ClassicSimilarity], result of:
      0.039870523 = score(doc=4215,freq=12.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.40932083 = fieldWeight in 4215, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
  0.27272728 = coord(3/11)

Abstract: For historical and cultural reasons, English phrases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.
Source: Information processing and management. 45(2009) no.2, S.246-262

Rahmstorf, G.: Rückkehr von Ordnung in die Informationstechnik? (2000) 0.02

0.024824534 = product of:
  0.13653493 = sum of:
    0.0065784254 = weight(_text_:information in 5504) [ClassicSimilarity], result of:
      0.0065784254 = score(doc=5504,freq=2.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.116372846 = fieldWeight in 5504, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5504)
    0.1299565 = weight(_text_:kongress in 5504) [ClassicSimilarity], result of:
      0.1299565 = score(doc=5504,freq=4.0), product of:
        0.21127632 = queryWeight, product of:
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.032201413 = queryNorm
        0.61510205 = fieldWeight in 5504, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.046875 = fieldNorm(doc=5504)
  0.18181819 = coord(2/11)

Series: Gemeinsamer Kongress der Bundesvereinigung Deutscher Bibliotheksverbände e.V. (BDB) und der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. (DGI); Bd.1)(Tagungen der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V.; Bd.3
Source: Information und Öffentlichkeit: 1. Gemeinsamer Kongress der Bundesvereinigung Deutscher Bibliotheksverbände e.V. (BDB) und der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. (DGI), Leipzig, 20.-23.3.2000. Zugleich 90. Deutscher Bibliothekartag, 52. Jahrestagung der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. (DGI). Hrsg.: G. Ruppelt u. H. Neißer

Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.02

0.023962976 = product of:
  0.08786424 = sum of:
    0.041907072 = weight(_text_:wide in 5896) [ClassicSimilarity], result of:
      0.041907072 = score(doc=5896,freq=2.0), product of:
        0.14267668 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.032201413 = queryNorm
        0.29372054 = fieldWeight in 5896, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=5896)
    0.039378747 = weight(_text_:web in 5896) [ClassicSimilarity], result of:
      0.039378747 = score(doc=5896,freq=6.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.37471575 = fieldWeight in 5896, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5896)
    0.0065784254 = weight(_text_:information in 5896) [ClassicSimilarity], result of:
      0.0065784254 = score(doc=5896,freq=2.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.116372846 = fieldWeight in 5896, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5896)
  0.27272728 = coord(3/11)

Abstract: Word usage is of interest to linguists for its own sake as well as to social scientists and others who seek to track the spread of ideas, for example, in public debates over political decisions. The historical evolution of language can be analyzed with the tools of corpus linguistics through evolving corpora and the Web. But word usage statistics can only be gathered for known words. In this article, techniques are described and tested for identifying new words from the Web, focusing on the case when the words are related to a topic and have a hybrid form with a common sequence of letters. The results highlight the need to employ a combination of search techniques and show the wide potential of hybrid word family investigations in linguistics and social science.
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.10, S.1326-1337

Airio, E.: Who benefits from CLIR in web retrieval? (2008) 0.02

0.023931006 = product of:
  0.08774702 = sum of:
    0.039378747 = weight(_text_:web in 2342) [ClassicSimilarity], result of:
      0.039378747 = score(doc=2342,freq=6.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.37471575 = fieldWeight in 2342, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.009303299 = weight(_text_:information in 2342) [ClassicSimilarity], result of:
      0.009303299 = score(doc=2342,freq=4.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.16457605 = fieldWeight in 2342, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.039064977 = weight(_text_:retrieval in 2342) [ClassicSimilarity], result of:
      0.039064977 = score(doc=2342,freq=8.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.40105087 = fieldWeight in 2342, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
  0.27272728 = coord(3/11)

Abstract: Purpose - The aim of the current paper is to test whether query translation is beneficial in web retrieval. Design/methodology/approach - The language pairs were Finnish-Swedish, English-German and Finnish-French. A total of 12-18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary-based system. In English-German, also machine translation was utilized. The author used Google as the search engine. Findings - The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query-translation were better than in the traditional laboratory tests. Originality/value - This research shows that query translation in web is beneficial especially for users with moderate and non-active language skills. This is valuable information for developers of cross-language information retrieval systems.

Galvez, C.; Moya-Anegón, F. de; Solana, V.H.: Term conflation methods in information retrieval : non-linguistic and linguistic approaches (2005) 0.02

0.02150004 = product of:
  0.078833476 = sum of:
    0.041907072 = weight(_text_:wide in 4394) [ClassicSimilarity], result of:
      0.041907072 = score(doc=4394,freq=2.0), product of:
        0.14267668 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.032201413 = queryNorm
        0.29372054 = fieldWeight in 4394, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4394)
    0.009303299 = weight(_text_:information in 4394) [ClassicSimilarity], result of:
      0.009303299 = score(doc=4394,freq=4.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.16457605 = fieldWeight in 4394, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4394)
    0.02762311 = weight(_text_:retrieval in 4394) [ClassicSimilarity], result of:
      0.02762311 = score(doc=4394,freq=4.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.2835858 = fieldWeight in 4394, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4394)
  0.27272728 = coord(3/11)

Abstract: Purpose - To propose a categorization of the different conflation procedures at the two basic approaches, non-linguistic and linguistic techniques, and to justify the application of normalization methods within the framework of linguistic techniques. Design/methodology/approach - Presents a range of term conflation methods, that can be used in information retrieval. The uniterm and multiterm variants can be considered equivalent units for the purposes of automatic indexing. Stemming algorithms, segmentation rules, association measures and clustering techniques are well evaluated non-linguistic methods, and experiments with these techniques show a wide variety of results. Alternatively, the lemmatisation and the use of syntactic pattern-matching, through equivalence relations represented in finite-state transducers (FST), are emerging methods for the recognition and standardization of terms. Findings - The survey attempts to point out the positive and negative effects of the linguistic approach and its potential as a term conflation method. Originality/value - Outlines the importance of FSTs for the normalization of term variants.

Artemenko, O.; Shramko, M.: Entwicklung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten (2005) 0.02

0.021248309 = product of:
  0.058432847 = sum of:
    0.024445795 = weight(_text_:wide in 572) [ClassicSimilarity], result of:
      0.024445795 = score(doc=572,freq=2.0), product of:
        0.14267668 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.032201413 = queryNorm
        0.171337 = fieldWeight in 572, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.02734375 = fieldNorm(doc=572)
    0.01875569 = weight(_text_:web in 572) [ClassicSimilarity], result of:
      0.01875569 = score(doc=572,freq=4.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.17847323 = fieldWeight in 572, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=572)
    0.003837415 = weight(_text_:information in 572) [ClassicSimilarity], result of:
      0.003837415 = score(doc=572,freq=2.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.06788416 = fieldWeight in 572, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02734375 = fieldNorm(doc=572)
    0.011393951 = weight(_text_:retrieval in 572) [ClassicSimilarity], result of:
      0.011393951 = score(doc=572,freq=2.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.11697317 = fieldWeight in 572, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02734375 = fieldNorm(doc=572)
  0.36363637 = coord(4/11)

Abstract: Mit der Verbreitung des Internets vermehrt sich die Menge der im World Wide Web verfügbaren Dokumente. Die Gewährleistung eines effizienten Zugangs zu gewünschten Informationen für die Internetbenutzer wird zu einer großen Herausforderung an die moderne Informationsgesellschaft. Eine Vielzahl von Werkzeugen wird bereits eingesetzt, um den Nutzern die Orientierung in der wachsenden Informationsflut zu erleichtern. Allerdings stellt die enorme Menge an unstrukturierten und verteilten Informationen nicht die einzige Schwierigkeit dar, die bei der Entwicklung von Werkzeugen dieser Art zu bewältigen ist. Die zunehmende Vielsprachigkeit von Web-Inhalten resultiert in dem Bedarf an Sprachidentifikations-Software, die Sprache/en von elektronischen Dokumenten zwecks gezielter Weiterverarbeitung identifiziert. Solche Sprachidentifizierer können beispielsweise effektiv im Bereich des Multilingualen Information Retrieval eingesetzt werden, da auf den Sprachidentifikationsergebnissen Prozesse der automatischen Indexbildung wie Stemming, Stoppwörterextraktion etc. aufbauen. In der vorliegenden Arbeit wird das neue System "LangIdent" zur Sprachidentifikation von elektronischen Textdokumenten vorgestellt, das in erster Linie für Lehre und Forschung an der Universität Hildesheim verwendet werden soll. "LangIdent" enthält eine Auswahl von gängigen Algorithmen zu der monolingualen Sprachidentifikation, die durch den Benutzer interaktiv ausgewählt und eingestellt werden können. Zusätzlich wurde im System ein neuer Algorithmus implementiert, der die Identifikation von Sprachen, in denen ein multilinguales Dokument verfasst ist, ermöglicht. Die Identifikation beschränkt sich nicht nur auf eine Aufzählung von gefundenen Sprachen, vielmehr wird der Text in monolinguale Abschnitte aufgeteilt, jeweils mit der Angabe der identifizierten Sprache.

Mauldin, M.L.: Conceptual information retrieval : a case study in adaptive partial parsing (1991) 0.02

0.020993965 = product of:
  0.1154668 = sum of:
    0.02909089 = weight(_text_:information in 121) [ClassicSimilarity], result of:
      0.02909089 = score(doc=121,freq=22.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.51462007 = fieldWeight in 121, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=121)
    0.086375915 = weight(_text_:retrieval in 121) [ClassicSimilarity], result of:
      0.086375915 = score(doc=121,freq=22.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.88675684 = fieldWeight in 121, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=121)
  0.18181819 = coord(2/11)

LCSH: FERRET (Information retrieval system)
Information storage and retrieval
RSWK: Freitextsuche / Information Retrieval
Information Retrieval / Expertensystem
Syntaktische Analyse Information Retrieval
Subject: Freitextsuche / Information Retrieval
Information Retrieval / Expertensystem
Syntaktische Analyse Information Retrieval
FERRET (Information retrieval system)
Information storage and retrieval

Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02

0.020488083 = product of:
  0.07512297 = sum of:
    0.018606598 = weight(_text_:information in 4483) [ClassicSimilarity], result of:
      0.018606598 = score(doc=4483,freq=4.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.3291521 = fieldWeight in 4483, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=4483)
    0.039064977 = weight(_text_:retrieval in 4483) [ClassicSimilarity], result of:
      0.039064977 = score(doc=4483,freq=2.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.40105087 = fieldWeight in 4483, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=4483)
    0.017451387 = product of:
      0.05235416 = sum of:
        0.05235416 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
          0.05235416 = score(doc=4483,freq=2.0), product of:
            0.11276386 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032201413 = queryNorm
            0.46428138 = fieldWeight in 4483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
      0.33333334 = coord(1/3)
  0.27272728 = coord(3/11)

Date: 15. 3.2000 10:22:37
Source: Journal of information science. 25(1999) no.2, S.113-131

Wang, F.L.; Yang, C.C.: Mining Web data for Chinese segmentation (2007) 0.02
```
0.020421594 = product of:
  0.07487918 = sum of:
    0.042364784 = weight(_text_:web in 604) [ClassicSimilarity], result of:
      0.042364784 = score(doc=604,freq=10.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.40312994 = fieldWeight in 604, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=604)
    0.00949514 = weight(_text_:information in 604) [ClassicSimilarity], result of:
      0.00949514 = score(doc=604,freq=6.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.16796975 = fieldWeight in 604, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=604)
    0.023019256 = weight(_text_:retrieval in 604) [ClassicSimilarity], result of:
      0.023019256 = score(doc=604,freq=4.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.23632148 = fieldWeight in 604, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=604)
  0.27272728 = coord(3/11)
```
Abstract

Modern information retrieval systems use keywords within documents as indexing terms for search of relevant documents. As Chinese is an ideographic character-based language, the words in the texts are not delimited by white spaces. Indexing of Chinese documents is impossible without a proper segmentation algorithm. Many Chinese segmentation algorithms have been proposed in the past. Traditional segmentation algorithms cannot operate without a large dictionary or a large corpus of training data. Nowadays, the Web has become the largest corpus that is ideal for Chinese segmentation. Although most search engines have problems in segmenting texts into proper words, they maintain huge databases of documents and frequencies of character sequences in the documents. Their databases are important potential resources for segmentation. In this paper, we propose a segmentation algorithm by mining Web data with the help of search engines. On the other hand, the Romanized pinyin of Chinese language indicates boundaries of words in the text. Our algorithm is the first to utilize the Romanized pinyin to segmentation. It is the first unified segmentation algorithm for the Chinese language from different geographical areas, and it is also domain independent because of the nature of the Web. Experiments have been conducted on the datasets of a recent Chinese segmentation competition. The results show that our algorithm outperforms the traditional algorithms in terms of precision and recall. Moreover, our algorithm can effectively deal with the problems of segmentation ambiguity, new word (unknown word) detection, and stop words.

Footnote

Beitrag eines Themenschwerpunktes "Mining Web resources for enhancing information retrieval"

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.12, S.1820-1837

Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.02

0.019955693 = product of:
  0.109756306 = sum of:
    0.041907072 = weight(_text_:wide in 5480) [ClassicSimilarity], result of:
      0.041907072 = score(doc=5480,freq=2.0), product of:
        0.14267668 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.032201413 = queryNorm
        0.29372054 = fieldWeight in 5480, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=5480)
    0.067849234 = weight(_text_:konstanz in 5480) [ClassicSimilarity], result of:
      0.067849234 = score(doc=5480,freq=2.0), product of:
        0.18154396 = queryWeight, product of:
          5.637764 = idf(docFreq=427, maxDocs=44218)
          0.032201413 = queryNorm
        0.37373447 = fieldWeight in 5480, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.637764 = idf(docFreq=427, maxDocs=44218)
          0.046875 = fieldNorm(doc=5480)
  0.18181819 = coord(2/11)

Abstract: (Automatic) document classification is generally defined as content-based assignment of one or more predefined categories to documents. Usually, machine learning, statistical pattern recognition, or neural network approaches are used to construct classifiers automatically. In this paper we thoroughly evaluate a wide variety of these methods on a document classification task for German text. We evaluate different feature construction and selection methods and various classifiers. Our main results are: (1) feature selection is necessary not only to reduce learning and classification time, but also to avoid overfitting (even for Support Vector Machines); (2) surprisingly, our morphological analysis does not improve classification quality compared to a letter 5-gram approach; (3) Support Vector Machines are significantly better than all other classification methods
Imprint: Konstanz : UVK, Universitätsverlag

Luo, Z.; Yu, Y.; Osborne, M.; Wang, T.: Structuring tweets for improving Twitter search (2015) 0.02
```
0.019902337 = product of:
  0.07297523 = sum of:
    0.018946107 = weight(_text_:web in 2335) [ClassicSimilarity], result of:
      0.018946107 = score(doc=2335,freq=2.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.18028519 = fieldWeight in 2335, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2335)
    0.0109640425 = weight(_text_:information in 2335) [ClassicSimilarity], result of:
      0.0109640425 = score(doc=2335,freq=8.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.19395474 = fieldWeight in 2335, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2335)
    0.043065086 = weight(_text_:retrieval in 2335) [ClassicSimilarity], result of:
      0.043065086 = score(doc=2335,freq=14.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.442117 = fieldWeight in 2335, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2335)
  0.27272728 = coord(3/11)
```
Abstract

Spam and wildly varying documents make searching in Twitter challenging. Most Twitter search systems generally treat a Tweet as a plain text when modeling relevance. However, a series of conventions allows users to Tweet in structural ways using a combination of different blocks of texts. These blocks include plain texts, hashtags, links, mentions, etc. Each block encodes a variety of communicative intent and the sequence of these blocks captures changing discourse. Previous work shows that exploiting the structural information can improve the structured documents (e.g., web pages) retrieval. In this study we utilize the structure of Tweets, induced by these blocks, for Twitter retrieval and Twitter opinion retrieval. For Twitter retrieval, a set of features, derived from the blocks of text and their combinations, is used into a learning-to-rank scenario. We show that structuring Tweets can achieve state-of-the-art performance. Our approach does not rely on social media features, but when we do add this additional information, performance improves significantly. For Twitter opinion retrieval, we explore the question of whether structural information derived from the body of Tweets and opinionatedness ratings of Tweets can improve performance. Experimental results show that retrieval using a novel unsupervised opinionatedness feature based on structuring Tweets achieves comparable performance with a supervised method using manually tagged Tweets. Topic-related specific structured Tweet sets are shown to help with query-dependent opinion retrieval.

Source

Journal of the Association for Information Science and Technology. 66(2015) no.12, S.2522-2539

Korman, D.Z.; Mack, E.; Jett, J.; Renear, A.H.: Defining textual entailment (2018) 0.02

0.019863743 = product of:
  0.072833724 = sum of:
    0.041907072 = weight(_text_:wide in 4284) [ClassicSimilarity], result of:
      0.041907072 = score(doc=4284,freq=2.0), product of:
        0.14267668 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.032201413 = queryNorm
        0.29372054 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.011394167 = weight(_text_:information in 4284) [ClassicSimilarity], result of:
      0.011394167 = score(doc=4284,freq=6.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.20156369 = fieldWeight in 4284, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.019532489 = weight(_text_:retrieval in 4284) [ClassicSimilarity], result of:
      0.019532489 = score(doc=4284,freq=2.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.20052543 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
  0.27272728 = coord(3/11)

Abstract: Textual entailment is a relationship that obtains between fragments of text when one fragment in some sense implies the other fragment. The automation of textual entailment recognition supports a wide variety of text-based tasks, including information retrieval, information extraction, question answering, text summarization, and machine translation. Much ingenuity has been devoted to developing algorithms for identifying textual entailments, but relatively little to saying what textual entailment actually is. This article is a review of the logical and philosophical issues involved in providing an adequate definition of textual entailment. We show that many natural definitions of textual entailment are refuted by counterexamples, including the most widely cited definition of Dagan et al. We then articulate and defend the following revised definition: T textually entails H?=?df typically, a human reading T would be justified in inferring the proposition expressed by H from the proposition expressed by T. We also show that textual entailment is context-sensitive, nontransitive, and nonmonotonic.
Source: Journal of the Association for Information Science and Technology. 69(2018) no.6, S.763-772

Rajasurya, S.; Muralidharan, T.; Devi, S.; Swamynathan, S.: Semantic information retrieval using ontology in university domain (2012) 0.02

0.019201802 = product of:
  0.07040661 = sum of:
    0.037892215 = weight(_text_:web in 2861) [ClassicSimilarity], result of:
      0.037892215 = score(doc=2861,freq=8.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.36057037 = fieldWeight in 2861, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2861)
    0.00949514 = weight(_text_:information in 2861) [ClassicSimilarity], result of:
      0.00949514 = score(doc=2861,freq=6.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.16796975 = fieldWeight in 2861, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2861)
    0.023019256 = weight(_text_:retrieval in 2861) [ClassicSimilarity], result of:
      0.023019256 = score(doc=2861,freq=4.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.23632148 = fieldWeight in 2861, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2861)
  0.27272728 = coord(3/11)

Abstract: Today's conventional search engines hardly do provide the essential content relevant to the user's search query. This is because the context and semantics of the request made by the user is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is upcoming in the area of web search which combines Natural Language Processing and Artificial Intelligence. The objective of the work done here is to design, develop and implement a semantic search engine- SIEU(Semantic Information Extraction in University Domain) confined to the university domain. SIEU uses ontology as a knowledge base for the information retrieval process. It is not just a mere keyword search. It is one layer above what Google or any other search engines retrieve by analyzing just the keywords. Here the query is analyzed both syntactically and semantically. The developed system retrieves the web results more relevant to the user query through keyword expansion. The results obtained here will be accurate enough to satisfy the request made by the user. The level of accuracy will be enhanced since the query is analyzed semantically. The system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query.

Jacquemin, C.: Spotting and discovering terms through natural language processing (2001) 0.02

0.018436614 = product of:
  0.06760092 = sum of:
    0.018946107 = weight(_text_:web in 119) [ClassicSimilarity], result of:
      0.018946107 = score(doc=119,freq=2.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.18028519 = fieldWeight in 119, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=119)
    0.012258172 = weight(_text_:information in 119) [ClassicSimilarity], result of:
      0.012258172 = score(doc=119,freq=10.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.21684799 = fieldWeight in 119, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=119)
    0.03639664 = weight(_text_:retrieval in 119) [ClassicSimilarity], result of:
      0.03639664 = score(doc=119,freq=10.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.37365708 = fieldWeight in 119, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=119)
  0.27272728 = coord(3/11)

Abstract: In this book Christian Jacquemin shows how the power of natural language processing (NLP) can be used to advance text indexing and information retrieval (IR). Jacquemin's novel tool is FASTR, a parser that normalizes terms and recognizes term variants. Since there are more meanings in a language than there are words, FASTR uses a metagrammar composed of shallow linguistic transformations that describe the morphological, syntactic, semantic, and pragmatic variations of words and terms. The acquired parsed terms can then be applied for precise retrieval and assembly of information. The use of a corpus-based unification grammar to define, recognize, and combine term variants from their base forms allows for intelligent information access to, or "linguistic data tuning" of, heterogeneous texts. FASTR can be used to do automatic controlled indexing, to carry out content-based Web searches through conceptually related alternative query formulations, to abstract scientific and technical extracts, and even to translate and collect terms from multilingual material. Jacquemin provides a comprehensive account of the method and implementation of this innovative retrieval technique for text processing.
RSWK: Automatische Indexierung / Computerlinguistik / Information Retrieval
Subject: Automatische Indexierung / Computerlinguistik / Information Retrieval

Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.02

0.017860819 = product of:
  0.065489665 = sum of:
    0.039378747 = weight(_text_:web in 3455) [ClassicSimilarity], result of:
      0.039378747 = score(doc=3455,freq=6.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.37471575 = fieldWeight in 3455, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=3455)
    0.0065784254 = weight(_text_:information in 3455) [ClassicSimilarity], result of:
      0.0065784254 = score(doc=3455,freq=2.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.116372846 = fieldWeight in 3455, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=3455)
    0.019532489 = weight(_text_:retrieval in 3455) [ClassicSimilarity], result of:
      0.019532489 = score(doc=3455,freq=2.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.20052543 = fieldWeight in 3455, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3455)
  0.27272728 = coord(3/11)

Abstract: Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
Source: Journal of the American Society for Information Science and Technology. 56(2005) no.6, S.571-583

Sonnenberger, G.: Automatische Wissensakquisition aus Texten : Textparsing (1990) 0.02

0.017446058 = product of:
  0.19190663 = sum of:
    0.19190663 = weight(_text_:konstanz in 8428) [ClassicSimilarity], result of:
      0.19190663 = score(doc=8428,freq=4.0), product of:
        0.18154396 = queryWeight, product of:
          5.637764 = idf(docFreq=427, maxDocs=44218)
          0.032201413 = queryNorm
        1.0570807 = fieldWeight in 8428, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.637764 = idf(docFreq=427, maxDocs=44218)
          0.09375 = fieldNorm(doc=8428)
  0.09090909 = coord(1/11)

Imprint: Konstanz : Universitätsverlag
Source: Pragmatische Aspekte beim Entwurf und Betrieb von Informationssystemen: Proc. 1. Int. Symposiums für Informationswissenschaft, Universität Konstanz, 17.-19.10.1990. Hrsg.: J. Herget u. R. Kuhlen

Clark, M.; Kim, Y.; Kruschwitz, U.; Song, D.; Albakour, D.; Dignum, S.; Beresi, U.C.; Fasli, M.; Roeck, A De: Automatically structuring domain knowledge from text : an overview of current research (2012) 0.02

0.0166332 = product of:
  0.060988396 = sum of:
    0.032152608 = weight(_text_:web in 2738) [ClassicSimilarity], result of:
      0.032152608 = score(doc=2738,freq=4.0), product of:
        0.10508965 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.032201413 = queryNorm
        0.3059541 = fieldWeight in 2738, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2738)
    0.009303299 = weight(_text_:information in 2738) [ClassicSimilarity], result of:
      0.009303299 = score(doc=2738,freq=4.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.16457605 = fieldWeight in 2738, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2738)
    0.019532489 = weight(_text_:retrieval in 2738) [ClassicSimilarity], result of:
      0.019532489 = score(doc=2738,freq=2.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.20052543 = fieldWeight in 2738, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2738)
  0.27272728 = coord(3/11)

Abstract: This paper presents an overview of automatic methods for building domain knowledge structures (domain models) from text collections. Applications of domain models have a long history within knowledge engineering and artificial intelligence. In the last couple of decades they have surfaced noticeably as a useful tool within natural language processing, information retrieval and semantic web technology. Inspired by the ubiquitous propagation of domain model structures that are emerging in several research disciplines, we give an overview of the current research landscape and some techniques and approaches. We will also discuss trade-offs between different approaches and point to some recent trends.
Content: Beitrag in einem Themenheft "Soft Approaches to IA on the Web". Vgl.: doi:10.1016/j.ipm.2011.07.002.
Source: Information processing and management. 48(2012) no.3, S.552-568

Natürlichsprachlicher Entwurf von Informationssystemen (1996) 0.02

0.0164483 = product of:
  0.1809313 = sum of:
    0.1809313 = weight(_text_:konstanz in 722) [ClassicSimilarity], result of:
      0.1809313 = score(doc=722,freq=2.0), product of:
        0.18154396 = queryWeight, product of:
          5.637764 = idf(docFreq=427, maxDocs=44218)
          0.032201413 = queryNorm
        0.99662524 = fieldWeight in 722, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.637764 = idf(docFreq=427, maxDocs=44218)
          0.125 = fieldNorm(doc=722)
  0.09090909 = coord(1/11)

Imprint: Konstanz : Universitätsverlag

Gachot, D.A.; Lange, E.; Yang, J.: ¬The SYSTRAN NLP browser : an application of machine translation technology in cross-language information retrieval (1998) 0.02

0.016445613 = product of:
  0.09045087 = sum of:
    0.022788335 = weight(_text_:information in 6213) [ClassicSimilarity], result of:
      0.022788335 = score(doc=6213,freq=6.0), product of:
        0.05652887 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.032201413 = queryNorm
        0.40312737 = fieldWeight in 6213, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=6213)
    0.06766253 = weight(_text_:retrieval in 6213) [ClassicSimilarity], result of:
      0.06766253 = score(doc=6213,freq=6.0), product of:
        0.09740654 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032201413 = queryNorm
        0.6946405 = fieldWeight in 6213, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=6213)
  0.18181819 = coord(2/11)

Series: The Kluwer International series on information retrieval
Source: Cross-language information retrieval. Ed.: G. Grefenstette

Search (492 results, page 2 of 25)

Authors

Years

Languages

Types

Themes

Subjects

Classifications