Search (284 results, page 1 of 15)

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.10

0.09791788 = product of:
  0.19583575 = sum of:
    0.019452421 = weight(_text_:retrieval in 563) [ClassicSimilarity], result of:
      0.019452421 = score(doc=563,freq=2.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.20052543 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.010544236 = product of:
      0.021088472 = sum of:
        0.021088472 = weight(_text_:system in 563) [ClassicSimilarity], result of:
          0.021088472 = score(doc=563,freq=2.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.20878783 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
    0.1528042 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.1528042 = score(doc=563,freq=2.0), product of:
        0.27188486 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.032069415 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.013034889 = product of:
      0.026069777 = sum of:
        0.026069777 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.026069777 = score(doc=563,freq=2.0), product of:
            0.112301625 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032069415 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
Content: A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
Date: 10. 1.2013 19:22:47

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.08

0.081290185 = product of:
  0.21677382 = sum of:
    0.050934732 = product of:
      0.1528042 = sum of:
        0.1528042 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.1528042 = score(doc=562,freq=2.0), product of:
            0.27188486 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032069415 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.1528042 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.1528042 = score(doc=562,freq=2.0), product of:
        0.27188486 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.032069415 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.013034889 = product of:
      0.026069777 = sum of:
        0.026069777 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.026069777 = score(doc=562,freq=2.0), product of:
            0.112301625 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032069415 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.05

0.050934732 = product of:
  0.20373893 = sum of:
    0.050934732 = product of:
      0.1528042 = sum of:
        0.1528042 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.1528042 = score(doc=862,freq=2.0), product of:
            0.27188486 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032069415 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
    0.1528042 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
      0.1528042 = score(doc=862,freq=2.0), product of:
        0.27188486 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.032069415 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
  0.25 = coord(2/8)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.03

0.029168403 = product of:
  0.07778241 = sum of:
    0.022924898 = weight(_text_:retrieval in 2541) [ClassicSimilarity], result of:
      0.022924898 = score(doc=2541,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.23632148 = fieldWeight in 2541, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.039495744 = sum of:
      0.017573725 = weight(_text_:system in 2541) [ClassicSimilarity], result of:
        0.017573725 = score(doc=2541,freq=2.0), product of:
          0.10100432 = queryWeight, product of:
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.032069415 = queryNorm
          0.17398985 = fieldWeight in 2541, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2541)
      0.021922018 = weight(_text_:29 in 2541) [ClassicSimilarity], result of:
        0.021922018 = score(doc=2541,freq=2.0), product of:
          0.11281017 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.032069415 = queryNorm
          0.19432661 = fieldWeight in 2541, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2541)
    0.015361764 = product of:
      0.030723527 = sum of:
        0.030723527 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
          0.030723527 = score(doc=2541,freq=4.0), product of:
            0.112301625 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032069415 = queryNorm
            0.27358043 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
Date: 14. 8.2004 17:22:56
Source: Online. 28(2004) no.3, S.22-29

Rahmstorf, G.: Concept structures for large vocabularies (1998) 0.03

0.026973326 = product of:
  0.1078933 = sum of:
    0.019452421 = weight(_text_:retrieval in 75) [ClassicSimilarity], result of:
      0.019452421 = score(doc=75,freq=2.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.20052543 = fieldWeight in 75, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=75)
    0.08844088 = sum of:
      0.062371105 = weight(_text_:etc in 75) [ClassicSimilarity], result of:
        0.062371105 = score(doc=75,freq=2.0), product of:
          0.17370372 = queryWeight, product of:
            5.4164915 = idf(docFreq=533, maxDocs=44218)
            0.032069415 = queryNorm
          0.35906604 = fieldWeight in 75, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.4164915 = idf(docFreq=533, maxDocs=44218)
            0.046875 = fieldNorm(doc=75)
      0.026069777 = weight(_text_:22 in 75) [ClassicSimilarity], result of:
        0.026069777 = score(doc=75,freq=2.0), product of:
          0.112301625 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.032069415 = queryNorm
          0.23214069 = fieldWeight in 75, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=75)
  0.25 = coord(2/8)

Abstract: A technology is described which supports the acquisition, visualisation and manipulation of large vocabularies with associated structures. It is used for dictionary production, terminology data bases, thesauri, library classification systems etc. Essential features of the technology are a lexicographic user interface, variable word description, unlimited list of word readings, a concept language, automatic transformations of formulas into graphic structures, structure manipulation operations and retransformation into formulas. The concept language includes notations for undefined concepts. The structure of defined concepts can be constructed interactively. The technology supports the generation of large vocabularies with structures representing word senses. Concept structures and ordering systems for indexing and retrieval can be constructed separately and connected by associating relations.
Date: 30.12.2001 19:01:22

Mauldin, M.L.: Conceptual information retrieval : a case study in adaptive partial parsing (1991) 0.03

0.026476061 = product of:
  0.105904244 = sum of:
    0.08602184 = weight(_text_:retrieval in 121) [ClassicSimilarity], result of:
      0.08602184 = score(doc=121,freq=22.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.88675684 = fieldWeight in 121, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=121)
    0.019882401 = product of:
      0.039764803 = sum of:
        0.039764803 = weight(_text_:system in 121) [ClassicSimilarity], result of:
          0.039764803 = score(doc=121,freq=4.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.3936941 = fieldWeight in 121, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=121)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

LCSH: FERRET (Information retrieval system)
Information storage and retrieval
RSWK: Freitextsuche / Information Retrieval
Information Retrieval / Expertensystem
Syntaktische Analyse Information Retrieval
Subject: Freitextsuche / Information Retrieval
Information Retrieval / Expertensystem
Syntaktische Analyse Information Retrieval
FERRET (Information retrieval system)
Information storage and retrieval

Sembok, T.M.T.; Rijsbergen, C.J. van: SILOL: a simple logical-linguistic document retrieval system (1990) 0.02

0.021970551 = product of:
  0.087882206 = sum of:
    0.06353134 = weight(_text_:retrieval in 6684) [ClassicSimilarity], result of:
      0.06353134 = score(doc=6684,freq=12.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.6549133 = fieldWeight in 6684, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=6684)
    0.02435087 = product of:
      0.04870174 = sum of:
        0.04870174 = weight(_text_:system in 6684) [ClassicSimilarity], result of:
          0.04870174 = score(doc=6684,freq=6.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.48217484 = fieldWeight in 6684, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=6684)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: Describes a system called SILOL which is based on a logical-linguistic model of document retrieval systems. SILOL uses a shallow semantic translation of natural language texts into a first order predicate representation in performing a document indexing and retrieval process. Some preliminary experiments have been carried out to test the retrieval effectiveness of this system. The results obtained show improvements in the level of retrieval effectiveness, which demonstrate that the approach of using a semantic theory of natural language and logic in document retrieval systems is a valid one

Bowker, L.: Information retrieval in translation memory systems : assessment of current limitations and possibilities for future development (2002) 0.02

0.021847224 = product of:
  0.087388895 = sum of:
    0.032094855 = weight(_text_:retrieval in 1854) [ClassicSimilarity], result of:
      0.032094855 = score(doc=1854,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.33085006 = fieldWeight in 1854, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1854)
    0.05529404 = sum of:
      0.024603218 = weight(_text_:system in 1854) [ClassicSimilarity], result of:
        0.024603218 = score(doc=1854,freq=2.0), product of:
          0.10100432 = queryWeight, product of:
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.032069415 = queryNorm
          0.2435858 = fieldWeight in 1854, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1854)
      0.030690823 = weight(_text_:29 in 1854) [ClassicSimilarity], result of:
        0.030690823 = score(doc=1854,freq=2.0), product of:
          0.11281017 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.032069415 = queryNorm
          0.27205724 = fieldWeight in 1854, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1854)
  0.25 = coord(2/8)

Abstract: A translation memory system is a new type of human language technology (HLT) tool that is gaining popularity among translators. Such tools allow translators to store previously translated texts in a type of aligned bilingual database, and to recycle relevant parts of these texts when producing new translations. Currently, these tools retrieve information from the database using superficial character string matching, which often results in poor precision and recall. This paper explains how translation memory systems work, and it considers some possible ways for introducing more sophisticated information retrieval techniques into such systems by taking syntactic and semantic similarity into account. Some of the suggested techniques are inspired by these used in other areas of HLT, and some by techniques used in information science.
Source: Knowledge organization. 29(2002) nos.3/4, S.198-203

Herrera-Viedma, E.: Modeling the retrieval process for an information retrieval system using an ordinal fuzzy linguistic approach (2001) 0.02
```
0.02010944 = product of:
  0.08043776 = sum of:
    0.028077152 = weight(_text_:retrieval in 5752) [ClassicSimilarity], result of:
      0.028077152 = score(doc=5752,freq=6.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.28943354 = fieldWeight in 5752, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5752)
    0.052360605 = sum of:
      0.030438587 = weight(_text_:system in 5752) [ClassicSimilarity], result of:
        0.030438587 = score(doc=5752,freq=6.0), product of:
          0.10100432 = queryWeight, product of:
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.032069415 = queryNorm
          0.30135927 = fieldWeight in 5752, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5752)
      0.021922018 = weight(_text_:29 in 5752) [ClassicSimilarity], result of:
        0.021922018 = score(doc=5752,freq=2.0), product of:
          0.11281017 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.032069415 = queryNorm
          0.19432661 = fieldWeight in 5752, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5752)
  0.25 = coord(2/8)
```
Abstract

A linguistic model for an Information Retrieval System (IRS) defined using an ordinal fuzzy linguistic approach is proposed. The ordinal fuzzy linguistic approach is presented, and its use for modeling the imprecision and subjectivity that appear in the user-IRS interaction is studied. The user queries and IRS responses are modeled linguistically using the concept of fuzzy linguistic variables. The system accepts Boolean queries whose terms can be weighted simultaneously by means of ordinal linguistic values according to three possible semantics: a symmetrical threshold semantic, a quantitative semantic, and an importance semantic. The first one identifies a new threshold semantic used to express qualitative restrictions on the documents retrieved for a given term. It is monotone increasing in index term weight for the threshold values that are on the right of the mid-value, and decreasing for the threshold values that are on the left of the mid-value. The second one is a new semantic proposal introduced to express quantitative restrictions on the documents retrieved for a term, i.e., restrictions on the number of documents that must be retrieved containing that term. The last one is the usual semantic of relative importance that has an effect when the term is in a Boolean expression. A bottom-up evaluation mechanism of queries is presented that coherently integrates the use of the three semantics and satisfies the separability property. The advantage of this IRS with respect to others is that users can express linguistically different semantic restrictions on the desired documents simultaneously, incorporating more flexibility in the user-IRS interaction

Date

29. 9.2001 14:00:25
Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.02
```
0.019944277 = product of:
  0.07977711 = sum of:
    0.061513953 = weight(_text_:retrieval in 2502) [ClassicSimilarity], result of:
      0.061513953 = score(doc=2502,freq=20.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.63411707 = fieldWeight in 2502, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2502)
    0.018263152 = product of:
      0.036526304 = sum of:
        0.036526304 = weight(_text_:system in 2502) [ClassicSimilarity], result of:
          0.036526304 = score(doc=2502,freq=6.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.36163113 = fieldWeight in 2502, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2502)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

Prior efforts have shown that under certain situations retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single information retrieval system will lead to similar improvements. In this study, we show that this is not the case. We hold constant systemic differences such as parsing, stemming, phrase processing, and relevance feedback, and fuse result sets generated from highly effective retrieval strategies in the same information retrieval system. From this, we show that data fusion of highly effective retrieval strategies alone shows little or no improvement in retrieval effectiveness. Furthermore, we present a detailed analysis of the performance of modern data fusion approaches, and demonstrate the reasons why they do not perform weIl when applied to this problem. Detailed results and analyses are included to support our conclusions.

Yannakoudakis, E.J.; Daraki, J.J.: Lexical clustering and retrieval of bibliographic records (1994) 0.02

0.019224234 = product of:
  0.076896936 = sum of:
    0.05558992 = weight(_text_:retrieval in 1045) [ClassicSimilarity], result of:
      0.05558992 = score(doc=1045,freq=12.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.5730491 = fieldWeight in 1045, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1045)
    0.021307012 = product of:
      0.042614024 = sum of:
        0.042614024 = weight(_text_:system in 1045) [ClassicSimilarity], result of:
          0.042614024 = score(doc=1045,freq=6.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.42190298 = fieldWeight in 1045, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1045)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: Presents a new system that enables users to retrieve catalogue entries on the basis of theri lexical similarities and to cluster records in a dynamic fashion. Describes the information retrieval system developed by the Department of Informatics, Athens University of Economics and Business, Greece. The system also offers the means for cyclic retrieval of records from each cluster while allowing the user to define the field to be used in each case. The approach is based on logical keys which are derived from pertinent bibliographic fields and are used for all clustering and information retrieval functions
Source: Information retrieval: new systems and current research. Proceedings of the 15th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Glasgow 1993. Ed.: Ruben Leon

McCune, B.P.; Tong, R.M.; Dean, J.S.: Rubric: a system for rule-based information retrieval (1985) 0.02

0.019027056 = product of:
  0.076108225 = sum of:
    0.055019755 = weight(_text_:retrieval in 1945) [ClassicSimilarity], result of:
      0.055019755 = score(doc=1945,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.5671716 = fieldWeight in 1945, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=1945)
    0.021088472 = product of:
      0.042176943 = sum of:
        0.042176943 = weight(_text_:system in 1945) [ClassicSimilarity], result of:
          0.042176943 = score(doc=1945,freq=2.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.41757566 = fieldWeight in 1945, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.09375 = fieldNorm(doc=1945)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.440-445.

Schwarz, C.: THESYS: Thesaurus Syntax System : a fully automatic thesaurus building aid (1988) 0.02

0.0188263 = product of:
  0.05020347 = sum of:
    0.02269449 = weight(_text_:retrieval in 1361) [ClassicSimilarity], result of:
      0.02269449 = score(doc=1361,freq=2.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.23394634 = fieldWeight in 1361, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1361)
    0.012301609 = product of:
      0.024603218 = sum of:
        0.024603218 = weight(_text_:system in 1361) [ClassicSimilarity], result of:
          0.024603218 = score(doc=1361,freq=2.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.2435858 = fieldWeight in 1361, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1361)
      0.5 = coord(1/2)
    0.01520737 = product of:
      0.03041474 = sum of:
        0.03041474 = weight(_text_:22 in 1361) [ClassicSimilarity], result of:
          0.03041474 = score(doc=1361,freq=2.0), product of:
            0.112301625 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032069415 = queryNorm
            0.2708308 = fieldWeight in 1361, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1361)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: THESYS is based on the natural language processing of free-text databases. It yields statistically evaluated correlations between words of the database. These correlations correspond to traditional thesaurus relations. The person who has to build a thesaurus is thus assisted by the proposals made by THESYS. THESYS is being tested on commercial databases under real world conditions. It is part of a text processing project at Siemens, called TINA (Text-Inhalts-Analyse). Software from TINA is actually being applied and evaluated by the US Department of Commerce for patent search and indexing (REALIST: REtrieval Aids by Linguistics and STatistics)
Date: 6. 1.1999 10:22:07

Airio, E.; Kettunen, K.: Does dictionary based bilingual retrieval work in a non-normalized index? (2009) 0.02
```
0.0175226 = product of:
  0.0700904 = sum of:
    0.038904842 = weight(_text_:retrieval in 4224) [ClassicSimilarity], result of:
      0.038904842 = score(doc=4224,freq=8.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.40105087 = fieldWeight in 4224, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4224)
    0.031185552 = product of:
      0.062371105 = sum of:
        0.062371105 = weight(_text_:etc in 4224) [ClassicSimilarity], result of:
          0.062371105 = score(doc=4224,freq=2.0), product of:
            0.17370372 = queryWeight, product of:
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.032069415 = queryNorm
            0.35906604 = fieldWeight in 4224, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.046875 = fieldNorm(doc=4224)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

Many operational IR indexes are non-normalized, i.e. no lemmatization or stemming techniques, etc. have been employed in indexing. This poses a challenge for dictionary-based cross-language retrieval (CLIR), because translations are mostly lemmas. In this study, we face the challenge of dictionary-based CLIR in a non-normalized index. We test two optional approaches: FCG (Frequent Case Generation) and s-gramming. The idea of FCG is to automatically generate the most frequent inflected forms for a given lemma. FCG has been tested in monolingual retrieval and has been shown to be a good method for inflected retrieval, especially for highly inflected languages. S-gramming is an approximate string matching technique (an extension of n-gramming). The language pairs in our tests were English-Finnish, English-Swedish, Swedish-Finnish and Finnish-Swedish. Both our approaches performed quite well, but the results varied depending on the language pair. S-gramming and FCG performed quite equally in all the other language pairs except Finnish-Swedish, where s-gramming outperformed FCG.
Luo, Z.; Yu, Y.; Osborne, M.; Wang, T.: Structuring tweets for improving Twitter search (2015) 0.02
```
0.01721913 = product of:
  0.06887652 = sum of:
    0.042888556 = weight(_text_:retrieval in 2335) [ClassicSimilarity], result of:
      0.042888556 = score(doc=2335,freq=14.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.442117 = fieldWeight in 2335, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2335)
    0.02598796 = product of:
      0.05197592 = sum of:
        0.05197592 = weight(_text_:etc in 2335) [ClassicSimilarity], result of:
          0.05197592 = score(doc=2335,freq=2.0), product of:
            0.17370372 = queryWeight, product of:
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.032069415 = queryNorm
            0.2992217 = fieldWeight in 2335, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2335)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

Spam and wildly varying documents make searching in Twitter challenging. Most Twitter search systems generally treat a Tweet as a plain text when modeling relevance. However, a series of conventions allows users to Tweet in structural ways using a combination of different blocks of texts. These blocks include plain texts, hashtags, links, mentions, etc. Each block encodes a variety of communicative intent and the sequence of these blocks captures changing discourse. Previous work shows that exploiting the structural information can improve the structured documents (e.g., web pages) retrieval. In this study we utilize the structure of Tweets, induced by these blocks, for Twitter retrieval and Twitter opinion retrieval. For Twitter retrieval, a set of features, derived from the blocks of text and their combinations, is used into a learning-to-rank scenario. We show that structuring Tweets can achieve state-of-the-art performance. Our approach does not rely on social media features, but when we do add this additional information, performance improves significantly. For Twitter opinion retrieval, we explore the question of whether structural information derived from the body of Tweets and opinionatedness ratings of Tweets can improve performance. Experimental results show that retrieval using a novel unsupervised opinionatedness feature based on structuring Tweets achieves comparable performance with a supervised method using manually tagged Tweets. Topic-related specific structured Tweet sets are shown to help with query-dependent opinion retrieval.

Rau, L.F.: Conceptual information extraction and retrieval from natural language input (198) 0.02

0.016942954 = product of:
  0.067771815 = sum of:
    0.045849796 = weight(_text_:retrieval in 1955) [ClassicSimilarity], result of:
      0.045849796 = score(doc=1955,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.47264296 = fieldWeight in 1955, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=1955)
    0.021922018 = product of:
      0.043844037 = sum of:
        0.043844037 = weight(_text_:29 in 1955) [ClassicSimilarity], result of:
          0.043844037 = score(doc=1955,freq=2.0), product of:
            0.11281017 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.032069415 = queryNorm
            0.38865322 = fieldWeight in 1955, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=1955)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Date: 16. 8.1998 13:29:20
Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.527-533

Liu, S.; Liu, F.; Yu, C.; Meng, W.: ¬An effective approach to document retrieval via utilizing WordNet and recognizing phrases (2004) 0.02

0.016942954 = product of:
  0.067771815 = sum of:
    0.045849796 = weight(_text_:retrieval in 4078) [ClassicSimilarity], result of:
      0.045849796 = score(doc=4078,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.47264296 = fieldWeight in 4078, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=4078)
    0.021922018 = product of:
      0.043844037 = sum of:
        0.043844037 = weight(_text_:29 in 4078) [ClassicSimilarity], result of:
          0.043844037 = score(doc=4078,freq=2.0), product of:
            0.11281017 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.032069415 = queryNorm
            0.38865322 = fieldWeight in 4078, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=4078)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Date: 10.10.2005 10:29:08
Source: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a

Jones, I.; Cunliffe, D.; Tudhope, D.: Natural language processing and knowledge organization systems as an aid to retrieval (2004) 0.02
```
0.01672344 = product of:
  0.06689376 = sum of:
    0.02779496 = weight(_text_:retrieval in 2677) [ClassicSimilarity], result of:
      0.02779496 = score(doc=2677,freq=12.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.28652456 = fieldWeight in 2677, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2677)
    0.03909879 = sum of:
      0.017397102 = weight(_text_:system in 2677) [ClassicSimilarity], result of:
        0.017397102 = score(doc=2677,freq=4.0), product of:
          0.10100432 = queryWeight, product of:
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.032069415 = queryNorm
          0.17224117 = fieldWeight in 2677, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2677)
      0.021701692 = weight(_text_:29 in 2677) [ClassicSimilarity], result of:
        0.021701692 = score(doc=2677,freq=4.0), product of:
          0.11281017 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.032069415 = queryNorm
          0.19237353 = fieldWeight in 2677, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2677)
  0.25 = coord(2/8)
```
Abstract

This paper discusses research that employs methods from Natural Language Processing (NLP) in exploiting the intellectual resources of Knowledge Organization Systems (KOS), particularly in the retrieval of information. A technique for the disambiguation of homographs and nominal compounds in free text, where these are known ambiguous terms in the KOS itself, is described. The use of Roget's Thesaurus as an intermediary in the process is also reported. A short review of the relevant literature in the field is given. Design considerations, results and conclusions are presented from the implementation of a prototype system. The linguistic techniques are applied at two complementary levels, namely an a free text string used as an entry point to the KOS, and an the underlying controlled vocabulary itself.

Content

1. Introduction The need for research into the application of linguistic techniques in Information Retrieval (IR) in general, and a similar need in faceted Knowledge Organization Systems (KOS) has been indicated by various authors. Smeaton (1997) points out the inherent limitations of conventional approaches to IR based an "bags of words", mainly difficulties caused by lexical ambiguity in the words concerned, and goes an to suggest the possibility of using Natural Language Processing (NLP) in query formulation. Past experience with a faceted retrieval system highlighted the need for integrating the linguistic perspective in order to fully utilise the potential of a KOS (Tudhope et al." 2002). The present research seeks to address some of these needs in using NLP to improve the efficacy of KOS tools in query and retrieval systems. Syntactic parsing and part-of-speech tagging can substantially reduce lexical ambiguity through homograph disambiguation. Given the two strings "1 fable the motion" and "I put the motion an the fable", for instance, the parser used in this research clearly indicates that 'fable' in the first string is a verb, while 'table' in the second string is a noun, a distinction that would be missed in the "bag of words" approach. This syntactic disambiguation enables a more precise matching from free text to the controlled vocabulary of a KOS and vice versa. The use of a general linguistic resource, namely Roget's Thesaurus of English Words and Phrases (RTEWP), as an intermediary in this process, is investigated. The adaptation of the Link parser (Sleator & Temperley, 1993) to the purposes of the research is reported. The design and implementation of the early practical stages of the project are described, and the results of the initial experiments are presented and evaluated. Applications of the techniques developed are foreseen in the areas of query disambiguation, information retrieval and automatic indexing. In the first section of the paper a brief review of the literature and relevant current work in the field is presented. The second section includes reports an the development of algorithms, the construction of data sets and theoretical and experimental work undertaken to date. The third section evaluates the results obtained, and outlines directions for future research.

Date

29. 8.2004 19:29:56

Magennis, M.: Expert rule-based query expansion (1995) 0.02

0.016673999 = product of:
  0.066695996 = sum of:
    0.04538898 = weight(_text_:retrieval in 5181) [ClassicSimilarity], result of:
      0.04538898 = score(doc=5181,freq=8.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.46789268 = fieldWeight in 5181, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5181)
    0.021307012 = product of:
      0.042614024 = sum of:
        0.042614024 = weight(_text_:system in 5181) [ClassicSimilarity], result of:
          0.042614024 = score(doc=5181,freq=6.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.42190298 = fieldWeight in 5181, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5181)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: Examines how, for term based free text retrieval, Interactive Query Expansion (IQE) provides better retrieval performance tahn Automatic Query Expansion (AQE) but the performance of IQE depends on the strategy employed by the user to select expansion terms. The aim is to build an expert query expansion system using term selection rules based on expert users' strategies. It is expected that such a system will achieve better performance for novice or inexperienced users that either AQE or IQE. The procedure is to discover expert IQE users' term selection strategies through observation and interrogation, to construct a rule based query expansion (RQE) system based on these and to compare the resulting retrieval performance with that of comparable AQE and IQE systems
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Rindflesch, T.C.; Aronson, A.R.: Semantic processing in information retrieval (1993) 0.02

0.016522959 = product of:
  0.066091835 = sum of:
    0.050746426 = weight(_text_:retrieval in 4121) [ClassicSimilarity], result of:
      0.050746426 = score(doc=4121,freq=10.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.5231199 = fieldWeight in 4121, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4121)
    0.015345411 = product of:
      0.030690823 = sum of:
        0.030690823 = weight(_text_:29 in 4121) [ClassicSimilarity], result of:
          0.030690823 = score(doc=4121,freq=2.0), product of:
            0.11281017 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.032069415 = queryNorm
            0.27205724 = fieldWeight in 4121, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4121)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: Intuition suggests that one way to enhance the information retrieval process would be the use of phrases to characterize the contents of text. A number of researchers, however, have noted that phrases alone do not improve retrieval effectiveness. In this paper we briefly review the use of phrases in information retrieval and then suggest extensions to this paradigm using semantic information. We claim that semantic processing, which can be viewed as expressing relations between the concepts represented by phrases, will in fact enhance retrieval effectiveness. The availability of the UMLS® domain model, which we exploit extensively, significantly contributes to the feasibility of this processing.
Date: 29. 6.2015 14:51:28

Search (284 results, page 1 of 15)

Authors

Years

Languages

Types

Themes

Subjects

Classifications