Search (27 results, page 1 of 2)

Savoy, J.: Estimating the probability of an authorship attribution (2016) 0.03

0.0318287 = product of:
  0.0636574 = sum of:
    0.008101207 = weight(_text_:information in 2937) [ClassicSimilarity], result of:
      0.008101207 = score(doc=2937,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.09697737 = fieldWeight in 2937, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2937)
    0.055556197 = sum of:
      0.02331961 = weight(_text_:technology in 2937) [ClassicSimilarity], result of:
        0.02331961 = score(doc=2937,freq=2.0), product of:
          0.1417311 = queryWeight, product of:
            2.978387 = idf(docFreq=6114, maxDocs=44218)
            0.047586527 = queryNorm
          0.16453418 = fieldWeight in 2937, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.978387 = idf(docFreq=6114, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2937)
      0.032236587 = weight(_text_:22 in 2937) [ClassicSimilarity], result of:
        0.032236587 = score(doc=2937,freq=2.0), product of:
          0.16663991 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.047586527 = queryNorm
          0.19345059 = fieldWeight in 2937, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2937)
  0.5 = coord(2/4)

Date: 7. 5.2016 21:22:27
Source: Journal of the Association for Information Science and Technology. 67(2016) no.6, S.1462-1472

Fautsch, C.; Savoy, J.: Algorithmic stemmers or morphological analysis? : an evaluation (2009) 0.02
```
0.015414905 = product of:
  0.03082981 = sum of:
    0.016838044 = weight(_text_:information in 2950) [ClassicSimilarity], result of:
      0.016838044 = score(doc=2950,freq=6.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.20156369 = fieldWeight in 2950, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2950)
    0.013991767 = product of:
      0.027983533 = sum of:
        0.027983533 = weight(_text_:technology in 2950) [ClassicSimilarity], result of:
          0.027983533 = score(doc=2950,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.19744103 = fieldWeight in 2950, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.046875 = fieldNorm(doc=2950)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

It is important in information retrieval (IR), information extraction, or classification tasks that morphologically related forms are conflated under the same stem (using stemmer) or lemma (using morphological analyzer). To achieve this for the English language, algorithmic stemming or various morphological analysis approaches have been suggested. Based on Cross-Language Evaluation Forum test collections containing 284 queries and various IR models, this article evaluates these word-normalization proposals. Stemming improves the mean average precision significantly by around 7% while performance differences are not significant when comparing various algorithmic stemmers or algorithmic stemmers and morphological analysis. Accounting for thesaurus class numbers during indexing does not modify overall retrieval performances. Finally, we demonstrate that including a stop word list, even one containing only around 10 terms, might significantly improve retrieval performance, depending on the IR model.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.8, S.1616-1624

Dolamic, L.; Savoy, J.: Retrieval effectiveness of machine translated queries (2010) 0.01

0.013869986 = product of:
  0.027739972 = sum of:
    0.013748205 = weight(_text_:information in 4102) [ClassicSimilarity], result of:
      0.013748205 = score(doc=4102,freq=4.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.16457605 = fieldWeight in 4102, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4102)
    0.013991767 = product of:
      0.027983533 = sum of:
        0.027983533 = weight(_text_:technology in 4102) [ClassicSimilarity], result of:
          0.027983533 = score(doc=4102,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.19744103 = fieldWeight in 4102, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.046875 = fieldNorm(doc=4102)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: This article describes and evaluates various information retrieval models used to search document collections written in English through submitting queries written in various other languages, either members of the Indo-European family (English, French, German, and Spanish) or radically different language groups such as Chinese. This evaluation method involves searching a rather large number of topics (around 300) and using two commercial machine translation systems to translate across the language barriers. In this study, mean average precision is used to measure variances in retrieval effectiveness when a query language differs from the document language. Although performance differences are rather large for certain languages pairs, this does not mean that bilingual search methods are not commercially viable. Causes of the difficulties incurred when searching or during translation are analyzed and the results of concrete examples are explained.
Source: Journal of the American Society for Information Science and Technology. 61(2010) no.11, S.2266-2273

Dolamic, L.; Savoy, J.: When stopword lists make the difference (2009) 0.01

0.01383271 = product of:
  0.02766542 = sum of:
    0.011341691 = weight(_text_:information in 3319) [ClassicSimilarity], result of:
      0.011341691 = score(doc=3319,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.13576832 = fieldWeight in 3319, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3319)
    0.016323728 = product of:
      0.032647457 = sum of:
        0.032647457 = weight(_text_:technology in 3319) [ClassicSimilarity], result of:
          0.032647457 = score(doc=3319,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.23034787 = fieldWeight in 3319, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3319)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.1, S.200-203

Picard, J.; Savoy, J.: Enhancing retrieval with hyperlinks : a general model based on propositional argumentation systems (2003) 0.01
```
0.012845755 = product of:
  0.02569151 = sum of:
    0.0140317045 = weight(_text_:information in 1427) [ClassicSimilarity], result of:
      0.0140317045 = score(doc=1427,freq=6.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.16796975 = fieldWeight in 1427, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1427)
    0.011659805 = product of:
      0.02331961 = sum of:
        0.02331961 = weight(_text_:technology in 1427) [ClassicSimilarity], result of:
          0.02331961 = score(doc=1427,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.16453418 = fieldWeight in 1427, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1427)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Fast, effective, and adaptable techniques are needed to automatically organize and retrieve information an the ever-increasing World Wide Web. In that respect, different strategies have been suggested to take hypertext links into account. For example, hyperlinks have been used to (1) enhance document representation, (2) improve document ranking by propagating document score, (3) provide an indicator of popularity, and (4) find hubs and authorities for a given topic. Although the TREC experiments have not demonstrated the usefulness of hyperlinks for retrieval, the hypertext structure is nevertheless an essential aspect of the Web, and as such, should not be ignored. The development of abstract models of the IR task was a key factor to the improvement of search engines. However, at this time conceptual tools for modeling the hypertext retrieval task are lacking, making it difficult to compare, improve, and reason an the existing techniques. This article proposes a general model for using hyperlinks based an Probabilistic Argumentation Systems, in which each of the above-mentioned techniques can be stated. This model will allow to discover some inconsistencies in the mentioned techniques, and to take a higher level and systematic approach for using hyperlinks for retrieval.

Footnote

Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval

Source

Journal of the American Society for Information Science and technology. 54(2003) no.4, S.347-355

Kocher, M.; Savoy, J.: ¬A simple and efficient algorithm for authorship verification (2017) 0.01

0.011856608 = product of:
  0.023713216 = sum of:
    0.00972145 = weight(_text_:information in 3330) [ClassicSimilarity], result of:
      0.00972145 = score(doc=3330,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.116372846 = fieldWeight in 3330, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=3330)
    0.013991767 = product of:
      0.027983533 = sum of:
        0.027983533 = weight(_text_:technology in 3330) [ClassicSimilarity], result of:
          0.027983533 = score(doc=3330,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.19744103 = fieldWeight in 3330, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.046875 = fieldNorm(doc=3330)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Source: Journal of the Association for Information Science and Technology. 68(2017) no.1, S.259-269

Dolamic, L.; Savoy, J.: Indexing and searching strategies for the Russian language (2009) 0.01
```
0.011558321 = product of:
  0.023116643 = sum of:
    0.011456838 = weight(_text_:information in 3301) [ClassicSimilarity], result of:
      0.011456838 = score(doc=3301,freq=4.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.13714671 = fieldWeight in 3301, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3301)
    0.011659805 = product of:
      0.02331961 = sum of:
        0.02331961 = weight(_text_:technology in 3301) [ClassicSimilarity], result of:
          0.02331961 = score(doc=3301,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.16453418 = fieldWeight in 3301, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3301)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

This paper describes and evaluates various stemming and indexing strategies for the Russian language. We design and evaluate two stemming approaches, a light and a more aggressive one, and compare these stemmers to the Snowball stemmer, to no stemming, and also to a language-independent approach (n-gram). To evaluate the suggested stemming strategies we apply various probabilistic information retrieval (IR) models, including the Okapi, the Divergence from Randomness (DFR), a statistical language model (LM), as well as two vector-space approaches, namely, the classical tf idf scheme and the dtu-dtn model. We find that the vector-space dtu-dtn and the DFR models tend to result in better retrieval effectiveness than the Okapi, LM, or tf idf models, while only the latter two IR approaches result in statistically significant performance differences. Ignoring stemming generally reduces the MAP by more than 50%, and these differences are always significant. When applying an n-gram approach, performance differences are usually lower than an approach involving stemming. Finally, our light stemmer tends to perform best, although performance differences between the light, aggressive, and Snowball stemmers are not statistically significant.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.12, S.2540-2547

Savoy, J.: Text clustering : an application with the 'State of the Union' addresses (2015) 0.01

0.0098805055 = product of:
  0.019761011 = sum of:
    0.008101207 = weight(_text_:information in 2128) [ClassicSimilarity], result of:
      0.008101207 = score(doc=2128,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.09697737 = fieldWeight in 2128, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2128)
    0.011659805 = product of:
      0.02331961 = sum of:
        0.02331961 = weight(_text_:technology in 2128) [ClassicSimilarity], result of:
          0.02331961 = score(doc=2128,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.16453418 = fieldWeight in 2128, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2128)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Source: Journal of the Association for Information Science and Technology. 66(2015) no.8, S.1645-1654

Savoy, J.: Text representation strategies : an example with the State of the union addresses (2016) 0.01

0.0098805055 = product of:
  0.019761011 = sum of:
    0.008101207 = weight(_text_:information in 3042) [ClassicSimilarity], result of:
      0.008101207 = score(doc=3042,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.09697737 = fieldWeight in 3042, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3042)
    0.011659805 = product of:
      0.02331961 = sum of:
        0.02331961 = weight(_text_:technology in 3042) [ClassicSimilarity], result of:
          0.02331961 = score(doc=3042,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.16453418 = fieldWeight in 3042, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3042)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Source: Journal of the Association for Information Science and Technology. 67(2016) no.8, S.1858-1870

Savoy, J.: Authorship of Pauline epistles revisited (2019) 0.01

0.0098805055 = product of:
  0.019761011 = sum of:
    0.008101207 = weight(_text_:information in 5386) [ClassicSimilarity], result of:
      0.008101207 = score(doc=5386,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.09697737 = fieldWeight in 5386, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5386)
    0.011659805 = product of:
      0.02331961 = sum of:
        0.02331961 = weight(_text_:technology in 5386) [ClassicSimilarity], result of:
          0.02331961 = score(doc=5386,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.16453418 = fieldWeight in 5386, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5386)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Source: Journal of the Association for Information Science and Technology. 70(2019) no.10, S.1089-1097

Ikae, C.; Savoy, J.: Gender identification on Twitter (2022) 0.01

0.0098805055 = product of:
  0.019761011 = sum of:
    0.008101207 = weight(_text_:information in 445) [ClassicSimilarity], result of:
      0.008101207 = score(doc=445,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.09697737 = fieldWeight in 445, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=445)
    0.011659805 = product of:
      0.02331961 = sum of:
        0.02331961 = weight(_text_:technology in 445) [ClassicSimilarity], result of:
          0.02331961 = score(doc=445,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.16453418 = fieldWeight in 445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.0390625 = fieldNorm(doc=445)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Source: Journal of the Association for Information Science and Technology. 73(2022) no.1, S.58-69

Savoy, J.; Ndarugendamwo, M.; Vrajitoru, D.: Report on the TREC-4 experiment : combining probabilistic and vector-space schemes (1996) 0.01

0.0069958833 = product of:
  0.027983533 = sum of:
    0.027983533 = product of:
      0.055967066 = sum of:
        0.055967066 = weight(_text_:technology in 7574) [ClassicSimilarity], result of:
          0.055967066 = score(doc=7574,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.39488205 = fieldWeight in 7574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.09375 = fieldNorm(doc=7574)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Imprint: Gaithersburgh, MD : National Institute of Standards and Technology

Savoy, J.; Calvé, A. le; Vrajitoru, D.: Report on the TREC5 experiment : data fusion and collection fusion (1997) 0.01

0.0069958833 = product of:
  0.027983533 = sum of:
    0.027983533 = product of:
      0.055967066 = sum of:
        0.055967066 = weight(_text_:technology in 3108) [ClassicSimilarity], result of:
          0.055967066 = score(doc=3108,freq=2.0), product of:
            0.1417311 = queryWeight, product of:
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.047586527 = queryNorm
            0.39488205 = fieldWeight in 3108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.978387 = idf(docFreq=6114, maxDocs=44218)
              0.09375 = fieldNorm(doc=3108)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Imprint: Gaithersburgh, MD : National Institute of Standards and Technology

Savoy, J.: Stemming of French words based on grammatical categories (1993) 0.01

0.006480966 = product of:
  0.025923865 = sum of:
    0.025923865 = weight(_text_:information in 4650) [ClassicSimilarity], result of:
      0.025923865 = score(doc=4650,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.3103276 = fieldWeight in 4650, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.125 = fieldNorm(doc=4650)
  0.25 = coord(1/4)

Source: Journal of the American Society for Information Science. 44(1993) no.1, S.1-9

Savoy, J.: Bayesian inference networks and spreading activation in hypertext systems (1992) 0.01

0.006480966 = product of:
  0.025923865 = sum of:
    0.025923865 = weight(_text_:information in 192) [ClassicSimilarity], result of:
      0.025923865 = score(doc=192,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.3103276 = fieldWeight in 192, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.125 = fieldNorm(doc=192)
  0.25 = coord(1/4)

Source: Information processing and management. 28(1992), S.389-405

Savoy, J.: ¬An extended vector-processing scheme for searching information in hypertext systems (1996) 0.01
```
0.006075905 = product of:
  0.02430362 = sum of:
    0.02430362 = weight(_text_:information in 4036) [ClassicSimilarity], result of:
      0.02430362 = score(doc=4036,freq=18.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.2909321 = fieldWeight in 4036, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4036)
  0.25 = coord(1/4)
```
Abstract

When searching information in a hypertext is limited to navigation, it is not an easy task, especially when the number of nodes and/or links becomes very large. A query based access mechanism must therefore be provided to complement the navigational tools inherent in hypertext systems. Most mechanisms currently proposed are based on conventional information retrieval models which consider documents as indepent entities, and ignore hypertext links. To promote the use of other information retrieval mechnaisms adapted to hypertext systems, responds to the following questions; how can we integrate information given by hypertext links into an information retrieval scheme; are these hypertext links (and link semantics) clues to the enhancement of retrieval effectiveness; if so, how can we use them. 2 solutions are: using a default weight function based on link tape or assigning the same strength to all link types; or using a specific weight for each particular link, i.e. the level of association or a similarity measure. Proposes an extended vector processing scheme which extracts additional information from hypertext links to enhance retrieval effectiveness. A hypertext based on 2 medium size collections, the CACM and the CISI collection has been built. The hypergraph is composed of explicit links (bibliographic references), computed links based on bibliographic information, or on hypertext links established according to document representatives (nearest neighbour)

Source

Information processing and management. 32(1996) no.2, S.155-170

Savoy, J.; Picard, J.: Retrieval effectiveness on the web (2001) 0.01

0.0056708455 = product of:
  0.022683382 = sum of:
    0.022683382 = weight(_text_:information in 775) [ClassicSimilarity], result of:
      0.022683382 = score(doc=775,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.27153665 = fieldWeight in 775, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.109375 = fieldNorm(doc=775)
  0.25 = coord(1/4)

Source: Information processing and management. 37(2001) no.4, S.543-569

Savoy, J.: Searching information in legal hypertext systems (1993/94) 0.01

0.0056126816 = product of:
  0.022450726 = sum of:
    0.022450726 = weight(_text_:information in 757) [ClassicSimilarity], result of:
      0.022450726 = score(doc=757,freq=6.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.2687516 = fieldWeight in 757, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=757)
  0.25 = coord(1/4)

Abstract: Hypertext may represent a new paradigm capable of exploring legal sources within which links are established according to pertinent relationships found between statute texts and case law. However, to discover relvant information in such a network, a browsing mechanism is not enough when faced with a large column of texts. Describes a new retrieval model where documents are represented according to both their content and relationship with other sources of information

Savoy, J.: ¬A stemming procedure and stopword list for general French Corpora (1999) 0.00

0.004860725 = product of:
  0.0194429 = sum of:
    0.0194429 = weight(_text_:information in 4314) [ClassicSimilarity], result of:
      0.0194429 = score(doc=4314,freq=2.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.23274569 = fieldWeight in 4314, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=4314)
  0.25 = coord(1/4)

Source: Journal of the American Society for Information Science. 50(1999) no.10, S.944-954

Savoy, J.; Desbois, D.: Information retrieval in hypertext systems (1991) 0.00
```
0.0045827352 = product of:
  0.018330941 = sum of:
    0.018330941 = weight(_text_:information in 4452) [ClassicSimilarity], result of:
      0.018330941 = score(doc=4452,freq=4.0), product of:
        0.083537094 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.047586527 = queryNorm
        0.21943474 = fieldWeight in 4452, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=4452)
  0.25 = coord(1/4)
```
Abstract

The emphasis in most hypertext systems is on the navigational methods, rather than on the global document retrieval mechanisms. When a search mechanism is provided, it is often restricted to simple string matching or to the Boolean model (as an alternate method). proposes a retrieval mechanism using Bayesian inference networks. The main contribution of this approach is the automatic construction of this network using the expected mutual information measure to build the inference tree, and using Jaccard's formula to define fixed conditional probability relationships

Search (27 results, page 1 of 2)

Authors

Years

Themes