Search (27 results, page 1 of 2)

  • × author_ss:"Savoy, J."
  1. Ikae, C.; Savoy, J.: Gender identification on Twitter (2022) 0.03
    0.027124483 = product of:
      0.067811206 = sum of:
        0.00436753 = weight(_text_:e in 445) [ClassicSimilarity], result of:
          0.00436753 = score(doc=445,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.07940422 = fieldWeight in 445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0390625 = fieldNorm(doc=445)
        0.063443676 = weight(_text_:69 in 445) [ClassicSimilarity], result of:
          0.063443676 = score(doc=445,freq=2.0), product of:
            0.20963728 = queryWeight, product of:
              5.478287 = idf(docFreq=501, maxDocs=44218)
              0.03826694 = queryNorm
            0.30263546 = fieldWeight in 445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.478287 = idf(docFreq=501, maxDocs=44218)
              0.0390625 = fieldNorm(doc=445)
      0.4 = coord(2/5)
    
    Language
    e
    Source
    Journal of the Association for Information Science and Technology. 73(2022) no.1, S.58-69
  2. Savoy, J.: Text representation strategies : an example with the State of the union addresses (2016) 0.01
    0.009654467 = product of:
      0.024136167 = sum of:
        0.00436753 = weight(_text_:e in 3042) [ClassicSimilarity], result of:
          0.00436753 = score(doc=3042,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.07940422 = fieldWeight in 3042, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3042)
        0.019768637 = product of:
          0.05930591 = sum of:
            0.05930591 = weight(_text_:evolution in 3042) [ClassicSimilarity], result of:
              0.05930591 = score(doc=3042,freq=2.0), product of:
                0.2026858 = queryWeight, product of:
                  5.29663 = idf(docFreq=601, maxDocs=44218)
                  0.03826694 = queryNorm
                0.2926002 = fieldWeight in 3042, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.29663 = idf(docFreq=601, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3042)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    Based on State of the Union addresses from 1790 to 2014 (225 speeches delivered by 42 presidents), this paper describes and evaluates different text representation strategies. To determine the most important words of a given text, the term frequencies (tf) or the tf?idf weighting scheme can be applied. Recently, latent Dirichlet allocation (LDA) has been proposed to define the topics included in a corpus. As another strategy, this study proposes to apply a vocabulary specificity measure (Z?score) to determine the most significantly overused word-types or short sequences of them. Our experiments show that the simple term frequency measure is not able to discriminate between specific terms associated with a document or a set of texts. Using the tf idf or LDA approach, the selection requires some arbitrary decisions. Based on the term-specific measure (Z?score), the term selection has a clear theoretical basis. Moreover, the most significant sentences for each presidency can be determined. As another facet, we can visualize the dynamic evolution of usage of some terms associated with their specificity measures. Finally, this technique can be employed to define the most important lexical leaders introducing terms overused by the k following presidencies.
    Language
    e
  3. Savoy, J.: Estimating the probability of an authorship attribution (2016) 0.01
    0.0069316537 = product of:
      0.017329134 = sum of:
        0.00436753 = weight(_text_:e in 2937) [ClassicSimilarity], result of:
          0.00436753 = score(doc=2937,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.07940422 = fieldWeight in 2937, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2937)
        0.012961605 = product of:
          0.02592321 = sum of:
            0.02592321 = weight(_text_:22 in 2937) [ClassicSimilarity], result of:
              0.02592321 = score(doc=2937,freq=2.0), product of:
                0.1340043 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03826694 = queryNorm
                0.19345059 = fieldWeight in 2937, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2937)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Date
    7. 5.2016 21:22:27
    Language
    e
  4. Savoy, J.: Stemming of French words based on grammatical categories (1993) 0.00
    0.002795219 = product of:
      0.013976094 = sum of:
        0.013976094 = weight(_text_:e in 4650) [ClassicSimilarity], result of:
          0.013976094 = score(doc=4650,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.2540935 = fieldWeight in 4650, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.125 = fieldNorm(doc=4650)
      0.2 = coord(1/5)
    
    Language
    e
  5. Savoy, J.: Bayesian inference networks and spreading activation in hypertext systems (1992) 0.00
    0.002795219 = product of:
      0.013976094 = sum of:
        0.013976094 = weight(_text_:e in 192) [ClassicSimilarity], result of:
          0.013976094 = score(doc=192,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.2540935 = fieldWeight in 192, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.125 = fieldNorm(doc=192)
      0.2 = coord(1/5)
    
    Language
    e
  6. Savoy, J.; Picard, J.: Retrieval effectiveness on the web (2001) 0.00
    0.0024458165 = product of:
      0.012229082 = sum of:
        0.012229082 = weight(_text_:e in 775) [ClassicSimilarity], result of:
          0.012229082 = score(doc=775,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.2223318 = fieldWeight in 775, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.109375 = fieldNorm(doc=775)
      0.2 = coord(1/5)
    
    Language
    e
  7. Savoy, J.; Ndarugendamwo, M.; Vrajitoru, D.: Report on the TREC-4 experiment : combining probabilistic and vector-space schemes (1996) 0.00
    0.0020964143 = product of:
      0.010482071 = sum of:
        0.010482071 = weight(_text_:e in 7574) [ClassicSimilarity], result of:
          0.010482071 = score(doc=7574,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.19057012 = fieldWeight in 7574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.09375 = fieldNorm(doc=7574)
      0.2 = coord(1/5)
    
    Language
    e
  8. Savoy, J.; Calvé, A. le; Vrajitoru, D.: Report on the TREC5 experiment : data fusion and collection fusion (1997) 0.00
    0.0020964143 = product of:
      0.010482071 = sum of:
        0.010482071 = weight(_text_:e in 3108) [ClassicSimilarity], result of:
          0.010482071 = score(doc=3108,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.19057012 = fieldWeight in 3108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.09375 = fieldNorm(doc=3108)
      0.2 = coord(1/5)
    
    Language
    e
  9. Savoy, J.: ¬A stemming procedure and stopword list for general French Corpora (1999) 0.00
    0.0020964143 = product of:
      0.010482071 = sum of:
        0.010482071 = weight(_text_:e in 4314) [ClassicSimilarity], result of:
          0.010482071 = score(doc=4314,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.19057012 = fieldWeight in 4314, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.09375 = fieldNorm(doc=4314)
      0.2 = coord(1/5)
    
    Language
    e
  10. Savoy, J.; Desbois, D.: Information retrieval in hypertext systems (1991) 0.00
    0.0013976095 = product of:
      0.006988047 = sum of:
        0.006988047 = weight(_text_:e in 4452) [ClassicSimilarity], result of:
          0.006988047 = score(doc=4452,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.12704675 = fieldWeight in 4452, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0625 = fieldNorm(doc=4452)
      0.2 = coord(1/5)
    
    Language
    e
  11. Savoy, J.: Effectiveness of information retrieval systems used in a hypertext environment (1993) 0.00
    0.0013976095 = product of:
      0.006988047 = sum of:
        0.006988047 = weight(_text_:e in 6511) [ClassicSimilarity], result of:
          0.006988047 = score(doc=6511,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.12704675 = fieldWeight in 6511, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0625 = fieldNorm(doc=6511)
      0.2 = coord(1/5)
    
    Language
    e
  12. Savoy, J.: ¬A learning scheme for information retrieval in hypertext (1994) 0.00
    0.0013976095 = product of:
      0.006988047 = sum of:
        0.006988047 = weight(_text_:e in 7292) [ClassicSimilarity], result of:
          0.006988047 = score(doc=7292,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.12704675 = fieldWeight in 7292, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0625 = fieldNorm(doc=7292)
      0.2 = coord(1/5)
    
    Language
    e
  13. Savoy, J.: Searching information in legal hypertext systems (1993/94) 0.00
    0.0013976095 = product of:
      0.006988047 = sum of:
        0.006988047 = weight(_text_:e in 757) [ClassicSimilarity], result of:
          0.006988047 = score(doc=757,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.12704675 = fieldWeight in 757, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0625 = fieldNorm(doc=757)
      0.2 = coord(1/5)
    
    Language
    e
  14. Savoy, J.: ¬A new probabilistic scheme for information retrieval in hypertext (1995) 0.00
    0.0012229083 = product of:
      0.006114541 = sum of:
        0.006114541 = weight(_text_:e in 7254) [ClassicSimilarity], result of:
          0.006114541 = score(doc=7254,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.1111659 = fieldWeight in 7254, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7254)
      0.2 = coord(1/5)
    
    Language
    e
  15. Savoy, J.: Bibliographic database access using free-text and controlled vocabulary : an evaluation (2005) 0.00
    0.0012229083 = product of:
      0.006114541 = sum of:
        0.006114541 = weight(_text_:e in 1053) [ClassicSimilarity], result of:
          0.006114541 = score(doc=1053,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.1111659 = fieldWeight in 1053, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1053)
      0.2 = coord(1/5)
    
    Language
    e
  16. Dolamic, L.; Savoy, J.: When stopword lists make the difference (2009) 0.00
    0.0012229083 = product of:
      0.006114541 = sum of:
        0.006114541 = weight(_text_:e in 3319) [ClassicSimilarity], result of:
          0.006114541 = score(doc=3319,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.1111659 = fieldWeight in 3319, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3319)
      0.2 = coord(1/5)
    
    Language
    e
  17. Savoy, J.: Ranking schemes in hybrid Boolean systems : a new approach (1997) 0.00
    0.0010482072 = product of:
      0.0052410355 = sum of:
        0.0052410355 = weight(_text_:e in 393) [ClassicSimilarity], result of:
          0.0052410355 = score(doc=393,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.09528506 = fieldWeight in 393, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=393)
      0.2 = coord(1/5)
    
    Language
    e
  18. Savoy, J.: Searching strategies for the Hungarian language (2008) 0.00
    0.0010482072 = product of:
      0.0052410355 = sum of:
        0.0052410355 = weight(_text_:e in 2037) [ClassicSimilarity], result of:
          0.0052410355 = score(doc=2037,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.09528506 = fieldWeight in 2037, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=2037)
      0.2 = coord(1/5)
    
    Language
    e
  19. Abdou, S.; Savoy, J.: Searching in Medline : query expansion and manual indexing evaluation (2008) 0.00
    0.0010482072 = product of:
      0.0052410355 = sum of:
        0.0052410355 = weight(_text_:e in 2062) [ClassicSimilarity], result of:
          0.0052410355 = score(doc=2062,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.09528506 = fieldWeight in 2062, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=2062)
      0.2 = coord(1/5)
    
    Language
    e
  20. Fautsch, C.; Savoy, J.: Algorithmic stemmers or morphological analysis? : an evaluation (2009) 0.00
    0.0010482072 = product of:
      0.0052410355 = sum of:
        0.0052410355 = weight(_text_:e in 2950) [ClassicSimilarity], result of:
          0.0052410355 = score(doc=2950,freq=2.0), product of:
            0.055003747 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.03826694 = queryNorm
            0.09528506 = fieldWeight in 2950, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=2950)
      0.2 = coord(1/5)
    
    Language
    e