Search (2 results, page 1 of 1)

  • × author_ss:"Cohen, J.D."
  1. Cohen, J.D.: Highlights: language- and domain-independent automatic indexing terms for abstracting (1995) 0.00
    0.0021674242 = product of:
      0.0043348484 = sum of:
        0.0043348484 = product of:
          0.008669697 = sum of:
            0.008669697 = weight(_text_:a in 1793) [ClassicSimilarity], result of:
              0.008669697 = score(doc=1793,freq=10.0), product of:
                0.043477926 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.037706986 = queryNorm
                0.19940455 = fieldWeight in 1793, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1793)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Presents a model of drawing index terms from text. The approach uses no stop list, stemmer, or other language and domain specific component, allowing operation in any language or domain with only trivial modification. The method uses n-grams counts, achieving a function similar to, but more general than, a stemmer. The generated index terms, called 'highlights', are suitable for identifying the topic for perusal and selection. An extension is also described and demonstrated which selects index terms to represent a subset of documents, distinguishing them from the corpus. Presents some experimental results, showing operation in English, Spanish, German, Georgian, Russian and Japanese
    Type
    a
  2. Cohen, J.D.: Massive query resolution for rapid selective dissemination of information (1999) 0.00
    0.002035109 = product of:
      0.004070218 = sum of:
        0.004070218 = product of:
          0.008140436 = sum of:
            0.008140436 = weight(_text_:a in 3054) [ClassicSimilarity], result of:
              0.008140436 = score(doc=3054,freq=12.0), product of:
                0.043477926 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.037706986 = queryNorm
                0.18723148 = fieldWeight in 3054, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3054)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The tasks of choosing documents from a new collection and categorization the choices, both on the basis of a body of standing queries, are known variously as selection and routing, selective dissemination of information (SDI), and information filtering. The combined operation of selecting and labeling documents naturally separates into 2 processes: feature scanning and query resolution. The first process examines a document for features and their locations; the second takes the findings from the first process, looks for satisfaction of combinations specified in the queries, and marks the document accordingly. When the body of queries is large, query resolution can become a significant factor in total processing speed. This paper outlines an efficient approach to performing query resolution on massive Boolean queries, suitable for implementation on a desktop computer. Algorithms are sketched in pseudo-code and experimental results are reported
    Type
    a