Search (1 results, page 1 of 1)

  • × author_ss:"Berlanga-Llavori, R."
  • × theme_ss:"Data Mining"
  1. Pons-Porrata, A.; Berlanga-Llavori, R.; Ruiz-Shulcloper, J.: Topic discovery based on text mining techniques (2007) 0.00
    0.0039754473 = product of:
      0.02782813 = sum of:
        0.022816047 = weight(_text_:system in 916) [ClassicSimilarity], result of:
          0.022816047 = score(doc=916,freq=4.0), product of:
            0.07727166 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.02453417 = queryNorm
            0.29527056 = fieldWeight in 916, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=916)
        0.0050120843 = weight(_text_:information in 916) [ClassicSimilarity], result of:
          0.0050120843 = score(doc=916,freq=2.0), product of:
            0.04306919 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02453417 = queryNorm
            0.116372846 = fieldWeight in 916, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=916)
      0.14285715 = coord(2/14)
    
    Abstract
    In this paper, we present a topic discovery system aimed to reveal the implicit knowledge present in news streams. This knowledge is expressed as a hierarchy of topic/subtopics, where each topic contains the set of documents that are related to it and a summary extracted from these documents. Summaries so built are useful to browse and select topics of interest from the generated hierarchies. Our proposal consists of a new incremental hierarchical clustering algorithm, which combines both partitional and agglomerative approaches, taking the main benefits from them. Finally, a new summarization method based on Testor Theory has been proposed to build the topic summaries. Experimental results in the TDT2 collection demonstrate its usefulness and effectiveness not only as a topic detection system, but also as a classification and summarization tool.
    Source
    Information processing and management. 43(2007) no.3, S.752-768