Search (3 results, page 1 of 1)

  • × theme_ss:"Automatisches Abstracting"
  • × year_i:[2010 TO 2020}
  1. Cai, X.; Li, W.: Enhancing sentence-level clustering with integrated and interactive frameworks for theme-based summarization (2011) 0.01
    0.010069768 = product of:
      0.030209303 = sum of:
        0.030209303 = product of:
          0.09062791 = sum of:
            0.09062791 = weight(_text_:objects in 4770) [ClassicSimilarity], result of:
              0.09062791 = score(doc=4770,freq=2.0), product of:
                0.3086582 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.05807226 = queryNorm
                0.29361898 = fieldWeight in 4770, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4770)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Sentence clustering plays a pivotal role in theme-based summarization, which discovers topic themes defined as the clusters of highly related sentences to avoid redundancy and cover more diverse information. As the length of sentences is short and the content it contains is limited, the bag-of-words cosine similarity traditionally used for document clustering is no longer suitable. Special treatment for measuring sentence similarity is necessary. In this article, we study the sentence-level clustering problem. After exploiting concept- and context-enriched sentence vector representations, we develop two co-clustering frameworks to enhance sentence-level clustering for theme-based summarization-integrated clustering and interactive clustering-both allowing word and document to play an explicit role in sentence clustering as independent text objects rather than using word or concept as features of a sentence in a document set. In each framework, we experiment with two-level co-clustering (i.e., sentence-word co-clustering or sentence-document co-clustering) and three-level co-clustering (i.e., document-sentence-word co-clustering). Compared against concept- and context-oriented sentence-representation reformation, co-clustering shows a clear advantage in both intrinsic clustering quality evaluation and extrinsic summarization evaluation conducted on the Document Understanding Conferences (DUC) datasets.
  2. Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.01
    0.007834411 = product of:
      0.023503233 = sum of:
        0.023503233 = product of:
          0.047006465 = sum of:
            0.047006465 = weight(_text_:indexing in 5400) [ClassicSimilarity], result of:
              0.047006465 = score(doc=5400,freq=2.0), product of:
                0.22229293 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.05807226 = queryNorm
                0.21146181 = fieldWeight in 5400, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5400)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.
  3. Kim, H.H.; Kim, Y.H.: Generic speech summarization of transcribed lecture videos : using tags and their semantic relations (2016) 0.01
    0.006556659 = product of:
      0.019669976 = sum of:
        0.019669976 = product of:
          0.039339952 = sum of:
            0.039339952 = weight(_text_:22 in 2640) [ClassicSimilarity], result of:
              0.039339952 = score(doc=2640,freq=2.0), product of:
                0.20335917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05807226 = queryNorm
                0.19345059 = fieldWeight in 2640, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2640)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 1.2016 12:29:41