Search (21 results, page 1 of 2)

  • × theme_ss:"Automatisches Abstracting"
  1. Shen, D.; Yang, Q.; Chen, Z.: Noise reduction through summarization for Web-page classification (2007) 0.05
    0.045006108 = product of:
      0.13501832 = sum of:
        0.13501832 = product of:
          0.27003664 = sum of:
            0.27003664 = weight(_text_:page in 953) [ClassicSimilarity], result of:
              0.27003664 = score(doc=953,freq=14.0), product of:
                0.27565226 = queryWeight, product of:
                  5.5854197 = idf(docFreq=450, maxDocs=44218)
                  0.049352113 = queryNorm
                0.97962785 = fieldWeight in 953, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  5.5854197 = idf(docFreq=450, maxDocs=44218)
                  0.046875 = fieldNorm(doc=953)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Due to a large variety of noisy information embedded in Web pages, Web-page classification is much more difficult than pure-text classification. In this paper, we propose to improve the Web-page classification performance by removing the noise through summarization techniques. We first give empirical evidence that ideal Web-page summaries generated by human editors can indeed improve the performance of Web-page classification algorithms. We then put forward a new Web-page summarization algorithm based on Web-page layout and evaluate it along with several other state-of-the-art text summarization algorithms on the LookSmart Web directory. Experimental results show that the classification algorithms (NB or SVM) augmented by any summarization approach can achieve an improvement by more than 5.0% as compared to pure-text-based classification algorithms. We further introduce an ensemble method to combine the different summarization algorithms. The ensemble summarization method achieves more than 12.0% improvement over pure-text based methods.
  2. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.04
    0.03511544 = product of:
      0.105346315 = sum of:
        0.105346315 = weight(_text_:query in 5089) [ClassicSimilarity], result of:
          0.105346315 = score(doc=5089,freq=10.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.4592845 = fieldWeight in 5089, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.03125 = fieldNorm(doc=5089)
      0.33333334 = coord(1/3)
    
    Abstract
    Sentiment analysis concerns the study of opinions expressed in a text. This paper presents the QMOS method, which employs a combination of sentiment analysis and summarization approaches. It is a lexicon-based method to query-based multi-documents summarization of opinion expressed in reviews. QMOS combines multiple sentiment dictionaries to improve word coverage limit of the individual lexicon. A major problem for a dictionary-based approach is the semantic gap between the prior polarity of a word presented by a lexicon and the word polarity in a specific context. This is due to the fact that, the polarity of a word depends on the context in which it is being used. Furthermore, the type of a sentence can also affect the performance of a sentiment analysis approach. Therefore, to tackle the aforementioned challenges, QMOS integrates multiple strategies to adjust word prior sentiment orientation while also considers the type of sentence. QMOS also employs the Semantic Sentiment Approach to determine the sentiment score of a word if it is not included in a sentiment lexicon. On the other hand, the most of the existing methods fail to distinguish the meaning of a review sentence and user's query when both of them share the similar bag-of-words; hence there is often a conflict between the extracted opinionated sentences and users' needs. However, the summarization phase of QMOS is able to avoid extracting a review sentence whose similarity with the user's query is high but whose meaning is different. The method also employs the greedy algorithm and query expansion approach to reduce redundancy and bridge the lexical gaps for similar contexts that are expressed using different wording, respectively. Our experiment shows that the QMOS method can significantly improve the performance and make QMOS comparable to other existing methods.
  3. Lam, W.; Chan, K.; Radev, D.; Saggion, H.; Teufel, S.: Context-based generic cross-lingual retrieval of documents and automated summaries (2005) 0.03
    0.03331343 = product of:
      0.09994029 = sum of:
        0.09994029 = weight(_text_:query in 1965) [ClassicSimilarity], result of:
          0.09994029 = score(doc=1965,freq=4.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.43571556 = fieldWeight in 1965, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.046875 = fieldNorm(doc=1965)
      0.33333334 = coord(1/3)
    
    Abstract
    We develop a context-based generic cross-lingual retrieval model that can deal with different language pairs. Our model considers contexts in the query translation process. Contexts in the query as weIl as in the documents based an co-occurrence statistics from different granularity of passages are exploited. We also investigate cross-lingual retrieval of automatic generic summaries. We have implemented our model for two different cross-lingual settings, namely, retrieving Chinese documents from English queries as weIl as retrieving English documents from Chinese queries. Extensive experiments have been conducted an a large-scale parallel corpus enabling studies an retrieval performance for two different cross-lingual settings of full-length documents as weIl as automated summaries.
  4. Liang, S.-F.; Devlin, S.; Tait, J.: Investigating sentence weighting components for automatic summarisation (2007) 0.03
    0.03331343 = product of:
      0.09994029 = sum of:
        0.09994029 = weight(_text_:query in 899) [ClassicSimilarity], result of:
          0.09994029 = score(doc=899,freq=4.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.43571556 = fieldWeight in 899, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.046875 = fieldNorm(doc=899)
      0.33333334 = coord(1/3)
    
    Abstract
    The work described here initially formed part of a triangulation exercise to establish the effectiveness of the Query Term Order algorithm. It subsequently proved to be a reliable indicator for summarising English web documents. We utilised the human summaries from the Document Understanding Conference data, and generated queries automatically for testing the QTO algorithm. Six sentence weighting schemes that made use of Query Term Frequency and QTO were constructed to produce system summaries, and this paper explains the process of combining and balancing the weighting components. The summaries produced were evaluated by the ROUGE-1 metric, and the results showed that using QTO in a weighting combination resulted in the best performance. We also found that using a combination of more weighting components always produced improved performance compared to any single weighting component.
  5. Wei, F.; Li, W.; Lu, Q.; He, Y.: Applying two-level reinforcement ranking in query-oriented multidocument summarization (2009) 0.03
    0.027761191 = product of:
      0.08328357 = sum of:
        0.08328357 = weight(_text_:query in 3120) [ClassicSimilarity], result of:
          0.08328357 = score(doc=3120,freq=4.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.3630963 = fieldWeight in 3120, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3120)
      0.33333334 = coord(1/3)
    
    Abstract
    Sentence ranking is the issue of most concern in document summarization today. While traditional feature-based approaches evaluate sentence significance and rank the sentences relying on the features that are particularly designed to characterize the different aspects of the individual sentences, the newly emerging graph-based ranking algorithms (such as the PageRank-like algorithms) recursively compute sentence significance using the global information in a text graph that links sentences together. In general, the existing PageRank-like algorithms can model well the phenomena that a sentence is important if it is linked by many other important sentences. Or they are capable of modeling the mutual reinforcement among the sentences in the text graph. However, when dealing with multidocument summarization these algorithms often assemble a set of documents into one large file. The document dimension is totally ignored. In this article we present a framework to model the two-level mutual reinforcement among sentences as well as documents. Under this framework we design and develop a novel ranking algorithm such that the document reinforcement is taken into account in the process of sentence ranking. The convergence issue is examined. We also explore an interesting and important property of the proposed algorithm. When evaluated on the DUC 2005 and 2006 query-oriented multidocument summarization datasets, significant results are achieved.
  6. Ouyang, Y.; Li, W.; Li, S.; Lu, Q.: Intertopic information mining for query-based summarization (2010) 0.03
    0.027761191 = product of:
      0.08328357 = sum of:
        0.08328357 = weight(_text_:query in 3459) [ClassicSimilarity], result of:
          0.08328357 = score(doc=3459,freq=4.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.3630963 = fieldWeight in 3459, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3459)
      0.33333334 = coord(1/3)
    
    Abstract
    In this article, the authors address the problem of sentence ranking in summarization. Although most existing summarization approaches are concerned with the information embodied in a particular topic (including a set of documents and an associated query) for sentence ranking, they propose a novel ranking approach that incorporates intertopic information mining. Intertopic information, in contrast to intratopic information, is able to reveal pairwise topic relationships and thus can be considered as the bridge across different topics. In this article, the intertopic information is used for transferring word importance learned from known topics to unknown topics under a learning-based summarization framework. To mine this information, the authors model the topic relationship by clustering all the words in both known and unknown topics according to various kinds of word conceptual labels, which indicate the roles of the words in the topic. Based on the mined relationships, we develop a probabilistic model using manually generated summaries provided for known topics to predict ranking scores for sentences in unknown topics. A series of experiments have been conducted on the Document Understanding Conference (DUC) 2006 data set. The evaluation results show that intertopic information is indeed effective for sentence ranking and the resultant summarization system performs comparably well to the best-performing DUC participating systems on the same data set.
  7. Xiong, S.; Ji, D.: Query-focused multi-document summarization using hypergraph-based ranking (2016) 0.03
    0.027482178 = product of:
      0.08244653 = sum of:
        0.08244653 = weight(_text_:query in 2972) [ClassicSimilarity], result of:
          0.08244653 = score(doc=2972,freq=2.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.35944697 = fieldWeight in 2972, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2972)
      0.33333334 = coord(1/3)
    
  8. Dunlavy, D.M.; O'Leary, D.P.; Conroy, J.M.; Schlesinger, J.D.: QCS: A system for querying, clustering and summarizing documents (2007) 0.03
    0.027200304 = product of:
      0.08160091 = sum of:
        0.08160091 = weight(_text_:query in 947) [ClassicSimilarity], result of:
          0.08160091 = score(doc=947,freq=6.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.35576028 = fieldWeight in 947, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.03125 = fieldNorm(doc=947)
      0.33333334 = coord(1/3)
    
    Abstract
    Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel integrated information retrieval system-the Query, Cluster, Summarize (QCS) system-which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of methods in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) as measured by the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence "trimming" and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.
  9. Kannan, R.; Ghinea, G.; Swaminathan, S.: What do you wish to see? : A summarization system for movies based on user preferences (2015) 0.02
    0.022208955 = product of:
      0.06662686 = sum of:
        0.06662686 = weight(_text_:query in 2683) [ClassicSimilarity], result of:
          0.06662686 = score(doc=2683,freq=4.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.29047704 = fieldWeight in 2683, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.03125 = fieldNorm(doc=2683)
      0.33333334 = coord(1/3)
    
    Abstract
    Video summarization aims at producing a compact version of a full-length video while preserving the significant content of the original video. Movie summarization condenses a full-length movie into a summary that still retains the most significant and interesting content of the original movie. In the past, several movie summarization systems have been proposed to generate a movie summary based on low-level video features such as color, motion, texture, etc. However, a generic summary, which is common to everyone and is produced based only on low-level video features will not satisfy every user. As users' preferences for the summary differ vastly for the same movie, there is a need for a personalized movie summarization system nowadays. To address this demand, this paper proposes a novel system to generate semantically meaningful video summaries for the same movie, which are tailored to the preferences and interests of a user. For a given movie, shots and scenes are automatically detected and their high-level features are semi-automatically annotated. Preferences over high-level movie features are explicitly collected from the user using a query interface. The user preferences are generated by means of a stored-query. Movie summaries are generated at shot level and scene level, where shots or scenes are selected for summary skim based on the similarity measured between shots and scenes, and the user's preferences. The proposed movie summarization system is evaluated subjectively using a sample of 20 subjects with eight movies in the English language. The quality of the generated summaries is assessed by informativeness, enjoyability, relevance, and acceptance metrics and Quality of Perception measures. Further, the usability of the proposed summarization system is subjectively evaluated by conducting a questionnaire survey. The experimental results on the performance of the proposed movie summarization approach show the potential of the proposed system.
  10. Ou, S.; Khoo, C.S.G.; Goh, D.H.: Multi-document summarization of news articles using an event-based framework (2006) 0.02
    0.019630127 = product of:
      0.05889038 = sum of:
        0.05889038 = weight(_text_:query in 657) [ClassicSimilarity], result of:
          0.05889038 = score(doc=657,freq=2.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.25674784 = fieldWeight in 657, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=657)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The purpose of this research is to develop a method for automatic construction of multi-document summaries of sets of news articles that might be retrieved by a web search engine in response to a user query. Design/methodology/approach - Based on the cross-document discourse analysis, an event-based framework is proposed for integrating and organizing information extracted from different news articles. It has a hierarchical structure in which the summarized information is presented at the top level and more detailed information given at the lower levels. A tree-view interface was implemented for displaying a multi-document summary based on the framework. A preliminary user evaluation was performed by comparing the framework-based summaries against the sentence-based summaries. Findings - In a small evaluation, all the human subjects preferred the framework-based summaries to the sentence-based summaries. It indicates that the event-based framework is an effective way to summarize a set of news articles reporting an event or a series of relevant events. Research limitations/implications - Limited to event-based news articles only, not applicable to news critiques and other kinds of news articles. A summarization system based on the event-based framework is being implemented. Practical implications - Multi-document summarization of news articles can adopt the proposed event-based framework. Originality/value - An event-based framework for summarizing sets of news articles was developed and evaluated using a tree-view interface for displaying such summaries.
  11. Ou, S.; Khoo, S.G.; Goh, D.H.: Automatic multidocument summarization of research abstracts : design and user evaluation (2007) 0.02
    0.019630127 = product of:
      0.05889038 = sum of:
        0.05889038 = weight(_text_:query in 522) [ClassicSimilarity], result of:
          0.05889038 = score(doc=522,freq=2.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.25674784 = fieldWeight in 522, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=522)
      0.33333334 = coord(1/3)
    
    Abstract
    The purpose of this study was to develop a method for automatic construction of multidocument summaries of sets of research abstracts that may be retrieved by a digital library or search engine in response to a user query. Sociology dissertation abstracts were selected as the sample domain in this study. A variable-based framework was proposed for integrating and organizing research concepts and relationships as well as research methods and contextual relations extracted from different dissertation abstracts. Based on the framework, a new summarization method was developed, which parses the discourse structure of abstracts, extracts research concepts and relationships, integrates the information across different abstracts, and organizes and presents them in a Web-based interface. The focus of this article is on the user evaluation that was performed to assess the overall quality and usefulness of the summaries. Two types of variable-based summaries generated using the summarization method-with or without the use of a taxonomy-were compared against a sentence-based summary that lists only the research-objective sentences extracted from each abstract and another sentence-based summary generated using the MEAD system that extracts important sentences. The evaluation results indicate that the majority of sociological researchers (70%) and general users (64%) preferred the variable-based summaries generated with the use of the taxonomy.
  12. Sankarasubramaniam, Y.; Ramanathan, K.; Ghosh, S.: Text summarization using Wikipedia (2014) 0.02
    0.019630127 = product of:
      0.05889038 = sum of:
        0.05889038 = weight(_text_:query in 2693) [ClassicSimilarity], result of:
          0.05889038 = score(doc=2693,freq=2.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.25674784 = fieldWeight in 2693, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2693)
      0.33333334 = coord(1/3)
    
    Abstract
    Automatic text summarization has been an active field of research for many years. Several approaches have been proposed, ranging from simple position and word-frequency methods, to learning and graph based algorithms. The advent of human-generated knowledge bases like Wikipedia offer a further possibility in text summarization - they can be used to understand the input text in terms of salient concepts from the knowledge base. In this paper, we study a novel approach that leverages Wikipedia in conjunction with graph-based ranking. Our approach is to first construct a bipartite sentence-concept graph, and then rank the input sentences using iterative updates on this graph. We consider several models for the bipartite graph, and derive convergence properties under each model. Then, we take up personalized and query-focused summarization, where the sentence ranks additionally depend on user interests and queries, respectively. Finally, we present a Wikipedia-based multi-document summarization algorithm. An important feature of the proposed algorithms is that they enable real-time incremental summarization - users can first view an initial summary, and then request additional content if interested. We evaluate the performance of our proposed summarizer using the ROUGE metric, and the results show that leveraging Wikipedia can significantly improve summary quality. We also present results from a user study, which suggests that using incremental summarization can help in better understanding news articles.
  13. Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.02
    0.019630127 = product of:
      0.05889038 = sum of:
        0.05889038 = weight(_text_:query in 5400) [ClassicSimilarity], result of:
          0.05889038 = score(doc=5400,freq=2.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.25674784 = fieldWeight in 5400, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5400)
      0.33333334 = coord(1/3)
    
    Abstract
    Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.
  14. Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.01
    0.008915374 = product of:
      0.026746122 = sum of:
        0.026746122 = product of:
          0.053492244 = sum of:
            0.053492244 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
              0.053492244 = score(doc=6599,freq=2.0), product of:
                0.1728227 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049352113 = queryNorm
                0.30952093 = fieldWeight in 6599, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6599)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    26. 2.1997 10:22:43
  15. Robin, J.; McKeown, K.: Empirically designing and evaluating a new revision-based model for summary generation (1996) 0.01
    0.008915374 = product of:
      0.026746122 = sum of:
        0.026746122 = product of:
          0.053492244 = sum of:
            0.053492244 = weight(_text_:22 in 6751) [ClassicSimilarity], result of:
              0.053492244 = score(doc=6751,freq=2.0), product of:
                0.1728227 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049352113 = queryNorm
                0.30952093 = fieldWeight in 6751, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6751)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    6. 3.1997 16:22:15
  16. Jones, P.A.; Bradbeer, P.V.G.: Discovery of optimal weights in a concept selection system (1996) 0.01
    0.008915374 = product of:
      0.026746122 = sum of:
        0.026746122 = product of:
          0.053492244 = sum of:
            0.053492244 = weight(_text_:22 in 6974) [ClassicSimilarity], result of:
              0.053492244 = score(doc=6974,freq=2.0), product of:
                0.1728227 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049352113 = queryNorm
                0.30952093 = fieldWeight in 6974, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6974)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
  17. Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.01
    0.0066865305 = product of:
      0.020059591 = sum of:
        0.020059591 = product of:
          0.040119182 = sum of:
            0.040119182 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
              0.040119182 = score(doc=948,freq=2.0), product of:
                0.1728227 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049352113 = queryNorm
                0.23214069 = fieldWeight in 948, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=948)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.
  18. Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.01
    0.0055721086 = product of:
      0.016716326 = sum of:
        0.016716326 = product of:
          0.03343265 = sum of:
            0.03343265 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
              0.03343265 = score(doc=5290,freq=2.0), product of:
                0.1728227 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049352113 = queryNorm
                0.19345059 = fieldWeight in 5290, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5290)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 7.2006 17:25:48
  19. Kim, H.H.; Kim, Y.H.: Generic speech summarization of transcribed lecture videos : using tags and their semantic relations (2016) 0.01
    0.0055721086 = product of:
      0.016716326 = sum of:
        0.016716326 = product of:
          0.03343265 = sum of:
            0.03343265 = weight(_text_:22 in 2640) [ClassicSimilarity], result of:
              0.03343265 = score(doc=2640,freq=2.0), product of:
                0.1728227 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049352113 = queryNorm
                0.19345059 = fieldWeight in 2640, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2640)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 1.2016 12:29:41
  20. Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.01
    0.0055721086 = product of:
      0.016716326 = sum of:
        0.016716326 = product of:
          0.03343265 = sum of:
            0.03343265 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
              0.03343265 = score(doc=889,freq=2.0), product of:
                0.1728227 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049352113 = queryNorm
                0.19345059 = fieldWeight in 889, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=889)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 1.2023 18:57:12