Document (#38727)

Author
Aker, A.
Gaizauskas, R.
Title
Generating descriptive multi-document summaries of geo-located entities using entity type models
Source
Journal of the Association for Information Science and Technology. 66(2015) no.4, S.721-738
Year
2015
Abstract
In this article, we investigate the application of entity type models in extractive multi-document summarization using automatic caption generation for images of geo-located entities (e.g., Westminster Abbey) as an application scenario. Entity type models contain sets of patterns aiming to capture the ways geo-located entities are described in natural language. They are automatically derived from texts about geo-located entities of the same type (e.g., churches, lakes). We integrate entity type models into a multi-document summarizer and use them to address the 2 major tasks in extractive multi-document summarization: sentence scoring and summary composition. We experiment with 3 different representation methods for entity type models: signature words, n-gram language models, and dependency patterns. We evaluate the summarizer with integrated entity type models relative to (a) a summarizer using standard text-related features commonly used in text summarization and (b) the Wikipedia location descriptions. Our results show that entity type models significantly improve the quality of output summaries over that of summaries generated using standard summarization features and Wikipedia summaries. The representation of entity type models using dependency patterns is superior to the representations using signature words and n-gram language models.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23211/abstract.

Similar documents (content)

  1. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.28
    0.28391024 = sum of:
      0.28391024 = product of:
        0.88721955 = sum of:
          0.025118815 = weight(abstract_txt:representation in 563) [ClassicSimilarity], result of:
            0.025118815 = score(doc=563,freq=1.0), product of:
              0.06526887 = queryWeight, product of:
                1.1256503 = boost
                4.926098 = idf(docFreq=871, maxDocs=44218)
                0.011770626 = queryNorm
              0.3848514 = fieldWeight in 563, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.926098 = idf(docFreq=871, maxDocs=44218)
                0.078125 = fieldNorm(doc=563)
          0.03223172 = weight(abstract_txt:words in 563) [ClassicSimilarity], result of:
            0.03223172 = score(doc=563,freq=1.0), product of:
              0.07707182 = queryWeight, product of:
                1.2232022 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.011770626 = queryNorm
              0.41820365 = fieldWeight in 563, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.078125 = fieldNorm(doc=563)
          0.02305476 = weight(abstract_txt:language in 563) [ClassicSimilarity], result of:
            0.02305476 = score(doc=563,freq=1.0), product of:
              0.07056307 = queryWeight, product of:
                1.4334575 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.011770626 = queryNorm
              0.32672557 = fieldWeight in 563, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=563)
          0.047010932 = weight(abstract_txt:document in 563) [ClassicSimilarity], result of:
            0.047010932 = score(doc=563,freq=2.0), product of:
              0.09912257 = queryWeight, product of:
                1.9617864 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.011770626 = queryNorm
              0.4742707 = fieldWeight in 563, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=563)
          0.026182637 = weight(abstract_txt:using in 563) [ClassicSimilarity], result of:
            0.026182637 = score(doc=563,freq=1.0), product of:
              0.09677339 = queryWeight, product of:
                2.3740456 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.011770626 = queryNorm
              0.27055615 = fieldWeight in 563, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.078125 = fieldNorm(doc=563)
          0.21622254 = weight(abstract_txt:multi in 563) [ClassicSimilarity], result of:
            0.21622254 = score(doc=563,freq=6.0), product of:
              0.19007874 = queryWeight, product of:
                2.716641 = boost
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.011770626 = queryNorm
              1.137542 = fieldWeight in 563, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.078125 = fieldNorm(doc=563)
          0.253271 = weight(abstract_txt:summaries in 563) [ClassicSimilarity], result of:
            0.253271 = score(doc=563,freq=3.0), product of:
              0.26611328 = queryWeight, product of:
                3.2143912 = boost
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.011770626 = queryNorm
              0.95174134 = fieldWeight in 563, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.078125 = fieldNorm(doc=563)
          0.26412717 = weight(abstract_txt:summarization in 563) [ClassicSimilarity], result of:
            0.26412717 = score(doc=563,freq=3.0), product of:
              0.27366439 = queryWeight, product of:
                3.2596772 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.011770626 = queryNorm
              0.96514994 = fieldWeight in 563, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.078125 = fieldNorm(doc=563)
        0.32 = coord(8/25)
    
  2. Sankarasubramaniam, Y.; Ramanathan, K.; Ghosh, S.: Text summarization using Wikipedia (2014) 0.22
    0.22295387 = sum of:
      0.22295387 = product of:
        0.7962638 = sum of:
          0.08277547 = weight(abstract_txt:wikipedia in 2693) [ClassicSimilarity], result of:
            0.08277547 = score(doc=2693,freq=4.0), product of:
              0.105656065 = queryWeight, product of:
                1.4321802 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.011770626 = queryNorm
              0.7834427 = fieldWeight in 2693, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.026593398 = weight(abstract_txt:document in 2693) [ClassicSimilarity], result of:
            0.026593398 = score(doc=2693,freq=1.0), product of:
              0.09912257 = queryWeight, product of:
                1.9617864 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.011770626 = queryNorm
              0.26828802 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.036279723 = weight(abstract_txt:using in 2693) [ClassicSimilarity], result of:
            0.036279723 = score(doc=2693,freq=3.0), product of:
              0.09677339 = queryWeight, product of:
                2.3740456 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.011770626 = queryNorm
              0.37489358 = fieldWeight in 2693, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.07061799 = weight(abstract_txt:multi in 2693) [ClassicSimilarity], result of:
            0.07061799 = score(doc=2693,freq=1.0), product of:
              0.19007874 = queryWeight, product of:
                2.716641 = boost
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.011770626 = queryNorm
              0.37151965 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.19711985 = weight(abstract_txt:summarizer in 2693) [ClassicSimilarity], result of:
            0.19711985 = score(doc=2693,freq=1.0), product of:
              0.34237126 = queryWeight, product of:
                3.15751 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.011770626 = queryNorm
              0.5757488 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.2988258 = weight(abstract_txt:summarization in 2693) [ClassicSimilarity], result of:
            0.2988258 = score(doc=2693,freq=6.0), product of:
              0.27366439 = queryWeight, product of:
                3.2596772 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.011770626 = queryNorm
              1.0919425 = fieldWeight in 2693, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
          0.084051594 = weight(abstract_txt:models in 2693) [ClassicSimilarity], result of:
            0.084051594 = score(doc=2693,freq=1.0), product of:
              0.28973478 = queryWeight, product of:
                5.3031726 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.011770626 = queryNorm
              0.2900984 = fieldWeight in 2693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.0625 = fieldNorm(doc=2693)
        0.28 = coord(7/25)
    
  3. Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.19
    0.19404943 = sum of:
      0.19404943 = product of:
        0.6930337 = sum of:
          0.025785375 = weight(abstract_txt:words in 948) [ClassicSimilarity], result of:
            0.025785375 = score(doc=948,freq=1.0), product of:
              0.07707182 = queryWeight, product of:
                1.2232022 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.011770626 = queryNorm
              0.33456293 = fieldWeight in 948, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.026593398 = weight(abstract_txt:document in 948) [ClassicSimilarity], result of:
            0.026593398 = score(doc=948,freq=1.0), product of:
              0.09912257 = queryWeight, product of:
                1.9617864 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.011770626 = queryNorm
              0.26828802 = fieldWeight in 948, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.13517228 = weight(abstract_txt:extractive in 948) [ClassicSimilarity], result of:
            0.13517228 = score(doc=948,freq=1.0), product of:
              0.23257966 = queryWeight, product of:
                2.1248894 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.011770626 = queryNorm
              0.581187 = fieldWeight in 948, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.020946108 = weight(abstract_txt:using in 948) [ClassicSimilarity], result of:
            0.020946108 = score(doc=948,freq=1.0), product of:
              0.09677339 = queryWeight, product of:
                2.3740456 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.011770626 = queryNorm
              0.21644491 = fieldWeight in 948, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.07061799 = weight(abstract_txt:multi in 948) [ClassicSimilarity], result of:
            0.07061799 = score(doc=948,freq=1.0), product of:
              0.19007874 = queryWeight, product of:
                2.716641 = boost
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.011770626 = queryNorm
              0.37151965 = fieldWeight in 948, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.20261681 = weight(abstract_txt:summaries in 948) [ClassicSimilarity], result of:
            0.20261681 = score(doc=948,freq=3.0), product of:
              0.26611328 = queryWeight, product of:
                3.2143912 = boost
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.011770626 = queryNorm
              0.7613931 = fieldWeight in 948, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
          0.21130173 = weight(abstract_txt:summarization in 948) [ClassicSimilarity], result of:
            0.21130173 = score(doc=948,freq=3.0), product of:
              0.27366439 = queryWeight, product of:
                3.2596772 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.011770626 = queryNorm
              0.77211994 = fieldWeight in 948, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.0625 = fieldNorm(doc=948)
        0.28 = coord(7/25)
    
  4. Kar, M.; Nunes, S.; Ribeiro, C.: Summarization of changes in dynamic text collections using Latent Dirichlet Allocation model (2015) 0.19
    0.19124256 = sum of:
      0.19124256 = product of:
        0.597633 = sum of:
          0.013301697 = weight(abstract_txt:standard in 2676) [ClassicSimilarity], result of:
            0.013301697 = score(doc=2676,freq=1.0), product of:
              0.06005426 = queryWeight, product of:
                1.0797479 = boost
                4.725219 = idf(docFreq=1065, maxDocs=44218)
                0.011770626 = queryNorm
              0.22149463 = fieldWeight in 2676, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.725219 = idf(docFreq=1065, maxDocs=44218)
                0.046875 = fieldNorm(doc=2676)
          0.019339032 = weight(abstract_txt:words in 2676) [ClassicSimilarity], result of:
            0.019339032 = score(doc=2676,freq=1.0), product of:
              0.07707182 = queryWeight, product of:
                1.2232022 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.011770626 = queryNorm
              0.2509222 = fieldWeight in 2676, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.046875 = fieldNorm(doc=2676)
          0.0310408 = weight(abstract_txt:wikipedia in 2676) [ClassicSimilarity], result of:
            0.0310408 = score(doc=2676,freq=1.0), product of:
              0.105656065 = queryWeight, product of:
                1.4321802 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.011770626 = queryNorm
              0.293791 = fieldWeight in 2676, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.046875 = fieldNorm(doc=2676)
          0.044598486 = weight(abstract_txt:document in 2676) [ClassicSimilarity], result of:
            0.044598486 = score(doc=2676,freq=5.0), product of:
              0.09912257 = queryWeight, product of:
                1.9617864 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.011770626 = queryNorm
              0.4499327 = fieldWeight in 2676, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.046875 = fieldNorm(doc=2676)
          0.10137921 = weight(abstract_txt:extractive in 2676) [ClassicSimilarity], result of:
            0.10137921 = score(doc=2676,freq=1.0), product of:
              0.23257966 = queryWeight, product of:
                2.1248894 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.011770626 = queryNorm
              0.43589026 = fieldWeight in 2676, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.046875 = fieldNorm(doc=2676)
          0.03141916 = weight(abstract_txt:using in 2676) [ClassicSimilarity], result of:
            0.03141916 = score(doc=2676,freq=4.0), product of:
              0.09677339 = queryWeight, product of:
                2.3740456 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.011770626 = queryNorm
              0.32466736 = fieldWeight in 2676, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.046875 = fieldNorm(doc=2676)
          0.15196261 = weight(abstract_txt:summaries in 2676) [ClassicSimilarity], result of:
            0.15196261 = score(doc=2676,freq=3.0), product of:
              0.26611328 = queryWeight, product of:
                3.2143912 = boost
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.011770626 = queryNorm
              0.5710448 = fieldWeight in 2676, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.046875 = fieldNorm(doc=2676)
          0.20459203 = weight(abstract_txt:summarization in 2676) [ClassicSimilarity], result of:
            0.20459203 = score(doc=2676,freq=5.0), product of:
              0.27366439 = queryWeight, product of:
                3.2596772 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.011770626 = queryNorm
              0.747602 = fieldWeight in 2676, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.046875 = fieldNorm(doc=2676)
        0.32 = coord(8/25)
    
  5. Liu, X.; Zheng, W.; Fang, H.: ¬An exploration of ranking models and feedback method for related entity finding (2013) 0.18
    0.17963699 = sum of:
      0.17963699 = product of:
        0.7484875 = sum of:
          0.017735595 = weight(abstract_txt:standard in 2714) [ClassicSimilarity], result of:
            0.017735595 = score(doc=2714,freq=1.0), product of:
              0.06005426 = queryWeight, product of:
                1.0797479 = boost
                4.725219 = idf(docFreq=1065, maxDocs=44218)
                0.011770626 = queryNorm
              0.29532617 = fieldWeight in 2714, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.725219 = idf(docFreq=1065, maxDocs=44218)
                0.0625 = fieldNorm(doc=2714)
          0.026593398 = weight(abstract_txt:document in 2714) [ClassicSimilarity], result of:
            0.026593398 = score(doc=2714,freq=1.0), product of:
              0.09912257 = queryWeight, product of:
                1.9617864 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.011770626 = queryNorm
              0.26828802 = fieldWeight in 2714, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=2714)
          0.17655993 = weight(abstract_txt:entities in 2714) [ClassicSimilarity], result of:
            0.17655993 = score(doc=2714,freq=7.0), product of:
              0.18304256 = queryWeight, product of:
                2.6658854 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.011770626 = queryNorm
              0.96458405 = fieldWeight in 2714, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.0625 = fieldNorm(doc=2714)
          0.09406327 = weight(abstract_txt:type in 2714) [ClassicSimilarity], result of:
            0.09406327 = score(doc=2714,freq=1.0), product of:
              0.30153024 = queryWeight, product of:
                5.1324196 = boost
                4.991248 = idf(docFreq=816, maxDocs=44218)
                0.011770626 = queryNorm
              0.311953 = fieldWeight in 2714, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.991248 = idf(docFreq=816, maxDocs=44218)
                0.0625 = fieldNorm(doc=2714)
          0.14558162 = weight(abstract_txt:models in 2714) [ClassicSimilarity], result of:
            0.14558162 = score(doc=2714,freq=3.0), product of:
              0.28973478 = queryWeight, product of:
                5.3031726 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.011770626 = queryNorm
              0.5024651 = fieldWeight in 2714, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.0625 = fieldNorm(doc=2714)
          0.28795367 = weight(abstract_txt:entity in 2714) [ClassicSimilarity], result of:
            0.28795367 = score(doc=2714,freq=3.0), product of:
              0.42381337 = queryWeight, product of:
                5.7367744 = boost
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.011770626 = queryNorm
              0.6794351 = fieldWeight in 2714, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0625 = fieldNorm(doc=2714)
        0.24 = coord(6/25)