Search (3719 results, page 1 of 186)

  1. Alexander, M.: Digitising books, manuscripts and scholarly materials : preparation, handling, scanning, recognition, compression, storage formats (1998) 0.19
    0.19162384 = product of:
      0.38324767 = sum of:
        0.38324767 = sum of:
          0.32980162 = weight(_text_:compression in 3686) [ClassicSimilarity], result of:
            0.32980162 = score(doc=3686,freq=4.0), product of:
              0.36069217 = queryWeight, product of:
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.049309507 = queryNorm
              0.9143576 = fieldWeight in 3686, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.0625 = fieldNorm(doc=3686)
          0.053446062 = weight(_text_:22 in 3686) [ClassicSimilarity], result of:
            0.053446062 = score(doc=3686,freq=2.0), product of:
              0.1726735 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.049309507 = queryNorm
              0.30952093 = fieldWeight in 3686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=3686)
      0.5 = coord(1/2)
    
    Abstract
    The British Library's Initiatives for Access programme (1993-) aims to identify the impact and value of digital and networking technologies on the Library's collections and services. Describes the projects: the Electronic Beowulf, digitisation of ageing microfilm, digital photographic images, and use of the Excalibur retrieval software. Examines the ways in which the issues of preparation, scanning, and storage have been tackled, and problems raised by use of recognition technologies and compression
    Date
    22. 5.1999 19:00:52
  2. Wolff, J.G.: Computing, cognition and information compression (1993) 0.15
    0.1457531 = product of:
      0.2915062 = sum of:
        0.2915062 = product of:
          0.5830124 = sum of:
            0.5830124 = weight(_text_:compression in 6712) [ClassicSimilarity], result of:
              0.5830124 = score(doc=6712,freq=8.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.6163712 = fieldWeight in 6712, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.078125 = fieldNorm(doc=6712)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The storage and processing of information in computers and in brains may often be understood information compression. Reviews what is meant by information and, in particular, what is meant by redundancy, a concept fundamental in all methods for information compression. Describes principles of information compression
  3. Dimitrova, N.; Golshani, F.: Motion recovery for video content classification (1995) 0.14
    0.14332551 = product of:
      0.28665102 = sum of:
        0.28665102 = sum of:
          0.23320496 = weight(_text_:compression in 3834) [ClassicSimilarity], result of:
            0.23320496 = score(doc=3834,freq=2.0), product of:
              0.36069217 = queryWeight, product of:
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.049309507 = queryNorm
              0.64654845 = fieldWeight in 3834, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.0625 = fieldNorm(doc=3834)
          0.053446062 = weight(_text_:22 in 3834) [ClassicSimilarity], result of:
            0.053446062 = score(doc=3834,freq=2.0), product of:
              0.1726735 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.049309507 = queryNorm
              0.30952093 = fieldWeight in 3834, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=3834)
      0.5 = coord(1/2)
    
    Abstract
    Discusses the analysis of video for the classification of images in order to develop a video database. Covers compression; motion recovery in digital video; low-level motion extraction; single macroblock tracing; intermediate-level motion analysis; high-level motion analysis; spatiotemporal hierarchical representation; information filtering and digital video; content filtering opertaors; the query language; querying video contents; an architecture for video classification and retrieval; the visual query language VEVA; and implementation of macroblock tracing
    Date
    8. 4.1996 9:22:36
  4. Guenette, D.R.: Document imaging, CD-ROM, and CD-R : a starting point (1996) 0.14
    0.14332551 = product of:
      0.28665102 = sum of:
        0.28665102 = sum of:
          0.23320496 = weight(_text_:compression in 4986) [ClassicSimilarity], result of:
            0.23320496 = score(doc=4986,freq=2.0), product of:
              0.36069217 = queryWeight, product of:
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.049309507 = queryNorm
              0.64654845 = fieldWeight in 4986, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.0625 = fieldNorm(doc=4986)
          0.053446062 = weight(_text_:22 in 4986) [ClassicSimilarity], result of:
            0.053446062 = score(doc=4986,freq=2.0), product of:
              0.1726735 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.049309507 = queryNorm
              0.30952093 = fieldWeight in 4986, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=4986)
      0.5 = coord(1/2)
    
    Abstract
    An introduction to technical solutions for the generation and conversion of digital documents, using affordable scanner devices, document imaging systems and OCR technologies with cheap, networkable high storage capacity media such as CD-ROMs and CD-R signals the arrival of CD-ROM based document imaging systems. Describes the processes involved, including: the document imaging process; use of scanners to make bitmaps; data compression; advantages of indexing the images; OCR techniques; and document display. Lists some of the companies providing products and services applicable to CD-ROM and CD-R based document imaging systems
    Date
    6. 9.1996 19:08:22
  5. Cannane, A.; Williams, H.E.: General-purpose compression for efficient retrieval (2001) 0.14
    0.13827354 = product of:
      0.27654707 = sum of:
        0.27654707 = product of:
          0.55309415 = sum of:
            0.55309415 = weight(_text_:compression in 5705) [ClassicSimilarity], result of:
              0.55309415 = score(doc=5705,freq=20.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.5334243 = fieldWeight in 5705, product of:
                  4.472136 = tf(freq=20.0), with freq of:
                    20.0 = termFreq=20.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5705)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Compression of databases not only reduces space requirements but can also reduce overall retrieval times. In text databases, compression of documents based on semistatic modeling with words has been shown to be both practical and fast. Similarly, for specific applications -such as databases of integers or scientific databases-specially designed semistatic compression schemes work well. We propose a scheme for general-purpose compression that can be applied to all types of data stored in large collections. We describe our approach -which we call RAY-in detail, and show experimentally the compression available, compression and decompression costs, and performance as a stream and random-access technique. We show that, in many cases, RAY achieves better compression than an efficient Huffman scheme and popular adaptive compression techniques, and that it can be used as an efficient general-purpose compression scheme
  6. Huang, T.; Mehrotra, S.; Ramchandran, K.: Multimedia Access and Retrieval System (MARS) project (1997) 0.13
    0.12540983 = product of:
      0.25081965 = sum of:
        0.25081965 = sum of:
          0.20405436 = weight(_text_:compression in 758) [ClassicSimilarity], result of:
            0.20405436 = score(doc=758,freq=2.0), product of:
              0.36069217 = queryWeight, product of:
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.049309507 = queryNorm
              0.5657299 = fieldWeight in 758, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.314861 = idf(docFreq=79, maxDocs=44218)
                0.0546875 = fieldNorm(doc=758)
          0.0467653 = weight(_text_:22 in 758) [ClassicSimilarity], result of:
            0.0467653 = score(doc=758,freq=2.0), product of:
              0.1726735 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.049309507 = queryNorm
              0.2708308 = fieldWeight in 758, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=758)
      0.5 = coord(1/2)
    
    Abstract
    Reports results of the MARS project, conducted at Illinois University, to bring together researchers in the fields of computer vision, compression, information management and database systems with the goal of developing an effective multimedia database management system. Describes the first step, involving the design and implementation of an image retrieval system incorporating novel approaches to image segmentation, representation, browsing and information retrieval supported by the developed system. Points to future directions for the MARS project
    Date
    22. 9.1997 19:16:05
  7. Brandt, R.: Video compression : the why and the how (1993) 0.12
    0.12367561 = product of:
      0.24735121 = sum of:
        0.24735121 = product of:
          0.49470243 = sum of:
            0.49470243 = weight(_text_:compression in 4286) [ClassicSimilarity], result of:
              0.49470243 = score(doc=4286,freq=4.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.3715364 = fieldWeight in 4286, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4286)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Describes the technology of video compression as the key to the practical application of CD-ROM storage to multimedia CD-ROM databases
  8. Gates, R.; Bang, S.: Compression and archiving (1993) 0.12
    0.11660248 = product of:
      0.23320496 = sum of:
        0.23320496 = product of:
          0.46640992 = sum of:
            0.46640992 = weight(_text_:compression in 4428) [ClassicSimilarity], result of:
              0.46640992 = score(doc=4428,freq=2.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.2930969 = fieldWeight in 4428, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.125 = fieldNorm(doc=4428)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  9. Pennebaker, W.B.; Mitchell, J.L.: JPEG still image data compression standard (1993) 0.12
    0.11660248 = product of:
      0.23320496 = sum of:
        0.23320496 = product of:
          0.46640992 = sum of:
            0.46640992 = weight(_text_:compression in 5251) [ClassicSimilarity], result of:
              0.46640992 = score(doc=5251,freq=2.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.2930969 = fieldWeight in 5251, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.125 = fieldNorm(doc=5251)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  10. Jain, A.K.: Image data compression : a review (1981) 0.12
    0.11660248 = product of:
      0.23320496 = sum of:
        0.23320496 = product of:
          0.46640992 = sum of:
            0.46640992 = weight(_text_:compression in 8696) [ClassicSimilarity], result of:
              0.46640992 = score(doc=8696,freq=2.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.2930969 = fieldWeight in 8696, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.125 = fieldNorm(doc=8696)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  11. Bookstein, A.; Klein, S.T.: Compression, information theory, and grammars : a unified approach (1990) 0.12
    0.11660248 = product of:
      0.23320496 = sum of:
        0.23320496 = product of:
          0.46640992 = sum of:
            0.46640992 = weight(_text_:compression in 2970) [ClassicSimilarity], result of:
              0.46640992 = score(doc=2970,freq=2.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.2930969 = fieldWeight in 2970, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.125 = fieldNorm(doc=2970)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  12. Cheng, K.-S.; Young, G.H.; Wong, K.-F.: ¬A study on word-based and integral-bit Chinese text compression algorithms (1999) 0.12
    0.11660248 = product of:
      0.23320496 = sum of:
        0.23320496 = product of:
          0.46640992 = sum of:
            0.46640992 = weight(_text_:compression in 3056) [ClassicSimilarity], result of:
              0.46640992 = score(doc=3056,freq=8.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.2930969 = fieldWeight in 3056, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3056)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Experimental results show that a word-based arithmetic coding scheme can achieve a higher compression performance for Chinese text. However, an arithmetic coding scheme is a fractional-bit compression algorithm which is known to be time comsuming. In this article, we change the direction to study how to cascade the word segmentation model with a faster alternative, the integral-bit compression algorithm. It is shown that the cascaded algorithm is mor suitable for practical usage.
  13. Maguire, P.; Maguire, R.: Consciousness is data compression (2010) 0.12
    0.11660248 = product of:
      0.23320496 = sum of:
        0.23320496 = product of:
          0.46640992 = sum of:
            0.46640992 = weight(_text_:compression in 4972) [ClassicSimilarity], result of:
              0.46640992 = score(doc=4972,freq=8.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.2930969 = fieldWeight in 4972, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4972)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this article we advance the conjecture that conscious awareness is equivalent to data compression. Algorithmic information theory supports the assertion that all forms of understanding are contingent on compression (Chaitin, 2007). Here, we argue that the experience people refer to as consciousness is the particular form of understanding that the brain provides. We therefore propose that the degree of consciousness of a system can be measured in terms of the amount of data compression it carries out.
  14. Delfino, E.: ¬The Internet toolkit : file compression and archive utilities (1993) 0.10
    0.103063 = product of:
      0.206126 = sum of:
        0.206126 = product of:
          0.412252 = sum of:
            0.412252 = weight(_text_:compression in 6718) [ClassicSimilarity], result of:
              0.412252 = score(doc=6718,freq=4.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.142947 = fieldWeight in 6718, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.078125 = fieldNorm(doc=6718)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    As a result of the combination of high transmission speeds and large data file sizes, many files available over the Internet come in archived and compressed form which need to be decompressed before being read. Discusses the techniques available for file compression and extraction and where to find these utilities on the Internet
  15. Bell, T.C.; Moffat, A.; Nevill-Manning, C.G.; Witten, I.H.; Zobel, J.: Data compression in full-text retrieval system (1993) 0.10
    0.10202718 = product of:
      0.20405436 = sum of:
        0.20405436 = product of:
          0.4081087 = sum of:
            0.4081087 = weight(_text_:compression in 5643) [ClassicSimilarity], result of:
              0.4081087 = score(doc=5643,freq=8.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                1.1314598 = fieldWeight in 5643, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5643)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    When data compression is applied to full-text retrieval systems, intricate relationships emerge between the amount of compression, access speed, and computing resources required. We propose compression methods, and explore corresponding tradeoffs, for all components of static full-text systems such as text databases on CD-ROM. These components include lexical indexes, and the mein text itself. Results are reported on the application of the methods to several substantial full-text databases, and show that a large, unindexed text can be stored, along with indexes that facilitate fast searching, in less than half its original size - at some appreciable cost in primary memory requirements
  16. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10
    0.09835884 = sum of:
      0.07831657 = product of:
        0.23494971 = sum of:
          0.23494971 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
            0.23494971 = score(doc=562,freq=2.0), product of:
              0.41804656 = queryWeight, product of:
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.049309507 = queryNorm
              0.56201804 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.33333334 = coord(1/3)
      0.020042272 = product of:
        0.040084545 = sum of:
          0.040084545 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
            0.040084545 = score(doc=562,freq=2.0), product of:
              0.1726735 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.049309507 = queryNorm
              0.23214069 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.5 = coord(1/2)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  17. Wittan, I.H.; Bell, T.C.; Nevill, C.G.: Indexing and compressing full-text databases for CD-ROM (1991) 0.09
    0.08835813 = product of:
      0.17671625 = sum of:
        0.17671625 = product of:
          0.3534325 = sum of:
            0.3534325 = weight(_text_:compression in 4828) [ClassicSimilarity], result of:
              0.3534325 = score(doc=4828,freq=6.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                0.97987294 = fieldWeight in 4828, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4828)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    CD-ROM is an attractive delivery vehicle for full text databases. Large storage capacity and low access speed, carefully designed indexing structures, including a concordance, are necessary to enable the text to be retrieved efficiently. However, the indexes are sufficiently large that they tax the ability of the main store to hold them when processing queries. The use of compression techniques can substantially increase the volume of text that a disc can accomodate, and substantially decrease the amount of primary storage needed to hold the indexes. Describes a suitable indexing mechanism, and its compression potential using modern compression methods. It is possible to double the amount of text that can be stored on a CD-ROM disc and include a full concordance and indexes as well
  18. Akman, K.I.: ¬A new text compression technique based on natural language structure (1995) 0.09
    0.08835813 = product of:
      0.17671625 = sum of:
        0.17671625 = product of:
          0.3534325 = sum of:
            0.3534325 = weight(_text_:compression in 1860) [ClassicSimilarity], result of:
              0.3534325 = score(doc=1860,freq=6.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                0.97987294 = fieldWeight in 1860, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1860)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Describes a new data compression technique which utilizes some of the common structural characteristics of languages. The proposed algorithm partitions words into their roots and suffixes which are then replaced by shorter bit representations. The method used 3 dictionaries in the from of binary search trees and 1 character array. The first 2 dictionaries are for roots, and the third one is for suffixes. The character array is used for both searching compressible words and coding incompressible words. The number of bits in representing a substring depends on the number of the entries in the dictionary in which the substring is found. The proposed algorithm is implemented in the Turkish language and tested using 3 different text groups with different lenghts. Results indicate a compression factor of up to 47 per cent
  19. Zajic, D.; Dorr, B.J.; Lin, J.; Schwartz, R.: Multi-candidate reduction : sentence compression as a tool for document summarization tasks (2007) 0.09
    0.08835813 = product of:
      0.17671625 = sum of:
        0.17671625 = product of:
          0.3534325 = sum of:
            0.3534325 = weight(_text_:compression in 944) [ClassicSimilarity], result of:
              0.3534325 = score(doc=944,freq=6.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                0.97987294 = fieldWeight in 944, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=944)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization-a "parse-and-trim" approach and a statistical noisy-channel approach. We introduce the multi-candidate reduction (MCR) framework for multi-document summarization, in which many compressed candidates are generated for each source sentence. These candidates are then selected for inclusion in the final summary based on a combination of static and dynamic features. Evaluations demonstrate that sentence compression is a valuable component of a larger multi-document summarization framework.
  20. Zajic, D.M.; Dorr, B.J.; Lin, J.: Single-document and multi-document summarization techniques for email threads using sentence compression (2008) 0.09
    0.08835813 = product of:
      0.17671625 = sum of:
        0.17671625 = product of:
          0.3534325 = sum of:
            0.3534325 = weight(_text_:compression in 2105) [ClassicSimilarity], result of:
              0.3534325 = score(doc=2105,freq=6.0), product of:
                0.36069217 = queryWeight, product of:
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.049309507 = queryNorm
                0.97987294 = fieldWeight in 2105, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  7.314861 = idf(docFreq=79, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2105)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We present two approaches to email thread summarization: collective message summarization (CMS) applies a multi-document summarization approach, while individual message summarization (IMS) treats the problem as a sequence of single-document summarization tasks. Both approaches are implemented in our general framework driven by sentence compression. Instead of a purely extractive approach, we employ linguistic and statistical methods to generate multiple compressions, and then select from those candidates to produce a final summary. We demonstrate these ideas on the Enron email collection - a very challenging corpus because of the highly technical language. Experimental results point to two findings: that CMS represents a better approach to email thread summarization, and that current sentence compression techniques do not improve summarization performance in this genre.

Languages

Types

  • a 3109
  • m 352
  • el 163
  • s 142
  • b 40
  • x 35
  • i 23
  • r 17
  • ? 8
  • p 4
  • d 3
  • n 3
  • u 2
  • z 2
  • au 1
  • h 1
  • More… Less…

Themes

Subjects

Classifications