Document (#22060)

Author
Liu, J.
Wu, Y.
Zhou, L.
Title
¬A hybrid method for abstracting newspaper articles
Source
Journal of the American Society for Information Science. 50(1999) no.13, S.1234-1245
Year
1999
Abstract
This paper introduces a hybrid method for abstracting Chinese text. It integrates the statistical approach with language understanding. Some linguistics heuristics and segmentation are also incorporated into the abstracting process. The prototype system is of a multipurpose type catering for various users with different reqirements. Initial responses show that the proposed method contributes much to the flexibility and accuracy of the automatic Chinese abstracting system. In practice, the present work provides a path to developing an intelligent Chinese system for automating the information
Theme
Automatisches Abstracting
Form
Zeitungen

Similar documents (author)

  1. Zhou, L.: Characteristics of material organization and classification in the Kinsey Institute Library (2003) 4.89
    4.8910537 = sum of:
      4.8910537 = weight(author_txt:zhou in 5639) [ClassicSimilarity], result of:
        4.8910537 = fieldWeight in 5639, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.825686 = idf(docFreq=47, maxDocs=44218)
          0.625 = fieldNorm(doc=5639)
    
  2. Zhou, J.-z.: ¬A new subclass for Library of Congress Classification, QF : Computer science (1998) 3.91
    3.912843 = sum of:
      3.912843 = weight(author_txt:zhou in 2846) [ClassicSimilarity], result of:
        3.912843 = fieldWeight in 2846, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.825686 = idf(docFreq=47, maxDocs=44218)
          0.5 = fieldNorm(doc=2846)
    
  3. Zhou, L.; Zhang, D.: NLPIR: a theoretical framework for applying Natural Language Processing to information retrieval (2003) 3.91
    3.912843 = sum of:
      3.912843 = weight(author_txt:zhou in 5148) [ClassicSimilarity], result of:
        3.912843 = fieldWeight in 5148, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.825686 = idf(docFreq=47, maxDocs=44218)
          0.5 = fieldNorm(doc=5148)
    
  4. Zhou, P.; Leydesdorff, L.: ¬A comparison between the China Scientific and Technical Papers and Citations Database and the Science Citation Index in terms of journal hierarchies and interjournal citation relations (2007) 3.91
    3.912843 = sum of:
      3.912843 = weight(author_txt:zhou in 70) [ClassicSimilarity], result of:
        3.912843 = fieldWeight in 70, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.825686 = idf(docFreq=47, maxDocs=44218)
          0.5 = fieldNorm(doc=70)
    
  5. Zhou, G.D.; Zhang, M.: Extracting relation information from text documents by exploring various types of knowledge (2007) 3.91
    3.912843 = sum of:
      3.912843 = weight(author_txt:zhou in 927) [ClassicSimilarity], result of:
        3.912843 = fieldWeight in 927, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.825686 = idf(docFreq=47, maxDocs=44218)
          0.5 = fieldNorm(doc=927)
    

Similar documents (content)

  1. Yang, C.C.; Li, K.W.: ¬A heuristic method based on a statistical approach for chinese text segmentation (2005) 0.24
    0.23788095 = sum of:
      0.23788095 = product of:
        0.99117064 = sum of:
          0.010706058 = weight(abstract_txt:with in 4580) [ClassicSimilarity], result of:
            0.010706058 = score(doc=4580,freq=3.0), product of:
              0.03956355 = queryWeight, product of:
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.015827108 = queryNorm
              0.27060407 = fieldWeight in 4580, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.02775009 = weight(abstract_txt:automatic in 4580) [ClassicSimilarity], result of:
            0.02775009 = score(doc=4580,freq=1.0), product of:
              0.08545724 = queryWeight, product of:
                1.0392303 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.015827108 = queryNorm
              0.32472485 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.047739662 = weight(abstract_txt:statistical in 4580) [ClassicSimilarity], result of:
            0.047739662 = score(doc=4580,freq=2.0), product of:
              0.097382784 = queryWeight, product of:
                1.1093752 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.015827108 = queryNorm
              0.49022692 = fieldWeight in 4580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.29664207 = weight(abstract_txt:segmentation in 4580) [ClassicSimilarity], result of:
            0.29664207 = score(doc=4580,freq=9.0), product of:
              0.1993641 = queryWeight, product of:
                1.5873066 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.015827108 = queryNorm
              1.4879413 = fieldWeight in 4580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.16237266 = weight(abstract_txt:method in 4580) [ClassicSimilarity], result of:
            0.16237266 = score(doc=4580,freq=9.0), product of:
              0.19240107 = queryWeight, product of:
                2.7008579 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.015827108 = queryNorm
              0.8439281 = fieldWeight in 4580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.44596007 = weight(abstract_txt:chinese in 4580) [ClassicSimilarity], result of:
            0.44596007 = score(doc=4580,freq=9.0), product of:
              0.3773371 = queryWeight, product of:
                3.782359 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.015827108 = queryNorm
              1.1818612 = fieldWeight in 4580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
        0.24 = coord(6/25)
    
  2. Khoo, C.S.G.; Dai, D.; Loh, T.E.: Using statistical and contextual information to identify two- and three-character words in Chinese text (2002) 0.15
    0.15432175 = sum of:
      0.15432175 = product of:
        0.6430073 = sum of:
          0.008741461 = weight(abstract_txt:with in 5206) [ClassicSimilarity], result of:
            0.008741461 = score(doc=5206,freq=2.0), product of:
              0.03956355 = queryWeight, product of:
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.015827108 = queryNorm
              0.22094731 = fieldWeight in 5206, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.039244555 = weight(abstract_txt:automatic in 5206) [ClassicSimilarity], result of:
            0.039244555 = score(doc=5206,freq=2.0), product of:
              0.08545724 = queryWeight, product of:
                1.0392303 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.015827108 = queryNorm
              0.45923027 = fieldWeight in 5206, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.03375704 = weight(abstract_txt:statistical in 5206) [ClassicSimilarity], result of:
            0.03375704 = score(doc=5206,freq=1.0), product of:
              0.097382784 = queryWeight, product of:
                1.1093752 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.015827108 = queryNorm
              0.3466428 = fieldWeight in 5206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.0615818 = weight(abstract_txt:incorporated in 5206) [ClassicSimilarity], result of:
            0.0615818 = score(doc=5206,freq=1.0), product of:
              0.14539212 = queryWeight, product of:
                1.3555259 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.015827108 = queryNorm
              0.42355666 = fieldWeight in 5206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.24220726 = weight(abstract_txt:segmentation in 5206) [ClassicSimilarity], result of:
            0.24220726 = score(doc=5206,freq=6.0), product of:
              0.1993641 = queryWeight, product of:
                1.5873066 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.015827108 = queryNorm
              1.2148991 = fieldWeight in 5206, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.25747517 = weight(abstract_txt:chinese in 5206) [ClassicSimilarity], result of:
            0.25747517 = score(doc=5206,freq=3.0), product of:
              0.3773371 = queryWeight, product of:
                3.782359 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.015827108 = queryNorm
              0.6823479 = fieldWeight in 5206, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
        0.24 = coord(6/25)
    
  3. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.14
    0.13526082 = sum of:
      0.13526082 = product of:
        0.6763041 = sum of:
          0.010706058 = weight(abstract_txt:with in 831) [ClassicSimilarity], result of:
            0.010706058 = score(doc=831,freq=3.0), product of:
              0.03956355 = queryWeight, product of:
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.015827108 = queryNorm
              0.27060407 = fieldWeight in 831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.03375704 = weight(abstract_txt:statistical in 831) [ClassicSimilarity], result of:
            0.03375704 = score(doc=831,freq=1.0), product of:
              0.097382784 = queryWeight, product of:
                1.1093752 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.015827108 = queryNorm
              0.3466428 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.072920576 = weight(abstract_txt:accuracy in 831) [ClassicSimilarity], result of:
            0.072920576 = score(doc=831,freq=3.0), product of:
              0.112831995 = queryWeight, product of:
                1.1941352 = boost
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.015827108 = queryNorm
              0.6462757 = fieldWeight in 831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.2616137 = weight(abstract_txt:segmentation in 831) [ClassicSimilarity], result of:
            0.2616137 = score(doc=831,freq=7.0), product of:
              0.1993641 = queryWeight, product of:
                1.5873066 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.015827108 = queryNorm
              1.3122408 = fieldWeight in 831, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.29730672 = weight(abstract_txt:chinese in 831) [ClassicSimilarity], result of:
            0.29730672 = score(doc=831,freq=4.0), product of:
              0.3773371 = queryWeight, product of:
                3.782359 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.015827108 = queryNorm
              0.7879075 = fieldWeight in 831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
        0.2 = coord(5/25)
    
  4. Fletcher, G.P.; Hinde, C.J.: Using a neural network as a tool for constructing rule based systems (1995) 0.13
    0.13336818 = sum of:
      0.13336818 = product of:
        0.66684085 = sum of:
          0.010817005 = weight(abstract_txt:with in 3214) [ClassicSimilarity], result of:
            0.010817005 = score(doc=3214,freq=1.0), product of:
              0.03956355 = queryWeight, product of:
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.015827108 = queryNorm
              0.27340835 = fieldWeight in 3214, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.109375 = fieldNorm(doc=3214)
          0.08471637 = weight(abstract_txt:intelligent in 3214) [ClassicSimilarity], result of:
            0.08471637 = score(doc=3214,freq=1.0), product of:
              0.12383939 = queryWeight, product of:
                1.2510272 = boost
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.015827108 = queryNorm
              0.68408257 = fieldWeight in 3214, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.109375 = fieldNorm(doc=3214)
          0.039838143 = weight(abstract_txt:system in 3214) [ClassicSimilarity], result of:
            0.039838143 = score(doc=3214,freq=1.0), product of:
              0.10800745 = queryWeight, product of:
                2.0236008 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.015827108 = queryNorm
              0.36884624 = fieldWeight in 3214, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.109375 = fieldNorm(doc=3214)
          0.09471739 = weight(abstract_txt:method in 3214) [ClassicSimilarity], result of:
            0.09471739 = score(doc=3214,freq=1.0), product of:
              0.19240107 = queryWeight, product of:
                2.7008579 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.015827108 = queryNorm
              0.4922914 = fieldWeight in 3214, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.109375 = fieldNorm(doc=3214)
          0.43675193 = weight(abstract_txt:abstracting in 3214) [ClassicSimilarity], result of:
            0.43675193 = score(doc=3214,freq=1.0), product of:
              0.5866654 = queryWeight, product of:
                5.4458113 = boost
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.015827108 = queryNorm
              0.7444651 = fieldWeight in 3214, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.109375 = fieldNorm(doc=3214)
        0.2 = coord(5/25)
    
  5. Wan, T.-L.; Evens, M.; Wan, Y.-W.; Pao, Y.-Y.: Experiments with automatic indexing and a relational thesaurus in a Chinese information retrieval system (1997) 0.12
    0.11624034 = sum of:
      0.11624034 = product of:
        0.5812017 = sum of:
          0.01311219 = weight(abstract_txt:with in 956) [ClassicSimilarity], result of:
            0.01311219 = score(doc=956,freq=2.0), product of:
              0.03956355 = queryWeight, product of:
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.015827108 = queryNorm
              0.33142096 = fieldWeight in 956, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.09375 = fieldNorm(doc=956)
          0.07209685 = weight(abstract_txt:automatic in 956) [ClassicSimilarity], result of:
            0.07209685 = score(doc=956,freq=3.0), product of:
              0.08545724 = queryWeight, product of:
                1.0392303 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.015827108 = queryNorm
              0.8436599 = fieldWeight in 956, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.09375 = fieldNorm(doc=956)
          0.05063556 = weight(abstract_txt:statistical in 956) [ClassicSimilarity], result of:
            0.05063556 = score(doc=956,freq=1.0), product of:
              0.097382784 = queryWeight, product of:
                1.1093752 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.015827108 = queryNorm
              0.5199642 = fieldWeight in 956, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.09375 = fieldNorm(doc=956)
          0.059144307 = weight(abstract_txt:system in 956) [ClassicSimilarity], result of:
            0.059144307 = score(doc=956,freq=3.0), product of:
              0.10800745 = queryWeight, product of:
                2.0236008 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.015827108 = queryNorm
              0.54759467 = fieldWeight in 956, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.09375 = fieldNorm(doc=956)
          0.3862128 = weight(abstract_txt:chinese in 956) [ClassicSimilarity], result of:
            0.3862128 = score(doc=956,freq=3.0), product of:
              0.3773371 = queryWeight, product of:
                3.782359 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.015827108 = queryNorm
              1.0235219 = fieldWeight in 956, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.09375 = fieldNorm(doc=956)
        0.2 = coord(5/25)