Document (#34368)

Author
Wu, D.-S.
Liang, T.
Title
Chinese pronominal anaphora resolution using lexical knowledge and entropy-based weight
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.13, S.2138-2145
Year
2008
Abstract
Pronominal anaphors are commonly observed in written texts. In this article, effective Chinese pronominal anaphora resolution is addressed by using lexical knowledge acquisition and salience measurement. The lexical knowledge acquisition is aimed to extract more semantic features, such as gender, number, and collocate compatibility by employing multiple resources. The presented salience measurement is based on entropy-based weighting on selecting antecedent candidates. The resolution is justified with a real corpus and compared with a rule-based model. Experimental results by five-fold cross-validation show that our approach yields 82.5% success rate on 1343 anaphoric instances. In comparison with a general rule-based approach, the performance is improved by 7%.
Theme
Computerlinguistik

Similar documents (author)

  1. Liang, D.F.: Mathematical journals : an annotated guide (1992) 5.21
    5.2059946 = sum of:
      5.2059946 = weight(author_txt:liang in 179) [ClassicSimilarity], result of:
        5.2059946 = fieldWeight in 179, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.625 = fieldNorm(doc=179)
    
  2. Liang, L.: R-Sequences : relative indicators for the rhythm of science (2005) 5.21
    5.2059946 = sum of:
      5.2059946 = weight(author_txt:liang in 3877) [ClassicSimilarity], result of:
        5.2059946 = fieldWeight in 3877, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.625 = fieldNorm(doc=3877)
    
  3. Liang, T.-Y.: ¬The basic entity model : a fundamental theoretical model of information and information processing (1994) 4.16
    4.164796 = sum of:
      4.164796 = weight(author_txt:liang in 8468) [ClassicSimilarity], result of:
        4.164796 = fieldWeight in 8468, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.5 = fieldNorm(doc=8468)
    
  4. Liang, T.-Y.: ¬The basic entity model : a theoretical model of information processing, decision making and information systems (1996) 4.16
    4.164796 = sum of:
      4.164796 = weight(author_txt:liang in 5408) [ClassicSimilarity], result of:
        4.164796 = fieldWeight in 5408, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.5 = fieldNorm(doc=5408)
    
  5. Liang, T.-Y.: ¬The basic entity model (1997) 4.16
    4.164796 = sum of:
      4.164796 = weight(author_txt:liang in 7757) [ClassicSimilarity], result of:
        4.164796 = fieldWeight in 7757, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.329592 = idf(docFreq=28, maxDocs=44218)
          0.5 = fieldNorm(doc=7757)
    

Similar documents (content)

  1. Steinberger, J.; Poesio, M.; Kabadjov, M.A.; Jezek, K.: Two uses of anaphora resolution in summarization (2007) 0.32
    0.3158272 = sum of:
      0.3158272 = product of:
        1.97392 = sum of:
          0.017997123 = weight(abstract_txt:approach in 949) [ClassicSimilarity], result of:
            0.017997123 = score(doc=949,freq=1.0), product of:
              0.06150681 = queryWeight, product of:
                1.080168 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015203447 = queryNorm
              0.29260373 = fieldWeight in 949, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=949)
          0.027745908 = weight(abstract_txt:based in 949) [ClassicSimilarity], result of:
            0.027745908 = score(doc=949,freq=1.0), product of:
              0.11140392 = queryWeight, product of:
                2.2985287 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.015203447 = queryNorm
              0.24905685 = fieldWeight in 949, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.078125 = fieldNorm(doc=949)
          1.59792 = weight(title_txt:anaphora in 949) [ClassicSimilarity], result of:
            1.59792 = score(doc=949,freq=1.0), product of:
              0.43019333 = queryWeight, product of:
                2.8566797 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.015203447 = queryNorm
              3.7144227 = fieldWeight in 949, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.375 = fieldNorm(doc=949)
          0.33025706 = weight(abstract_txt:resolution in 949) [ClassicSimilarity], result of:
            0.33025706 = score(doc=949,freq=3.0), product of:
              0.3396351 = queryWeight, product of:
                3.10872 = boost
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.015203447 = queryNorm
              0.97238785 = fieldWeight in 949, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.078125 = fieldNorm(doc=949)
        0.16 = coord(4/25)
    
  2. Alzahrani, S.; Palade, V.; Salim, N.; Abraham, A.: Using structural information and citation evidence to detect significant plagiarism cases in scientific publications (2012) 0.14
    0.14004755 = sum of:
      0.14004755 = product of:
        0.43764862 = sum of:
          0.057621166 = weight(abstract_txt:weighting in 4982) [ClassicSimilarity], result of:
            0.057621166 = score(doc=4982,freq=2.0), product of:
              0.10676376 = queryWeight, product of:
                1.0062981 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.015203447 = queryNorm
              0.5397072 = fieldWeight in 4982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.06536739 = weight(abstract_txt:validation in 4982) [ClassicSimilarity], result of:
            0.06536739 = score(doc=4982,freq=2.0), product of:
              0.11612968 = queryWeight, product of:
                1.0495094 = boost
                7.2780466 = idf(docFreq=82, maxDocs=44218)
                0.015203447 = queryNorm
              0.5628827 = fieldWeight in 4982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2780466 = idf(docFreq=82, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.096886314 = weight(abstract_txt:weight in 4982) [ClassicSimilarity], result of:
            0.096886314 = score(doc=4982,freq=4.0), product of:
              0.119821325 = queryWeight, product of:
                1.0660603 = boost
                7.3928223 = idf(docFreq=73, maxDocs=44218)
                0.015203447 = queryNorm
              0.80858994 = fieldWeight in 4982, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.3928223 = idf(docFreq=73, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.0125979865 = weight(abstract_txt:approach in 4982) [ClassicSimilarity], result of:
            0.0125979865 = score(doc=4982,freq=1.0), product of:
              0.06150681 = queryWeight, product of:
                1.080168 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015203447 = queryNorm
              0.20482263 = fieldWeight in 4982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.0112365745 = weight(abstract_txt:with in 4982) [ClassicSimilarity], result of:
            0.0112365745 = score(doc=4982,freq=4.0), product of:
              0.04109814 = queryWeight, product of:
                1.0814002 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.015203447 = queryNorm
              0.27340835 = fieldWeight in 4982, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.058403034 = weight(abstract_txt:candidates in 4982) [ClassicSimilarity], result of:
            0.058403034 = score(doc=4982,freq=1.0), product of:
              0.13572799 = queryWeight, product of:
                1.1346173 = boost
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.015203447 = queryNorm
              0.4302947 = fieldWeight in 4982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.101896025 = weight(abstract_txt:fold in 4982) [ClassicSimilarity], result of:
            0.101896025 = score(doc=4982,freq=2.0), product of:
              0.15612555 = queryWeight, product of:
                1.216891 = boost
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.015203447 = queryNorm
              0.6526544 = fieldWeight in 4982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
          0.033640128 = weight(abstract_txt:based in 4982) [ClassicSimilarity], result of:
            0.033640128 = score(doc=4982,freq=3.0), product of:
              0.11140392 = queryWeight, product of:
                2.2985287 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.015203447 = queryNorm
              0.3019654 = fieldWeight in 4982, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4982)
        0.32 = coord(8/25)
    
  3. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.10
    0.099176496 = sum of:
      0.099176496 = product of:
        0.4132354 = sum of:
          0.028795397 = weight(abstract_txt:approach in 831) [ClassicSimilarity], result of:
            0.028795397 = score(doc=831,freq=4.0), product of:
              0.06150681 = queryWeight, product of:
                1.080168 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015203447 = queryNorm
              0.468166 = fieldWeight in 831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.011121324 = weight(abstract_txt:with in 831) [ClassicSimilarity], result of:
            0.011121324 = score(doc=831,freq=3.0), product of:
              0.04109814 = queryWeight, product of:
                1.0814002 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.015203447 = queryNorm
              0.27060407 = fieldWeight in 831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.059434507 = weight(abstract_txt:yields in 831) [ClassicSimilarity], result of:
            0.059434507 = score(doc=831,freq=1.0), product of:
              0.12562525 = queryWeight, product of:
                1.091574 = boost
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.015203447 = queryNorm
              0.47310954 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.13726166 = weight(abstract_txt:chinese in 831) [ClassicSimilarity], result of:
            0.13726166 = score(doc=831,freq=4.0), product of:
              0.17421037 = queryWeight, product of:
                1.8178862 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.015203447 = queryNorm
              0.7879075 = fieldWeight in 831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.13817665 = weight(abstract_txt:entropy in 831) [ClassicSimilarity], result of:
            0.13817665 = score(doc=831,freq=1.0), product of:
              0.27776933 = queryWeight, product of:
                2.2954712 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.015203447 = queryNorm
              0.4974511 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.03844586 = weight(abstract_txt:based in 831) [ClassicSimilarity], result of:
            0.03844586 = score(doc=831,freq=3.0), product of:
              0.11140392 = queryWeight, product of:
                2.2985287 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.015203447 = queryNorm
              0.3451033 = fieldWeight in 831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
        0.24 = coord(6/25)
    
  4. Brychcín, T.; Konopík, M.: HPS: High precision stemmer (2015) 0.10
    0.09532792 = sum of:
      0.09532792 = product of:
        0.3971997 = sum of:
          0.024937546 = weight(abstract_txt:approach in 2686) [ClassicSimilarity], result of:
            0.024937546 = score(doc=2686,freq=3.0), product of:
              0.06150681 = queryWeight, product of:
                1.080168 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015203447 = queryNorm
              0.40544364 = fieldWeight in 2686, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.009080524 = weight(abstract_txt:with in 2686) [ClassicSimilarity], result of:
            0.009080524 = score(doc=2686,freq=2.0), product of:
              0.04109814 = queryWeight, product of:
                1.0814002 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.015203447 = queryNorm
              0.22094731 = fieldWeight in 2686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.079342924 = weight(abstract_txt:rule in 2686) [ClassicSimilarity], result of:
            0.079342924 = score(doc=2686,freq=1.0), product of:
              0.19189632 = queryWeight, product of:
                1.9079326 = boost
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.015203447 = queryNorm
              0.41346768 = fieldWeight in 2686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.13817665 = weight(abstract_txt:entropy in 2686) [ClassicSimilarity], result of:
            0.13817665 = score(doc=2686,freq=1.0), product of:
              0.27776933 = queryWeight, product of:
                2.2954712 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.015203447 = queryNorm
              0.4974511 = fieldWeight in 2686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.031390913 = weight(abstract_txt:based in 2686) [ClassicSimilarity], result of:
            0.031390913 = score(doc=2686,freq=2.0), product of:
              0.11140392 = queryWeight, product of:
                2.2985287 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.015203447 = queryNorm
              0.28177565 = fieldWeight in 2686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
          0.11427114 = weight(abstract_txt:lexical in 2686) [ClassicSimilarity], result of:
            0.11427114 = score(doc=2686,freq=1.0), product of:
              0.28014484 = queryWeight, product of:
                2.8233626 = boost
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.015203447 = queryNorm
              0.4079002 = fieldWeight in 2686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.0625 = fieldNorm(doc=2686)
        0.24 = coord(6/25)
    
  5. Drexel, G.: Knowledge engineering for intelligent information retrieval (2001) 0.09
    0.08989685 = sum of:
      0.08989685 = product of:
        0.37457022 = sum of:
          0.05983608 = weight(abstract_txt:instances in 4043) [ClassicSimilarity], result of:
            0.05983608 = score(doc=4043,freq=1.0), product of:
              0.10874766 = queryWeight, product of:
                1.0156046 = boost
                7.042927 = idf(docFreq=104, maxDocs=44218)
                0.015203447 = queryNorm
              0.55022866 = fieldWeight in 4043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.042927 = idf(docFreq=104, maxDocs=44218)
                0.078125 = fieldNorm(doc=4043)
          0.025451776 = weight(abstract_txt:approach in 4043) [ClassicSimilarity], result of:
            0.025451776 = score(doc=4043,freq=2.0), product of:
              0.06150681 = queryWeight, product of:
                1.080168 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.015203447 = queryNorm
              0.41380417 = fieldWeight in 4043, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=4043)
          0.008026124 = weight(abstract_txt:with in 4043) [ClassicSimilarity], result of:
            0.008026124 = score(doc=4043,freq=1.0), product of:
              0.04109814 = queryWeight, product of:
                1.0814002 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.015203447 = queryNorm
              0.19529167 = fieldWeight in 4043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.078125 = fieldNorm(doc=4043)
          0.099178664 = weight(abstract_txt:rule in 4043) [ClassicSimilarity], result of:
            0.099178664 = score(doc=4043,freq=1.0), product of:
              0.19189632 = queryWeight, product of:
                1.9079326 = boost
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.015203447 = queryNorm
              0.5168346 = fieldWeight in 4043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.078125 = fieldNorm(doc=4043)
          0.039238643 = weight(abstract_txt:based in 4043) [ClassicSimilarity], result of:
            0.039238643 = score(doc=4043,freq=2.0), product of:
              0.11140392 = queryWeight, product of:
                2.2985287 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.015203447 = queryNorm
              0.35221958 = fieldWeight in 4043, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.078125 = fieldNorm(doc=4043)
          0.14283894 = weight(abstract_txt:lexical in 4043) [ClassicSimilarity], result of:
            0.14283894 = score(doc=4043,freq=1.0), product of:
              0.28014484 = queryWeight, product of:
                2.8233626 = boost
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.015203447 = queryNorm
              0.5098753 = fieldWeight in 4043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.078125 = fieldNorm(doc=4043)
        0.24 = coord(6/25)