Document (#40267)

Author
Zhao, G.
Wu, J.
Wang, D.
Li, T.
Title
Entity disambiguation to Wikipedia using collective ranking
Source
Information processing and management. 52(2016) no.6, S.1247-1257
Year
2016
Abstract
Entity disambiguation is a fundamental task of semantic Web annotation. Entity Linking (EL) is an essential procedure in entity disambiguation, which aims to link a mention appearing in a plain text to a structured or semi-structured knowledge base, such as Wikipedia. Existing research on EL usually annotates the mentions in a text one by one and treats entities independent to each other. However this might not be true in many application scenarios. For example, if two mentions appear in one text, they are likely to have certain intrinsic relationships. In this paper, we first propose a novel query expansion method for candidate generation utilizing the information of co-occurrences of mentions. We further propose a re-ranking model which can be iteratively adjusted based on the prediction in the previous round. Experiments on real-world data demonstrate the effectiveness of our proposed methods for entity disambiguation.
Content
Vgl.: http://www.sciencedirect.com/science/article/pii/S030645731630098X.

Similar documents (author)

  1. Wang, X.; High, A.; Wang, X.; Zhao, K.: Predicting users' continued engagement in online health communities from the quantity and quality of received support (2021) 3.62
    3.6204233 = sum of:
      3.6204233 = sum of:
        1.7689424 = weight(author_txt:wang in 242) [ClassicSimilarity], result of:
          1.7689424 = score(doc=242,freq=2.0), product of:
            0.61006033 = queryWeight, product of:
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.09298157 = queryNorm
            2.8996186 = fieldWeight in 242, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.3125 = fieldNorm(doc=242)
        1.851481 = weight(author_txt:zhao in 242) [ClassicSimilarity], result of:
          1.851481 = score(doc=242,freq=1.0), product of:
            0.792355 = queryWeight, product of:
              1.1396554 = boost
              7.4773793 = idf(docFreq=67, maxDocs=44218)
              0.09298157 = queryNorm
            2.3366811 = fieldWeight in 242, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.4773793 = idf(docFreq=67, maxDocs=44218)
              0.3125 = fieldNorm(doc=242)
    
  2. Wang, C.; Zhao, S.; Kalra, A.; Borcea, C.; Chen, Y.: Predictive models and analysis for webpage depth-level dwell time (2018) 3.10
    3.102312 = sum of:
      3.102312 = sum of:
        1.2508312 = weight(author_txt:wang in 4370) [ClassicSimilarity], result of:
          1.2508312 = score(doc=4370,freq=1.0), product of:
            0.61006033 = queryWeight, product of:
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.09298157 = queryNorm
            2.0503402 = fieldWeight in 4370, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.3125 = fieldNorm(doc=4370)
        1.851481 = weight(author_txt:zhao in 4370) [ClassicSimilarity], result of:
          1.851481 = score(doc=4370,freq=1.0), product of:
            0.792355 = queryWeight, product of:
              1.1396554 = boost
              7.4773793 = idf(docFreq=67, maxDocs=44218)
              0.09298157 = queryNorm
            2.3366811 = fieldWeight in 4370, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.4773793 = idf(docFreq=67, maxDocs=44218)
              0.3125 = fieldNorm(doc=4370)
    
  3. Wang, X.; Zhang, M.; Fan, W.; Zhao, K.: Understanding the spread of COVID-19 misinformation on social media : the effects of topics and a political leader's nudge (2022) 3.10
    3.102312 = sum of:
      3.102312 = sum of:
        1.2508312 = weight(author_txt:wang in 549) [ClassicSimilarity], result of:
          1.2508312 = score(doc=549,freq=1.0), product of:
            0.61006033 = queryWeight, product of:
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.09298157 = queryNorm
            2.0503402 = fieldWeight in 549, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.5610886 = idf(docFreq=169, maxDocs=44218)
              0.3125 = fieldNorm(doc=549)
        1.851481 = weight(author_txt:zhao in 549) [ClassicSimilarity], result of:
          1.851481 = score(doc=549,freq=1.0), product of:
            0.792355 = queryWeight, product of:
              1.1396554 = boost
              7.4773793 = idf(docFreq=67, maxDocs=44218)
              0.09298157 = queryNorm
            2.3366811 = fieldWeight in 549, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.4773793 = idf(docFreq=67, maxDocs=44218)
              0.3125 = fieldNorm(doc=549)
    
  4. Zhao, L.: Save space for "newcomers" : analyzing problems in book number assignment under the LCC system (2004) 1.85
    1.851481 = sum of:
      1.851481 = product of:
        3.702962 = sum of:
          3.702962 = weight(author_txt:zhao in 3081) [ClassicSimilarity], result of:
            3.702962 = score(doc=3081,freq=1.0), product of:
              0.792355 = queryWeight, product of:
                1.1396554 = boost
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.09298157 = queryNorm
              4.6733623 = fieldWeight in 3081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.625 = fieldNorm(doc=3081)
        0.5 = coord(1/2)
    
  5. Zhao, L.: How librarians used e-resources : an analysis of citations in CCQ (2006) 1.85
    1.851481 = sum of:
      1.851481 = product of:
        3.702962 = sum of:
          3.702962 = weight(author_txt:zhao in 5766) [ClassicSimilarity], result of:
            3.702962 = score(doc=5766,freq=1.0), product of:
              0.792355 = queryWeight, product of:
                1.1396554 = boost
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.09298157 = queryNorm
              4.6733623 = fieldWeight in 5766, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.625 = fieldNorm(doc=5766)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Phan, M.C.; Sun, A.: Collective named entity recognition in user comments via parameterized label propagation (2020) 0.23
    0.23310405 = sum of:
      0.23310405 = product of:
        0.83251446 = sum of:
          0.046273995 = weight(abstract_txt:collective in 5815) [ClassicSimilarity], result of:
            0.046273995 = score(doc=5815,freq=1.0), product of:
              0.107156105 = queryWeight, product of:
                1.041467 = boost
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.0148912575 = queryNorm
              0.43183723 = fieldWeight in 5815, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.0625 = fieldNorm(doc=5815)
          0.047858123 = weight(abstract_txt:utilizing in 5815) [ClassicSimilarity], result of:
            0.047858123 = score(doc=5815,freq=1.0), product of:
              0.10958792 = queryWeight, product of:
                1.0532182 = boost
                6.987357 = idf(docFreq=110, maxDocs=44218)
                0.0148912575 = queryNorm
              0.43670982 = fieldWeight in 5815, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.987357 = idf(docFreq=110, maxDocs=44218)
                0.0625 = fieldNorm(doc=5815)
          0.090853274 = weight(abstract_txt:mention in 5815) [ClassicSimilarity], result of:
            0.090853274 = score(doc=5815,freq=2.0), product of:
              0.133355 = queryWeight, product of:
                1.1618276 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0148912575 = queryNorm
              0.68128884 = fieldWeight in 5815, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=5815)
          0.038294073 = weight(abstract_txt:propose in 5815) [ClassicSimilarity], result of:
            0.038294073 = score(doc=5815,freq=1.0), product of:
              0.11900265 = queryWeight, product of:
                1.552138 = boost
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0148912575 = queryNorm
              0.32179177 = fieldWeight in 5815, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0625 = fieldNorm(doc=5815)
          0.03935895 = weight(abstract_txt:text in 5815) [ClassicSimilarity], result of:
            0.03935895 = score(doc=5815,freq=2.0), product of:
              0.11011632 = queryWeight, product of:
                1.8286201 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0148912575 = queryNorm
              0.3574307 = fieldWeight in 5815, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=5815)
          0.3246177 = weight(abstract_txt:mentions in 5815) [ClassicSimilarity], result of:
            0.3246177 = score(doc=5815,freq=3.0), product of:
              0.39268145 = queryWeight, product of:
                3.453169 = boost
                7.636444 = idf(docFreq=57, maxDocs=44218)
                0.0148912575 = queryNorm
              0.82666934 = fieldWeight in 5815, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.636444 = idf(docFreq=57, maxDocs=44218)
                0.0625 = fieldNorm(doc=5815)
          0.24525832 = weight(abstract_txt:entity in 5815) [ClassicSimilarity], result of:
            0.24525832 = score(doc=5815,freq=2.0), product of:
              0.4421009 = queryWeight, product of:
                4.730235 = boost
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0148912575 = queryNorm
              0.5547564 = fieldWeight in 5815, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0625 = fieldNorm(doc=5815)
        0.28 = coord(7/25)
    
  2. Li, C.; Sun, A.; Datta, A.: TSDW: Two-stage word sense disambiguation using Wikipedia (2013) 0.15
    0.15009339 = sum of:
      0.15009339 = product of:
        0.9380837 = sum of:
          0.03935895 = weight(abstract_txt:text in 956) [ClassicSimilarity], result of:
            0.03935895 = score(doc=956,freq=2.0), product of:
              0.11011632 = queryWeight, product of:
                1.8286201 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0148912575 = queryNorm
              0.3574307 = fieldWeight in 956, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=956)
          0.13815558 = weight(abstract_txt:wikipedia in 956) [ClassicSimilarity], result of:
            0.13815558 = score(doc=956,freq=4.0), product of:
              0.17634422 = queryWeight, product of:
                1.8894379 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.0148912575 = queryNorm
              0.7834427 = fieldWeight in 956, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.0625 = fieldNorm(doc=956)
          0.5871454 = weight(abstract_txt:disambiguation in 956) [ClassicSimilarity], result of:
            0.5871454 = score(doc=956,freq=7.0), product of:
              0.4837378 = queryWeight, product of:
                4.4255986 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0148912575 = queryNorm
              1.2137679 = fieldWeight in 956, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0625 = fieldNorm(doc=956)
          0.17342383 = weight(abstract_txt:entity in 956) [ClassicSimilarity], result of:
            0.17342383 = score(doc=956,freq=1.0), product of:
              0.4421009 = queryWeight, product of:
                4.730235 = boost
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0148912575 = queryNorm
              0.39227203 = fieldWeight in 956, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0625 = fieldNorm(doc=956)
        0.16 = coord(4/25)
    
  3. Gao, N.; Dredze, M.; Oard, D.W.: Person entity linking in email with NIL detection (2017) 0.12
    0.11997701 = sum of:
      0.11997701 = product of:
        0.74985635 = sum of:
          0.06424297 = weight(abstract_txt:mention in 3830) [ClassicSimilarity], result of:
            0.06424297 = score(doc=3830,freq=1.0), product of:
              0.133355 = queryWeight, product of:
                1.1618276 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0148912575 = queryNorm
              0.48174396 = fieldWeight in 3830, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=3830)
          0.03935895 = weight(abstract_txt:text in 3830) [ClassicSimilarity], result of:
            0.03935895 = score(doc=3830,freq=2.0), product of:
              0.11011632 = queryWeight, product of:
                1.8286201 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0148912575 = queryNorm
              0.3574307 = fieldWeight in 3830, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=3830)
          0.18741812 = weight(abstract_txt:mentions in 3830) [ClassicSimilarity], result of:
            0.18741812 = score(doc=3830,freq=1.0), product of:
              0.39268145 = queryWeight, product of:
                3.453169 = boost
                7.636444 = idf(docFreq=57, maxDocs=44218)
                0.0148912575 = queryNorm
              0.47727776 = fieldWeight in 3830, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.636444 = idf(docFreq=57, maxDocs=44218)
                0.0625 = fieldNorm(doc=3830)
          0.4588363 = weight(abstract_txt:entity in 3830) [ClassicSimilarity], result of:
            0.4588363 = score(doc=3830,freq=7.0), product of:
              0.4421009 = queryWeight, product of:
                4.730235 = boost
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0148912575 = queryNorm
              1.0378542 = fieldWeight in 3830, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0625 = fieldNorm(doc=3830)
        0.16 = coord(4/25)
    
  4. Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.10
    0.102798894 = sum of:
      0.102798894 = product of:
        0.51399446 = sum of:
          0.028720554 = weight(abstract_txt:propose in 5820) [ClassicSimilarity], result of:
            0.028720554 = score(doc=5820,freq=1.0), product of:
              0.11900265 = queryWeight, product of:
                1.552138 = boost
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0148912575 = queryNorm
              0.24134383 = fieldWeight in 5820, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.046875 = fieldNorm(doc=5820)
          0.047939345 = weight(abstract_txt:structured in 5820) [ClassicSimilarity], result of:
            0.047939345 = score(doc=5820,freq=2.0), product of:
              0.13290648 = queryWeight, product of:
                1.6403068 = boost
                5.4411373 = idf(docFreq=520, maxDocs=44218)
                0.0148912575 = queryNorm
              0.36069983 = fieldWeight in 5820, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4411373 = idf(docFreq=520, maxDocs=44218)
                0.046875 = fieldNorm(doc=5820)
          0.08258116 = weight(abstract_txt:ranking in 5820) [ClassicSimilarity], result of:
            0.08258116 = score(doc=5820,freq=5.0), product of:
              0.14072093 = queryWeight, product of:
                1.6878403 = boost
                5.598813 = idf(docFreq=444, maxDocs=44218)
                0.0148912575 = queryNorm
              0.5868435 = fieldWeight in 5820, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.598813 = idf(docFreq=444, maxDocs=44218)
                0.046875 = fieldNorm(doc=5820)
          0.036153503 = weight(abstract_txt:text in 5820) [ClassicSimilarity], result of:
            0.036153503 = score(doc=5820,freq=3.0), product of:
              0.11011632 = queryWeight, product of:
                1.8286201 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0148912575 = queryNorm
              0.32832104 = fieldWeight in 5820, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.046875 = fieldNorm(doc=5820)
          0.3185999 = weight(abstract_txt:entity in 5820) [ClassicSimilarity], result of:
            0.3185999 = score(doc=5820,freq=6.0), product of:
              0.4421009 = queryWeight, product of:
                4.730235 = boost
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0148912575 = queryNorm
              0.7206497 = fieldWeight in 5820, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.046875 = fieldNorm(doc=5820)
        0.2 = coord(5/25)
    
  5. Vechtomova, O.; Robertson, S.E.: ¬A domain-independent approach to finding related entities (2012) 0.09
    0.091127045 = sum of:
      0.091127045 = product of:
        0.7593921 = sum of:
          0.13797916 = weight(abstract_txt:candidate in 2733) [ClassicSimilarity], result of:
            0.13797916 = score(doc=2733,freq=4.0), product of:
              0.12051504 = queryWeight, product of:
                1.1044794 = boost
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.0148912575 = queryNorm
              1.1449124 = fieldWeight in 2733, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.078125 = fieldNorm(doc=2733)
          0.04786759 = weight(abstract_txt:propose in 2733) [ClassicSimilarity], result of:
            0.04786759 = score(doc=2733,freq=1.0), product of:
              0.11900265 = queryWeight, product of:
                1.552138 = boost
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0148912575 = queryNorm
              0.4022397 = fieldWeight in 2733, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.078125 = fieldNorm(doc=2733)
          0.57354534 = weight(abstract_txt:entity in 2733) [ClassicSimilarity], result of:
            0.57354534 = score(doc=2733,freq=7.0), product of:
              0.4421009 = queryWeight, product of:
                4.730235 = boost
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.0148912575 = queryNorm
              1.2973177 = fieldWeight in 2733, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.2763524 = idf(docFreq=225, maxDocs=44218)
                0.078125 = fieldNorm(doc=2733)
        0.12 = coord(3/25)