Document (#40268)

Author
Zhao, G.
Wu, J.
Wang, D.
Li, T.
Title
Entity disambiguation to Wikipedia using collective ranking
Source
Information processing and management. 52(2016) no.6, S.1247-1257
Year
2016
Abstract
Entity disambiguation is a fundamental task of semantic Web annotation. Entity Linking (EL) is an essential procedure in entity disambiguation, which aims to link a mention appearing in a plain text to a structured or semi-structured knowledge base, such as Wikipedia. Existing research on EL usually annotates the mentions in a text one by one and treats entities independent to each other. However this might not be true in many application scenarios. For example, if two mentions appear in one text, they are likely to have certain intrinsic relationships. In this paper, we first propose a novel query expansion method for candidate generation utilizing the information of co-occurrences of mentions. We further propose a re-ranking model which can be iteratively adjusted based on the prediction in the previous round. Experiments on real-world data demonstrate the effectiveness of our proposed methods for entity disambiguation.
Content
Vgl.: http://www.sciencedirect.com/science/article/pii/S030645731630098X.

Similar documents (author)

  1. Wang, X.; High, A.; Wang, X.; Zhao, K.: Predicting users' continued engagement in online health communities from the quantity and quality of received support (2021) 3.71
    3.7093225 = sum of:
      3.7093225 = sum of:
        1.8181618 = weight(author_txt:wang in 1244) [ClassicSimilarity], result of:
          1.8181618 = score(doc=1244,freq=2.0), product of:
            0.61165303 = queryWeight, product of:
              6.726085 = idf(docFreq=140, maxDocs=43254)
              0.09093745 = queryNorm
            2.972538 = fieldWeight in 1244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.726085 = idf(docFreq=140, maxDocs=43254)
              0.3125 = fieldNorm(doc=1244)
        1.8911606 = weight(author_txt:zhao in 1244) [ClassicSimilarity], result of:
          1.8911606 = score(doc=1244,freq=1.0), product of:
            0.79112613 = queryWeight, product of:
              1.1372876 = boost
              7.649493 = idf(docFreq=55, maxDocs=43254)
              0.09093745 = queryNorm
            2.3904667 = fieldWeight in 1244, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.649493 = idf(docFreq=55, maxDocs=43254)
              0.3125 = fieldNorm(doc=1244)
    
  2. Wang, C.; Zhao, S.; Kalra, A.; Borcea, C.; Chen, Y.: Predictive models and analysis for webpage depth-level dwell time (2018) 3.18
    3.176795 = sum of:
      3.176795 = sum of:
        1.2856344 = weight(author_txt:wang in 371) [ClassicSimilarity], result of:
          1.2856344 = score(doc=371,freq=1.0), product of:
            0.61165303 = queryWeight, product of:
              6.726085 = idf(docFreq=140, maxDocs=43254)
              0.09093745 = queryNorm
            2.1019015 = fieldWeight in 371, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.726085 = idf(docFreq=140, maxDocs=43254)
              0.3125 = fieldNorm(doc=371)
        1.8911606 = weight(author_txt:zhao in 371) [ClassicSimilarity], result of:
          1.8911606 = score(doc=371,freq=1.0), product of:
            0.79112613 = queryWeight, product of:
              1.1372876 = boost
              7.649493 = idf(docFreq=55, maxDocs=43254)
              0.09093745 = queryNorm
            2.3904667 = fieldWeight in 371, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.649493 = idf(docFreq=55, maxDocs=43254)
              0.3125 = fieldNorm(doc=371)
    
  3. Zhao, L.: Save space for "newcomers" : analyzing problems in book number assignment under the LCC system (2004) 1.89
    1.8911606 = sum of:
      1.8911606 = product of:
        3.7823212 = sum of:
          3.7823212 = weight(author_txt:zhao in 5082) [ClassicSimilarity], result of:
            3.7823212 = score(doc=5082,freq=1.0), product of:
              0.79112613 = queryWeight, product of:
                1.1372876 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.09093745 = queryNorm
              4.7809334 = fieldWeight in 5082, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.625 = fieldNorm(doc=5082)
        0.5 = coord(1/2)
    
  4. Zhao, L.: How librarians used e-resources : an analysis of citations in CCQ (2006) 1.89
    1.8911606 = sum of:
      1.8911606 = product of:
        3.7823212 = sum of:
          3.7823212 = weight(author_txt:zhao in 767) [ClassicSimilarity], result of:
            3.7823212 = score(doc=767,freq=1.0), product of:
              0.79112613 = queryWeight, product of:
                1.1372876 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.09093745 = queryNorm
              4.7809334 = fieldWeight in 767, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.625 = fieldNorm(doc=767)
        0.5 = coord(1/2)
    
  5. Zhao, D.: Challenges of scholarly publications on the Web to the evaluation of science : a comparison of author visibility on the Web and in print journals (2005) 1.89
    1.8911606 = sum of:
      1.8911606 = product of:
        3.7823212 = sum of:
          3.7823212 = weight(author_txt:zhao in 3066) [ClassicSimilarity], result of:
            3.7823212 = score(doc=3066,freq=1.0), product of:
              0.79112613 = queryWeight, product of:
                1.1372876 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.09093745 = queryNorm
              4.7809334 = fieldWeight in 3066, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.625 = fieldNorm(doc=3066)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Phan, M.C.; Sun, A.: Collective named entity recognition in user comments via parameterized label propagation (2020) 0.24
    0.23526847 = sum of:
      0.23526847 = product of:
        0.84024453 = sum of:
          0.04652851 = weight(abstract_txt:collective in 816) [ClassicSimilarity], result of:
            0.04652851 = score(doc=816,freq=1.0), product of:
              0.10729127 = queryWeight, product of:
                1.0442837 = boost
                6.9386463 = idf(docFreq=113, maxDocs=43254)
                0.014807138 = queryNorm
              0.4336654 = fieldWeight in 816, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9386463 = idf(docFreq=113, maxDocs=43254)
                0.0625 = fieldNorm(doc=816)
          0.048399962 = weight(abstract_txt:utilizing in 816) [ClassicSimilarity], result of:
            0.048399962 = score(doc=816,freq=1.0), product of:
              0.11014927 = queryWeight, product of:
                1.058101 = boost
                7.030454 = idf(docFreq=103, maxDocs=43254)
                0.014807138 = queryNorm
              0.43940338 = fieldWeight in 816, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.030454 = idf(docFreq=103, maxDocs=43254)
                0.0625 = fieldNorm(doc=816)
          0.09008495 = weight(abstract_txt:mention in 816) [ClassicSimilarity], result of:
            0.09008495 = score(doc=816,freq=2.0), product of:
              0.13228475 = queryWeight, product of:
                1.1595546 = boost
                7.704553 = idf(docFreq=52, maxDocs=43254)
                0.014807138 = queryNorm
              0.6809927 = fieldWeight in 816, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.704553 = idf(docFreq=52, maxDocs=43254)
                0.0625 = fieldNorm(doc=816)
          0.038981907 = weight(abstract_txt:propose in 816) [ClassicSimilarity], result of:
            0.038981907 = score(doc=816,freq=1.0), product of:
              0.12013522 = queryWeight, product of:
                1.5627391 = boost
                5.1917377 = idf(docFreq=653, maxDocs=43254)
                0.014807138 = queryNorm
              0.3244836 = fieldWeight in 816, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1917377 = idf(docFreq=653, maxDocs=43254)
                0.0625 = fieldNorm(doc=816)
          0.039247494 = weight(abstract_txt:text in 816) [ClassicSimilarity], result of:
            0.039247494 = score(doc=816,freq=2.0), product of:
              0.10964529 = queryWeight, product of:
                1.8284873 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.014807138 = queryNorm
              0.35794964 = fieldWeight in 816, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.0625 = fieldNorm(doc=816)
          0.32859 = weight(abstract_txt:mentions in 816) [ClassicSimilarity], result of:
            0.32859 = score(doc=816,freq=3.0), product of:
              0.39493096 = queryWeight, product of:
                3.4702241 = boost
                7.685861 = idf(docFreq=53, maxDocs=43254)
                0.014807138 = queryNorm
              0.83201885 = fieldWeight in 816, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.685861 = idf(docFreq=53, maxDocs=43254)
                0.0625 = fieldNorm(doc=816)
          0.24841173 = weight(abstract_txt:entity in 816) [ClassicSimilarity], result of:
            0.24841173 = score(doc=816,freq=2.0), product of:
              0.44481528 = queryWeight, product of:
                4.754569 = boost
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.014807138 = queryNorm
              0.5584604 = fieldWeight in 816, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.0625 = fieldNorm(doc=816)
        0.28 = coord(7/25)
    
  2. Li, C.; Sun, A.; Datta, A.: TSDW: Two-stage word sense disambiguation using Wikipedia (2013) 0.15
    0.15040149 = sum of:
      0.15040149 = product of:
        0.9400093 = sum of:
          0.039247494 = weight(abstract_txt:text in 2421) [ClassicSimilarity], result of:
            0.039247494 = score(doc=2421,freq=2.0), product of:
              0.10964529 = queryWeight, product of:
                1.8284873 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.014807138 = queryNorm
              0.35794964 = fieldWeight in 2421, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.0625 = fieldNorm(doc=2421)
          0.13806587 = weight(abstract_txt:wikipedia in 2421) [ClassicSimilarity], result of:
            0.13806587 = score(doc=2421,freq=4.0), product of:
              0.17584601 = queryWeight, product of:
                1.890678 = boost
                6.2812176 = idf(docFreq=219, maxDocs=43254)
                0.014807138 = queryNorm
              0.7851522 = fieldWeight in 2421, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2812176 = idf(docFreq=219, maxDocs=43254)
                0.0625 = fieldNorm(doc=2421)
          0.5870423 = weight(abstract_txt:disambiguation in 2421) [ClassicSimilarity], result of:
            0.5870423 = score(doc=2421,freq=7.0), product of:
              0.48252356 = queryWeight, product of:
                4.429203 = boost
                7.357357 = idf(docFreq=74, maxDocs=43254)
                0.014807138 = queryNorm
              1.2166085 = fieldWeight in 2421, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.357357 = idf(docFreq=74, maxDocs=43254)
                0.0625 = fieldNorm(doc=2421)
          0.17565362 = weight(abstract_txt:entity in 2421) [ClassicSimilarity], result of:
            0.17565362 = score(doc=2421,freq=1.0), product of:
              0.44481528 = queryWeight, product of:
                4.754569 = boost
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.014807138 = queryNorm
              0.39489117 = fieldWeight in 2421, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.0625 = fieldNorm(doc=2421)
        0.16 = coord(4/25)
    
  3. Gao, N.; Dredze, M.; Oard, D.W.: Person entity linking in email with NIL detection (2017) 0.12
    0.12118312 = sum of:
      0.12118312 = product of:
        0.7573945 = sum of:
          0.06369968 = weight(abstract_txt:mention in 5295) [ClassicSimilarity], result of:
            0.06369968 = score(doc=5295,freq=1.0), product of:
              0.13228475 = queryWeight, product of:
                1.1595546 = boost
                7.704553 = idf(docFreq=52, maxDocs=43254)
                0.014807138 = queryNorm
              0.48153457 = fieldWeight in 5295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.704553 = idf(docFreq=52, maxDocs=43254)
                0.0625 = fieldNorm(doc=5295)
          0.039247494 = weight(abstract_txt:text in 5295) [ClassicSimilarity], result of:
            0.039247494 = score(doc=5295,freq=2.0), product of:
              0.10964529 = queryWeight, product of:
                1.8284873 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.014807138 = queryNorm
              0.35794964 = fieldWeight in 5295, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.0625 = fieldNorm(doc=5295)
          0.18971153 = weight(abstract_txt:mentions in 5295) [ClassicSimilarity], result of:
            0.18971153 = score(doc=5295,freq=1.0), product of:
              0.39493096 = queryWeight, product of:
                3.4702241 = boost
                7.685861 = idf(docFreq=53, maxDocs=43254)
                0.014807138 = queryNorm
              0.48036632 = fieldWeight in 5295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.685861 = idf(docFreq=53, maxDocs=43254)
                0.0625 = fieldNorm(doc=5295)
          0.4647358 = weight(abstract_txt:entity in 5295) [ClassicSimilarity], result of:
            0.4647358 = score(doc=5295,freq=7.0), product of:
              0.44481528 = queryWeight, product of:
                4.754569 = boost
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.014807138 = queryNorm
              1.0447838 = fieldWeight in 5295, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.0625 = fieldNorm(doc=5295)
        0.16 = coord(4/25)
    
  4. Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.10
    0.10357785 = sum of:
      0.10357785 = product of:
        0.51788926 = sum of:
          0.02923643 = weight(abstract_txt:propose in 821) [ClassicSimilarity], result of:
            0.02923643 = score(doc=821,freq=1.0), product of:
              0.12013522 = queryWeight, product of:
                1.5627391 = boost
                5.1917377 = idf(docFreq=653, maxDocs=43254)
                0.014807138 = queryNorm
              0.2433627 = fieldWeight in 821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1917377 = idf(docFreq=653, maxDocs=43254)
                0.046875 = fieldNorm(doc=821)
          0.047784407 = weight(abstract_txt:structured in 821) [ClassicSimilarity], result of:
            0.047784407 = score(doc=821,freq=2.0), product of:
              0.13230255 = queryWeight, product of:
                1.6399683 = boost
                5.4483085 = idf(docFreq=505, maxDocs=43254)
                0.014807138 = queryNorm
              0.36117524 = fieldWeight in 821, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4483085 = idf(docFreq=505, maxDocs=43254)
                0.046875 = fieldNorm(doc=821)
          0.082120955 = weight(abstract_txt:ranking in 821) [ClassicSimilarity], result of:
            0.082120955 = score(doc=821,freq=5.0), product of:
              0.13986212 = queryWeight, product of:
                1.6861701 = boost
                5.6018004 = idf(docFreq=433, maxDocs=43254)
                0.014807138 = queryNorm
              0.58715653 = fieldWeight in 821, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.6018004 = idf(docFreq=433, maxDocs=43254)
                0.046875 = fieldNorm(doc=821)
          0.036051128 = weight(abstract_txt:text in 821) [ClassicSimilarity], result of:
            0.036051128 = score(doc=821,freq=3.0), product of:
              0.10964529 = queryWeight, product of:
                1.8284873 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.014807138 = queryNorm
              0.32879776 = fieldWeight in 821, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.046875 = fieldNorm(doc=821)
          0.32269636 = weight(abstract_txt:entity in 821) [ClassicSimilarity], result of:
            0.32269636 = score(doc=821,freq=6.0), product of:
              0.44481528 = queryWeight, product of:
                4.754569 = boost
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.014807138 = queryNorm
              0.7254615 = fieldWeight in 821, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.046875 = fieldNorm(doc=821)
        0.2 = coord(5/25)
    
  5. Vechtomova, O.; Robertson, S.E.: ¬A domain-independent approach to finding related entities (2012) 0.09
    0.09193402 = sum of:
      0.09193402 = product of:
        0.76611686 = sum of:
          0.13646974 = weight(abstract_txt:candidate in 4198) [ClassicSimilarity], result of:
            0.13646974 = score(doc=4198,freq=4.0), product of:
              0.11934819 = queryWeight, product of:
                1.1013979 = boost
                7.318136 = idf(docFreq=77, maxDocs=43254)
                0.014807138 = queryNorm
              1.1434588 = fieldWeight in 4198, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.318136 = idf(docFreq=77, maxDocs=43254)
                0.078125 = fieldNorm(doc=4198)
          0.048727386 = weight(abstract_txt:propose in 4198) [ClassicSimilarity], result of:
            0.048727386 = score(doc=4198,freq=1.0), product of:
              0.12013522 = queryWeight, product of:
                1.5627391 = boost
                5.1917377 = idf(docFreq=653, maxDocs=43254)
                0.014807138 = queryNorm
              0.4056045 = fieldWeight in 4198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1917377 = idf(docFreq=653, maxDocs=43254)
                0.078125 = fieldNorm(doc=4198)
          0.58091974 = weight(abstract_txt:entity in 4198) [ClassicSimilarity], result of:
            0.58091974 = score(doc=4198,freq=7.0), product of:
              0.44481528 = queryWeight, product of:
                4.754569 = boost
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.014807138 = queryNorm
              1.3059797 = fieldWeight in 4198, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.318259 = idf(docFreq=211, maxDocs=43254)
                0.078125 = fieldNorm(doc=4198)
        0.12 = coord(3/25)