Search (1 results, page 1 of 1)

  • × author_ss:"Xiong, C."
  1. Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.05
    0.054245643 = product of:
      0.08136846 = sum of:
        0.055122763 = product of:
          0.16536829 = sum of:
            0.16536829 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
              0.16536829 = score(doc=5820,freq=2.0), product of:
                0.44136027 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.052059412 = queryNorm
                0.3746787 = fieldWeight in 5820, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5820)
          0.33333334 = coord(1/3)
        0.026245698 = weight(_text_:the in 5820) [ClassicSimilarity], result of:
          0.026245698 = score(doc=5820,freq=42.0), product of:
            0.08213748 = queryWeight, product of:
              1.5777643 = idf(docFreq=24812, maxDocs=44218)
              0.052059412 = queryNorm
            0.31953377 = fieldWeight in 5820, product of:
              6.4807405 = tf(freq=42.0), with freq of:
                42.0 = termFreq=42.0
              1.5777643 = idf(docFreq=24812, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
      0.6666667 = coord(2/3)
    
    Abstract
    The successes of information retrieval (IR) in recent decades were built upon bag-of-words representations. Effective as it is, bag-of-words is only a shallow text understanding; there is a limited amount of information for document ranking in the word space. This dissertation goes beyond words and builds knowledge based text representations, which embed the external and carefully curated information from knowledge bases, and provide richer and structured evidence for more advanced information retrieval systems. This thesis research first builds query representations with entities associated with the query. Entities' descriptions are used by query expansion techniques that enrich the query with explanation terms. Then we present a general framework that represents a query with entities that appear in the query, are retrieved by the query, or frequently show up in the top retrieved documents. A latent space model is developed to jointly learn the connections from query to entities and the ranking of documents, modeling the external evidence from knowledge bases and internal ranking features cooperatively. To further improve the quality of relevant entities, a defining factor of our query representations, we introduce learning to rank to entity search and retrieve better entities from knowledge bases. In the document representation part, this thesis research also moves one step forward with a bag-of-entities model, in which documents are represented by their automatic entity annotations, and the ranking is performed in the entity space.
    This proposal includes plans to improve the quality of relevant entities with a co-learning framework that learns from both entity labels and document labels. We also plan to develop a hybrid ranking system that combines word based and entity based representations together with their uncertainties considered. At last, we plan to enrich the text representations with connections between entities. We propose several ways to infer entity graph representations for texts, and to rank documents using their structure representations. This dissertation overcomes the limitation of word based representations with external and carefully curated information from knowledge bases. We believe this thesis research is a solid start towards the new generation of intelligent, semantic, and structured information retrieval.
    Content
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies. Vgl.: https%3A%2F%2Fwww.cs.cmu.edu%2F~cx%2Fpapers%2Fknowledge_based_text_representation.pdf&usg=AOvVaw0SaTSvhWLTh__Uz_HtOtl3.