Search (3 results, page 1 of 1)

Zeng, Q.; Yu, M.; Yu, W.; Xiong, J.; Shi, Y.; Jiang, M.: Faceted hierarchy : a new graph type to organize scientific concepts and a construction method (2019) 0.07
```
0.069151685 = product of:
  0.10372752 = sum of:
    0.082684144 = product of:
      0.24805243 = sum of:
        0.24805243 = weight(_text_:3a in 400) [ClassicSimilarity], result of:
          0.24805243 = score(doc=400,freq=2.0), product of:
            0.44136027 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.052059412 = queryNorm
            0.56201804 = fieldWeight in 400, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=400)
      0.33333334 = coord(1/3)
    0.021043373 = weight(_text_:the in 400) [ClassicSimilarity], result of:
      0.021043373 = score(doc=400,freq=12.0), product of:
        0.08213748 = queryWeight, product of:
          1.5777643 = idf(docFreq=24812, maxDocs=44218)
          0.052059412 = queryNorm
        0.25619698 = fieldWeight in 400, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5777643 = idf(docFreq=24812, maxDocs=44218)
          0.046875 = fieldNorm(doc=400)
  0.6666667 = coord(2/3)
```
Abstract

On a scientific concept hierarchy, a parent concept may have a few attributes, each of which has multiple values being a group of child concepts. We call these attributes facets: classification has a few facets such as application (e.g., face recognition), model (e.g., svm, knn), and metric (e.g., precision). In this work, we aim at building faceted concept hierarchies from scientific literature. Hierarchy construction methods heavily rely on hypernym detection, however, the faceted relations are parent-to-child links but the hypernym relation is a multi-hop, i.e., ancestor-to-descendent link with a specific facet "type-of". We use information extraction techniques to find synonyms, sibling concepts, and ancestor-descendent relations from a data science corpus. And we propose a hierarchy growth algorithm to infer the parent-child links from the three types of relationships. It resolves conflicts by maintaining the acyclic structure of a hierarchy.

Content

Vgl.: https%3A%2F%2Faclanthology.org%2FD19-5317.pdf&usg=AOvVaw0ZZFyq5wWTtNTvNkrvjlGA.

Source

Graph-Based Methods for Natural Language Processing - proceedings of the Thirteenth Workshop (TextGraphs-13): November 4, 2019, Hong Kong : EMNLP-IJCNLP 2019. Ed.: Dmitry Ustalov
Stojanovic, N.: Ontology-based Information Retrieval : methods and tools for cooperative query answering (2005) 0.06
```
0.060896844 = product of:
  0.091345266 = sum of:
    0.055122763 = product of:
      0.16536829 = sum of:
        0.16536829 = weight(_text_:3a in 701) [ClassicSimilarity], result of:
          0.16536829 = score(doc=701,freq=2.0), product of:
            0.44136027 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.052059412 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
      0.33333334 = coord(1/3)
    0.036222506 = weight(_text_:the in 701) [ClassicSimilarity], result of:
      0.036222506 = score(doc=701,freq=80.0), product of:
        0.08213748 = queryWeight, product of:
          1.5777643 = idf(docFreq=24812, maxDocs=44218)
          0.052059412 = queryNorm
        0.44099852 = fieldWeight in 701, product of:
          8.944272 = tf(freq=80.0), with freq of:
            80.0 = termFreq=80.0
          1.5777643 = idf(docFreq=24812, maxDocs=44218)
          0.03125 = fieldNorm(doc=701)
  0.6666667 = coord(2/3)
```
Abstract

By the explosion of possibilities for a ubiquitous content production, the information overload problem reaches the level of complexity which cannot be managed by traditional modelling approaches anymore. Due to their pure syntactical nature traditional information retrieval approaches did not succeed in treating content itself (i.e. its meaning, and not its representation). This leads to a very low usefulness of the results of a retrieval process for a user's task at hand. In the last ten years ontologies have been emerged from an interesting conceptualisation paradigm to a very promising (semantic) modelling technology, especially in the context of the Semantic Web. From the information retrieval point of view, ontologies enable a machine-understandable form of content description, such that the retrieval process can be driven by the meaning of the content. However, the very ambiguous nature of the retrieval process in which a user, due to the unfamiliarity with the underlying repository and/or query syntax, just approximates his information need in a query, implies a necessity to include the user in the retrieval process more actively in order to close the gap between the meaning of the content and the meaning of a user's query (i.e. his information need). This thesis lays foundation for such an ontology-based interactive retrieval process, in which the retrieval system interacts with a user in order to conceptually interpret the meaning of his query, whereas the underlying domain ontology drives the conceptualisation process. In that way the retrieval process evolves from a query evaluation process into a highly interactive cooperation between a user and the retrieval system, in which the system tries to anticipate the user's information need and to deliver the relevant content proactively. Moreover, the notion of content relevance for a user's query evolves from a content dependent artefact to the multidimensional context-dependent structure, strongly influenced by the user's preferences. This cooperation process is realized as the so-called Librarian Agent Query Refinement Process. In order to clarify the impact of an ontology on the retrieval process (regarding its complexity and quality), a set of methods and tools for different levels of content and query formalisation is developed, ranging from pure ontology-based inferencing to keyword-based querying in which semantics automatically emerges from the results. Our evaluation studies have shown that the possibilities to conceptualize a user's information need in the right manner and to interpret the retrieval results accordingly are key issues for realizing much more meaningful information retrieval systems.

Content

Vgl.: http%3A%2F%2Fdigbib.ubka.uni-karlsruhe.de%2Fvolltexte%2Fdocuments%2F1627&ei=tAtYUYrBNoHKtQb3l4GYBw&usg=AFQjCNHeaxKkKU3-u54LWxMNYGXaaDLCGw&sig2=8WykXWQoDKjDSdGtAakH2Q&bvm=bv.44442042,d.Yms.
Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.05
```
0.054245643 = product of:
  0.08136846 = sum of:
    0.055122763 = product of:
      0.16536829 = sum of:
        0.16536829 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
          0.16536829 = score(doc=5820,freq=2.0), product of:
            0.44136027 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.052059412 = queryNorm
            0.3746787 = fieldWeight in 5820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
      0.33333334 = coord(1/3)
    0.026245698 = weight(_text_:the in 5820) [ClassicSimilarity], result of:
      0.026245698 = score(doc=5820,freq=42.0), product of:
        0.08213748 = queryWeight, product of:
          1.5777643 = idf(docFreq=24812, maxDocs=44218)
          0.052059412 = queryNorm
        0.31953377 = fieldWeight in 5820, product of:
          6.4807405 = tf(freq=42.0), with freq of:
            42.0 = termFreq=42.0
          1.5777643 = idf(docFreq=24812, maxDocs=44218)
          0.03125 = fieldNorm(doc=5820)
  0.6666667 = coord(2/3)
```
Abstract

The successes of information retrieval (IR) in recent decades were built upon bag-of-words representations. Effective as it is, bag-of-words is only a shallow text understanding; there is a limited amount of information for document ranking in the word space. This dissertation goes beyond words and builds knowledge based text representations, which embed the external and carefully curated information from knowledge bases, and provide richer and structured evidence for more advanced information retrieval systems. This thesis research first builds query representations with entities associated with the query. Entities' descriptions are used by query expansion techniques that enrich the query with explanation terms. Then we present a general framework that represents a query with entities that appear in the query, are retrieved by the query, or frequently show up in the top retrieved documents. A latent space model is developed to jointly learn the connections from query to entities and the ranking of documents, modeling the external evidence from knowledge bases and internal ranking features cooperatively. To further improve the quality of relevant entities, a defining factor of our query representations, we introduce learning to rank to entity search and retrieve better entities from knowledge bases. In the document representation part, this thesis research also moves one step forward with a bag-of-entities model, in which documents are represented by their automatic entity annotations, and the ranking is performed in the entity space.
This proposal includes plans to improve the quality of relevant entities with a co-learning framework that learns from both entity labels and document labels. We also plan to develop a hybrid ranking system that combines word based and entity based representations together with their uncertainties considered. At last, we plan to enrich the text representations with connections between entities. We propose several ways to infer entity graph representations for texts, and to rank documents using their structure representations. This dissertation overcomes the limitation of word based representations with external and carefully curated information from knowledge bases. We believe this thesis research is a solid start towards the new generation of intelligent, semantic, and structured information retrieval.

Content

Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies. Vgl.: https%3A%2F%2Fwww.cs.cmu.edu%2F~cx%2Fpapers%2Fknowledge_based_text_representation.pdf&usg=AOvVaw0SaTSvhWLTh__Uz_HtOtl3.

Search (3 results, page 1 of 1)

Authors

Years

Types