Search (7 results, page 1 of 1)

Chen, L.; Zeng, J.; Tokuda, N.: ¬A "stereo" document representation for textual information retrieval (2006) 0.02

0.02047082 = product of:
  0.04094164 = sum of:
    0.04094164 = sum of:
      0.007295696 = weight(_text_:a in 5292) [ClassicSimilarity], result of:
        0.007295696 = score(doc=5292,freq=8.0), product of:
          0.04772363 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.041389145 = queryNorm
          0.15287387 = fieldWeight in 5292, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.046875 = fieldNorm(doc=5292)
      0.033645947 = weight(_text_:22 in 5292) [ClassicSimilarity], result of:
        0.033645947 = score(doc=5292,freq=2.0), product of:
          0.14493774 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.041389145 = queryNorm
          0.23214069 = fieldWeight in 5292, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=5292)
  0.5 = coord(1/2)

Abstract: A new document representation model is presented in this paper. This model is based on the idea of representing a document by two or more pictures of the document taken from different perspectives. It is shown that by applying the stereo representation model, enhanced textual retrieval performance is achieved because the new model improves the capability of capturing individual features of the document. Experiments have been conducted on two standard corpora, TIME and ADI, using the standard term vector method and the latent semantic indexing (LSI) method based upon both the stereo representation model and the traditional representation model. Statistical t-tests on the experimental results have convincingly illustrated that these methods achieve significant improvements in retrieval performances with the stereo representation model over those with the traditional representation model.
Date: 22. 7.2006 17:33:43
Type: a

Tang, X.; Chen, L.; Cui, J.; Wei, B.: Knowledge representation learning with entity descriptions, hierarchical types, and textual relations (2019) 0.02
```
0.02047082 = product of:
  0.04094164 = sum of:
    0.04094164 = sum of:
      0.007295696 = weight(_text_:a in 5101) [ClassicSimilarity], result of:
        0.007295696 = score(doc=5101,freq=8.0), product of:
          0.04772363 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.041389145 = queryNorm
          0.15287387 = fieldWeight in 5101, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.046875 = fieldNorm(doc=5101)
      0.033645947 = weight(_text_:22 in 5101) [ClassicSimilarity], result of:
        0.033645947 = score(doc=5101,freq=2.0), product of:
          0.14493774 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.041389145 = queryNorm
          0.23214069 = fieldWeight in 5101, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=5101)
  0.5 = coord(1/2)
```
Abstract

Knowledge representation learning methods usually only utilize triple facts, or just consider one kind of extra information. In this paper, we propose a multi-source knowledge representation learning (MKRL) model, which can combine entity descriptions, hierarchical types, and textual relations with triple facts. Specifically, for entity descriptions, a convolutional neural network is used to get representations. For hierarchical type, weighted hierarchy encoders are used to construct the projection matrixes of hierarchical types, and the projection matrix of an entity combines all hierarchical type projection matrixes of the entity with the relation-specific type constrains. For textual relations, a sentence-level attention mechanism is employed to get representations. We evaluate MKRL model on knowledge graph completion task with dataset FB15k-237, and experimental results demonstrate that our model outperforms the state-of-the-art methods, which indicates the effectiveness of multi-source information for knowledge representation.

Date

17. 3.2019 13:22:53

Type

a
Han, B.; Chen, L.; Tian, X.: Knowledge based collection selection for distributed information retrieval (2018) 0.00
```
0.002279905 = product of:
  0.00455981 = sum of:
    0.00455981 = product of:
      0.00911962 = sum of:
        0.00911962 = weight(_text_:a in 3289) [ClassicSimilarity], result of:
          0.00911962 = score(doc=3289,freq=18.0), product of:
            0.04772363 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.041389145 = queryNorm
            0.19109234 = fieldWeight in 3289, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3289)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Recent years have seen a great deal of work on collection selection. Most collection selection methods use central sample index (CSI) that consists of some documents sampled from each collection as collection description. The limitations of these methods are the usage of 'flat' meaning representations that ignore structure and relationships among words in CSI, and the calculation of query-collection similarity metric that ignore semantic distance between query words and indexed words. In this paper, we propose a knowledge based collection selection method (KBCS) to improve collection representation and query-collection similarity metric. KBCS models a collection as a weighted entity set and applies a novel query-collection similarity metric to select highly scored collections. Specifically, in the part of collection representation, context- and structure-based measures are employed to weight the semantic distance between two entities extracted from the sampled documents of a collection. In addition, the novel query-collection similarity metric takes the entity weight, collection size, and other factors into account. To enrich concepts contained in a query, DBpedia based query expansion is integrated. Finally, extensive experiments were conducted on a large webpage dataset, and DBpedia was chosen as the graph knowledge base. Experimental results demonstrate the effectiveness of KBCS.

Type

a
Chen, L.; Holsapple, C.W.; Hsiao, S.-H.; Ke, Z.; Oh, J.-Y.; Yang, Z.: Knowledge-dissemination channels : analytics of stature evaluation (2017) 0.00
```
0.0020106873 = product of:
  0.0040213745 = sum of:
    0.0040213745 = product of:
      0.008042749 = sum of:
        0.008042749 = weight(_text_:a in 3531) [ClassicSimilarity], result of:
          0.008042749 = score(doc=3531,freq=14.0), product of:
            0.04772363 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.041389145 = queryNorm
            0.1685276 = fieldWeight in 3531, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3531)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Understanding relative statures of channels for disseminating knowledge is of practical interest to both generators and consumers of knowledge flows. For generators, stature can influence attractiveness of alternative dissemination routes and deliberations of those who assess generator performance. For knowledge consumers, channel stature may influence knowledge content to which they are exposed. This study introduces a novel approach to conceptualizing and measuring stature of knowledge-dissemination channels: the power-impact (PI) technique. It is a flexible technique having 3 complementary variants, giving holistic insights about channel stature by accounting for both attraction of knowledge generators to a distribution channel and degree to which knowledge consumers choose to use a channel's knowledge content. Each PI variant is expressed in terms of multiple parameters, permitting customization of stature evaluation to suit its user's preferences. In the spirit of analytics, each PI variant is driven by objective evidence of actual behaviors. The PI technique is based on 2 building blocks: (a) power that channels have for attracting results of generators' knowledge work, and (b) impact that channel contents' exhibit on prospective recipients. Feasibility and functionality of the PI-technique design are demonstrated by applying it to solve a problem of journal stature evaluation for the information-systems discipline.

Type

a
Chen, L.; Ding, J.; Larivière, V.: Measuring the citation context of national self-references : how a web journal club is used (2022) 0.00
```
0.0020106873 = product of:
  0.0040213745 = sum of:
    0.0040213745 = product of:
      0.008042749 = sum of:
        0.008042749 = weight(_text_:a in 545) [ClassicSimilarity], result of:
          0.008042749 = score(doc=545,freq=14.0), product of:
            0.04772363 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.041389145 = queryNorm
            0.1685276 = fieldWeight in 545, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=545)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The emphasis on research evaluation has brought scrutiny to the role of self-citations in the scholarly communication process. While author self-citations have been studied at length, little is known on national-level self-references (SRs). This paper analyses the citation context of national SRs, using the full-text of 184,859 papers published in PLOS journals. It investigates the differences between national SRs and nonself-references (NSRs) in terms of their in-text mention, presence in enumerations, and location features. For all countries, national SRs exhibit a higher level of engagement than NSRs. NSRs are more often found in enumerative citances than SRs, which suggests that researchers pay more attention to domestic than foreign studies. There are more mentions of national research in the methods section, which provides evidence that methodologies developed in a nation are more likely to be used by other researchers from the same nation. Publications from the United States are cited at a higher rate in each of the sections, indicating that the country still maintains a dominant position in science. On the whole, this paper contributes to a better understanding of the role of national SRs in the scholarly communication system, and how it varies across countries and over time.

Type

a
Chen, L.; Fang, H.: ¬An automatic method for ex-tracting innovative ideas based on the Scopus® database (2019) 0.00
```
0.0018615347 = product of:
  0.0037230693 = sum of:
    0.0037230693 = product of:
      0.0074461387 = sum of:
        0.0074461387 = weight(_text_:a in 5310) [ClassicSimilarity], result of:
          0.0074461387 = score(doc=5310,freq=12.0), product of:
            0.04772363 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.041389145 = queryNorm
            0.15602624 = fieldWeight in 5310, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5310)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The novelty of knowledge claims in a research paper can be considered an evaluation criterion for papers to supplement citations. To provide a foundation for research evaluation from the perspective of innovativeness, we propose an automatic approach for extracting innovative ideas from the abstracts of technology and engineering papers. The approach extracts N-grams as candidates based on part-of-speech tagging and determines whether they are novel by checking the Scopus® database to determine whether they had ever been presented previously. Moreover, we discussed the distributions of innovative ideas in different abstract structures. To improve the performance by excluding noisy N-grams, a list of stopwords and a list of research description characteristics were developed. We selected abstracts of articles published from 2011 to 2017 with the topic of semantic analysis as the experimental texts. Excluding noisy N-grams, considering the distribution of innovative ideas in abstracts, and suitably combining N-grams can effectively improve the performance of automatic innovative idea extraction. Unlike co-word and co-citation analysis, innovative-idea extraction aims to identify the differences in a paper from all previously published papers.

Type

a
Xie, H.; Li, X.; Wang, T.; Lau, R.Y.K.; Wong, T.-L.; Chen, L.; Wang, F.L.; Li, Q.: Incorporating sentiment into tag-based user profiles and resource profiles for personalized search in folksonomy (2016) 0.00
```
0.0012159493 = product of:
  0.0024318986 = sum of:
    0.0024318986 = product of:
      0.004863797 = sum of:
        0.004863797 = weight(_text_:a in 2671) [ClassicSimilarity], result of:
          0.004863797 = score(doc=2671,freq=8.0), product of:
            0.04772363 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.041389145 = queryNorm
            0.10191591 = fieldWeight in 2671, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03125 = fieldNorm(doc=2671)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In recent years, there has been a rapid growth of user-generated data in collaborative tagging (a.k.a. folksonomy-based) systems due to the prevailing of Web 2.0 communities. To effectively assist users to find their desired resources, it is critical to understand user behaviors and preferences. Tag-based profile techniques, which model users and resources by a vector of relevant tags, are widely employed in folksonomy-based systems. This is mainly because that personalized search and recommendations can be facilitated by measuring relevance between user profiles and resource profiles. However, conventional measurements neglect the sentiment aspect of user-generated tags. In fact, tags can be very emotional and subjective, as users usually express their perceptions and feelings about the resources by tags. Therefore, it is necessary to take sentiment relevance into account into measurements. In this paper, we present a novel generic framework SenticRank to incorporate various sentiment information to various sentiment-based information for personalized search by user profiles and resource profiles. In this framework, content-based sentiment ranking and collaborative sentiment ranking methods are proposed to obtain sentiment-based personalized ranking. To the best of our knowledge, this is the first work of integrating sentiment information to address the problem of the personalized tag-based search in collaborative tagging systems. Moreover, we compare the proposed sentiment-based personalized search with baselines in the experiments, the results of which have verified the effectiveness of the proposed framework. In addition, we study the influences by popular sentiment dictionaries, and SenticNet is the most prominent knowledge base to boost the performance of personalized search in folksonomy.

Type

a

Search (7 results, page 1 of 1)

Authors

Years

Themes