Search (6 results, page 1 of 1)

Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.02
```
0.020948619 = product of:
  0.041897237 = sum of:
    0.041897237 = sum of:
      0.010696997 = weight(_text_:a in 5290) [ClassicSimilarity], result of:
        0.010696997 = score(doc=5290,freq=20.0), product of:
          0.053105544 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.046056706 = queryNorm
          0.20142901 = fieldWeight in 5290, product of:
            4.472136 = tf(freq=20.0), with freq of:
              20.0 = termFreq=20.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5290)
      0.03120024 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
        0.03120024 = score(doc=5290,freq=2.0), product of:
          0.16128273 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046056706 = queryNorm
          0.19345059 = fieldWeight in 5290, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5290)
  0.5 = coord(1/2)
```
Abstract

Document keyphrases provide a concise summary of a document's content, offering semantic metadata summarizing a document. They can be used in many applications related to knowledge management and text mining, such as automatic text summarization, development of search engines, document clustering, document classification, thesaurus construction, and browsing interfaces. Because only a small portion of documents have keyphrases assigned by authors, and it is time-consuming and costly to manually assign keyphrases to documents, it is necessary to develop an algorithm to automatically generate keyphrases for documents. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified phrases to assign weights to the candidate keyphrases. The logic of our algorithm is: The more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. KIP's learning function can enrich the glossary database by automatically adding new identified keyphrases to the database. KIP's personalization feature will let the user build a glossary database specifically suitable for the area of his/her interest. The evaluation results show that KIP's performance is better than the systems we compared to and that the learning function is effective.

Date

22. 7.2006 17:25:48

Type

a
Li, Q.; Wu, Y.-f.B.: People search : searching people sharing similar interests from the Web (2008) 0.00
```
0.0028703054 = product of:
  0.005740611 = sum of:
    0.005740611 = product of:
      0.011481222 = sum of:
        0.011481222 = weight(_text_:a in 1344) [ClassicSimilarity], result of:
          0.011481222 = score(doc=1344,freq=16.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.2161963 = fieldWeight in 1344, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=1344)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

On the Web, there are limited ways of finding people sharing similar interests with a given person. The current methods are either ineffective or time consuming. In this paper, we present a new approach for searching people sharing similar interests from the Web. Given a person, to find similar people from the Web, there are two major research issues: person representation and matching persons. In this study, we propose a person representation method which uses a person's website to represent this person. Our design of matching process takes person representation into consideration to allow the same representation to be used when composing the query. Under this person representation method, the proposed algorithm integrates textual content and hyperlink information of all the pages belonging to a personal website to represent a person and match persons. Other algorithms are also explored and compared to the proposed algorithm. Experimental results are presented.

Type

a
Zhang, Z.; Li, Q.; Zeng, D.; Ga, H.: Extracting evolutionary communities in community question answering (2014) 0.00
```
0.0016913437 = product of:
  0.0033826875 = sum of:
    0.0033826875 = product of:
      0.006765375 = sum of:
        0.006765375 = weight(_text_:a in 1286) [ClassicSimilarity], result of:
          0.006765375 = score(doc=1286,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.12739488 = fieldWeight in 1286, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1286)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

With the rapid growth of Web 2.0, community question answering (CQA) has become a prevalent information seeking channel, in which users form interactive communities by posting questions and providing answers. Communities may evolve over time, because of changes in users' interests, activities, and new users joining the network. To better understand user interactions in CQA communities, it is necessary to analyze the community structures and track community evolution over time. Existing work in CQA focuses on question searching or content quality detection, and the important problems of community extraction and evolutionary pattern detection have not been studied. In this article, we propose a probabilistic community model (PCM) to extract overlapping community structures and capture their evolution patterns in CQA. The empirical results show that our algorithm appears to improve the community extraction quality. We show empirically, using the iPhone data set, that interesting community evolution patterns can be discovered, with each evolution pattern reflecting the variation of users' interests over time. Our analysis suggests that individual users could benefit to gain comprehensive information from tracking the transition of products. We also show that the communities provide a decision-making basis for business.

Type

a
Li, Q.; Chen, Y.P.; Myaeng, S.-H.; Jin, Y.; Kang, B.-Y.: Concept unification of terms in different languages via web mining for Information Retrieval (2009) 0.00
```
0.0014647468 = product of:
  0.0029294936 = sum of:
    0.0029294936 = product of:
      0.005858987 = sum of:
        0.005858987 = weight(_text_:a in 4215) [ClassicSimilarity], result of:
          0.005858987 = score(doc=4215,freq=6.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.11032722 = fieldWeight in 4215, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4215)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

For historical and cultural reasons, English phrases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.

Type

a
Xie, H.; Li, X.; Wang, T.; Lau, R.Y.K.; Wong, T.-L.; Chen, L.; Wang, F.L.; Li, Q.: Incorporating sentiment into tag-based user profiles and resource profiles for personalized search in folksonomy (2016) 0.00
```
0.001353075 = product of:
  0.00270615 = sum of:
    0.00270615 = product of:
      0.0054123 = sum of:
        0.0054123 = weight(_text_:a in 2671) [ClassicSimilarity], result of:
          0.0054123 = score(doc=2671,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.10191591 = fieldWeight in 2671, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03125 = fieldNorm(doc=2671)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In recent years, there has been a rapid growth of user-generated data in collaborative tagging (a.k.a. folksonomy-based) systems due to the prevailing of Web 2.0 communities. To effectively assist users to find their desired resources, it is critical to understand user behaviors and preferences. Tag-based profile techniques, which model users and resources by a vector of relevant tags, are widely employed in folksonomy-based systems. This is mainly because that personalized search and recommendations can be facilitated by measuring relevance between user profiles and resource profiles. However, conventional measurements neglect the sentiment aspect of user-generated tags. In fact, tags can be very emotional and subjective, as users usually express their perceptions and feelings about the resources by tags. Therefore, it is necessary to take sentiment relevance into account into measurements. In this paper, we present a novel generic framework SenticRank to incorporate various sentiment information to various sentiment-based information for personalized search by user profiles and resource profiles. In this framework, content-based sentiment ranking and collaborative sentiment ranking methods are proposed to obtain sentiment-based personalized ranking. To the best of our knowledge, this is the first work of integrating sentiment information to address the problem of the personalized tag-based search in collaborative tagging systems. Moreover, we compare the proposed sentiment-based personalized search with baselines in the experiments, the results of which have verified the effectiveness of the proposed framework. In addition, we study the influences by popular sentiment dictionaries, and SenticNet is the most prominent knowledge base to boost the performance of personalized search in folksonomy.

Type

a

Miao, Q.; Li, Q.; Zeng, D.: Fine-grained opinion mining by integrating multiple review sources (2010) 0.00

0.0011839407 = product of:
  0.0023678814 = sum of:
    0.0023678814 = product of:
      0.0047357627 = sum of:
        0.0047357627 = weight(_text_:a in 4104) [ClassicSimilarity], result of:
          0.0047357627 = score(doc=4104,freq=2.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.089176424 = fieldWeight in 4104, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4104)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Search (6 results, page 1 of 1)

Authors

Years

Themes