Search (14 results, page 1 of 1)

  • × author_ss:"Li, X."
  1. Yan, X.; Li, X.; Song, D.: ¬A correlation analysis on LSA and HAL semantic space models (2004) 0.06
    0.058677156 = product of:
      0.14669289 = sum of:
        0.028076671 = weight(_text_:retrieval in 2152) [ClassicSimilarity], result of:
          0.028076671 = score(doc=2152,freq=2.0), product of:
            0.14001551 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.04628742 = queryNorm
            0.20052543 = fieldWeight in 2152, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2152)
        0.118616216 = weight(_text_:semantic in 2152) [ClassicSimilarity], result of:
          0.118616216 = score(doc=2152,freq=10.0), product of:
            0.19245663 = queryWeight, product of:
              4.1578603 = idf(docFreq=1879, maxDocs=44218)
              0.04628742 = queryNorm
            0.616327 = fieldWeight in 2152, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.1578603 = idf(docFreq=1879, maxDocs=44218)
              0.046875 = fieldNorm(doc=2152)
      0.4 = coord(2/5)
    
    Abstract
    In this paper, we compare a well-known semantic spacemodel, Latent Semantic Analysis (LSA) with another model, Hyperspace Analogue to Language (HAL) which is widely used in different area, especially in automatic query refinement. We conduct this comparative analysis to prove our hypothesis that with respect to ability of extracting the lexical information from a corpus of text, LSA is quite similar to HAL. We regard HAL and LSA as black boxes. Through a Pearson's correlation analysis to the outputs of these two black boxes, we conclude that LSA highly co-relates with HAL and thus there is a justification that LSA and HAL can potentially play a similar role in the area of facilitating automatic query refinement. This paper evaluates LSA in a new application area and contributes an effective way to compare different semantic space models.
    Object
    Latent Semantic Analysis
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  2. Li, X.: Designing an interactive Web tutorial with cross-browser dynamic HTML (2000) 0.02
    0.020597784 = product of:
      0.10298892 = sum of:
        0.10298892 = sum of:
          0.06536108 = weight(_text_:web in 4897) [ClassicSimilarity], result of:
            0.06536108 = score(doc=4897,freq=8.0), product of:
              0.15105948 = queryWeight, product of:
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.04628742 = queryNorm
              0.43268442 = fieldWeight in 4897, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.046875 = fieldNorm(doc=4897)
          0.03762784 = weight(_text_:22 in 4897) [ClassicSimilarity], result of:
            0.03762784 = score(doc=4897,freq=2.0), product of:
              0.16209066 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04628742 = queryNorm
              0.23214069 = fieldWeight in 4897, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=4897)
      0.2 = coord(1/5)
    
    Abstract
    Texas A&M University Libraries developed a Web-based training (WBT) application for LandView III, a federal depository CD-ROM publication using cross-browser dynamic HTML (DHTML) and other Web technologies. The interactive and self-paced tutorial demonstrates the major features of the CD-ROM and shows how to navigate the programs. The tutorial features dynamic HTML techniques, such as hiding, showing and moving layers; dragging objects; and windows-style drop-down menus. It also integrates interactive forms, common gateway interface (CGI), frames, and animated GIF images in the design of the WBT. After describing the design and implementation of the tutorial project, an evaluation of usage statistics and user feedback was conducted, as well as an assessment of its strengths and weaknesses, and a comparison of this tutorial with other common types of training methods. The present article describes an innovative approach for CD-ROM training using advanced Web technologies such as dynamic HTML, which can simulate and demonstrate the interactive use of the CD-ROM, as well as the actual search process of a database.
    Date
    28. 1.2006 19:21:22
  3. Xu, G.; Cao, Y.; Ren, Y.; Li, X.; Feng, Z.: Network security situation awareness based on semantic ontology and user-defined rules for Internet of Things (2017) 0.02
    0.015313287 = product of:
      0.076566435 = sum of:
        0.076566435 = weight(_text_:semantic in 306) [ClassicSimilarity], result of:
          0.076566435 = score(doc=306,freq=6.0), product of:
            0.19245663 = queryWeight, product of:
              4.1578603 = idf(docFreq=1879, maxDocs=44218)
              0.04628742 = queryNorm
            0.39783734 = fieldWeight in 306, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.1578603 = idf(docFreq=1879, maxDocs=44218)
              0.0390625 = fieldNorm(doc=306)
      0.2 = coord(1/5)
    
    Abstract
    Internet of Things (IoT) brings the third development wave of the global information industry which makes users, network and perception devices cooperate more closely. However, if IoT has security problems, it may cause a variety of damage and even threaten human lives and properties. To improve the abilities of monitoring, providing emergency response and predicting the development trend of IoT security, a new paradigm called network security situation awareness (NSSA) is proposed. However, it is limited by its ability to mine and evaluate security situation elements from multi-source heterogeneous network security information. To solve this problem, this paper proposes an IoT network security situation awareness model using situation reasoning method based on semantic ontology and user-defined rules. Ontology technology can provide a unified and formalized description to solve the problem of semantic heterogeneity in the IoT security domain. In this paper, four key sub-domains are proposed to reflect an IoT security situation: context, attack, vulnerability and network flow. Further, user-defined rules can compensate for the limited description ability of ontology, and hence can enhance the reasoning ability of our proposed ontology model. The examples in real IoT scenarios show that the ability of the network security situation awareness that adopts our situation reasoning method is more comprehensive and more powerful reasoning abilities than the traditional NSSA methods. [http://ieeexplore.ieee.org/abstract/document/7999187/]
  4. Li, X.; Schijvenaars, B.J.A.; Rijke, M.de: Investigating queries and search failures in academic search (2017) 0.01
    0.011844519 = product of:
      0.029611295 = sum of:
        0.01871778 = weight(_text_:retrieval in 5033) [ClassicSimilarity], result of:
          0.01871778 = score(doc=5033,freq=2.0), product of:
            0.14001551 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.04628742 = queryNorm
            0.13368362 = fieldWeight in 5033, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=5033)
        0.010893514 = product of:
          0.021787029 = sum of:
            0.021787029 = weight(_text_:web in 5033) [ClassicSimilarity], result of:
              0.021787029 = score(doc=5033,freq=2.0), product of:
                0.15105948 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.04628742 = queryNorm
                0.14422815 = fieldWeight in 5033, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5033)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Academic search concerns the retrieval and profiling of information objects in the domain of academic research. In this paper we reveal important observations of academic search queries, and provide an algorithmic solution to address a type of failure during search sessions: null queries. We start by providing a general characterization of academic search queries, by analyzing a large-scale transaction log of a leading academic search engine. Unlike previous small-scale analyses of academic search queries, we find important differences with query characteristics known from web search. E.g., in academic search there is a substantially bigger proportion of entity queries, and a heavier tail in query length distribution. We then focus on search failures and, in particular, on null queries that lead to an empty search engine result page, on null sessions that contain such null queries, and on users who are prone to issue null queries. In academic search approximately 1 in 10 queries is a null query, and 25% of the sessions contain a null query. They appear in different types of search sessions, and prevent users from achieving their search goal. To address the high rate of null queries in academic search, we consider the task of providing query suggestions. Specifically we focus on a highly frequent query type: non-boolean informational queries. To this end we need to overcome query sparsity and make effective use of session information. We find that using entities helps to surface more relevant query suggestions in the face of query sparsity. We also find that query suggestions should be conditioned on the type of session in which they are offered to be more effective. After casting the session classification problem as a multi-label classification problem, we generate session-conditional query suggestions based on predicted session type. We find that this session-conditional method leads to significant improvements over a generic query suggestion method. Personalization yields very little further improvements over session-conditional query suggestions.
  5. Li, X.: ¬A new robust relevance model in the language model framework (2008) 0.01
    0.008105038 = product of:
      0.040525187 = sum of:
        0.040525187 = weight(_text_:retrieval in 2076) [ClassicSimilarity], result of:
          0.040525187 = score(doc=2076,freq=6.0), product of:
            0.14001551 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.04628742 = queryNorm
            0.28943354 = fieldWeight in 2076, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2076)
      0.2 = coord(1/5)
    
    Abstract
    In this paper, a new robust relevance model is proposed that can be applied to both pseudo and true relevance feedback in the language-modeling framework for document retrieval. There are at least three main differences between our new relevance model and other relevance models. The proposed model brings back the original query into the relevance model by treating it as a short, special document, in addition to a number of top-ranked documents returned from the first round retrieval for pseudo feedback, or a number of relevant documents for true relevance feedback. Second, instead of using a uniform prior as in the original relevance model proposed by Lavrenko and Croft, documents are assigned with different priors according to their lengths (in terms) and ranks in the first round retrieval. Third, the probability of a term in the relevance model is further adjusted by its probability in a background language model. In both pseudo and true relevance cases, we have compared the performance of our model to that of the two baselines: the original relevance model and a linear combination model. Our experimental results show that the proposed new model outperforms both of the two baselines in terms of mean average precision.
  6. Li, X.; Fullerton, J.P.: Create, edit, and manage Web database content using active server pages (2002) 0.01
    0.0073075923 = product of:
      0.03653796 = sum of:
        0.03653796 = product of:
          0.07307592 = sum of:
            0.07307592 = weight(_text_:web in 4793) [ClassicSimilarity], result of:
              0.07307592 = score(doc=4793,freq=10.0), product of:
                0.15105948 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.04628742 = queryNorm
                0.48375595 = fieldWeight in 4793, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4793)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    Libraries have been integrating active server pages (ASP) with Web-based databases for searching and retrieving electronic information for the past five years; however, a literature review reveals that a more complete description of modifying data through the Web interface is needed. At the Texas A&M University Libraries, a Web database of Internet links was developed using ASP, Microsoft Access, and Microsoft Internet Information Server (IIS) to facilitate use of online resources. The implementation of the Internet Links database is described with focus on its data management functions. Also described are other library applications of ASP technology. The project explores a more complete approach to library Web database applications than was found in the current literature and should serve to facilitate reference service.
  7. Thelwall, M.; Li, X.; Barjak, F.; Robinson, S.: Assessing the international web connectivity of research groups (2008) 0.01
    0.00608966 = product of:
      0.030448299 = sum of:
        0.030448299 = product of:
          0.060896598 = sum of:
            0.060896598 = weight(_text_:web in 1401) [ClassicSimilarity], result of:
              0.060896598 = score(doc=1401,freq=10.0), product of:
                0.15105948 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.04628742 = queryNorm
                0.40312994 = fieldWeight in 1401, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1401)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    Purpose - The purpose of this paper is to claim that it is useful to assess the web connectivity of research groups, describe hyperlink-based techniques to achieve this and present brief details of European life sciences research groups as a case study. Design/methodology/approach - A commercial search engine was harnessed to deliver hyperlink data via its automatic query submission interface. A special purpose link analysis tool, LexiURL, then summarised and graphed the link data in appropriate ways. Findings - Webometrics can provide a wide range of descriptive information about the international connectivity of research groups. Research limitations/implications - Only one field was analysed, data was taken from only one search engine, and the results were not validated. Practical implications - Web connectivity seems to be particularly important for attracting overseas job applicants and to promote research achievements and capabilities, and hence we contend that it can be useful for national and international governments to use webometrics to ensure that the web is being used effectively by research groups. Originality/value - This is the first paper to make a case for the value of using a range of webometric techniques to evaluate the web presences of research groups within a field, and possibly the first "applied" webometrics study produced for an external contract.
  8. Li, X.; Crane, N.: Electronic styles : a handbook for citing electronic information (1996) 0.01
    0.0054467577 = product of:
      0.027233787 = sum of:
        0.027233787 = product of:
          0.054467574 = sum of:
            0.054467574 = weight(_text_:web in 2075) [ClassicSimilarity], result of:
              0.054467574 = score(doc=2075,freq=2.0), product of:
                0.15105948 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.04628742 = queryNorm
                0.36057037 = fieldWeight in 2075, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2075)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    The second edition of the best-selling guide to referencing electronic information and citing the complete range of electronic formats includes text-based information, electronic journals and discussion lists, Web sites, CD-ROM and multimedia products, and commercial online documents
  9. Barjak, F.; Li, X.; Thelwall, M.: Which factors explain the Web impact of scientists' personal homepages? (2007) 0.01
    0.0053367107 = product of:
      0.026683552 = sum of:
        0.026683552 = product of:
          0.053367104 = sum of:
            0.053367104 = weight(_text_:web in 73) [ClassicSimilarity], result of:
              0.053367104 = score(doc=73,freq=12.0), product of:
                0.15105948 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.04628742 = queryNorm
                0.35328537 = fieldWeight in 73, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03125 = fieldNorm(doc=73)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    In recent years, a considerable body of Webometric research has used hyperlinks to generate indicators for the impact of Web documents and the organizations that created them. The relationship between this Web impact and other, offline impact indicators has been explored for entire universities, departments, countries, and scientific journals, but not yet for individual scientists-an important omission. The present research closes this gap by investigating factors that may influence the Web impact (i.e., inlink counts) of scientists' personal homepages. Data concerning 456 scientists from five scientific disciplines in six European countries were analyzed, showing that both homepage content and personal and institutional characteristics of the homepage owners had significant relationships with inlink counts. A multivariate statistical analysis confirmed that full-text articles are the most linked-to content in homepages. At the individual homepage level, hyperlinks are related to several offline characteristics. Notable differences regarding total inlinks to scientists' homepages exist between the scientific disciplines and the countries in the sample. There also are both gender and age effects: fewer external inlinks (i.e., links from other Web domains) to the homepages of female and of older scientists. There is only a weak relationship between a scientist's recognition and homepage inlinks and, surprisingly, no relationship between research productivity and inlink counts. Contrary to expectations, the size of collaboration networks is negatively related to hyperlink counts. Some of the relationships between hyperlinks to homepages and the properties of their owners can be explained by the content that the homepage owners put on their homepage and their level of Internet use; however, the findings about productivity and collaborations do not seem to have a simple, intuitive explanation. Overall, the results emphasize the complexity of the phenomenon of Web linking, when analyzed at the level of individual pages.
  10. Lu, W.; Li, X.; Liu, Z.; Cheng, Q.: How do author-selected keywords function semantically in scientific manuscripts? (2019) 0.00
    0.004679445 = product of:
      0.023397226 = sum of:
        0.023397226 = weight(_text_:retrieval in 5453) [ClassicSimilarity], result of:
          0.023397226 = score(doc=5453,freq=2.0), product of:
            0.14001551 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.04628742 = queryNorm
            0.16710453 = fieldWeight in 5453, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5453)
      0.2 = coord(1/5)
    
    Abstract
    Author-selected keywords have been widely utilized for indexing, information retrieval, bibliometrics and knowledge organization in previous studies. However, few studies exist con-cerning how author-selected keywords function semantically in scientific manuscripts. In this paper, we investigated this problem from the perspective of term function (TF) by devising indica-tors of the diversity and symmetry of keyword term functions in papers, as well as the intensity of individual term functions in papers. The data obtained from the whole Journal of Informetrics(JOI) were manually processed by an annotation scheme of key-word term functions, including "research topic," "research method," "research object," "research area," "data" and "others," based on empirical work in content analysis. The results show, quantitatively, that the diversity of keyword term function de-creases, and the irregularity increases with the number of author-selected keywords in a paper. Moreover, the distribution of the intensity of individual keyword term function indicated that no significant difference exists between the ranking of the five term functions with the increase of the number of author-selected keywords (i.e., "research topic" > "research method" > "research object" > "research area" > "data"). The findings indicate that precise keyword related research must take into account the dis-tinct types of author-selected keywords.
  11. Zhu, L.; Xu, A.; Deng, S.; Heng, G.; Li, X.: Entity management using Wikidata for cultural heritage information (2024) 0.00
    0.0038127303 = product of:
      0.019063652 = sum of:
        0.019063652 = product of:
          0.038127303 = sum of:
            0.038127303 = weight(_text_:web in 975) [ClassicSimilarity], result of:
              0.038127303 = score(doc=975,freq=2.0), product of:
                0.15105948 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.04628742 = queryNorm
                0.25239927 = fieldWeight in 975, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=975)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    Entity management in a Linked Open Data (LOD) environment is a process of associating a unique, persistent, and dereferenceable Uniform Resource Identifier (URI) with a single entity. It allows data from various sources to be reused and connected to the Web. It can help improve data quality and enable more efficient workflows. This article describes a semi-automated entity management project conducted by the "Wikidata: WikiProject Chinese Culture and Heritage Group," explores the challenges and opportunities in describing Chinese women poets and historical places in Wikidata, the largest crowdsourcing LOD platform in the world, and discusses lessons learned and future opportunities.
  12. Li, X.; Thelwall, M.; Kousha, K.: ¬The role of arXiv, RePEc, SSRN and PMC in formal scholarly communication (2015) 0.00
    0.0031356532 = product of:
      0.015678266 = sum of:
        0.015678266 = product of:
          0.031356532 = sum of:
            0.031356532 = weight(_text_:22 in 2593) [ClassicSimilarity], result of:
              0.031356532 = score(doc=2593,freq=2.0), product of:
                0.16209066 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04628742 = queryNorm
                0.19345059 = fieldWeight in 2593, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2593)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Date
    20. 1.2015 18:30:22
  13. Wang, P.; Li, X.: Assessing the quality of information on Wikipedia : a deep-learning approach (2020) 0.00
    0.0027233788 = product of:
      0.013616893 = sum of:
        0.013616893 = product of:
          0.027233787 = sum of:
            0.027233787 = weight(_text_:web in 5505) [ClassicSimilarity], result of:
              0.027233787 = score(doc=5505,freq=2.0), product of:
                0.15105948 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.04628742 = queryNorm
                0.18028519 = fieldWeight in 5505, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5505)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    Currently, web document repositories have been collaboratively created and edited. One of these repositories, Wikipedia, is facing an important problem: assessing the quality of Wikipedia. Existing approaches exploit techniques such as statistical models or machine leaning algorithms to assess Wikipedia article quality. However, existing models do not provide satisfactory results. Furthermore, these models fail to adopt a comprehensive feature framework. In this article, we conduct an extensive survey of previous studies and summarize a comprehensive feature framework, including text statistics, writing style, readability, article structure, network, and editing history. Selected state-of-the-art deep-learning models, including the convolutional neural network (CNN), deep neural network (DNN), long short-term memory (LSTMs) network, CNN-LSTMs, bidirectional LSTMs, and stacked LSTMs, are applied to assess the quality of Wikipedia. A detailed comparison of deep-learning models is conducted with regard to different aspects: classification performance and training performance. We include an importance analysis of different features and feature sets to determine which features or feature sets are most effective in distinguishing Wikipedia article quality. This extensive experiment validates the effectiveness of the proposed model.
  14. Xie, H.; Li, X.; Wang, T.; Lau, R.Y.K.; Wong, T.-L.; Chen, L.; Wang, F.L.; Li, Q.: Incorporating sentiment into tag-based user profiles and resource profiles for personalized search in folksonomy (2016) 0.00
    0.002178703 = product of:
      0.010893514 = sum of:
        0.010893514 = product of:
          0.021787029 = sum of:
            0.021787029 = weight(_text_:web in 2671) [ClassicSimilarity], result of:
              0.021787029 = score(doc=2671,freq=2.0), product of:
                0.15105948 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.04628742 = queryNorm
                0.14422815 = fieldWeight in 2671, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2671)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    In recent years, there has been a rapid growth of user-generated data in collaborative tagging (a.k.a. folksonomy-based) systems due to the prevailing of Web 2.0 communities. To effectively assist users to find their desired resources, it is critical to understand user behaviors and preferences. Tag-based profile techniques, which model users and resources by a vector of relevant tags, are widely employed in folksonomy-based systems. This is mainly because that personalized search and recommendations can be facilitated by measuring relevance between user profiles and resource profiles. However, conventional measurements neglect the sentiment aspect of user-generated tags. In fact, tags can be very emotional and subjective, as users usually express their perceptions and feelings about the resources by tags. Therefore, it is necessary to take sentiment relevance into account into measurements. In this paper, we present a novel generic framework SenticRank to incorporate various sentiment information to various sentiment-based information for personalized search by user profiles and resource profiles. In this framework, content-based sentiment ranking and collaborative sentiment ranking methods are proposed to obtain sentiment-based personalized ranking. To the best of our knowledge, this is the first work of integrating sentiment information to address the problem of the personalized tag-based search in collaborative tagging systems. Moreover, we compare the proposed sentiment-based personalized search with baselines in the experiments, the results of which have verified the effectiveness of the proposed framework. In addition, we study the influences by popular sentiment dictionaries, and SenticNet is the most prominent knowledge base to boost the performance of personalized search in folksonomy.