Search (15 results, page 1 of 1)

  • × author_ss:"Li, X."
  1. Li, X.; Zhang, A.; Li, C.; Ouyang, J.; Cai, Y.: Exploring coherent topics by topic modeling with term weighting (2018) 0.02
    0.023530604 = product of:
      0.04706121 = sum of:
        0.02586502 = weight(_text_:data in 5045) [ClassicSimilarity], result of:
          0.02586502 = score(doc=5045,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.17468026 = fieldWeight in 5045, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5045)
        0.021196188 = product of:
          0.042392377 = sum of:
            0.042392377 = weight(_text_:processing in 5045) [ClassicSimilarity], result of:
              0.042392377 = score(doc=5045,freq=2.0), product of:
                0.18956426 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046827413 = queryNorm
                0.22363065 = fieldWeight in 5045, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5045)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Topic models often produce unexplainable topics that are filled with noisy words. The reason is that words in topic modeling have equal weights. High frequency words dominate the top topic word lists, but most of them are meaningless words, e.g., domain-specific stopwords. To address this issue, in this paper we aim to investigate how to weight words, and then develop a straightforward but effective term weighting scheme, namely entropy weighting (EW). The proposed EW scheme is based on conditional entropy measured by word co-occurrences. Compared with existing term weighting schemes, the highlight of EW is that it can automatically reward informative words. For more robust word weight, we further suggest a combination form of EW (CEW) with two existing weighting schemes. Basically, our CEW assigns meaningless words lower weights and informative words higher weights, leading to more coherent topics during topic modeling inference. We apply CEW to Dirichlet multinomial mixture and latent Dirichlet allocation, and evaluate it by topic quality, document clustering and classification tasks on 8 real world data sets. Experimental results show that weighting words can effectively improve the topic modeling performance over both short texts and normal long texts. More importantly, the proposed CEW significantly outperforms the existing term weighting schemes, since it further considers which words are informative.
    Source
    Information processing and management. 54(2018) no.6, S.1345-1358
  2. Xie, H.; Li, X.; Wang, T.; Lau, R.Y.K.; Wong, T.-L.; Chen, L.; Wang, F.L.; Li, Q.: Incorporating sentiment into tag-based user profiles and resource profiles for personalized search in folksonomy (2016) 0.02
    0.018824484 = product of:
      0.03764897 = sum of:
        0.020692015 = weight(_text_:data in 2671) [ClassicSimilarity], result of:
          0.020692015 = score(doc=2671,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.1397442 = fieldWeight in 2671, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=2671)
        0.016956951 = product of:
          0.033913903 = sum of:
            0.033913903 = weight(_text_:processing in 2671) [ClassicSimilarity], result of:
              0.033913903 = score(doc=2671,freq=2.0), product of:
                0.18956426 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046827413 = queryNorm
                0.17890452 = fieldWeight in 2671, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2671)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    In recent years, there has been a rapid growth of user-generated data in collaborative tagging (a.k.a. folksonomy-based) systems due to the prevailing of Web 2.0 communities. To effectively assist users to find their desired resources, it is critical to understand user behaviors and preferences. Tag-based profile techniques, which model users and resources by a vector of relevant tags, are widely employed in folksonomy-based systems. This is mainly because that personalized search and recommendations can be facilitated by measuring relevance between user profiles and resource profiles. However, conventional measurements neglect the sentiment aspect of user-generated tags. In fact, tags can be very emotional and subjective, as users usually express their perceptions and feelings about the resources by tags. Therefore, it is necessary to take sentiment relevance into account into measurements. In this paper, we present a novel generic framework SenticRank to incorporate various sentiment information to various sentiment-based information for personalized search by user profiles and resource profiles. In this framework, content-based sentiment ranking and collaborative sentiment ranking methods are proposed to obtain sentiment-based personalized ranking. To the best of our knowledge, this is the first work of integrating sentiment information to address the problem of the personalized tag-based search in collaborative tagging systems. Moreover, we compare the proposed sentiment-based personalized search with baselines in the experiments, the results of which have verified the effectiveness of the proposed framework. In addition, we study the influences by popular sentiment dictionaries, and SenticNet is the most prominent knowledge base to boost the performance of personalized search in folksonomy.
    Source
    Information processing and management. 52(2016) no.1, S.61-72
  3. Zhu, L.; Xu, A.; Deng, S.; Heng, G.; Li, X.: Entity management using Wikidata for cultural heritage information (2024) 0.02
    0.015679834 = product of:
      0.06271934 = sum of:
        0.06271934 = weight(_text_:data in 975) [ClassicSimilarity], result of:
          0.06271934 = score(doc=975,freq=6.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.42357713 = fieldWeight in 975, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=975)
      0.25 = coord(1/4)
    
    Abstract
    Entity management in a Linked Open Data (LOD) environment is a process of associating a unique, persistent, and dereferenceable Uniform Resource Identifier (URI) with a single entity. It allows data from various sources to be reused and connected to the Web. It can help improve data quality and enable more efficient workflows. This article describes a semi-automated entity management project conducted by the "Wikidata: WikiProject Chinese Culture and Heritage Group," explores the challenges and opportunities in describing Chinese women poets and historical places in Wikidata, the largest crowdsourcing LOD platform in the world, and discusses lessons learned and future opportunities.
  4. Thelwall, M.; Li, X.; Barjak, F.; Robinson, S.: Assessing the international web connectivity of research groups (2008) 0.01
    0.011199882 = product of:
      0.04479953 = sum of:
        0.04479953 = weight(_text_:data in 1401) [ClassicSimilarity], result of:
          0.04479953 = score(doc=1401,freq=6.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.30255508 = fieldWeight in 1401, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1401)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - The purpose of this paper is to claim that it is useful to assess the web connectivity of research groups, describe hyperlink-based techniques to achieve this and present brief details of European life sciences research groups as a case study. Design/methodology/approach - A commercial search engine was harnessed to deliver hyperlink data via its automatic query submission interface. A special purpose link analysis tool, LexiURL, then summarised and graphed the link data in appropriate ways. Findings - Webometrics can provide a wide range of descriptive information about the international connectivity of research groups. Research limitations/implications - Only one field was analysed, data was taken from only one search engine, and the results were not validated. Practical implications - Web connectivity seems to be particularly important for attracting overseas job applicants and to promote research achievements and capabilities, and hence we contend that it can be useful for national and international governments to use webometrics to ensure that the web is being used effectively by research groups. Originality/value - This is the first paper to make a case for the value of using a range of webometric techniques to evaluate the web presences of research groups within a field, and possibly the first "applied" webometrics study produced for an external contract.
  5. Lu, W.; Li, X.; Liu, Z.; Cheng, Q.: How do author-selected keywords function semantically in scientific manuscripts? (2019) 0.01
    0.011199882 = product of:
      0.04479953 = sum of:
        0.04479953 = weight(_text_:data in 5453) [ClassicSimilarity], result of:
          0.04479953 = score(doc=5453,freq=6.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.30255508 = fieldWeight in 5453, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5453)
      0.25 = coord(1/4)
    
    Abstract
    Author-selected keywords have been widely utilized for indexing, information retrieval, bibliometrics and knowledge organization in previous studies. However, few studies exist con-cerning how author-selected keywords function semantically in scientific manuscripts. In this paper, we investigated this problem from the perspective of term function (TF) by devising indica-tors of the diversity and symmetry of keyword term functions in papers, as well as the intensity of individual term functions in papers. The data obtained from the whole Journal of Informetrics(JOI) were manually processed by an annotation scheme of key-word term functions, including "research topic," "research method," "research object," "research area," "data" and "others," based on empirical work in content analysis. The results show, quantitatively, that the diversity of keyword term function de-creases, and the irregularity increases with the number of author-selected keywords in a paper. Moreover, the distribution of the intensity of individual keyword term function indicated that no significant difference exists between the ranking of the five term functions with the increase of the number of author-selected keywords (i.e., "research topic" > "research method" > "research object" > "research area" > "data"). The findings indicate that precise keyword related research must take into account the dis-tinct types of author-selected keywords.
  6. Li, X.; Fullerton, J.P.: Create, edit, and manage Web database content using active server pages (2002) 0.01
    0.010973599 = product of:
      0.043894395 = sum of:
        0.043894395 = weight(_text_:data in 4793) [ClassicSimilarity], result of:
          0.043894395 = score(doc=4793,freq=4.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.29644224 = fieldWeight in 4793, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=4793)
      0.25 = coord(1/4)
    
    Abstract
    Libraries have been integrating active server pages (ASP) with Web-based databases for searching and retrieving electronic information for the past five years; however, a literature review reveals that a more complete description of modifying data through the Web interface is needed. At the Texas A&M University Libraries, a Web database of Internet links was developed using ASP, Microsoft Access, and Microsoft Internet Information Server (IIS) to facilitate use of online resources. The implementation of the Internet Links database is described with focus on its data management functions. Also described are other library applications of ASP technology. The project explores a more complete approach to library Web database applications than was found in the current literature and should serve to facilitate reference service.
  7. Li, X.; Cox, A.; Ford, N.; Creaser, C.; Fry, J.; Willett, P.: Knowledge construction by users : a content analysis framework and a knowledge construction process model for virtual product user communities (2017) 0.01
    0.009144665 = product of:
      0.03657866 = sum of:
        0.03657866 = weight(_text_:data in 3574) [ClassicSimilarity], result of:
          0.03657866 = score(doc=3574,freq=4.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.24703519 = fieldWeight in 3574, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3574)
      0.25 = coord(1/4)
    
    Abstract
    Purpose The purpose of this paper is to develop a content analysis framework and from that derive a process model of knowledge construction in the context of virtual product user communities, organization sponsored online forums where product users collaboratively construct knowledge to solve their technical problems. Design/methodology/approach The study is based on a deductive and qualitative content analysis of discussion threads about solving technical problems selected from a series of virtual product user communities. Data are complemented with thematic analysis of interviews with forum members. Findings The research develops a content analysis framework for knowledge construction. It is based on a combination of existing codes derived from frameworks developed for computer-supported collaborative learning and new categories identified from the data. Analysis using this framework allows the authors to propose a knowledge construction process model showing how these elements are organized around a typical "trial and error" knowledge construction strategy. Practical implications The research makes suggestions about organizations' management of knowledge activities in virtual product user communities, including moderators' roles in facilitation. Originality/value The paper outlines a new framework for analysing knowledge activities where there is a low level of critical thinking and a model of knowledge construction by trial and error. The new framework and model can be applied in other similar contexts.
  8. Su, S.; Li, X.; Cheng, X.; Sun, C.: Location-aware targeted influence maximization in social networks (2018) 0.01
    0.006466255 = product of:
      0.02586502 = sum of:
        0.02586502 = weight(_text_:data in 4034) [ClassicSimilarity], result of:
          0.02586502 = score(doc=4034,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.17468026 = fieldWeight in 4034, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4034)
      0.25 = coord(1/4)
    
    Abstract
    In this paper, we study the location-aware targeted influence maximization problem in social networks, which finds a seed set to maximize the influence spread over the targeted users. In particular, we consider those users who have both topic and geographical preferences on promotion products as targeted users. To efficiently solve this problem, one challenge is how to find the targeted users and compute their preferences efficiently for given requests. To address this challenge, we devise a TR-tree index structure, where each tree node stores users' topic and geographical preferences. By traversing the TR-tree in depth-first order, we can efficiently find the targeted users. Another challenge of the problem is to devise algorithms for efficient seeds selection. We solve this challenge from two complementary directions. In one direction, we adopt the maximum influence arborescence (MIA) model to approximate the influence spread, and propose two efficient approximation algorithms with math formula approximation ratio, which prune some candidate seeds with small influences by precomputing users' initial influences offline and estimating the upper bound of their marginal influences online. In the other direction, we propose a fast heuristic algorithm to improve efficiency. Experiments conducted on real-world data sets demonstrate the effectiveness and efficiency of our proposed algorithms.
  9. Yang, X.; Li, X.; Hu, D.; Wang, H.J.: Differential impacts of social influence on initial and sustained participation in open source software projects (2021) 0.01
    0.006466255 = product of:
      0.02586502 = sum of:
        0.02586502 = weight(_text_:data in 332) [ClassicSimilarity], result of:
          0.02586502 = score(doc=332,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.17468026 = fieldWeight in 332, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=332)
      0.25 = coord(1/4)
    
    Abstract
    Social networking tools and visible information about developer activities on open source software (OSS) development platforms can leverage developers' social influence to attract more participation from their peers. However, the differential impacts of such social influence on developers' initial and sustained participation behaviors were largely overlooked in previous research. We empirically studied the impacts of two social influence mechanisms-word-of-mouth (WOM) and observational learning (OL)-on these two types of participation, using data collected from a large OSS development platform called Open Hub. We found that action (OL) speaks louder than words (WOM) with regard to sustained participation. Moreover, project age positively moderates the impacts of social influence on both types of participation. For projects with a higher average workload, the impacts of OL are reduced on initial participation but are increased on sustained participation. Our study provides a better understanding of how social influence affects OSS developers' participation behaviors. It also offers important practical implications for designing software development platforms that can leverage social influence to attract more initial and sustained participation.
  10. Li, X.: ¬A new robust relevance model in the language model framework (2008) 0.01
    0.005299047 = product of:
      0.021196188 = sum of:
        0.021196188 = product of:
          0.042392377 = sum of:
            0.042392377 = weight(_text_:processing in 2076) [ClassicSimilarity], result of:
              0.042392377 = score(doc=2076,freq=2.0), product of:
                0.18956426 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046827413 = queryNorm
                0.22363065 = fieldWeight in 2076, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2076)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 44(2008) no.3, S.991-1007
  11. Li, X.; Rijke, M.de: Characterizing and predicting downloads in academic search (2019) 0.01
    0.005299047 = product of:
      0.021196188 = sum of:
        0.021196188 = product of:
          0.042392377 = sum of:
            0.042392377 = weight(_text_:processing in 5103) [ClassicSimilarity], result of:
              0.042392377 = score(doc=5103,freq=2.0), product of:
                0.18956426 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046827413 = queryNorm
                0.22363065 = fieldWeight in 5103, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5103)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 56(2019) no.3, S.394-407
  12. Barjak, F.; Li, X.; Thelwall, M.: Which factors explain the Web impact of scientists' personal homepages? (2007) 0.01
    0.0051730038 = product of:
      0.020692015 = sum of:
        0.020692015 = weight(_text_:data in 73) [ClassicSimilarity], result of:
          0.020692015 = score(doc=73,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.1397442 = fieldWeight in 73, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=73)
      0.25 = coord(1/4)
    
    Abstract
    In recent years, a considerable body of Webometric research has used hyperlinks to generate indicators for the impact of Web documents and the organizations that created them. The relationship between this Web impact and other, offline impact indicators has been explored for entire universities, departments, countries, and scientific journals, but not yet for individual scientists-an important omission. The present research closes this gap by investigating factors that may influence the Web impact (i.e., inlink counts) of scientists' personal homepages. Data concerning 456 scientists from five scientific disciplines in six European countries were analyzed, showing that both homepage content and personal and institutional characteristics of the homepage owners had significant relationships with inlink counts. A multivariate statistical analysis confirmed that full-text articles are the most linked-to content in homepages. At the individual homepage level, hyperlinks are related to several offline characteristics. Notable differences regarding total inlinks to scientists' homepages exist between the scientific disciplines and the countries in the sample. There also are both gender and age effects: fewer external inlinks (i.e., links from other Web domains) to the homepages of female and of older scientists. There is only a weak relationship between a scientist's recognition and homepage inlinks and, surprisingly, no relationship between research productivity and inlink counts. Contrary to expectations, the size of collaboration networks is negatively related to hyperlink counts. Some of the relationships between hyperlinks to homepages and the properties of their owners can be explained by the content that the homepage owners put on their homepage and their level of Internet use; however, the findings about productivity and collaborations do not seem to have a simple, intuitive explanation. Overall, the results emphasize the complexity of the phenomenon of Web linking, when analyzed at the level of individual pages.
  13. Li, X.: Designing an interactive Web tutorial with cross-browser dynamic HTML (2000) 0.00
    0.0047583506 = product of:
      0.019033402 = sum of:
        0.019033402 = product of:
          0.038066804 = sum of:
            0.038066804 = weight(_text_:22 in 4897) [ClassicSimilarity], result of:
              0.038066804 = score(doc=4897,freq=2.0), product of:
                0.16398162 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046827413 = queryNorm
                0.23214069 = fieldWeight in 4897, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4897)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    28. 1.2006 19:21:22
  14. Li, X.; Schijvenaars, B.J.A.; Rijke, M.de: Investigating queries and search failures in academic search (2017) 0.00
    0.004239238 = product of:
      0.016956951 = sum of:
        0.016956951 = product of:
          0.033913903 = sum of:
            0.033913903 = weight(_text_:processing in 5033) [ClassicSimilarity], result of:
              0.033913903 = score(doc=5033,freq=2.0), product of:
                0.18956426 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046827413 = queryNorm
                0.17890452 = fieldWeight in 5033, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5033)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 53(2017) no.3, S.666-683
  15. Li, X.; Thelwall, M.; Kousha, K.: ¬The role of arXiv, RePEc, SSRN and PMC in formal scholarly communication (2015) 0.00
    0.0039652926 = product of:
      0.01586117 = sum of:
        0.01586117 = product of:
          0.03172234 = sum of:
            0.03172234 = weight(_text_:22 in 2593) [ClassicSimilarity], result of:
              0.03172234 = score(doc=2593,freq=2.0), product of:
                0.16398162 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046827413 = queryNorm
                0.19345059 = fieldWeight in 2593, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2593)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    20. 1.2015 18:30:22