Search (12 results, page 1 of 1)

  • × author_ss:"Liu, X."
  1. Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.02
    0.017412834 = product of:
      0.0522385 = sum of:
        0.016011827 = product of:
          0.032023653 = sum of:
            0.032023653 = weight(_text_:29 in 3464) [ClassicSimilarity], result of:
              0.032023653 = score(doc=3464,freq=2.0), product of:
                0.13732746 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03903913 = queryNorm
                0.23319192 = fieldWeight in 3464, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3464)
          0.5 = coord(1/2)
        0.036226675 = product of:
          0.07245335 = sum of:
            0.07245335 = weight(_text_:methods in 3464) [ClassicSimilarity], result of:
              0.07245335 = score(doc=3464,freq=6.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.4616232 = fieldWeight in 3464, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3464)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    We propose a new hybrid clustering framework to incorporate text mining with bibliometrics in journal set analysis. The framework integrates two different approaches: clustering ensemble and kernel-fusion clustering. To improve the flexibility and the efficiency of processing large-scale data, we propose an information-based weighting scheme to leverage the effect of multiple data sources in hybrid clustering. Three different algorithms are extended by the proposed weighting scheme and they are employed on a large journal set retrieved from the Web of Science (WoS) database. The clustering performance of the proposed algorithms is systematically evaluated using multiple evaluation methods, and they were cross-compared with alternative methods. Experimental results demonstrate that the proposed weighted hybrid clustering strategy is superior to other methods in clustering performance and efficiency. The proposed approach also provides a more refined structural mapping of journal sets, which is useful for monitoring and detecting new trends in different scientific fields.
    Date
    1. 6.2010 9:29:57
  2. Chen, M.; Liu, X.; Qin, J.: Semantic relation extraction from socially-generated tags : a methodology for metadata generation (2008) 0.01
    0.00885545 = product of:
      0.026566349 = sum of:
        0.01334319 = product of:
          0.02668638 = sum of:
            0.02668638 = weight(_text_:29 in 2648) [ClassicSimilarity], result of:
              0.02668638 = score(doc=2648,freq=2.0), product of:
                0.13732746 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03903913 = queryNorm
                0.19432661 = fieldWeight in 2648, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2648)
          0.5 = coord(1/2)
        0.013223159 = product of:
          0.026446318 = sum of:
            0.026446318 = weight(_text_:22 in 2648) [ClassicSimilarity], result of:
              0.026446318 = score(doc=2648,freq=2.0), product of:
                0.1367084 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03903913 = queryNorm
                0.19345059 = fieldWeight in 2648, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2648)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Date
    20. 2.2009 10:29:07
    Source
    Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas
  3. Yang, Y.; Liu, X.: ¬A re-examination of text categorization methods (1999) 0.01
    0.008133798 = product of:
      0.048802786 = sum of:
        0.048802786 = product of:
          0.09760557 = sum of:
            0.09760557 = weight(_text_:methods in 3386) [ClassicSimilarity], result of:
              0.09760557 = score(doc=3386,freq=8.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.62187594 = fieldWeight in 3386, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3386)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    This paper reports a controlled study with statistical significance tests an five text categorization methods: the Support Vector Machines (SVM), a k-Nearest Neighbor (kNN) classifier, a neural network (NNet) approach, the Linear Leastsquares Fit (LLSF) mapping and a Naive Bayes (NB) classifier. We focus an the robustness of these methods in dealing with a skewed category distribution, and their performance as function of the training-set category frequency. Our results show that SVM, kNN and LLSF significantly outperform NNet and NB when the number of positive training instances per category are small (less than ten, and that all the methods perform comparably when the categories are sufficiently common (over 300 instances).
  4. Liu, X.; Turtle, H.: Real-time user interest modeling for real-time ranking (2013) 0.00
    0.004929826 = product of:
      0.029578956 = sum of:
        0.029578956 = product of:
          0.05915791 = sum of:
            0.05915791 = weight(_text_:methods in 1035) [ClassicSimilarity], result of:
              0.05915791 = score(doc=1035,freq=4.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.37691376 = fieldWeight in 1035, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1035)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    User interest as a very dynamic information need is often ignored in most existing information retrieval systems. In this research, we present the results of experiments designed to evaluate the performance of a real-time interest model (RIM) that attempts to identify the dynamic and changing query level interests regarding social media outputs. Unlike most existing ranking methods, our ranking approach targets calculation of the probability that user interest in the content of the document is subject to very dynamic user interest change. We describe 2 formulations of the model (real-time interest vector space and real-time interest language model) stemming from classical relevance ranking methods and develop a novel methodology for evaluating the performance of RIM using Amazon Mechanical Turk to collect (interest-based) relevance judgments on a daily basis. Our results show that the model usually, although not always, performs better than baseline results obtained from commercial web search engines. We identify factors that affect RIM performance and outline plans for future research.
  5. Zhang, X.; Fang, Y.; He, W.; Zhang, Y.; Liu, X.: Epistemic motivation, task reflexivity, and knowledge contribution behavior on team wikis : a cross-level moderation model (2019) 0.00
    0.0037292899 = product of:
      0.022375738 = sum of:
        0.022375738 = product of:
          0.044751476 = sum of:
            0.044751476 = weight(_text_:theory in 5245) [ClassicSimilarity], result of:
              0.044751476 = score(doc=5245,freq=2.0), product of:
                0.16234003 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.03903913 = queryNorm
                0.27566507 = fieldWeight in 5245, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5245)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    A cross-level model based on the information processing perspective and trait activation theory was developed and tested in order to investigate the effects of individual-level epistemic motivation and team-level task reflexivity on three different individual contribution behaviors (i.e., adding, deleting, and revising) in the process of knowledge creation on team wikis. Using the Hierarchical Linear Modeling software package and the 2-wave data from 166 individuals in 51 wiki-based teams, we found cross-level interaction effects between individual epistemic motivation and team task reflexivity on different knowledge contribution behaviors on wikis. Epistemic motivation exerted a positive effect on adding, which was strengthened by team task reflexivity. The effect of epistemic motivation on deleting was positive only when task reflexivity was high. In addition, epistemic motivation was strongly positively related to revising, regardless of the level of task reflexivity involved.
  6. Liu, X.; Guo, C.; Zhang, L.: Scholar metadata and knowledge generation with human and artificial intelligence (2014) 0.00
    0.0034859132 = product of:
      0.020915478 = sum of:
        0.020915478 = product of:
          0.041830957 = sum of:
            0.041830957 = weight(_text_:methods in 1287) [ClassicSimilarity], result of:
              0.041830957 = score(doc=1287,freq=2.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.26651827 = fieldWeight in 1287, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1287)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    Scholar metadata have traditionally centered on descriptive representations, which have been used as a foundation for scholarly publication repositories and academic information retrieval systems. In this article, we propose innovative and economic methods of generating knowledge-based structural metadata (structural keywords) using a combination of natural language processing-based machine-learning techniques and human intelligence. By allowing low-barrier participation through a social media system, scholars (both as authors and users) can participate in the metadata editing and enhancing process and benefit from more accurate and effective information retrieval. Our experimental web system ScholarWiki uses machine learning techniques, which automatically produce increasingly refined metadata by learning from the structural metadata contributed by scholars. The cumulated structural metadata add intelligence and automatically enhance and update recursively the quality of metadata, wiki pages, and the machine-learning model.
  7. Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.00
    0.0031077415 = product of:
      0.018646449 = sum of:
        0.018646449 = product of:
          0.037292898 = sum of:
            0.037292898 = weight(_text_:theory in 4277) [ClassicSimilarity], result of:
              0.037292898 = score(doc=4277,freq=2.0), product of:
                0.16234003 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.03903913 = queryNorm
                0.2297209 = fieldWeight in 4277, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4277)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.
  8. Liu, X.; Hu, M.; Xiao, B.S.; Shao, J.: Is my doctor around me? : Investigating the impact of doctors' presence on patients' review behaviors on an online health platform (2022) 0.00
    0.0031077415 = product of:
      0.018646449 = sum of:
        0.018646449 = product of:
          0.037292898 = sum of:
            0.037292898 = weight(_text_:theory in 650) [ClassicSimilarity], result of:
              0.037292898 = score(doc=650,freq=2.0), product of:
                0.16234003 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.03903913 = queryNorm
                0.2297209 = fieldWeight in 650, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=650)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    Patient-generated online reviews are well-established as an important source of information for people to evaluate doctors' quality and improve health outcomes. However, how such reviews are generated in the first place is not well examined. This study examines a hitherto unexplored social driver of online review generation-doctors' presence on online health platforms, which results in the reviewers (i.e., patients) and the reviewees (i.e., doctors) coexisting in the same medium. Drawing on the Stimulus-Organism-Response theory as an overarching framework, we advance hypotheses about the impact of doctors' presence on their patients' review behaviors, including review volume, review effort, and emotional expression. To achieve causal identification, we conduct a quasi-experiment on a large online health platform and employ propensity score matching and difference-in-difference estimation. Our findings show that doctors' presence increases their patients' review volume. Furthermore, doctors' presence motivates their patients to exert greater effort and express more positive emotions in the review text. The results also show that the presence of doctors with higher professional titles has a stronger effect on review volume than the presence of doctors with lower professional titles. Our findings offer important implications both for research and practice.
  9. Liu, X.; Kaza, S.; Zhang, P.; Chen, H.: Determining inventor status and its effect on knowledge diffusion : a study on nanotechnology literature from China, Russia, and India (2011) 0.00
    0.0029049278 = product of:
      0.017429566 = sum of:
        0.017429566 = product of:
          0.034859132 = sum of:
            0.034859132 = weight(_text_:methods in 4468) [ClassicSimilarity], result of:
              0.034859132 = score(doc=4468,freq=2.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.22209854 = fieldWeight in 4468, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4468)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    In an increasingly global research landscape, it is important to identify the most prolific researchers in various institutions and their influence on the diffusion of knowledge. Knowledge diffusion within institutions is influenced by not just the status of individual researchers but also the collaborative culture that determines status. There are various methods to measure individual status, but few studies have compared them or explored the possible effects of different cultures on the status measures. In this article, we examine knowledge diffusion within science and technology-oriented research organizations. Using social network analysis metrics to measure individual status in large-scale coauthorship networks, we studied an individual's impact on the recombination of knowledge to produce innovation in nanotechnology. Data from the most productive and high-impact institutions in China (Chinese Academy of Sciences), Russia (Russian Academy of Sciences), and India (Indian Institutes of Technology) were used. We found that boundary-spanning individuals influenced knowledge diffusion in all countries. However, our results also indicate that cultural and institutional differences may influence knowledge diffusion.
  10. Liu, X.: ¬The standardization of Chinese library classification (1993) 0.00
    0.0026686378 = product of:
      0.016011827 = sum of:
        0.016011827 = product of:
          0.032023653 = sum of:
            0.032023653 = weight(_text_:29 in 5588) [ClassicSimilarity], result of:
              0.032023653 = score(doc=5588,freq=2.0), product of:
                0.13732746 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03903913 = queryNorm
                0.23319192 = fieldWeight in 5588, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5588)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Date
    8.10.2000 14:29:26
  11. Clewley, N.; Chen, S.Y.; Liu, X.: Cognitive styles and search engine preferences : field dependence/independence vs holism/serialism (2010) 0.00
    0.0022238651 = product of:
      0.01334319 = sum of:
        0.01334319 = product of:
          0.02668638 = sum of:
            0.02668638 = weight(_text_:29 in 3961) [ClassicSimilarity], result of:
              0.02668638 = score(doc=3961,freq=2.0), product of:
                0.13732746 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03903913 = queryNorm
                0.19432661 = fieldWeight in 3961, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3961)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Date
    29. 8.2010 13:11:47
  12. Jiang, Z.; Liu, X.; Chen, Y.: Recovering uncaptured citations in a scholarly network : a two-step citation analysis to estimate publication importance (2016) 0.00
    0.0022238651 = product of:
      0.01334319 = sum of:
        0.01334319 = product of:
          0.02668638 = sum of:
            0.02668638 = weight(_text_:29 in 3018) [ClassicSimilarity], result of:
              0.02668638 = score(doc=3018,freq=2.0), product of:
                0.13732746 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03903913 = queryNorm
                0.19432661 = fieldWeight in 3018, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3018)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Date
    12. 6.2016 20:31:29