Search (11 results, page 1 of 1)

  • × author_ss:"Liu, X."
  1. Chen, M.; Liu, X.; Qin, J.: Semantic relation extraction from socially-generated tags : a methodology for metadata generation (2008) 0.03
    0.02694875 = product of:
      0.107795 = sum of:
        0.107795 = sum of:
          0.050565217 = weight(_text_:language in 2648) [ClassicSimilarity], result of:
            0.050565217 = score(doc=2648,freq=4.0), product of:
              0.16497234 = queryWeight, product of:
                3.9232929 = idf(docFreq=2376, maxDocs=44218)
                0.042049456 = queryNorm
              0.30650726 = fieldWeight in 2648, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9232929 = idf(docFreq=2376, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2648)
          0.028744178 = weight(_text_:29 in 2648) [ClassicSimilarity], result of:
            0.028744178 = score(doc=2648,freq=2.0), product of:
              0.14791684 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.042049456 = queryNorm
              0.19432661 = fieldWeight in 2648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2648)
          0.028485604 = weight(_text_:22 in 2648) [ClassicSimilarity], result of:
            0.028485604 = score(doc=2648,freq=2.0), product of:
              0.14725003 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.042049456 = queryNorm
              0.19345059 = fieldWeight in 2648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2648)
      0.25 = coord(1/4)
    
    Abstract
    The growing predominance of social semantics in the form of tagging presents the metadata community with both opportunities and challenges as for leveraging this new form of information content representation and for retrieval. One key challenge is the absence of contextual information associated with these tags. This paper presents an experiment working with Flickr tags as an example of utilizing social semantics sources for enriching subject metadata. The procedure included four steps: 1) Collecting a sample of Flickr tags, 2) Calculating cooccurrences between tags through mutual information, 3) Tracing contextual information of tag pairs via Google search results, 4) Applying natural language processing and machine learning techniques to extract semantic relations between tags. The experiment helped us to build a context sentence collection from the Google search results, which was then processed by natural language processing and machine learning algorithms. This new approach achieved a reasonably good rate of accuracy in assigning semantic relations to tag pairs. This paper also explores the implications of this approach for using social semantics to enrich subject metadata.
    Date
    20. 2.2009 10:29:07
    Source
    Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas
  2. Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.01
    0.011148583 = product of:
      0.044594333 = sum of:
        0.044594333 = product of:
          0.133783 = sum of:
            0.133783 = weight(_text_:language in 4277) [ClassicSimilarity], result of:
              0.133783 = score(doc=4277,freq=28.0), product of:
                0.16497234 = queryWeight, product of:
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.042049456 = queryNorm
                0.810942 = fieldWeight in 4277, product of:
                  5.2915025 = tf(freq=28.0), with freq of:
                    28.0 = termFreq=28.0
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4277)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.
  3. Liu, X.; Croft, W.B.: Cluster-based retrieval using language models (2004) 0.01
    0.007151001 = product of:
      0.028604005 = sum of:
        0.028604005 = product of:
          0.08581201 = sum of:
            0.08581201 = weight(_text_:language in 4115) [ClassicSimilarity], result of:
              0.08581201 = score(doc=4115,freq=2.0), product of:
                0.16497234 = queryWeight, product of:
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.042049456 = queryNorm
                0.52016 = fieldWeight in 4115, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4115)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
  4. Liu, X.; Zhang, J.; Guo, C.: Full-text citation analysis : a new method to enhance scholarly networks (2013) 0.00
    0.004213768 = product of:
      0.016855072 = sum of:
        0.016855072 = product of:
          0.050565217 = sum of:
            0.050565217 = weight(_text_:language in 1044) [ClassicSimilarity], result of:
              0.050565217 = score(doc=1044,freq=4.0), product of:
                0.16497234 = queryWeight, product of:
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.042049456 = queryNorm
                0.30650726 = fieldWeight in 1044, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1044)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    In this article, we use innovative full-text citation analysis along with supervised topic modeling and network-analysis algorithms to enhance classical bibliometric analysis and publication/author/venue ranking. By utilizing citation contexts extracted from a large number of full-text publications, each citation or publication is represented by a probability distribution over a set of predefined topics, where each topic is labeled by an author-contributed keyword. We then used publication/citation topic distribution to generate a citation graph with vertex prior and edge transitioning probability distributions. The publication importance score for each given topic is calculated by PageRank with edge and vertex prior distributions. To evaluate this work, we sampled 104 topics (labeled with keywords) in review papers. The cited publications of each review paper are assumed to be "important publications" for the target topic (keyword), and we use these cited publications to validate our topic-ranking result and to compare different publication-ranking lists. Evaluation results show that full-text citation and publication content prior topic distribution, along with the classical PageRank algorithm can significantly enhance bibliometric analysis and scientific publication ranking performance, comparing with term frequency-inverted document frequency (tf-idf), language model, BM25, PageRank, and PageRank + language model (p < .001), for academic information retrieval (IR) systems.
  5. Liu, X.; Jia, H.: Answering academic questions for education by recommending cyberlearning resources (2013) 0.00
    0.0035755006 = product of:
      0.014302002 = sum of:
        0.014302002 = product of:
          0.042906005 = sum of:
            0.042906005 = weight(_text_:language in 1012) [ClassicSimilarity], result of:
              0.042906005 = score(doc=1012,freq=2.0), product of:
                0.16497234 = queryWeight, product of:
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.042049456 = queryNorm
                0.26008 = fieldWeight in 1012, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1012)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    In this study, we design an innovative method for answering students' or scholars' academic questions (for a specific scientific publication) by automatically recommending e-learning resources in a cyber-infrastructure-enabled learning environment to enhance the learning experiences of students and scholars. By using information retrieval and metasearch methodologies, different types of referential metadata (related Wikipedia pages, data sets, source code, video lectures, presentation slides, and online tutorials) for an assortment of publications and scientific topics will be automatically retrieved, associated, and ranked (via the language model and the inference network model) to provide easily understandable cyberlearning resources to answer students' questions. We also designed an experimental system to automatically answer students' questions for a specific academic publication and then evaluated the quality of the answers (the recommended resources) using mean reciprocal rank and normalized discounted cumulative gain. After examining preliminary evaluation results and student feedback, we found that cyberlearning resources can provide high-quality and straightforward answers for students' and scholars' questions concerning the content of academic publications.
  6. Liu, X.; Turtle, H.: Real-time user interest modeling for real-time ranking (2013) 0.00
    0.0035755006 = product of:
      0.014302002 = sum of:
        0.014302002 = product of:
          0.042906005 = sum of:
            0.042906005 = weight(_text_:language in 1035) [ClassicSimilarity], result of:
              0.042906005 = score(doc=1035,freq=2.0), product of:
                0.16497234 = queryWeight, product of:
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.042049456 = queryNorm
                0.26008 = fieldWeight in 1035, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1035)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    User interest as a very dynamic information need is often ignored in most existing information retrieval systems. In this research, we present the results of experiments designed to evaluate the performance of a real-time interest model (RIM) that attempts to identify the dynamic and changing query level interests regarding social media outputs. Unlike most existing ranking methods, our ranking approach targets calculation of the probability that user interest in the content of the document is subject to very dynamic user interest change. We describe 2 formulations of the model (real-time interest vector space and real-time interest language model) stemming from classical relevance ranking methods and develop a novel methodology for evaluating the performance of RIM using Amazon Mechanical Turk to collect (interest-based) relevance judgments on a daily basis. Our results show that the model usually, although not always, performs better than baseline results obtained from commercial web search engines. We identify factors that affect RIM performance and outline plans for future research.
  7. Liu, X.; Guo, C.; Zhang, L.: Scholar metadata and knowledge generation with human and artificial intelligence (2014) 0.00
    0.0035755006 = product of:
      0.014302002 = sum of:
        0.014302002 = product of:
          0.042906005 = sum of:
            0.042906005 = weight(_text_:language in 1287) [ClassicSimilarity], result of:
              0.042906005 = score(doc=1287,freq=2.0), product of:
                0.16497234 = queryWeight, product of:
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.042049456 = queryNorm
                0.26008 = fieldWeight in 1287, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9232929 = idf(docFreq=2376, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1287)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    Scholar metadata have traditionally centered on descriptive representations, which have been used as a foundation for scholarly publication repositories and academic information retrieval systems. In this article, we propose innovative and economic methods of generating knowledge-based structural metadata (structural keywords) using a combination of natural language processing-based machine-learning techniques and human intelligence. By allowing low-barrier participation through a social media system, scholars (both as authors and users) can participate in the metadata editing and enhancing process and benefit from more accurate and effective information retrieval. Our experimental web system ScholarWiki uses machine learning techniques, which automatically produce increasingly refined metadata by learning from the structural metadata contributed by scholars. The cumulated structural metadata add intelligence and automatically enhance and update recursively the quality of metadata, wiki pages, and the machine-learning model.
  8. Liu, X.: ¬The standardization of Chinese library classification (1993) 0.00
    0.0028744177 = product of:
      0.011497671 = sum of:
        0.011497671 = product of:
          0.03449301 = sum of:
            0.03449301 = weight(_text_:29 in 5588) [ClassicSimilarity], result of:
              0.03449301 = score(doc=5588,freq=2.0), product of:
                0.14791684 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.042049456 = queryNorm
                0.23319192 = fieldWeight in 5588, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5588)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    8.10.2000 14:29:26
  9. Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.00
    0.0028744177 = product of:
      0.011497671 = sum of:
        0.011497671 = product of:
          0.03449301 = sum of:
            0.03449301 = weight(_text_:29 in 3464) [ClassicSimilarity], result of:
              0.03449301 = score(doc=3464,freq=2.0), product of:
                0.14791684 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.042049456 = queryNorm
                0.23319192 = fieldWeight in 3464, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3464)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    1. 6.2010 9:29:57
  10. Clewley, N.; Chen, S.Y.; Liu, X.: Cognitive styles and search engine preferences : field dependence/independence vs holism/serialism (2010) 0.00
    0.0023953482 = product of:
      0.009581393 = sum of:
        0.009581393 = product of:
          0.028744178 = sum of:
            0.028744178 = weight(_text_:29 in 3961) [ClassicSimilarity], result of:
              0.028744178 = score(doc=3961,freq=2.0), product of:
                0.14791684 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.042049456 = queryNorm
                0.19432661 = fieldWeight in 3961, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3961)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    29. 8.2010 13:11:47
  11. Jiang, Z.; Liu, X.; Chen, Y.: Recovering uncaptured citations in a scholarly network : a two-step citation analysis to estimate publication importance (2016) 0.00
    0.0023953482 = product of:
      0.009581393 = sum of:
        0.009581393 = product of:
          0.028744178 = sum of:
            0.028744178 = weight(_text_:29 in 3018) [ClassicSimilarity], result of:
              0.028744178 = score(doc=3018,freq=2.0), product of:
                0.14791684 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.042049456 = queryNorm
                0.19432661 = fieldWeight in 3018, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3018)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    12. 6.2016 20:31:29