Search (9 results, page 1 of 1)

  • × author_ss:"Liu, J."
  1. Zhou, D.; Lawless, S.; Wu, X.; Zhao, W.; Liu, J.: ¬A study of user profile representation for personalized cross-language information retrieval (2016) 0.01
    0.0116941 = product of:
      0.040929347 = sum of:
        0.02770021 = weight(_text_:based in 3167) [ClassicSimilarity], result of:
          0.02770021 = score(doc=3167,freq=4.0), product of:
            0.11767787 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.03905679 = queryNorm
            0.23539014 = fieldWeight in 3167, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3167)
        0.013229139 = product of:
          0.026458278 = sum of:
            0.026458278 = weight(_text_:22 in 3167) [ClassicSimilarity], result of:
              0.026458278 = score(doc=3167,freq=2.0), product of:
                0.13677022 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03905679 = queryNorm
                0.19345059 = fieldWeight in 3167, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3167)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Purpose - With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native speakers. The purpose of this paper is to present a comprehensive study of user profile representation techniques and investigate their use in personalized cross-language information retrieval (CLIR) systems through the means of personalized query expansion. Design/methodology/approach - The user profiles consist of weighted terms computed by using frequency-based methods such as tf-idf and BM25, as well as various latent semantic models trained on monolingual documents and cross-lingual comparable documents. This paper also proposes an automatic evaluation method for comparing various user profile generation techniques and query expansion methods. Findings - Experimental results suggest that latent semantic-weighted user profile representation techniques are superior to frequency-based methods, and are particularly suitable for users with a sufficient amount of historical data. The study also confirmed that user profiles represented by latent semantic models trained on a cross-lingual level gained better performance than the models trained on a monolingual level. Originality/value - Previous studies on personalized information retrieval systems have primarily investigated user profiles and personalization strategies on a monolingual level. The effect of utilizing such monolingual profiles for personalized CLIR remains unclear. The current study fills the gap by a comprehensive study of user profile representation for personalized CLIR and a novel personalized CLIR evaluation methodology to ensure repeatable and controlled experiments can be conducted.
    Date
    20. 1.2015 18:30:22
  2. Zhang, Y.; Liu, J.; Song, S.: ¬The design and evaluation of a nudge-based interface to facilitate consumers' evaluation of online health information credibility (2023) 0.01
    0.0116941 = product of:
      0.040929347 = sum of:
        0.02770021 = weight(_text_:based in 993) [ClassicSimilarity], result of:
          0.02770021 = score(doc=993,freq=4.0), product of:
            0.11767787 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.03905679 = queryNorm
            0.23539014 = fieldWeight in 993, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=993)
        0.013229139 = product of:
          0.026458278 = sum of:
            0.026458278 = weight(_text_:22 in 993) [ClassicSimilarity], result of:
              0.026458278 = score(doc=993,freq=2.0), product of:
                0.13677022 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03905679 = queryNorm
                0.19345059 = fieldWeight in 993, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=993)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Evaluating the quality of online health information (OHI) is a major challenge facing consumers. We designed PageGraph, an interface that displays quality indicators and associated values for a webpage, based on credibility evaluation models, the nudge theory, and existing empirical research concerning professionals' and consumers' evaluation of OHI quality. A qualitative evaluation of the interface with 16 participants revealed that PageGraph rendered the information and presentation nudges as intended. It provided the participants with easier access to quality indicators, encouraged fresh angles to assess information credibility, provided an evaluation framework, and encouraged validation of initial judgments. We then conducted a quantitative evaluation of the interface involving 60 participants using a between-subject experimental design. The control group used a regular web browser and evaluated the credibility of 12 preselected webpages, whereas the experimental group evaluated the same webpages with the assistance of PageGraph. PageGraph did not significantly influence participants' evaluation results. The results may be attributed to the insufficiency of the saliency and structure of the nudges implemented and the webpage stimuli's lack of sensitivity to the intervention. Future directions for applying nudges to support OHI evaluation were discussed.
    Date
    22. 6.2023 18:18:34
  3. Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023) 0.01
    0.009376042 = product of:
      0.032816146 = sum of:
        0.019587006 = weight(_text_:based in 1012) [ClassicSimilarity], result of:
          0.019587006 = score(doc=1012,freq=2.0), product of:
            0.11767787 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.03905679 = queryNorm
            0.16644597 = fieldWeight in 1012, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1012)
        0.013229139 = product of:
          0.026458278 = sum of:
            0.026458278 = weight(_text_:22 in 1012) [ClassicSimilarity], result of:
              0.026458278 = score(doc=1012,freq=2.0), product of:
                0.13677022 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03905679 = queryNorm
                0.19345059 = fieldWeight in 1012, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1012)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    With the wide application of keyphrases in many Information Retrieval (IR) and Natural Language Processing (NLP) tasks, automatic keyphrase prediction has been emerging. However, these statistically important phrases are contributing increasingly less to the related tasks because the end-to-end learning mechanism enables models to learn the important semantic information of the text directly. Similarly, keyphrases are of little help for readers to quickly grasp the paper's main idea because the relationship between the keyphrase and the paper is not explicit to readers. Therefore, we propose to generate keyphrases with specific functions for readers to bridge the semantic gap between them and the information producers, and verify the effectiveness of the keyphrase function for assisting users' comprehension with a user experiment. A controllable keyphrase generation framework (the CKPG) that uses the keyphrase function as a control code to generate categorized keyphrases is proposed and implemented based on Transformer, BART, and T5, respectively. For the Computer Science domain, the Macro-avgs of , , and on the Paper with Code dataset are up to 0.680, 0.535, and 0.558, respectively. Our experimental results indicate the effectiveness of the CKPG models.
    Date
    22. 6.2023 14:55:20
  4. Jiang, X.; Liu, J.: Extracting the evolutionary backbone of scientific domains : the semantic main path network analysis approach based on citation context analysis (2023) 0.00
    0.0048465277 = product of:
      0.033925693 = sum of:
        0.033925693 = weight(_text_:based in 948) [ClassicSimilarity], result of:
          0.033925693 = score(doc=948,freq=6.0), product of:
            0.11767787 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.03905679 = queryNorm
            0.28829288 = fieldWeight in 948, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=948)
      0.14285715 = coord(1/7)
    
    Abstract
    Main path analysis is a popular method for extracting the scientific backbone from the citation network of a research domain. Existing approaches ignored the semantic relationships between the citing and cited publications, resulting in several adverse issues, in terms of coherence of main paths and coverage of significant studies. This paper advocated the semantic main path network analysis approach to alleviate these issues based on citation function analysis. A wide variety of SciBERT-based deep learning models were designed for identifying citation functions. Semantic citation networks were built by either including important citations, for example, extension, motivation, usage and similarity, or excluding incidental citations like background and future work. Semantic main path network was built by merging the top-K main paths extracted from various time slices of semantic citation network. In addition, a three-way framework was proposed for the quantitative evaluation of main path analysis results. Both qualitative and quantitative analysis on three research areas of computational linguistics demonstrated that, compared to semantics-agnostic counterparts, different types of semantic main path networks provide complementary views of scientific knowledge flows. Combining them together, we obtained a more precise and comprehensive picture of domain evolution and uncover more coherent development pathways between scientific ideas.
  5. Liu, J.; Li, Y.; Hastings, S.K.: Simplified scheme of search task difficulty reasons (2019) 0.00
    0.0033577727 = product of:
      0.023504408 = sum of:
        0.023504408 = weight(_text_:based in 5224) [ClassicSimilarity], result of:
          0.023504408 = score(doc=5224,freq=2.0), product of:
            0.11767787 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.03905679 = queryNorm
            0.19973516 = fieldWeight in 5224, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=5224)
      0.14285715 = coord(1/7)
    
    Abstract
    This article reports on a study that aimed at simplifying a search task difficulty reason scheme. Liu, Kim, and Creel (2015) (denoted LKC15) developed a 21-item search task difficulty reason scheme using a controlled laboratory experiment. The current study simplified the scheme through another experiment that followed the same design as LKC15 and involved 32 university students. The study had one added questionnaire item that provided a list of the 21 difficulty reasons in the multiple-choice format. By comparing the current study with LKC15, a concept of primary top difficulty reasons was proposed, which reasonably simplified the 21-item scheme to an 8-item top reason list. This limited number of reasons is more manageable and makes it feasible for search systems to predict task difficulty reasons from observable user behaviors, which builds the basis for systems to improve user satisfaction based on predicted search difficulty reasons.
  6. Liu, J.; Liu, C.: Personalization in text information retrieval : a survey (2020) 0.00
    0.0033577727 = product of:
      0.023504408 = sum of:
        0.023504408 = weight(_text_:based in 5761) [ClassicSimilarity], result of:
          0.023504408 = score(doc=5761,freq=2.0), product of:
            0.11767787 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.03905679 = queryNorm
            0.19973516 = fieldWeight in 5761, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=5761)
      0.14285715 = coord(1/7)
    
    Abstract
    Personalization of information retrieval (PIR) is aimed at tailoring a search toward individual users and user groups by taking account of additional information about users besides their queries. In the past two decades or so, PIR has received extensive attention in both academia and industry. This article surveys the literature of personalization in text retrieval, following a framework for aspects or factors that can be used for personalization. The framework consists of additional information about users that can be explicitly obtained by asking users for their preferences, or implicitly inferred from users' search behaviors. Users' characteristics and contextual factors such as tasks, time, location, etc., can be helpful for personalization. This article also addresses various issues including when to personalize, the evaluation of PIR, privacy, usability, etc. Based on the extensive review, challenges are discussed and directions for future effort are suggested.
  7. Zhang, X.; Liu, J.; Cole, M.; Belkin, N.: Predicting users' domain knowledge in information retrieval using multiple regression analysis of search behaviors (2015) 0.00
    0.002798144 = product of:
      0.019587006 = sum of:
        0.019587006 = weight(_text_:based in 1822) [ClassicSimilarity], result of:
          0.019587006 = score(doc=1822,freq=2.0), product of:
            0.11767787 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.03905679 = queryNorm
            0.16644597 = fieldWeight in 1822, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1822)
      0.14285715 = coord(1/7)
    
    Abstract
    User domain knowledge affects search behaviors and search success. Predicting a user's knowledge level from implicit evidence such as search behaviors could allow an adaptive information retrieval system to better personalize its interaction with users. This study examines whether user domain knowledge can be predicted from search behaviors by applying a regression modeling analysis method. We identify behavioral features that contribute most to a successful prediction model. A user experiment was conducted with 40 participants searching on task topics in the domain of genomics. Participant domain knowledge level was assessed based on the users' familiarity with and expertise in the search topics and their knowledge of MeSH (Medical Subject Headings) terms in the categories that corresponded to the search topics. Users' search behaviors were captured by logging software, which includes querying behaviors, document selection behaviors, and general task interaction behaviors. Multiple regression analysis was run on the behavioral data using different variable selection methods. Four successful predictive models were identified, each involving a slightly different set of behavioral variables. The models were compared for the best on model fit, significance of the model, and contributions of individual predictors in each model. Each model was validated using the split sampling method. The final model highlights three behavioral variables as domain knowledge level predictors: the number of documents saved, the average query length, and the average ranking position of the documents opened. The results are discussed, study limitations are addressed, and future research directions are suggested.
  8. Liu, J.; Zhou, Z.; Gao, M.; Tang, J.; Fan, W.: Aspect sentiment mining of short bullet screen comments from online TV series (2023) 0.00
    0.002798144 = product of:
      0.019587006 = sum of:
        0.019587006 = weight(_text_:based in 1018) [ClassicSimilarity], result of:
          0.019587006 = score(doc=1018,freq=2.0), product of:
            0.11767787 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.03905679 = queryNorm
            0.16644597 = fieldWeight in 1018, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1018)
      0.14285715 = coord(1/7)
    
    Abstract
    Bullet screen comments (BSCs) are user-generated short comments that appear as real-time overlays on many video platforms, expressing the audience opinions and emotions about different aspects of the ongoing video. Unlike traditional long comments after a show, BSCs are often incomplete, ambiguous in context, and correlated over time. Current studies in sentiment analysis of BSCs rarely address these challenges, motivating us to develop an aspect-level sentiment analysis framework. Our framework, BSCNET, is a pre-trained language encoder-based deep neural classifier designed to enhance semantic understanding. A novel neighbor context construction method is proposed to uncover latent contextual correlation among BSCs over time, and we also incorporate semi-supervised learning to reduce labeling costs. The framework increases F1 (Macro) and accuracy by up to 10% and 10.2%, respectively. Additionally, we have developed two novel downstream tasks. The first is noisy BSCs identification, which reached F1 (Macro) and accuracy of 90.1% and 98.3%, respectively, through fine-tuning the BSCNET. The second is the prediction of future episode popularity, where the MAPE is reduced by 11%-19.0% when incorporating sentiment features. Overall, this study provides a methodology reference for aspect-level sentiment analysis of BSCs and highlights its potential for viewing experience or forthcoming content optimization.
  9. Liu, J.: CIP in China : the development and status quo (1996) 0.00
    0.0022678524 = product of:
      0.015874967 = sum of:
        0.015874967 = product of:
          0.031749934 = sum of:
            0.031749934 = weight(_text_:22 in 5528) [ClassicSimilarity], result of:
              0.031749934 = score(doc=5528,freq=2.0), product of:
                0.13677022 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03905679 = queryNorm
                0.23214069 = fieldWeight in 5528, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5528)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Source
    Cataloging and classification quarterly. 22(1996) no.1, S.69-76