Search (13 results, page 1 of 1)

  • × author_ss:"Zhang, X."
  1. Jiang, Y.; Zhang, X.; Tang, Y.; Nie, R.: Feature-based approaches to semantic similarity assessment of concepts using Wikipedia (2015) 0.05
    0.047309753 = product of:
      0.094619505 = sum of:
        0.01557172 = product of:
          0.06228688 = sum of:
            0.06228688 = weight(_text_:based in 2682) [ClassicSimilarity], result of:
              0.06228688 = score(doc=2682,freq=14.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.44037464 = fieldWeight in 2682, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2682)
          0.25 = coord(1/4)
        0.079047784 = product of:
          0.15809557 = sum of:
            0.15809557 = weight(_text_:assessment in 2682) [ClassicSimilarity], result of:
              0.15809557 = score(doc=2682,freq=8.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.60999227 = fieldWeight in 2682, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2682)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Semantic similarity assessment between concepts is an important task in many language related applications. In the past, several approaches to assess similarity by evaluating the knowledge modeled in an (or multiple) ontology (or ontologies) have been proposed. However, there are some limitations such as the facts of relying on predefined ontologies and fitting non-dynamic domains in the existing measures. Wikipedia provides a very large domain-independent encyclopedic repository and semantic network for computing semantic similarity of concepts with more coverage than usual ontologies. In this paper, we propose some novel feature based similarity assessment methods that are fully dependent on Wikipedia and can avoid most of the limitations and drawbacks introduced above. To implement similarity assessment based on feature by making use of Wikipedia, firstly a formal representation of Wikipedia concepts is presented. We then give a framework for feature based similarity based on the formal representation of Wikipedia concepts. Lastly, we investigate several feature based approaches to semantic similarity measures resulting from instantiations of the framework. The evaluation, based on several widely used benchmarks and a benchmark developed in ourselves, sustains the intuitions with respect to human judgements. Overall, several methods proposed in this paper have good human correlation and constitute some effective ways of determining similarity between Wikipedia concepts.
  2. Zhang, X.; Chignell, M.: Assessment of the effects of user characteristics on mental models of information retrieval systems (2001) 0.03
    0.027245669 = product of:
      0.054491337 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 5753) [ClassicSimilarity], result of:
              0.028250674 = score(doc=5753,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 5753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5753)
          0.25 = coord(1/4)
        0.047428668 = product of:
          0.094857335 = sum of:
            0.094857335 = weight(_text_:assessment in 5753) [ClassicSimilarity], result of:
              0.094857335 = score(doc=5753,freq=2.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.36599535 = fieldWeight in 5753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5753)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    This article reports the results of a study that investigated effects of four user characteristics on users' mental models of information retrieval systems: educational and professional status, first language, academic background, and computer experience. The repertory grid technique was used in the study. Using this method, important components of information retrieval systems were represented by nine concepts, based on four IR experts' judgments. Users' mental models were represented by factor scores that were derived from users' matrices of concept ratings on different attributes of the concepts. The study found that educational and professional status, academic background, and computer experience had significant effects in differentiating users on their factor scores. First language had a borderline effect, but the effect was not significant enough at a = 0.05 level. Specific different views regarding IR systems among different groups of users are described and discussed. Implications of the study for information science and IR system designs are suggested
  3. Zhang, X.; Wang, D.; Tang, Y.; Xiao, Q.: How question type influences knowledge withholding in social Q&A community (2023) 0.01
    0.013071639 = product of:
      0.026143279 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 1067) [ClassicSimilarity], result of:
              0.028250674 = score(doc=1067,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 1067, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1067)
          0.25 = coord(1/4)
        0.019080611 = product of:
          0.038161222 = sum of:
            0.038161222 = weight(_text_:22 in 1067) [ClassicSimilarity], result of:
              0.038161222 = score(doc=1067,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23214069 = fieldWeight in 1067, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1067)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Social question-and-answer (Q&A) communities are becoming increasingly important for knowledge acquisition. However, some users withhold knowledge, which can hinder the effectiveness of these platforms. Based on social exchange theory, the study investigates how different types of questions influence knowledge withholding, with question difficulty and user anonymity as boundary conditions. Two experiments were conducted to test hypotheses. Results indicate that informational questions are more likely to lead to knowledge withholding than conversational ones, as they elicit more fear of negative evaluation and fear of exploitation. The study also examines the interplay of question difficulty and user anonymity with question type. Overall, this study significantly extends the existing literature on counterproductive knowledge behavior by exploring the antecedents of knowledge withholding in social Q&A communities.
    Date
    22. 9.2023 13:51:47
  4. Yang, F.; Zhang, X.: Focal fields in literature on the information divide : the USA, China, UK and India (2020) 0.00
    0.003975128 = product of:
      0.015900511 = sum of:
        0.015900511 = product of:
          0.031801023 = sum of:
            0.031801023 = weight(_text_:22 in 5835) [ClassicSimilarity], result of:
              0.031801023 = score(doc=5835,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19345059 = fieldWeight in 5835, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5835)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    13. 2.2020 18:22:13
  5. Tay, W.; Zhang, X.; Karimi , S.: Beyond mean rating : probabilistic aggregation of star ratings based on helpfulness (2020) 0.00
    0.003058225 = product of:
      0.0122329 = sum of:
        0.0122329 = product of:
          0.0489316 = sum of:
            0.0489316 = weight(_text_:based in 5917) [ClassicSimilarity], result of:
              0.0489316 = score(doc=5917,freq=6.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.34595144 = fieldWeight in 5917, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5917)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    The star-rating mechanism of customer reviews is used universally by the online population to compare and select merchants, movies, products, and services. The consensus opinion from aggregation of star ratings is used as a proxy for item quality. Online reviews are noisy and effective aggregation of star ratings to accurately reflect the "true quality" of products and services is challenging. The mean-rating aggregation model is widely used and other aggregation models are also proposed. These existing aggregation models rely on a large number of reviews to tolerate noise. However, many products rarely have reviews. We propose probabilistic aggregation models for review ratings based on the Dirichlet distribution to combat data sparsity in reviews. We further propose to exploit the "helpfulness" social information and time to filter noisy reviews and effectively aggregate ratings to compute the consensus opinion. Our experiments on an Amazon data set show that our probabilistic aggregation models based on "helpfulness" achieve better performance than the statistical and heuristic baseline approaches.
  6. Zhang, X.; Han, H.: ¬An empirical testing of user stereotypes of information retrieval systems (2005) 0.00
    0.002548521 = product of:
      0.010194084 = sum of:
        0.010194084 = product of:
          0.040776335 = sum of:
            0.040776335 = weight(_text_:based in 1031) [ClassicSimilarity], result of:
              0.040776335 = score(doc=1031,freq=6.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28829288 = fieldWeight in 1031, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1031)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    Stereotyping is a technique used in many information systems to represent user groups and/or to generate initial individual user models. However, there has been a lack of evidence on the accuracy of their use in representing users. We propose a formal evaluation method to test the accuracy or homogeneity of the stereotypes that are based on users' explicit characteristics. Using the method, the results of an empirical testing on 11 common user stereotypes of information retrieval (IR) systems are reported. The participants' memberships in the stereotypes were predicted using discriminant analysis, based on their IR knowledge. The actual membership and the predicted membership of each stereotype were compared. The data show that "librarians/IR professionals" is an accurate stereotype in representing its members, while some others, such as "undergraduate students" and "social sciences/humanities" users, are not accurate stereotypes. The data also demonstrate that based on the user's IR knowledge a stereotype can be made more accurate or homogeneous. The results show the promise that our method can help better detect the differences among stereotype members, and help with better stereotype design and user modeling. We assume that accurate stereotypes have better performance in user modeling and thus the system performance. Limitations and future directions of the study are discussed.
  7. Jiang, Y.; Bai, W.; Zhang, X.; Hu, J.: Wikipedia-based information content and semantic similarity computation (2017) 0.00
    0.002548521 = product of:
      0.010194084 = sum of:
        0.010194084 = product of:
          0.040776335 = sum of:
            0.040776335 = weight(_text_:based in 2877) [ClassicSimilarity], result of:
              0.040776335 = score(doc=2877,freq=6.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28829288 = fieldWeight in 2877, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2877)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    The Information Content (IC) of a concept is a fundamental dimension in computational linguistics. It enables a better understanding of concept's semantics. In the past, several approaches to compute IC of a concept have been proposed. However, there are some limitations such as the facts of relying on corpora availability, manual tagging, or predefined ontologies and fitting non-dynamic domains in the existing methods. Wikipedia provides a very large domain-independent encyclopedic repository and semantic network for computing IC of concepts with more coverage than usual ontologies. In this paper, we propose some novel methods to IC computation of a concept to solve the shortcomings of existing approaches. The presented methods focus on the IC computation of a concept (i.e., Wikipedia category) drawn from the Wikipedia category structure. We propose several new IC-based measures to compute the semantic similarity between concepts. The evaluation, based on several widely used benchmarks and a benchmark developed in ourselves, sustains the intuitions with respect to human judgments. Overall, some methods proposed in this paper have a good human correlation and constitute some effective ways of determining IC values for concepts and semantic similarity between concepts.
  8. Zhang, X.: Collaborative relevance judgment : a group consensus method for evaluating user search performance (2002) 0.00
    0.0024970302 = product of:
      0.009988121 = sum of:
        0.009988121 = product of:
          0.039952483 = sum of:
            0.039952483 = weight(_text_:based in 250) [ClassicSimilarity], result of:
              0.039952483 = score(doc=250,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28246817 = fieldWeight in 250, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=250)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    Relevance judgment has traditionally been considered a personal and subjective matter. A user's search and the search result are treated as an isolated event. To consider the collaborative nature of information retrieval (IR) in a group/organization or even societal context, this article proposes a method that measures relevance based on group/peer consensus. The method can be used in IR experiments. In this method, the relevance of a document is decided by group consensus, or more specifically, by the number of users (or experiment participants) who retrieve it for the same search question. The more users who retrieve it, the more relevant the document will be considered. A user's search performance can be measured by a relevance score based on this notion. The article reports the results of an experiment using this method to compare the search performance of different types of users. Related issues with the method and future directions are also discussed
  9. Zhang, X.: Concept integration of document databases using different indexing languages (2006) 0.00
    0.0024970302 = product of:
      0.009988121 = sum of:
        0.009988121 = product of:
          0.039952483 = sum of:
            0.039952483 = weight(_text_:based in 962) [ClassicSimilarity], result of:
              0.039952483 = score(doc=962,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28246817 = fieldWeight in 962, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=962)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    An integrated information retrieval system generally contains multiple databases that are inconsistent in terms of their content and indexing. This paper proposes a rough set-based transfer (RST) model for integration of the concepts of document databases using various indexing languages, so that users can search through the multiple databases using any of the current indexing languages. The RST model aims to effectively create meaningful transfer relations between the terms of two indexing languages, provided a number of documents are indexed with them in parallel. In our experiment, the indexing concepts of two databases respectively using the Thesaurus of Social Science (IZ) and the Schlagwortnormdatei (SWD) are integrated by means of the RST model. Finally, this paper compares the results achieved with a cross-concordance method, a conditional probability based method and the RST model.
  10. Ho, S.M.; Bieber, M.; Song, M.; Zhang, X.: Seeking beyond with IntegraL : a user study of sense-making enabled by anchor-based virtual integration of library systems (2013) 0.00
    0.0024970302 = product of:
      0.009988121 = sum of:
        0.009988121 = product of:
          0.039952483 = sum of:
            0.039952483 = weight(_text_:based in 1037) [ClassicSimilarity], result of:
              0.039952483 = score(doc=1037,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28246817 = fieldWeight in 1037, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1037)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    This article presents a user study showing the effectiveness of a linked-based, virtual integration infrastructure that gives users access to relevant online resources, empowering them to design an information-seeking path that is specifically relevant to their context. IntegraL provides a lightweight approach to improve and augment search functionality by dynamically generating context-focused "anchors" for recognized elements of interest generated by library services. This article includes a description of how IntegraL's design supports users' information-seeking behavior. A full user study with both objective and subjective measures of IntegraL and hypothesis testing regarding IntegraL's effectiveness of the user's information-seeking experience are described along with data analysis, implications arising from this kind of virtual integration, and possible future directions.
  11. Zhang, X.; Fang, Y.; He, W.; Zhang, Y.; Liu, X.: Epistemic motivation, task reflexivity, and knowledge contribution behavior on team wikis : a cross-level moderation model (2019) 0.00
    0.0024970302 = product of:
      0.009988121 = sum of:
        0.009988121 = product of:
          0.039952483 = sum of:
            0.039952483 = weight(_text_:based in 5245) [ClassicSimilarity], result of:
              0.039952483 = score(doc=5245,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28246817 = fieldWeight in 5245, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5245)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    A cross-level model based on the information processing perspective and trait activation theory was developed and tested in order to investigate the effects of individual-level epistemic motivation and team-level task reflexivity on three different individual contribution behaviors (i.e., adding, deleting, and revising) in the process of knowledge creation on team wikis. Using the Hierarchical Linear Modeling software package and the 2-wave data from 166 individuals in 51 wiki-based teams, we found cross-level interaction effects between individual epistemic motivation and team task reflexivity on different knowledge contribution behaviors on wikis. Epistemic motivation exerted a positive effect on adding, which was strengthened by team task reflexivity. The effect of epistemic motivation on deleting was positive only when task reflexivity was high. In addition, epistemic motivation was strongly positively related to revising, regardless of the level of task reflexivity involved.
  12. Zhang, X.; Liu, J.; Cole, M.; Belkin, N.: Predicting users' domain knowledge in information retrieval using multiple regression analysis of search behaviors (2015) 0.00
    0.0014713892 = product of:
      0.005885557 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 1822) [ClassicSimilarity], result of:
              0.023542227 = score(doc=1822,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 1822, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1822)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    User domain knowledge affects search behaviors and search success. Predicting a user's knowledge level from implicit evidence such as search behaviors could allow an adaptive information retrieval system to better personalize its interaction with users. This study examines whether user domain knowledge can be predicted from search behaviors by applying a regression modeling analysis method. We identify behavioral features that contribute most to a successful prediction model. A user experiment was conducted with 40 participants searching on task topics in the domain of genomics. Participant domain knowledge level was assessed based on the users' familiarity with and expertise in the search topics and their knowledge of MeSH (Medical Subject Headings) terms in the categories that corresponded to the search topics. Users' search behaviors were captured by logging software, which includes querying behaviors, document selection behaviors, and general task interaction behaviors. Multiple regression analysis was run on the behavioral data using different variable selection methods. Four successful predictive models were identified, each involving a slightly different set of behavioral variables. The models were compared for the best on model fit, significance of the model, and contributions of individual predictors in each model. Each model was validated using the split sampling method. The final model highlights three behavioral variables as domain knowledge level predictors: the number of documents saved, the average query length, and the average ranking position of the documents opened. The results are discussed, study limitations are addressed, and future research directions are suggested.
  13. Cui, Y.; Wang, Y.; Liu, X.; Wang, X.; Zhang, X.: Multidimensional scholarly citations : characterizing and understanding scholars' citation behaviors (2023) 0.00
    0.0014713892 = product of:
      0.005885557 = sum of:
        0.005885557 = product of:
          0.023542227 = sum of:
            0.023542227 = weight(_text_:based in 847) [ClassicSimilarity], result of:
              0.023542227 = score(doc=847,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.16644597 = fieldWeight in 847, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=847)
          0.25 = coord(1/4)
      0.25 = coord(1/4)
    
    Abstract
    This study investigates scholars' citation behaviors from a fine-grained perspective. Specifically, each scholarly citation is considered multidimensional rather than logically unidimensional (i.e., present or absent). Thirty million articles from PubMed were accessed for use in empirical research, in which a total of 15 interpretable features of scholarly citations were constructed and grouped into three main categories. Each category corresponds to one aspect of the reasons and motivations behind scholars' citation decision-making during academic writing. Using about 500,000 pairs of actual and randomly generated scholarly citations, a series of Random Forest-based classification experiments were conducted to quantitatively evaluate the correlation between each constructed citation feature and citation decisions made by scholars. Our experimental results indicate that citation proximity is the category most relevant to scholars' citation decision-making, followed by citation authority and citation inertia. However, big-name scholars whose h-indexes rank among the top 1% exhibit a unique pattern of citation behaviors-their citation decision-making correlates most closely with citation inertia, with the correlation nearly three times as strong as that of their ordinary counterparts. Hopefully, the empirical findings presented in this paper can bring us closer to characterizing and understanding the complex process of generating scholarly citations in academia.