Search (11 results, page 1 of 1)

  • × author_ss:"Li, Y."
  • × year_i:[2010 TO 2020}
  1. Crespo, J.A.; Herranz, N.; Li, Y.; Ruiz-Castillo, J.: ¬The effect on citation inequality of differences in citation practices at the web of science subject category level (2014) 0.03
    0.025444586 = product of:
      0.05088917 = sum of:
        0.05088917 = sum of:
          0.006765375 = weight(_text_:a in 1291) [ClassicSimilarity], result of:
            0.006765375 = score(doc=1291,freq=8.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.12739488 = fieldWeight in 1291, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1291)
          0.0441238 = weight(_text_:22 in 1291) [ClassicSimilarity], result of:
            0.0441238 = score(doc=1291,freq=4.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.27358043 = fieldWeight in 1291, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1291)
      0.5 = coord(1/2)
    
    Abstract
    This article studies the impact of differences in citation practices at the subfield, or Web of Science subject category level, using the model introduced in Crespo, Li, and Ruiz-Castillo (2013a), according to which the number of citations received by an article depends on its underlying scientific influence and the field to which it belongs. We use the same Thomson Reuters data set of about 4.4 million articles used in Crespo et al. (2013a) to analyze 22 broad fields. The main results are the following: First, when the classification system goes from 22 fields to 219 subfields the effect on citation inequality of differences in citation practices increases from ?14% at the field level to 18% at the subfield level. Second, we estimate a set of exchange rates (ERs) over a wide [660, 978] citation quantile interval to express the citation counts of articles into the equivalent counts in the all-sciences case. In the fractional case, for example, we find that in 187 of 219 subfields the ERs are reliable in the sense that the coefficient of variation is smaller than or equal to 0.10. Third, in the fractional case the normalization of the raw data using the ERs (or subfield mean citations) as normalization factors reduces the importance of the differences in citation practices from 18% to 3.8% (3.4%) of overall citation inequality. Fourth, the results in the fractional case are essentially replicated when we adopt a multiplicative approach.
    Type
    a
  2. Li, Y.; Xu, S.; Luo, X.; Lin, S.: ¬A new algorithm for product image search based on salient edge characterization (2014) 0.00
    0.0028047764 = product of:
      0.005609553 = sum of:
        0.005609553 = product of:
          0.011219106 = sum of:
            0.011219106 = weight(_text_:a in 1552) [ClassicSimilarity], result of:
              0.011219106 = score(doc=1552,freq=22.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.21126054 = fieldWeight in 1552, product of:
                  4.690416 = tf(freq=22.0), with freq of:
                    22.0 = termFreq=22.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1552)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Visually assisted product image search has gained increasing popularity because of its capability to greatly improve end users' e-commerce shopping experiences. Different from general-purpose content-based image retrieval (CBIR) applications, the specific goal of product image search is to retrieve and rank relevant products from a large-scale product database to visually assist a user's online shopping experience. In this paper, we explore the problem of product image search through salient edge characterization and analysis, for which we propose a novel image search method coupled with an interactive user region-of-interest indication function. Given a product image, the proposed approach first extracts an edge map, based on which contour curves are further extracted. We then segment the extracted contours into fragments according to the detected contour corners. After that, a set of salient edge elements is extracted from each product image. Based on salient edge elements matching and similarity evaluation, the method derives a new pairwise image similarity estimate. Using the new image similarity, we can then retrieve product images. To evaluate the performance of our algorithm, we conducted 120 sessions of querying experiments on a data set comprised of around 13k product images collected from multiple, real-world e-commerce websites. We compared the performance of the proposed method with that of a bag-of-words method (Philbin, Chum, Isard, Sivic, & Zisserman, 2008) and a Pyramid Histogram of Orientated Gradients (PHOG) method (Bosch, Zisserman, & Munoz, 2007). Experimental results demonstrate that the proposed method improves the performance of example-based product image retrieval.
    Type
    a
  3. Liu, J.; Li, Y.; Hastings, S.K.: Simplified scheme of search task difficulty reasons (2019) 0.00
    0.0026849252 = product of:
      0.0053698504 = sum of:
        0.0053698504 = product of:
          0.010739701 = sum of:
            0.010739701 = weight(_text_:a in 5224) [ClassicSimilarity], result of:
              0.010739701 = score(doc=5224,freq=14.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.20223314 = fieldWeight in 5224, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5224)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article reports on a study that aimed at simplifying a search task difficulty reason scheme. Liu, Kim, and Creel (2015) (denoted LKC15) developed a 21-item search task difficulty reason scheme using a controlled laboratory experiment. The current study simplified the scheme through another experiment that followed the same design as LKC15 and involved 32 university students. The study had one added questionnaire item that provided a list of the 21 difficulty reasons in the multiple-choice format. By comparing the current study with LKC15, a concept of primary top difficulty reasons was proposed, which reasonably simplified the 21-item scheme to an 8-item top reason list. This limited number of reasons is more manageable and makes it feasible for search systems to predict task difficulty reasons from observable user behaviors, which builds the basis for systems to improve user satisfaction based on predicted search difficulty reasons.
    Type
    a
  4. Arora, S.K.; Li, Y.; Youtie, J.; Shapira, P.: Using the wayback machine to mine websites in the social sciences : a methodological resource (2016) 0.00
    0.0025370158 = product of:
      0.0050740317 = sum of:
        0.0050740317 = product of:
          0.010148063 = sum of:
            0.010148063 = weight(_text_:a in 3050) [ClassicSimilarity], result of:
              0.010148063 = score(doc=3050,freq=18.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.19109234 = fieldWeight in 3050, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3050)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Websites offer an unobtrusive data source for developing and analyzing information about various types of social science phenomena. In this paper, we provide a methodological resource for social scientists looking to expand their toolkit using unstructured web-based text, and in particular, with the Wayback Machine, to access historical website data. After providing a literature review of existing research that uses the Wayback Machine, we put forward a step-by-step description of how the analyst can design a research project using archived websites. We draw on the example of a project that analyzes indicators of innovation activities and strategies in 300 U.S. small- and medium-sized enterprises in green goods industries. We present six steps to access historical Wayback website data: (a) sampling, (b) organizing and defining the boundaries of the web crawl, (c) crawling, (d) website variable operationalization, (e) integration with other data sources, and (f) analysis. Although our examples draw on specific types of firms in green goods industries, the method can be generalized to other areas of research. In discussing the limitations and benefits of using the Wayback Machine, we note that both machine and human effort are essential to developing a high-quality data set from archived web information.
    Type
    a
  5. Cao, Q.; Lu, Y.; Dong, D.; Tang, Z.; Li, Y.: ¬The roles of bridging and bonding in social media communities (2013) 0.00
    0.002269176 = product of:
      0.004538352 = sum of:
        0.004538352 = product of:
          0.009076704 = sum of:
            0.009076704 = weight(_text_:a in 1009) [ClassicSimilarity], result of:
              0.009076704 = score(doc=1009,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.1709182 = fieldWeight in 1009, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1009)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Social media communities have emerged recently as open and free communication platforms to support real-time information sharing among members. Drawing on social capital theories, we develop a theoretical model to investigate how the two types of social capital (bonding and bridging) contribute to the individual and collective well-being of virtual communities through information exchange. Research hypotheses were tested through survey instruments and computer archive data of 475 members of a large social network site during the Wenchuan earthquake (2008) in China. We find that bonding has a positive and significant impact on bridging. Both bonding and bridging have positive and significant impacts on information quality, but not on information quantity. Results also suggest that information quality is more critical to individuals and collective well-being than information quantity after a disaster.
    Type
    a
  6. Li, Y.; Belkin, N.J.: ¬An exploration of the relationships between work task and interactive information search behavior (2010) 0.00
    0.0020714647 = product of:
      0.0041429293 = sum of:
        0.0041429293 = product of:
          0.008285859 = sum of:
            0.008285859 = weight(_text_:a in 3980) [ClassicSimilarity], result of:
              0.008285859 = score(doc=3980,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15602624 = fieldWeight in 3980, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3980)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This study explores the relationships between work task and interactive information search behavior. Work task was conceptualized based on a faceted classification of task. An experiment was conducted with six work-task types and simulated work-task situations assigned to 24 participants. The results indicate that users present different behavior patterns to approach useful information for different work tasks: They select information systems to search based on the work tasks at hand, different work tasks motivate different types of search tasks, and different facets controlled in the study play different roles in shaping users' interactive information search behavior. The results provide empirical evidence to support the view that work tasks and search tasks play different roles in a user's interaction with information systems and that work task should be considered as a multifaceted variable. The findings provide a possibility to make predictions of a user's information search behavior from his or her work task, and vice versa. Thus, this study sheds light on task-based information seeking and search, and has implications in adaptive information retrieval (IR) and personalization of IR.
    Type
    a
  7. Song, J.; Huang, Y.; Qi, X.; Li, Y.; Li, F.; Fu, K.; Huang, T.: Discovering hierarchical topic evolution in time-stamped documents (2016) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 2853) [ClassicSimilarity], result of:
              0.008118451 = score(doc=2853,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 2853, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2853)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The objective of this paper is to propose a hierarchical topic evolution model (HTEM) that can organize time-varying topics in a hierarchy and discover their evolutions with multiple timescales. In the proposed HTEM, topics near the root of the hierarchy are more abstract and also evolve in the longer timescales than those near the leaves. To achieve this goal, the distance-dependent Chinese restaurant process (ddCRP) is extended to a new nested process that is able to simultaneously model the dependencies among data and the relationship between clusters. The HTEM is proposed based on the new process for time-stamped documents, in which the timestamp is utilized to measure the dependencies among documents. Moreover, an efficient Gibbs sampler is developed for the proposed HTEM. Our experimental results on two popular real-world data sets verify that the proposed HTEM can capture coherent topics and discover their hierarchical evolutions. It also outperforms the baseline model in terms of likelihood on held-out data.
    Type
    a
  8. Luo, P.; Chen, K.; Wu, C.; Li, Y.: Exploring the social influence of multichannel access in an online health community (2018) 0.00
    0.001757696 = product of:
      0.003515392 = sum of:
        0.003515392 = product of:
          0.007030784 = sum of:
            0.007030784 = weight(_text_:a in 4033) [ClassicSimilarity], result of:
              0.007030784 = score(doc=4033,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.13239266 = fieldWeight in 4033, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4033)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Social influence has a great impact on human behavior, which has been widely investigated in various research fields. Even so, it has rarely been investigated in the online health community. In this paper, we focus on the multichannel access in online health communities, defining social influence as the average degree of multichannel access to a physician's colleagues. Based on the multinomial logistic regression model, we examined the direct effects of social influence and patients' rating to multichannel access. In addition, we explored the moderating effect of social influence on the relationship between patients' rating and multichannel access in online health communities. The results of the experiment and robustness testing support the propositions that social influence and patients' rating significantly and positively affect multichannel access in an online health community. The moderating effect of social influence is negative and significantly influences the accessible channels provided by the focal physician. This research contributes to the literature concerning online health communities, social influence, and multichannel access; it also has practical implications.
    Type
    a
  9. Yang, M.; Kiang, M.; Chen, H.; Li, Y.: Artificial immune system for illicit content identification in social media (2012) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 4980) [ClassicSimilarity], result of:
              0.006765375 = score(doc=4980,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 4980, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4980)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Social media is frequently used as a platform for the exchange of information and opinions as well as propaganda dissemination. But online content can be misused for the distribution of illicit information, such as violent postings in web forums. Illicit content is highly distributed in social media, while non-illicit content is unspecific and topically diverse. It is costly and time consuming to label a large amount of illicit content (positive examples) and non-illicit content (negative examples) to train classification systems. Nevertheless, it is relatively easy to obtain large volumes of unlabeled content in social media. In this article, an artificial immune system-based technique is presented to address the difficulties in the illicit content identification in social media. Inspired by the positive selection principle in the immune system, we designed a novel labeling heuristic based on partially supervised learning to extract high-quality positive and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance.
    Type
    a
  10. Shen, J.; Yao, L.; Li, Y.; Clarke, M.; Wang, L.; Li, D.: Visualizing the history of evidence-based medicine : a bibliometric analysis (2013) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 1090) [ClassicSimilarity], result of:
              0.006765375 = score(doc=1090,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 1090, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1090)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The aim of this paper is to visualize the history of evidence-based medicine (EBM) and to examine the characteristics of EBM development in China and the West. We searched the Web of Science and the Chinese National Knowledge Infrastructure database for papers related to EBM. We applied information visualization techniques, citation analysis, cocitation analysis, cocitation cluster analysis, and network analysis to construct historiographies, themes networks, and chronological theme maps regarding EBM in China and the West. EBM appeared to develop in 4 stages: incubation (1972-1992 in the West vs. 1982-1999 in China), initiation (1992-1993 vs. 1999-2000), rapid development (1993-2000 vs. 2000-2004), and stable distribution (2000 onwards vs. 2004 onwards). Although there was a lag in EBM initiation in China compared with the West, the pace of development appeared similar. Our study shows that important differences exist in research themes, domain structures, and development depth, and in the speed of adoption between China and the West. In the West, efforts in EBM have shifted from education to practice, and from the quality of evidence to its translation. In China, there was a similar shift from education to practice, and from production of evidence to its translation. In addition, this concept has diffused to other healthcare areas, leading to the development of evidence-based traditional Chinese medicine, evidence-based nursing, and evidence-based policy making.
    Type
    a
  11. Xiao, D.; Ji, Y.; Li, Y.; Zhuang, F.; Shi, C.: Coupled matrix factorization and topic modeling for aspect mining (2018) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 5042) [ClassicSimilarity], result of:
              0.006765375 = score(doc=5042,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 5042, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5042)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Aspect mining, which aims to extract ad hoc aspects from online reviews and predict rating or opinion on each aspect, can satisfy the personalized needs for evaluation of specific aspect on product quality. Recently, with the increase of related research, how to effectively integrate rating and review information has become the key issue for addressing this problem. Considering that matrix factorization is an effective tool for rating prediction and topic modeling is widely used for review processing, it is a natural idea to combine matrix factorization and topic modeling for aspect mining (or called aspect rating prediction). However, this idea faces several challenges on how to address suitable sharing factors, scale mismatch, and dependency relation of rating and review information. In this paper, we propose a novel model to effectively integrate Matrix factorization and Topic modeling for Aspect rating prediction (MaToAsp). To overcome the above challenges and ensure the performance, MaToAsp employs items as the sharing factors to combine matrix factorization and topic modeling, and introduces an interpretive preference probability to eliminate scale mismatch. In the hybrid model, we establish a dependency relation from ratings to sentiment terms in phrases. The experiments on two real datasets including Chinese Dianping and English Tripadvisor prove that MaToAsp not only obtains reasonable aspect identification but also achieves the best aspect rating prediction performance, compared to recent representative baselines.
    Type
    a