Search (12 results, page 1 of 1)

  • × author_ss:"Li, Y."
  1. Yang, M.; Kiang, M.; Chen, H.; Li, Y.: Artificial immune system for illicit content identification in social media (2012) 0.01
    0.013026592 = product of:
      0.06513296 = sum of:
        0.008049765 = product of:
          0.01609953 = sum of:
            0.01609953 = weight(_text_:online in 4980) [ClassicSimilarity], result of:
              0.01609953 = score(doc=4980,freq=2.0), product of:
                0.096027054 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.031640913 = queryNorm
                0.16765618 = fieldWeight in 4980, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4980)
          0.5 = coord(1/2)
        0.030755727 = weight(_text_:evaluation in 4980) [ClassicSimilarity], result of:
          0.030755727 = score(doc=4980,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.23172665 = fieldWeight in 4980, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4980)
        0.026327467 = weight(_text_:web in 4980) [ClassicSimilarity], result of:
          0.026327467 = score(doc=4980,freq=4.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.25496176 = fieldWeight in 4980, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4980)
      0.2 = coord(3/15)
    
    Abstract
    Social media is frequently used as a platform for the exchange of information and opinions as well as propaganda dissemination. But online content can be misused for the distribution of illicit information, such as violent postings in web forums. Illicit content is highly distributed in social media, while non-illicit content is unspecific and topically diverse. It is costly and time consuming to label a large amount of illicit content (positive examples) and non-illicit content (negative examples) to train classification systems. Nevertheless, it is relatively easy to obtain large volumes of unlabeled content in social media. In this article, an artificial immune system-based technique is presented to address the difficulties in the illicit content identification in social media. Inspired by the positive selection principle in the immune system, we designed a novel labeling heuristic based on partially supervised learning to extract high-quality positive and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance.
  2. Zhang, Y.; Li, Y.: ¬A user-centered functional metadata evaluation of moving image collections (2008) 0.01
    0.0073172343 = product of:
      0.054879256 = sum of:
        0.011384088 = product of:
          0.022768175 = sum of:
            0.022768175 = weight(_text_:online in 1884) [ClassicSimilarity], result of:
              0.022768175 = score(doc=1884,freq=4.0), product of:
                0.096027054 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.031640913 = queryNorm
                0.23710167 = fieldWeight in 1884, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1884)
          0.5 = coord(1/2)
        0.043495167 = weight(_text_:evaluation in 1884) [ClassicSimilarity], result of:
          0.043495167 = score(doc=1884,freq=4.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.327711 = fieldWeight in 1884, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1884)
      0.13333334 = coord(2/15)
    
    Abstract
    In this article, the authors report a series of evaluations of two metadata schemes developed for Moving Image Collections (MIC), an integrated online catalog of moving images. Through two online surveys and one experiment spanning various stages of metadata implementation, the MIC evaluation team explored a user-centered approach in which the four generic user tasks suggested by IFLA FRBR (International Association of Library Associations Functional Requirement for Bibliographic Records) were embedded in data collection and analyses. Diverse groups of users rated usefulness of individual metadata fields for finding, identifying, selecting, and obtaining moving images. The results demonstrate a consistency across these evaluations with respect to (a) identification of a set of useful metadata fields highly rated by target users for each of the FRBR generic tasks, and (b) indication of a significant interaction between MIC metadata fields and the FRBR generic tasks. The findings provide timely feedback for the MIC implementation specifically, and valuable suggestions to other similar metadata application settings in general. They also suggest the feasibility of using the four IFLA FRBR generic tasks as a framework for user-centered functional metadata evaluations.
  3. Crespo, J.A.; Herranz, N.; Li, Y.; Ruiz-Castillo, J.: ¬The effect on citation inequality of differences in citation practices at the web of science subject category level (2014) 0.01
    0.0063201245 = product of:
      0.047400933 = sum of:
        0.03224443 = weight(_text_:web in 1291) [ClassicSimilarity], result of:
          0.03224443 = score(doc=1291,freq=6.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.3122631 = fieldWeight in 1291, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1291)
        0.015156505 = product of:
          0.03031301 = sum of:
            0.03031301 = weight(_text_:22 in 1291) [ClassicSimilarity], result of:
              0.03031301 = score(doc=1291,freq=4.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.27358043 = fieldWeight in 1291, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1291)
          0.5 = coord(1/2)
      0.13333334 = coord(2/15)
    
    Abstract
    This article studies the impact of differences in citation practices at the subfield, or Web of Science subject category level, using the model introduced in Crespo, Li, and Ruiz-Castillo (2013a), according to which the number of citations received by an article depends on its underlying scientific influence and the field to which it belongs. We use the same Thomson Reuters data set of about 4.4 million articles used in Crespo et al. (2013a) to analyze 22 broad fields. The main results are the following: First, when the classification system goes from 22 fields to 219 subfields the effect on citation inequality of differences in citation practices increases from ?14% at the field level to 18% at the subfield level. Second, we estimate a set of exchange rates (ERs) over a wide [660, 978] citation quantile interval to express the citation counts of articles into the equivalent counts in the all-sciences case. In the fractional case, for example, we find that in 187 of 219 subfields the ERs are reliable in the sense that the coefficient of variation is smaller than or equal to 0.10. Third, in the fractional case the normalization of the raw data using the ERs (or subfield mean citations) as normalization factors reduces the importance of the differences in citation practices from 18% to 3.8% (3.4%) of overall citation inequality. Fourth, the results in the fractional case are essentially replicated when we adopt a multiplicative approach.
    Object
    Web of Science
  4. Li, Y.; Xu, S.; Luo, X.; Lin, S.: ¬A new algorithm for product image search based on salient edge characterization (2014) 0.01
    0.005174066 = product of:
      0.038805492 = sum of:
        0.008049765 = product of:
          0.01609953 = sum of:
            0.01609953 = weight(_text_:online in 1552) [ClassicSimilarity], result of:
              0.01609953 = score(doc=1552,freq=2.0), product of:
                0.096027054 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.031640913 = queryNorm
                0.16765618 = fieldWeight in 1552, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1552)
          0.5 = coord(1/2)
        0.030755727 = weight(_text_:evaluation in 1552) [ClassicSimilarity], result of:
          0.030755727 = score(doc=1552,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.23172665 = fieldWeight in 1552, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1552)
      0.13333334 = coord(2/15)
    
    Abstract
    Visually assisted product image search has gained increasing popularity because of its capability to greatly improve end users' e-commerce shopping experiences. Different from general-purpose content-based image retrieval (CBIR) applications, the specific goal of product image search is to retrieve and rank relevant products from a large-scale product database to visually assist a user's online shopping experience. In this paper, we explore the problem of product image search through salient edge characterization and analysis, for which we propose a novel image search method coupled with an interactive user region-of-interest indication function. Given a product image, the proposed approach first extracts an edge map, based on which contour curves are further extracted. We then segment the extracted contours into fragments according to the detected contour corners. After that, a set of salient edge elements is extracted from each product image. Based on salient edge elements matching and similarity evaluation, the method derives a new pairwise image similarity estimate. Using the new image similarity, we can then retrieve product images. To evaluate the performance of our algorithm, we conducted 120 sessions of querying experiments on a data set comprised of around 13k product images collected from multiple, real-world e-commerce websites. We compared the performance of the proposed method with that of a bag-of-words method (Philbin, Chum, Isard, Sivic, & Zisserman, 2008) and a Pyramid Histogram of Orientated Gradients (PHOG) method (Bosch, Zisserman, & Munoz, 2007). Experimental results demonstrate that the proposed method improves the performance of example-based product image retrieval.
  5. Xiao, D.; Ji, Y.; Li, Y.; Zhuang, F.; Shi, C.: Coupled matrix factorization and topic modeling for aspect mining (2018) 0.01
    0.005174066 = product of:
      0.038805492 = sum of:
        0.008049765 = product of:
          0.01609953 = sum of:
            0.01609953 = weight(_text_:online in 5042) [ClassicSimilarity], result of:
              0.01609953 = score(doc=5042,freq=2.0), product of:
                0.096027054 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.031640913 = queryNorm
                0.16765618 = fieldWeight in 5042, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5042)
          0.5 = coord(1/2)
        0.030755727 = weight(_text_:evaluation in 5042) [ClassicSimilarity], result of:
          0.030755727 = score(doc=5042,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.23172665 = fieldWeight in 5042, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5042)
      0.13333334 = coord(2/15)
    
    Abstract
    Aspect mining, which aims to extract ad hoc aspects from online reviews and predict rating or opinion on each aspect, can satisfy the personalized needs for evaluation of specific aspect on product quality. Recently, with the increase of related research, how to effectively integrate rating and review information has become the key issue for addressing this problem. Considering that matrix factorization is an effective tool for rating prediction and topic modeling is widely used for review processing, it is a natural idea to combine matrix factorization and topic modeling for aspect mining (or called aspect rating prediction). However, this idea faces several challenges on how to address suitable sharing factors, scale mismatch, and dependency relation of rating and review information. In this paper, we propose a novel model to effectively integrate Matrix factorization and Topic modeling for Aspect rating prediction (MaToAsp). To overcome the above challenges and ensure the performance, MaToAsp employs items as the sharing factors to combine matrix factorization and topic modeling, and introduces an interpretive preference probability to eliminate scale mismatch. In the hybrid model, we establish a dependency relation from ratings to sentiment terms in phrases. The experiments on two real datasets including Chinese Dianping and English Tripadvisor prove that MaToAsp not only obtains reasonable aspect identification but also achieves the best aspect rating prediction performance, compared to recent representative baselines.
  6. Cao, Q.; Lu, Y.; Dong, D.; Tang, Z.; Li, Y.: ¬The roles of bridging and bonding in social media communities (2013) 0.00
    0.0042213076 = product of:
      0.06331961 = sum of:
        0.06331961 = weight(_text_:site in 1009) [ClassicSimilarity], result of:
          0.06331961 = score(doc=1009,freq=2.0), product of:
            0.1738463 = queryWeight, product of:
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.031640913 = queryNorm
            0.3642275 = fieldWeight in 1009, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.046875 = fieldNorm(doc=1009)
      0.06666667 = coord(1/15)
    
    Abstract
    Social media communities have emerged recently as open and free communication platforms to support real-time information sharing among members. Drawing on social capital theories, we develop a theoretical model to investigate how the two types of social capital (bonding and bridging) contribute to the individual and collective well-being of virtual communities through information exchange. Research hypotheses were tested through survey instruments and computer archive data of 475 members of a large social network site during the Wenchuan earthquake (2008) in China. We find that bonding has a positive and significant impact on bridging. Both bonding and bridging have positive and significant impacts on information quality, but not on information quantity. Results also suggest that information quality is more critical to individuals and collective well-being than information quantity after a disaster.
  7. Xianghao, G.; Yixin, Z.; Li, Y.: ¬A new method of news test understanding and abstracting based on speech acts theory (1998) 0.00
    0.002587875 = product of:
      0.03881812 = sum of:
        0.03881812 = product of:
          0.07763624 = sum of:
            0.07763624 = weight(_text_:analyse in 3532) [ClassicSimilarity], result of:
              0.07763624 = score(doc=3532,freq=2.0), product of:
                0.16670908 = queryWeight, product of:
                  5.268782 = idf(docFreq=618, maxDocs=44218)
                  0.031640913 = queryNorm
                0.46569893 = fieldWeight in 3532, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.268782 = idf(docFreq=618, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3532)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Abstract
    Presents a method for the automated analysis and comprehension of foreign affairs news produced by a Chinese news agency. Notes that the development of the method was prededed by a study of the structuring rules of the news. Describes how an abstract of the news story is produced automatically from the analysis. Stresses the main aim of the work which is to use specch act theory to analyse and classify sentences
  8. Li, Y.: Consistency versus inconsistency : issues in Chinese cataloging in OCLC (2004) 0.00
    0.0025675474 = product of:
      0.03851321 = sum of:
        0.03851321 = weight(_text_:software in 5657) [ClassicSimilarity], result of:
          0.03851321 = score(doc=5657,freq=2.0), product of:
            0.12552431 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031640913 = queryNorm
            0.30681872 = fieldWeight in 5657, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5657)
      0.06666667 = coord(1/15)
    
    Abstract
    This article addresses some unresolved cataloging issue related to pinyin Romanization, vernacular application, field coding, and other aspects of Chinese cataloging in OCLC. These issues lead to inconsistencies in the way Chinese materials are cataloged, though cataloging standards and Romanization rules are made and the processes of the projects like Pinyin Conversion, Manual Review, and Pinyin Clean-Up have been completed. In this article, eight of the most commonly encountered issues and inconsistent practices in Chinese cataloging are discussed. Examples from Chinese records created with OCLC CJK software in WorldCat are used to demonstrate the problems they raise. With the discussion it is hoped that these inconsistent practices can be recognized and avoided in the future.
  9. Arora, S.K.; Li, Y.; Youtie, J.; Shapira, P.: Using the wayback machine to mine websites in the social sciences : a methodological resource (2016) 0.00
    0.0021496287 = product of:
      0.03224443 = sum of:
        0.03224443 = weight(_text_:web in 3050) [ClassicSimilarity], result of:
          0.03224443 = score(doc=3050,freq=6.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.3122631 = fieldWeight in 3050, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3050)
      0.06666667 = coord(1/15)
    
    Abstract
    Websites offer an unobtrusive data source for developing and analyzing information about various types of social science phenomena. In this paper, we provide a methodological resource for social scientists looking to expand their toolkit using unstructured web-based text, and in particular, with the Wayback Machine, to access historical website data. After providing a literature review of existing research that uses the Wayback Machine, we put forward a step-by-step description of how the analyst can design a research project using archived websites. We draw on the example of a project that analyzes indicators of innovation activities and strategies in 300 U.S. small- and medium-sized enterprises in green goods industries. We present six steps to access historical Wayback website data: (a) sampling, (b) organizing and defining the boundaries of the web crawl, (c) crawling, (d) website variable operationalization, (e) integration with other data sources, and (f) analysis. Although our examples draw on specific types of firms in green goods industries, the method can be generalized to other areas of research. In discussing the limitations and benefits of using the Wayback Machine, we note that both machine and human effort are essential to developing a high-quality data set from archived web information.
  10. Li, Y.; Crescenzi, A.; Ward, A.R.; Capra, R.: Thinking inside the box : an evaluation of a novel search-assisting tool for supporting (meta)cognition during exploratory search (2023) 0.00
    0.0020503819 = product of:
      0.030755727 = sum of:
        0.030755727 = weight(_text_:evaluation in 1040) [ClassicSimilarity], result of:
          0.030755727 = score(doc=1040,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.23172665 = fieldWeight in 1040, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1040)
      0.06666667 = coord(1/15)
    
  11. Luo, P.; Chen, K.; Wu, C.; Li, Y.: Exploring the social influence of multichannel access in an online health community (2018) 0.00
    0.0015774255 = product of:
      0.02366138 = sum of:
        0.02366138 = product of:
          0.04732276 = sum of:
            0.04732276 = weight(_text_:online in 4033) [ClassicSimilarity], result of:
              0.04732276 = score(doc=4033,freq=12.0), product of:
                0.096027054 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.031640913 = queryNorm
                0.49280655 = fieldWeight in 4033, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4033)
          0.5 = coord(1/2)
      0.06666667 = coord(1/15)
    
    Abstract
    Social influence has a great impact on human behavior, which has been widely investigated in various research fields. Even so, it has rarely been investigated in the online health community. In this paper, we focus on the multichannel access in online health communities, defining social influence as the average degree of multichannel access to a physician's colleagues. Based on the multinomial logistic regression model, we examined the direct effects of social influence and patients' rating to multichannel access. In addition, we explored the moderating effect of social influence on the relationship between patients' rating and multichannel access in online health communities. The results of the experiment and robustness testing support the propositions that social influence and patients' rating significantly and positively affect multichannel access in an online health community. The moderating effect of social influence is negative and significantly influences the accessible channels provided by the focal physician. This research contributes to the literature concerning online health communities, social influence, and multichannel access; it also has practical implications.
  12. Shen, J.; Yao, L.; Li, Y.; Clarke, M.; Wang, L.; Li, D.: Visualizing the history of evidence-based medicine : a bibliometric analysis (2013) 0.00
    0.0012410887 = product of:
      0.01861633 = sum of:
        0.01861633 = weight(_text_:web in 1090) [ClassicSimilarity], result of:
          0.01861633 = score(doc=1090,freq=2.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.18028519 = fieldWeight in 1090, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1090)
      0.06666667 = coord(1/15)
    
    Abstract
    The aim of this paper is to visualize the history of evidence-based medicine (EBM) and to examine the characteristics of EBM development in China and the West. We searched the Web of Science and the Chinese National Knowledge Infrastructure database for papers related to EBM. We applied information visualization techniques, citation analysis, cocitation analysis, cocitation cluster analysis, and network analysis to construct historiographies, themes networks, and chronological theme maps regarding EBM in China and the West. EBM appeared to develop in 4 stages: incubation (1972-1992 in the West vs. 1982-1999 in China), initiation (1992-1993 vs. 1999-2000), rapid development (1993-2000 vs. 2000-2004), and stable distribution (2000 onwards vs. 2004 onwards). Although there was a lag in EBM initiation in China compared with the West, the pace of development appeared similar. Our study shows that important differences exist in research themes, domain structures, and development depth, and in the speed of adoption between China and the West. In the West, efforts in EBM have shifted from education to practice, and from the quality of evidence to its translation. In China, there was a similar shift from education to practice, and from production of evidence to its translation. In addition, this concept has diffused to other healthcare areas, leading to the development of evidence-based traditional Chinese medicine, evidence-based nursing, and evidence-based policy making.