Search (7 results, page 1 of 1)

  • × author_ss:"Chen, X."
  1. Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.02
    0.020948619 = product of:
      0.041897237 = sum of:
        0.041897237 = sum of:
          0.010696997 = weight(_text_:a in 5290) [ClassicSimilarity], result of:
            0.010696997 = score(doc=5290,freq=20.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.20142901 = fieldWeight in 5290, product of:
                4.472136 = tf(freq=20.0), with freq of:
                  20.0 = termFreq=20.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5290)
          0.03120024 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
            0.03120024 = score(doc=5290,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.19345059 = fieldWeight in 5290, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5290)
      0.5 = coord(1/2)
    
    Abstract
    Document keyphrases provide a concise summary of a document's content, offering semantic metadata summarizing a document. They can be used in many applications related to knowledge management and text mining, such as automatic text summarization, development of search engines, document clustering, document classification, thesaurus construction, and browsing interfaces. Because only a small portion of documents have keyphrases assigned by authors, and it is time-consuming and costly to manually assign keyphrases to documents, it is necessary to develop an algorithm to automatically generate keyphrases for documents. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified phrases to assign weights to the candidate keyphrases. The logic of our algorithm is: The more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. KIP's learning function can enrich the glossary database by automatically adding new identified keyphrases to the database. KIP's personalization feature will let the user build a glossary database specifically suitable for the area of his/her interest. The evaluation results show that KIP's performance is better than the systems we compared to and that the learning function is effective.
    Date
    22. 7.2006 17:25:48
    Type
    a
  2. Chen, X.: Fair use of electronic sources in libraries (1996) 0.00
    0.0023919214 = product of:
      0.0047838427 = sum of:
        0.0047838427 = product of:
          0.009567685 = sum of:
            0.009567685 = weight(_text_:a in 5856) [ClassicSimilarity], result of:
              0.009567685 = score(doc=5856,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.18016359 = fieldWeight in 5856, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=5856)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article explores some of the issues concerning the fair use doctrine, in particular, the fair use of electronic sources in a library setting. It reviews the purpose and application of this doctrine as embodied in the copyright law of the United States
    Type
    a
  3. Liu, X.; Chen, X.: Authors' noninstitutional emails and their correlation with retraction (2021) 0.00
    0.001913537 = product of:
      0.003827074 = sum of:
        0.003827074 = product of:
          0.007654148 = sum of:
            0.007654148 = weight(_text_:a in 152) [ClassicSimilarity], result of:
              0.007654148 = score(doc=152,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14413087 = fieldWeight in 152, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=152)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We collected research articles from Retraction Watch database, Scopus, and a major retraction announcement by Springer, to identify emails used by authors. Authors' emails can be institutional emails and noninstitutional emails. Data suggest that retracted articles are more likely to use noninstitutional emails, but it is difficult to generalize. The study put some focus on authors from China.
    Type
    a
  4. Chen, X.: ¬The influence of existing consistency measures on the relationship between indexing consistency and exhaustivity (2008) 0.00
    0.0018909799 = product of:
      0.0037819599 = sum of:
        0.0037819599 = product of:
          0.0075639198 = sum of:
            0.0075639198 = weight(_text_:a in 2502) [ClassicSimilarity], result of:
              0.0075639198 = score(doc=2502,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14243183 = fieldWeight in 2502, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2502)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Content
    Consistency studies have discussed the relationship between indexing consistency and exhaustivity, and it commonly accepted that higher exhaustivity results in lower indexing consistency. However, this issue has been oversimplified, and previous studies contain significant misinterpretations. The aim of this study is investigate the relationship between consistency and exhaustivity based on a large sample and to analyse the misinterpretations in earlier studies. A sample of 3,307 monographs, i.e. 6,614 records was drawn from two Chinese bibliographic catalogues. Indexing consistency was measured using two formulae which were popular in previous indexing consistency studies. A relatively high level of consistency was found (64.21% according to the first formula, 70.71% according to the second). Regarding the relationship between consistency and exhaustivity, it was found that when two indexers had identical exhaustivity, indexing consistency was substantially high. On the contrary, when they had different levels of exhaustivity, consistency was significantly low. It was inevitable with the use of the two formulae. Moreover, a detailed discussion was conducted to analyse the misinterpretations in previous studies.
    Type
    a
  5. Xu, C.; Ma, B.; Chen, X.; Ma, F.: Social tagging in the scholarly world (2013) 0.00
    0.0018909799 = product of:
      0.0037819599 = sum of:
        0.0037819599 = product of:
          0.0075639198 = sum of:
            0.0075639198 = weight(_text_:a in 1091) [ClassicSimilarity], result of:
              0.0075639198 = score(doc=1091,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14243183 = fieldWeight in 1091, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1091)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The number of research studies on social tagging has increased rapidly in the past years, but few of them highlight the characteristics and research trends in social tagging. A set of 862 academic documents relating to social tagging and published from 2005 to 2011 was thus examined using bibliometric analysis as well as the social network analysis technique. The results show that social tagging, as a research area, develops rapidly and attracts an increasing number of new entrants. There are no key authors, publication sources, or research groups that dominate the research domain of social tagging. Research on social tagging appears to focus mainly on the following three aspects: (a) components and functions of social tagging (e.g., tags, tagging objects, and tagging network), (b) taggers' behaviors and interface design, and (c) tags' organization and usage in social tagging. The trend suggest that more researchers turn to the latter two integrated with human computer interface and information retrieval, although the first aspect is the fundamental one in social tagging. Also, more studies relating to social tagging pay attention to multimedia tagging objects and not only text tagging. Previous research on social tagging was limited to a few subject domains such as information science and computer science. As an interdisciplinary research area, social tagging is anticipated to attract more researchers from different disciplines. More practical applications, especially in high-tech companies, is an encouraging research trend in social tagging.
    Type
    a
  6. Bose, I.; Chen, X.: ¬A method for extension of generative topographic mapping for fuzzy clustering (2009) 0.00
    0.001757696 = product of:
      0.003515392 = sum of:
        0.003515392 = product of:
          0.007030784 = sum of:
            0.007030784 = weight(_text_:a in 2711) [ClassicSimilarity], result of:
              0.007030784 = score(doc=2711,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.13239266 = fieldWeight in 2711, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2711)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this paper, a new method for fuzzy clustering is proposed that combines generative topographic mapping (GTM) and Fuzzy c-means (FCM) clustering. GTM is used to generate latent variables and their posterior probabilities. These two provide the distribution of the input data in the latent space. FCM determines the seeds of clusters, as well as the resultant clusters and the corresponding membership functions of the input data, based on the latent variables obtained from GTM. Experiments are conducted to compare the results obtained using FCM and the Gustafson-Kessel (GK) algorithm with the proposed method in terms of four cluster-validity indexes. Using simulated and benchmark data sets, it is observed that the hybrid method (GTMFCM) performs better than FCM and GK algorithms in terms of these indexes. It is also found that the superiority of GTMFCM over FCM and GK algorithms becomes more pronounced with the increase in the dimensionality of the input data set.
    Type
    a
  7. Zhou, H.; Xiao, L.; Liu, Y.; Chen, X.: ¬The effect of prediscussion note-taking in hidden profile tasks (2018) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 4184) [ClassicSimilarity], result of:
              0.006765375 = score(doc=4184,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 4184, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4184)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Prior research has discovered that groups tend to discuss shared information while failing to discuss unique information in decision-making processes. In our study, we conducted a lab experiment to examine the effect of prediscussion note-taking on this phenomenon. The experiment used a murder-mystery hidden profile task. In all, 192 undergraduate students were recruited and randomly assigned into 48 four-person groups with gender being the matching variable (i.e., each group consisted of four same-gender participants). During the decision-making processes, some groups were asked to take notes while reading task materials and had their notes available in the following group discussion, while the other groups were not given this opportunity. Our analysis results suggest that (a) the presence of an information piece in group members' notes positively correlates with its appearance in the subsequent discussion and note-taking positively affects the group's information repetition rate; (b) group decision quality positively correlates with the group's information sampling rate and negatively correlates with the group's information sampling/repetition bias; and (c) gender has no statistically significant moderating effect on the relationship between note-taking and information sharing. These results imply that prediscussion note-taking could facilitate information sharing but could not alleviate the biased information pooling in hidden profile tasks.
    Type
    a