Search (12 results, page 1 of 1)

  • × author_ss:"Chen, H."
  1. Yang, M.; Kiang, M.; Chen, H.; Li, Y.: Artificial immune system for illicit content identification in social media (2012) 0.03
    0.034768227 = product of:
      0.069536455 = sum of:
        0.069536455 = product of:
          0.13907291 = sum of:
            0.13907291 = weight(_text_:media in 4980) [ClassicSimilarity], result of:
              0.13907291 = score(doc=4980,freq=10.0), product of:
                0.24036849 = queryWeight, product of:
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.051318336 = queryNorm
                0.5785821 = fieldWeight in 4980, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4980)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Social media is frequently used as a platform for the exchange of information and opinions as well as propaganda dissemination. But online content can be misused for the distribution of illicit information, such as violent postings in web forums. Illicit content is highly distributed in social media, while non-illicit content is unspecific and topically diverse. It is costly and time consuming to label a large amount of illicit content (positive examples) and non-illicit content (negative examples) to train classification systems. Nevertheless, it is relatively easy to obtain large volumes of unlabeled content in social media. In this article, an artificial immune system-based technique is presented to address the difficulties in the illicit content identification in social media. Inspired by the positive selection principle in the immune system, we designed a novel labeling heuristic based on partially supervised learning to extract high-quality positive and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance.
  2. Chen, H.: Semantic research for digital libraries (1999) 0.03
    0.026387228 = product of:
      0.052774455 = sum of:
        0.052774455 = product of:
          0.10554891 = sum of:
            0.10554891 = weight(_text_:media in 1247) [ClassicSimilarity], result of:
              0.10554891 = score(doc=1247,freq=4.0), product of:
                0.24036849 = queryWeight, product of:
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.051318336 = queryNorm
                0.43911293 = fieldWeight in 1247, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1247)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this era of the Internet and distributed, multimedia computing, new and emerging classes of information systems applications have swept into the lives of office workers and people in general. From digital libraries, multimedia systems, geographic information systems, and collaborative computing to electronic commerce, virtual reality, and electronic video arts and games, these applications have created tremendous opportunities for information and computer science researchers and practitioners. As applications become more pervasive, pressing, and diverse, several well-known information retrieval (IR) problems have become even more urgent. Information overload, a result of the ease of information creation and transmission via the Internet and WWW, has become more troublesome (e.g., even stockbrokers and elementary school students, heavily exposed to various WWW search engines, are versed in such IR terminology as recall and precision). Significant variations in database formats and structures, the richness of information media (text, audio, and video), and an abundance of multilingual information content also have created severe information interoperability problems -- structural interoperability, media interoperability, and multilingual interoperability.
  3. Ku, Y.; Chiu, C.; Zhang, Y.; Chen, H.; Su, H.: Text mining self-disclosing health information for public health service (2014) 0.03
    0.026387228 = product of:
      0.052774455 = sum of:
        0.052774455 = product of:
          0.10554891 = sum of:
            0.10554891 = weight(_text_:media in 1262) [ClassicSimilarity], result of:
              0.10554891 = score(doc=1262,freq=4.0), product of:
                0.24036849 = queryWeight, product of:
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.051318336 = queryNorm
                0.43911293 = fieldWeight in 1262, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1262)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Understanding specific patterns or knowledge of self-disclosing health information could support public health surveillance and healthcare. This study aimed to develop an analytical framework to identify self-disclosing health information with unusual messages on web forums by leveraging advanced text-mining techniques. To demonstrate the performance of the proposed analytical framework, we conducted an experimental study on 2 major human immunodeficiency virus (HIV)/acquired immune deficiency syndrome (AIDS) forums in Taiwan. The experimental results show that the classification accuracy increased significantly (up to 83.83%) when using features selected by the information gain technique. The results also show the importance of adopting domain-specific features in analyzing unusual messages on web forums. This study has practical implications for the prevention and support of HIV/AIDS healthcare. For example, public health agencies can re-allocate resources and deliver services to people who need help via social media sites. In addition, individuals can also join a social media site to get better suggestions and support from each other.
  4. Ramsey, M.C.; Chen, H.; Zhu, B.; Schatz, B.R.: ¬A collection of visual thesauri for browsing large collections of geographic images (1999) 0.02
    0.021768354 = product of:
      0.043536708 = sum of:
        0.043536708 = product of:
          0.087073416 = sum of:
            0.087073416 = weight(_text_:media in 3922) [ClassicSimilarity], result of:
              0.087073416 = score(doc=3922,freq=2.0), product of:
                0.24036849 = queryWeight, product of:
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.051318336 = queryNorm
                0.3622497 = fieldWeight in 3922, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3922)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Digital libraries of geo-spatial multimedia content are currently deficient in providing fuzzy, concept-based retrieval mechanisms to users. The main challenge is that indexing and thesaurus creation are extremely labor-intensive processes for text documents and especially for images. Recently, 800.000 declassified staellite photographs were made available by the US Geological Survey. Additionally, millions of satellite and aerial photographs are archived in national and local map libraries. Such enormous collections make human indexing and thesaurus generation methods impossible to utilize. In this article we propose a scalable method to automatically generate visual thesauri of large collections of geo-spatial media using fuzzy, unsupervised machine-learning techniques
  5. Hu, P.J.-H.; Lin, C.; Chen, H.: User acceptance of intelligence and security informatics technology : a study of COPLINK (2005) 0.02
    0.018658588 = product of:
      0.037317175 = sum of:
        0.037317175 = product of:
          0.07463435 = sum of:
            0.07463435 = weight(_text_:media in 3233) [ClassicSimilarity], result of:
              0.07463435 = score(doc=3233,freq=2.0), product of:
                0.24036849 = queryWeight, product of:
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.051318336 = queryNorm
                0.31049973 = fieldWeight in 3233, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3233)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The importance of Intelligence and Security Informatics (ISI) has significantly increased with the rapid and largescale migration of local/national security information from physical media to electronic platforms, including the Internet and information systems. Motivated by the significance of ISI in law enforcement (particularly in the digital government context) and the limited investigations of officers' technology-acceptance decisionmaking, we developed and empirically tested a factor model for explaining law-enforcement officers' technology acceptance. Specifically, our empirical examination targeted the COPLINK technology and involved more than 280 police officers. Overall, our model shows a good fit to the data collected and exhibits satisfactory Power for explaining law-enforcement officers' technology acceptance decisions. Our findings have several implications for research and technology management practices in law enforcement, which are also discussed.
  6. Fu, T.; Abbasi, A.; Chen, H.: ¬A hybrid approach to Web forum interactional coherence analysis (2008) 0.02
    0.018658588 = product of:
      0.037317175 = sum of:
        0.037317175 = product of:
          0.07463435 = sum of:
            0.07463435 = weight(_text_:media in 1872) [ClassicSimilarity], result of:
              0.07463435 = score(doc=1872,freq=2.0), product of:
                0.24036849 = queryWeight, product of:
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.051318336 = queryNorm
                0.31049973 = fieldWeight in 1872, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1872)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Despite the rapid growth of text-based computer-mediated communication (CMC), its limitations have rendered the media highly incoherent. This poses problems for content analysis of online discourse archives. Interactional coherence analysis (ICA) attempts to accurately identify and construct CMC interaction networks. In this study, we propose the Hybrid Interactional Coherence (HIC) algorithm for identification of web forum interaction. HIC utilizes a bevy of system and linguistic features, including message header information, quotations, direct address, and lexical relations. Furthermore, several similarity-based methods including a Lexical Match Algorithm (LMA) and a sliding window method are utilized to account for interactional idiosyncrasies. Experiments results on two web forums revealed that the proposed HIC algorithm significantly outperformed comparison techniques in terms of precision, recall, and F-measure at both the forum and thread levels. Additionally, an example was used to illustrate how the improved ICA results can facilitate enhanced social network and role analysis capabilities.
  7. Chung, W.; Chen, H.; Reid, E.: Business stakeholder analyzer : an experiment of classifying stakeholders on the Web (2009) 0.02
    0.015548823 = product of:
      0.031097647 = sum of:
        0.031097647 = product of:
          0.062195294 = sum of:
            0.062195294 = weight(_text_:media in 2699) [ClassicSimilarity], result of:
              0.062195294 = score(doc=2699,freq=2.0), product of:
                0.24036849 = queryWeight, product of:
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.051318336 = queryNorm
                0.25874978 = fieldWeight in 2699, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.6838713 = idf(docFreq=1110, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2699)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    As the Web is used increasingly to share and disseminate information, business analysts and managers are challenged to understand stakeholder relationships. Traditional stakeholder theories and frameworks employ a manual approach to analysis and do not scale up to accommodate the rapid growth of the Web. Unfortunately, existing business intelligence (BI) tools lack analysis capability, and research on BI systems is sparse. This research proposes a framework for designing BI systems to identify and to classify stakeholders on the Web, incorporating human knowledge and machine-learned information from Web pages. Based on the framework, we have developed a prototype called Business Stakeholder Analyzer (BSA) that helps managers and analysts to identify and to classify their stakeholders on the Web. Results from our experiment involving algorithm comparison, feature comparison, and a user study showed that the system achieved better within-class accuracies in widespread stakeholder types such as partner/sponsor/supplier and media/reviewer, and was more efficient than human classification. The student and practitioner subjects in our user study strongly agreed that such a system would save analysts' time and help to identify and classify stakeholders. This research contributes to a better understanding of how to integrate information technology with stakeholder theory, and enriches the knowledge base of BI system design.
  8. Chung, W.; Chen, H.: Browsing the underdeveloped Web : an experiment on the Arabic Medical Web Directory (2009) 0.01
    0.010429389 = product of:
      0.020858778 = sum of:
        0.020858778 = product of:
          0.041717555 = sum of:
            0.041717555 = weight(_text_:22 in 2733) [ClassicSimilarity], result of:
              0.041717555 = score(doc=2733,freq=2.0), product of:
                0.17970806 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051318336 = queryNorm
                0.23214069 = fieldWeight in 2733, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2733)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 17:57:50
  9. Carmel, E.; Crawford, S.; Chen, H.: Browsing in hypertext : a cognitive study (1992) 0.01
    0.008691157 = product of:
      0.017382314 = sum of:
        0.017382314 = product of:
          0.03476463 = sum of:
            0.03476463 = weight(_text_:22 in 7469) [ClassicSimilarity], result of:
              0.03476463 = score(doc=7469,freq=2.0), product of:
                0.17970806 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051318336 = queryNorm
                0.19345059 = fieldWeight in 7469, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=7469)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    IEEE transactions on systems, man and cybernetics. 22(1992) no.5, S.865-884
  10. Leroy, G.; Chen, H.: Genescene: an ontology-enhanced integration of linguistic and co-occurrence based relations in biomedical texts (2005) 0.01
    0.008691157 = product of:
      0.017382314 = sum of:
        0.017382314 = product of:
          0.03476463 = sum of:
            0.03476463 = weight(_text_:22 in 5259) [ClassicSimilarity], result of:
              0.03476463 = score(doc=5259,freq=2.0), product of:
                0.17970806 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051318336 = queryNorm
                0.19345059 = fieldWeight in 5259, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5259)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 14:26:01
  11. Zheng, R.; Li, J.; Chen, H.; Huang, Z.: ¬A framework for authorship identification of online messages : writing-style features and classification techniques (2006) 0.01
    0.008691157 = product of:
      0.017382314 = sum of:
        0.017382314 = product of:
          0.03476463 = sum of:
            0.03476463 = weight(_text_:22 in 5276) [ClassicSimilarity], result of:
              0.03476463 = score(doc=5276,freq=2.0), product of:
                0.17970806 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051318336 = queryNorm
                0.19345059 = fieldWeight in 5276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5276)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 16:14:37
  12. Hu, D.; Kaza, S.; Chen, H.: Identifying significant facilitators of dark network evolution (2009) 0.01
    0.008691157 = product of:
      0.017382314 = sum of:
        0.017382314 = product of:
          0.03476463 = sum of:
            0.03476463 = weight(_text_:22 in 2753) [ClassicSimilarity], result of:
              0.03476463 = score(doc=2753,freq=2.0), product of:
                0.17970806 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051318336 = queryNorm
                0.19345059 = fieldWeight in 2753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2753)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 18:50:30