Search (12 results, page 1 of 1)

  • × author_ss:"Chen, H."
  1. Leroy, G.; Chen, H.: Genescene: an ontology-enhanced integration of linguistic and co-occurrence based relations in biomedical texts (2005) 0.07
    0.06717849 = product of:
      0.13435698 = sum of:
        0.13435698 = sum of:
          0.096571654 = weight(_text_:literature in 5259) [ClassicSimilarity], result of:
            0.096571654 = score(doc=5259,freq=6.0), product of:
              0.23726627 = queryWeight, product of:
                4.253809 = idf(docFreq=1707, maxDocs=44218)
                0.055777367 = queryNorm
              0.40701804 = fieldWeight in 5259, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.253809 = idf(docFreq=1707, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5259)
          0.037785318 = weight(_text_:22 in 5259) [ClassicSimilarity], result of:
            0.037785318 = score(doc=5259,freq=2.0), product of:
              0.19532284 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.055777367 = queryNorm
              0.19345059 = fieldWeight in 5259, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5259)
      0.5 = coord(1/2)
    
    Abstract
    The increasing amount of publicly available literature and experimental data in biomedicine makes it hard for biomedical researchers to stay up-to-date. Genescene is a toolkit that will help alleviate this problem by providing an overview of published literature content. We combined a linguistic parser with Concept Space, a co-occurrence based semantic net. Both techniques extract complementary biomedical relations between noun phrases from MEDLINE abstracts. The parser extracts precise and semantically rich relations from individual abstracts. Concept Space extracts relations that hold true for the collection of abstracts. The Gene Ontology, the Human Genome Nomenclature, and the Unified Medical Language System, are also integrated in Genescene. Currently, they are used to facilitate the integration of the two relation types, and to select the more interesting and high-quality relations for presentation. A user study focusing on p53 literature is discussed. All MEDLINE abstracts discussing p53 were processed in Genescene. Two researchers evaluated the terms and relations from several abstracts of interest to them. The results show that the terms were precise (precision 93%) and relevant, as were the parser relations (precision 95%). The Concept Space relations were more precise when selected with ontological knowledge (precision 78%) than without (60%).
    Date
    22. 7.2006 14:26:01
  2. Dang, Y.; Zhang, Y.; Chen, H.; Hu, P.J.-H.; Brown, S.A.; Larson, C.: Arizona Literature Mapper : an integrated approach to monitor and analyze global bioterrorism research literature (2009) 0.04
    0.044078726 = product of:
      0.08815745 = sum of:
        0.08815745 = product of:
          0.1763149 = sum of:
            0.1763149 = weight(_text_:literature in 2943) [ClassicSimilarity], result of:
              0.1763149 = score(doc=2943,freq=20.0), product of:
                0.23726627 = queryWeight, product of:
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.055777367 = queryNorm
                0.7431099 = fieldWeight in 2943, product of:
                  4.472136 = tf(freq=20.0), with freq of:
                    20.0 = termFreq=20.0
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2943)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Biomedical research is critical to biodefense, which is drawing increasing attention from governments globally as well as from various research communities. The U.S. government has been closely monitoring and regulating biomedical research activities, particularly those studying or involving bioterrorism agents or diseases. Effective surveillance requires comprehensive understanding of extant biomedical research and timely detection of new developments or emerging trends. The rapid knowledge expansion, technical breakthroughs, and spiraling collaboration networks demand greater support for literature search and sharing, which cannot be effectively supported by conventional literature search mechanisms or systems. In this study, we propose an integrated approach that integrates advanced techniques for content analysis, network analysis, and information visualization. We design and implement Arizona Literature Mapper, a Web-based portal that allows users to gain timely, comprehensive understanding of bioterrorism research, including leading scientists, research groups, institutions as well as insights about current mainstream interests or emerging trends. We conduct two user studies to evaluate Arizona Literature Mapper and include a well-known system for benchmarking purposes. According to our results, Arizona Literature Mapper is significantly more effective for supporting users' search of bioterrorism publications than PubMed. Users consider Arizona Literature Mapper more useful and easier to use than PubMed. Users are also more satisfied with Arizona Literature Mapper and show stronger intentions to use it in the future. Assessments of Arizona Literature Mapper's analysis functions are also positive, as our subjects consider them useful, easy to use, and satisfactory. Our results have important implications that are also discussed in the article.
  3. Suakkaphong, N.; Zhang, Z.; Chen, H.: Disease named entity recognition using semisupervised learning and conditional random fields (2011) 0.02
    0.024142914 = product of:
      0.048285827 = sum of:
        0.048285827 = product of:
          0.096571654 = sum of:
            0.096571654 = weight(_text_:literature in 4367) [ClassicSimilarity], result of:
              0.096571654 = score(doc=4367,freq=6.0), product of:
                0.23726627 = queryWeight, product of:
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.055777367 = queryNorm
                0.40701804 = fieldWeight in 4367, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4367)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Information extraction is an important text-mining task that aims at extracting prespecified types of information from large text collections and making them available in structured representations such as databases. In the biomedical domain, information extraction can be applied to help biologists make the most use of their digital-literature archives. Currently, there are large amounts of biomedical literature that contain rich information about biomedical substances. Extracting such knowledge requires a good named entity recognition technique. In this article, we combine conditional random fields (CRFs), a state-of-the-art sequence-labeling algorithm, with two semisupervised learning techniques, bootstrapping and feature sampling, to recognize disease names from biomedical literature. Two data-processing strategies for each technique also were analyzed: one sequentially processing unlabeled data partitions and another one processing unlabeled data partitions in a round-robin fashion. The experimental results showed the advantage of semisupervised learning techniques given limited labeled training data. Specifically, CRFs with bootstrapping implemented in sequential fashion outperformed strictly supervised CRFs for disease name recognition. The project was supported by NIH/NLM Grant R33 LM07299-01, 2002-2005.
  4. Orwig, R.E.; Chen, H.; Nunamaker, J.F.: ¬A graphical, self-organizing approach to classifying electronic meeting output (1997) 0.02
    0.019514484 = product of:
      0.03902897 = sum of:
        0.03902897 = product of:
          0.07805794 = sum of:
            0.07805794 = weight(_text_:literature in 6928) [ClassicSimilarity], result of:
              0.07805794 = score(doc=6928,freq=2.0), product of:
                0.23726627 = queryWeight, product of:
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.055777367 = queryNorm
                0.32898876 = fieldWeight in 6928, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=6928)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Describes research in the application of a Kohonen Self-Organizing Map (SOM) to the problem of classification of electronic brainstorming output and an evaluation of the results. Describes an electronic meeting system and describes the classification problem that exists in the group problem solving process. Surveys the literature concerning classification. Describes the application of the Kohonen SOM to the meeting output classification problem. Describes an experiment that evaluated the classification performed by the Kohonen SOM by comparing it with those of a human expert and a Hopfield neural network. Discusses conclusions and directions for future research
  5. Li, J.; Zhang, Z.; Li, X.; Chen, H.: Kernel-based learning for biomedical relation extraction (2008) 0.02
    0.0167267 = product of:
      0.0334534 = sum of:
        0.0334534 = product of:
          0.0669068 = sum of:
            0.0669068 = weight(_text_:literature in 1611) [ClassicSimilarity], result of:
              0.0669068 = score(doc=1611,freq=2.0), product of:
                0.23726627 = queryWeight, product of:
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.055777367 = queryNorm
                0.28199035 = fieldWeight in 1611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1611)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Relation extraction is the process of scanning text for relationships between named entities. Recently, significant studies have focused on automatically extracting relations from biomedical corpora. Most existing biomedical relation extractors require manual creation of biomedical lexicons or parsing templates based on domain knowledge. In this study, we propose to use kernel-based learning methods to automatically extract biomedical relations from literature text. We develop a framework of kernel-based learning for biomedical relation extraction. In particular, we modified the standard tree kernel function by incorporating a trace kernel to capture richer contextual information. In our experiments on a biomedical corpus, we compare different kernel functions for biomedical relation detection and classification. The experimental results show that a tree kernel outperforms word and sequence kernels for relation detection, our trace-tree kernel outperforms the standard tree kernel, and a composite kernel outperforms individual kernels for relation extraction.
  6. Chen, H.; Yim, T.; Fye, D.: Automatic thesaurus generation for an electronic community system (1995) 0.01
    0.013938917 = product of:
      0.027877834 = sum of:
        0.027877834 = product of:
          0.055755667 = sum of:
            0.055755667 = weight(_text_:literature in 2918) [ClassicSimilarity], result of:
              0.055755667 = score(doc=2918,freq=2.0), product of:
                0.23726627 = queryWeight, product of:
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.055777367 = queryNorm
                0.23499197 = fieldWeight in 2918, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2918)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Reports an algorithmic approach to the automatic generation of thesauri for electronic community systems. The techniques used included terms filtering, automatic indexing, and cluster analysis. The testbed for the research was the Worm Community System, which contains a comprehensive library of specialized community data and literature, currently in use by molecular biologists who study the nematode worm. The resulting worm thesaurus included 2709 researchers' names, 798 gene names, 20 experimental methods, and 4302 subject descriptors. On average, each term had about 90 weighted neighbouring terms indicating relevant concepts. The thesaurus was developed as an online search aide. Tests the worm thesaurus in an experiment with 6 worm researchers of varying degrees of expertise and background. The experiment showed that the thesaurus was an excellent 'memory jogging' device and that it supported learning and serendipitous browsing. Despite some occurrences of obvious noise, the system was useful in suggesting relevant concepts for the researchers' queries and it helped improve concept recall. With a simple browsing interface, an automatic thesaurus can become a useful tool for online search and can assist researchers in exploring and traversing a dynamic and complex electronic community system
  7. Chen, H.; Ng, T.D.; Martinez, J.; Schatz, B.R.: ¬A concept space approach to addressing the vocabulary problem in scientific information retrieval : an experiment on the Worm Community System (1997) 0.01
    0.013938917 = product of:
      0.027877834 = sum of:
        0.027877834 = product of:
          0.055755667 = sum of:
            0.055755667 = weight(_text_:literature in 6492) [ClassicSimilarity], result of:
              0.055755667 = score(doc=6492,freq=2.0), product of:
                0.23726627 = queryWeight, product of:
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.055777367 = queryNorm
                0.23499197 = fieldWeight in 6492, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6492)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This research presents an algorithmic approach to addressing the vocabulary problem in scientific information retrieval and information sharing, using the molecular biology domain as an example. We first present a literature review of cognitive studies related to the vocabulary problem and vocabulary-based search aids (thesauri) and then discuss techniques for building robust and domain-specific thesauri to assist in cross-domain scientific information retrieval. Using a variation of the automatic thesaurus generation techniques, which we refer to as the concept space approach, we recently conducted an experiment in the molecular biology domain in which we created a C. elegans worm thesaurus of 7.657 worm-specific terms and a Drosophila fly thesaurus of 15.626 terms. About 30% of these terms overlapped, which created vocabulary paths from one subject domain to the other. Based on a cognitve study of term association involving 4 biologists, we found that a large percentage (59,6-85,6%) of the terms suggested by the subjects were identified in the cojoined fly-worm thesaurus. However, we found only a small percentage (8,4-18,1%) of the associations suggested by the subjects in the thesaurus
  8. Liu, X.; Kaza, S.; Zhang, P.; Chen, H.: Determining inventor status and its effect on knowledge diffusion : a study on nanotechnology literature from China, Russia, and India (2011) 0.01
    0.013938917 = product of:
      0.027877834 = sum of:
        0.027877834 = product of:
          0.055755667 = sum of:
            0.055755667 = weight(_text_:literature in 4468) [ClassicSimilarity], result of:
              0.055755667 = score(doc=4468,freq=2.0), product of:
                0.23726627 = queryWeight, product of:
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.055777367 = queryNorm
                0.23499197 = fieldWeight in 4468, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.253809 = idf(docFreq=1707, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4468)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  9. Chung, W.; Chen, H.: Browsing the underdeveloped Web : an experiment on the Arabic Medical Web Directory (2009) 0.01
    0.011335595 = product of:
      0.02267119 = sum of:
        0.02267119 = product of:
          0.04534238 = sum of:
            0.04534238 = weight(_text_:22 in 2733) [ClassicSimilarity], result of:
              0.04534238 = score(doc=2733,freq=2.0), product of:
                0.19532284 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.055777367 = queryNorm
                0.23214069 = fieldWeight in 2733, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2733)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 17:57:50
  10. Carmel, E.; Crawford, S.; Chen, H.: Browsing in hypertext : a cognitive study (1992) 0.01
    0.009446329 = product of:
      0.018892659 = sum of:
        0.018892659 = product of:
          0.037785318 = sum of:
            0.037785318 = weight(_text_:22 in 7469) [ClassicSimilarity], result of:
              0.037785318 = score(doc=7469,freq=2.0), product of:
                0.19532284 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.055777367 = queryNorm
                0.19345059 = fieldWeight in 7469, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=7469)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    IEEE transactions on systems, man and cybernetics. 22(1992) no.5, S.865-884
  11. Zheng, R.; Li, J.; Chen, H.; Huang, Z.: ¬A framework for authorship identification of online messages : writing-style features and classification techniques (2006) 0.01
    0.009446329 = product of:
      0.018892659 = sum of:
        0.018892659 = product of:
          0.037785318 = sum of:
            0.037785318 = weight(_text_:22 in 5276) [ClassicSimilarity], result of:
              0.037785318 = score(doc=5276,freq=2.0), product of:
                0.19532284 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.055777367 = queryNorm
                0.19345059 = fieldWeight in 5276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5276)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 16:14:37
  12. Hu, D.; Kaza, S.; Chen, H.: Identifying significant facilitators of dark network evolution (2009) 0.01
    0.009446329 = product of:
      0.018892659 = sum of:
        0.018892659 = product of:
          0.037785318 = sum of:
            0.037785318 = weight(_text_:22 in 2753) [ClassicSimilarity], result of:
              0.037785318 = score(doc=2753,freq=2.0), product of:
                0.19532284 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.055777367 = queryNorm
                0.19345059 = fieldWeight in 2753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2753)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 18:50:30