Search (2 results, page 1 of 1)

  • × author_ss:"Cui, H."
  1. Cui, H.: CharaParser for fine-grained semantic annotation of organism morphological descriptions (2012) 0.01
    0.011246576 = product of:
      0.022493152 = sum of:
        0.022493152 = product of:
          0.044986304 = sum of:
            0.044986304 = weight(_text_:software in 45) [ClassicSimilarity], result of:
              0.044986304 = score(doc=45,freq=2.0), product of:
                0.20527047 = queryWeight, product of:
                  3.9671519 = idf(docFreq=2274, maxDocs=44218)
                  0.051742528 = queryNorm
                0.21915624 = fieldWeight in 45, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9671519 = idf(docFreq=2274, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=45)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Biodiversity information organization is looking beyond the traditional document-level metadata approach and has started to look into factual content in textual documents to support more intelligent and semantic-based access. This article reports the development and evaluation of CharaParser, a software application for semantic annotation of morphological descriptions. CharaParser annotates semistructured morphological descriptions in such a detailed manner that all stated morphological characters of an organ are marked up in Extensible Markup Language format. Using an unsupervised machine learning algorithm and a general purpose syntactic parser as its key annotation tools, CharaParser requires minimal additional knowledge engineering work and seems to perform well across different description collections and/or taxon groups. The system has been formally evaluated on over 1,000 sentences randomly selected from Volume 19 of Flora of North American and Part H of Treatise on Invertebrate Paleontology. CharaParser reaches and exceeds 90% in sentence-wise recall and precision, exceeding other similar systems reported in the literature. It also significantly outperforms a heuristic rule-based system we developed earlier. Early evidence that enriching the lexicon of a syntactic parser with domain terms alone may be sufficient to adapt the parser for the biodiversity domain is also observed and may have significant implications.
  2. Cui, H.: Competency evaluation of plant character ontologies against domain literature (2010) 0.01
    0.008762998 = product of:
      0.017525995 = sum of:
        0.017525995 = product of:
          0.03505199 = sum of:
            0.03505199 = weight(_text_:22 in 3466) [ClassicSimilarity], result of:
              0.03505199 = score(doc=3466,freq=2.0), product of:
                0.18119352 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051742528 = queryNorm
                0.19345059 = fieldWeight in 3466, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3466)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 6.2010 9:55:22