Search (74 results, page 4 of 4)

  • × language_ss:"e"
  • × theme_ss:"Theorie verbaler Dokumentationssprachen"
  • × type_ss:"a"
  1. Schmitz-Esser, W.: Formalizing terminology-based knowledge for an ontology independently of a particular language (2008) 0.00
    0.001757696 = product of:
      0.003515392 = sum of:
        0.003515392 = product of:
          0.007030784 = sum of:
            0.007030784 = weight(_text_:a in 1680) [ClassicSimilarity], result of:
              0.007030784 = score(doc=1680,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.13239266 = fieldWeight in 1680, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1680)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Last word ontological thought and practice is exemplified on an axiomatic framework [a model for an Integrative Cross-Language Ontology (ICLO), cf. Poli, R., Schmitz-Esser, W., forthcoming 2007] that is highly general, based on natural language, multilingual, can be implemented as topic maps and may be openly enhanced by software available for particular languages. Basics of ontological modelling, conditions for construction and maintenance, and the most salient points in application are addressed, such as cross-language text mining and knowledge generation. The rationale is to open the eyes for the tremendous potential of terminology-based ontologies for principled Knowledge Organization and the interchange and reuse of formalized knowledge.
    Type
    a
  2. Khoo, S.G.; Na, J.-C.: Semantic relations in information science (2006) 0.00
    0.001757696 = product of:
      0.003515392 = sum of:
        0.003515392 = product of:
          0.007030784 = sum of:
            0.007030784 = weight(_text_:a in 1978) [ClassicSimilarity], result of:
              0.007030784 = score(doc=1978,freq=24.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.13239266 = fieldWeight in 1978, product of:
                  4.8989797 = tf(freq=24.0), with freq of:
                    24.0 = termFreq=24.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=1978)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This chapter examines the nature of semantic relations and their main applications in information science. The nature and types of semantic relations are discussed from the perspectives of linguistics and psychology. An overview of the semantic relations used in knowledge structures such as thesauri and ontologies is provided, as well as the main techniques used in the automatic extraction of semantic relations from text. The chapter then reviews the use of semantic relations in information extraction, information retrieval, question-answering, and automatic text summarization applications. Concepts and relations are the foundation of knowledge and thought. When we look at the world, we perceive not a mass of colors but objects to which we automatically assign category labels. Our perceptual system automatically segments the world into concepts and categories. Concepts are the building blocks of knowledge; relations act as the cement that links concepts into knowledge structures. We spend much of our lives identifying regular associations and relations between objects, events, and processes so that the world has an understandable structure and predictability. Our lives and work depend on the accuracy and richness of this knowledge structure and its web of relations. Relations are needed for reasoning and inferencing. Chaffin and Herrmann (1988b, p. 290) noted that "relations between ideas have long been viewed as basic to thought, language, comprehension, and memory." Aristotle's Metaphysics (Aristotle, 1961; McKeon, expounded on several types of relations. The majority of the 30 entries in a section of the Metaphysics known today as the Philosophical Lexicon referred to relations and attributes, including cause, part-whole, same and opposite, quality (i.e., attribute) and kind-of, and defined different types of each relation. Hume (1955) pointed out that there is a connection between successive ideas in our minds, even in our dreams, and that the introduction of an idea in our mind automatically recalls an associated idea. He argued that all the objects of human reasoning are divided into relations of ideas and matters of fact and that factual reasoning is founded on the cause-effect relation. His Treatise of Human Nature identified seven kinds of relations: resemblance, identity, relations of time and place, proportion in quantity or number, degrees in quality, contrariety, and causation. Mill (1974, pp. 989-1004) discoursed on several types of relations, claiming that all things are either feelings, substances, or attributes, and that attributes can be a quality (which belongs to one object) or a relation to other objects.
    Linguists in the structuralist tradition (e.g., Lyons, 1977; Saussure, 1959) have asserted that concepts cannot be defined on their own but only in relation to other concepts. Semantic relations appear to reflect a logical structure in the fundamental nature of thought (Caplan & Herrmann, 1993). Green, Bean, and Myaeng (2002) noted that semantic relations play a critical role in how we represent knowledge psychologically, linguistically, and computationally, and that many systems of knowledge representation start with a basic distinction between entities and relations. Green (2001, p. 3) said that "relationships are involved as we combine simple entities to form more complex entities, as we compare entities, as we group entities, as one entity performs a process on another entity, and so forth. Indeed, many things that we might initially regard as basic and elemental are revealed upon further examination to involve internal structure, or in other words, internal relationships." Concepts and relations are often expressed in language and text. Language is used not just for communicating concepts and relations, but also for representing, storing, and reasoning with concepts and relations. We shall examine the nature of semantic relations from a linguistic and psychological perspective, with an emphasis on relations expressed in text. The usefulness of semantic relations in information science, especially in ontology construction, information extraction, information retrieval, question-answering, and text summarization is discussed. Research and development in information science have focused on concepts and terms, but the focus will increasingly shift to the identification, processing, and management of relations to achieve greater effectiveness and refinement in information science techniques. Previous chapters in ARIST on natural language processing (Chowdhury, 2003), text mining (Trybula, 1999), information retrieval and the philosophy of language (Blair, 2003), and query expansion (Efthimiadis, 1996) provide a background for this discussion, as semantic relations are an important part of these applications.
    Type
    a
  3. Fugmann, R.: ¬The complementarity of natural and controlled languages in indexing (1995) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 1634) [ClassicSimilarity], result of:
              0.006765375 = score(doc=1634,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 1634, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1634)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  4. Casagrande, J.B.; Hale, K.L.: Semantic relations in Papago folk definitions (1967) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 1194) [ClassicSimilarity], result of:
              0.006765375 = score(doc=1194,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 1194, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1194)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  5. Fugmann, R.: ¬The complementarity of natural and index language in the field of information supply : an overview of their specific capabilities and limitations (2002) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 1412) [ClassicSimilarity], result of:
              0.006765375 = score(doc=1412,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 1412, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1412)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Natural text phrasing is an indeterminate process and, thus, inherently lacks representational predictability. This holds true in particular in the Gase of general concepts and of their syntactical connectivity. Hence, natural language query phrasing and searching is an unending adventure of trial and error and, in most Gases, has an unsatisfactory outcome with respect to the recall and precision ratlos of the responses. Human indexing is based an knowledgeable document interpretation and aims - among other things - at introducing predictability into the representation of documents. Due to the indeterminacy of natural language text phrasing and image construction, any adequate indexing is also indeterminate in nature and therefore inherently defies any satisfactory algorithmization. But human indexing suffers from a different Set of deficiencies which are absent in the processing of non-interpreted natural language. An optimally effective information System combines both types of language in such a manner that their specific strengths are preserved and their weaknesses are avoided. lf the goal is a large and enduring information system for more than merely known-item searches, the expenditure for an advanced index language and its knowledgeable and careful employment is unavoidable.
    Type
    a
  6. Farradane, J.E.L.: Fundamental fallacies and new needs in classification (1985) 0.00
    0.0016828659 = product of:
      0.0033657318 = sum of:
        0.0033657318 = product of:
          0.0067314636 = sum of:
            0.0067314636 = weight(_text_:a in 3642) [ClassicSimilarity], result of:
              0.0067314636 = score(doc=3642,freq=22.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12675633 = fieldWeight in 3642, product of:
                  4.690416 = tf(freq=22.0), with freq of:
                    22.0 = termFreq=22.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=3642)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This chapter from The Sayers Memorial Volume summarizes Farradane's earlier work in which he developed his major themes by drawing in part upon research in psychology, and particularly those discoveries called "cognitive" which now form part of cognitive science. Farradane, a chemist by training who later became an information scientist and Director of the Center for Information Science, City University, London, from 1958 to 1973, defines the various types of methods used to achieve classification systems-philosophic, scientific, and synthetic. Early an he distinguishes the view that classification is "some part of external 'reality' waiting to be discovered" from that view which considers it "an intellectual operation upon mental entities and concepts." Classification, therefore, is to be treated as a mental construct and not as something "out there" to be discovered as, say, in astronomy or botany. His approach could be termed, somewhat facetiously, as an "in there" one, meaning found by utilizing the human brain as the key tool. This is not to say that discoveries in astronomy or botany do not require the use of the brain as a key tool. It is merely that the "material" worked upon by this tool is presented to it for observation by "that inward eye," by memory and by inference rather than by planned physical observation, memory, and inference. This distinction could be refined or clarified by considering the initial "observation" as a specific kind of mental set required in each case. Farradane then proceeds to demolish the notion of main classes as "fictitious," partly because the various category-defining methodologies used in library classification are "randomly mixed." The implication, probably correct, is that this results in mixed metaphorical concepts. It is an interesting contrast to the approach of Julia Pettee (q.v.), who began with indexing terms and, in studying relationships between terms, discovered hidden hierarchies both between the terms themselves and between the cross-references leading from one term or set of terms to another. One is tempted to ask two questions: "Is hierarchy innate but misinterpreted?" and "ls it possible to have meaningful terms which have only categorical relationships (that have no see also or equivalent relationships to other, out-of-category terms)?" Partly as a result of the rejection of existing general library classification systems, the Classification Research Group-of which Farradane was a charter member decided to adopt the principles of Ranganathan's faceted classification system, while rejecting his limit an the number of fundamental categories. The advantage of the faceted method is that it is created by inductive, rather than deductive, methods. It can be altered more readily to keep up with changes in and additions to the knowledge base in a subject without having to re-do the major schedules. In 1961, when Farradane's paper appeared, the computer was beginning to be viewed as a tool for solving all information retrieval problems. He tartly remarks:
    The basic fallacy of mechanised information retrieval systems seems to be the often unconscious but apparently implied assumption that the machine can inject meaning into a group of juxtaposed terms although no methods of conceptual analysis and re-synthesis have been programmed (p. 203). As an example, he suggests considering the slight but vital differences in the meaning of the word "of" in selected examples: swarm of bees house of the mayor House of Lords spectrum of the sun basket of fish meeting of councillors cooking of meat book of the film Farradane's distinctive contribution is his matrix of basic relationships. The rows concern time and memory, in degree of happenstance: coincidentally, occasionally, or always. The columns represent degree of the "powers of discrimination": occurring together, linked by common elements only, or standing alone. To make these relationships easily managed, he used symbols for each of the nine kinds - "symbols found an every typewriter": /O (Theta) /* /; /= /+ /( /) /_ /: Farradane has maintained his basic insights to the present day. Though he has gone an to do other kinds of research in classification, his work indicates that he still believes that "the primary task ... is that of establishing satisfactory and enduring principles of subject analysis, or classification" (p. 208).
    Source
    Theory of subject analysis: a sourcebook. Ed.: L.M. Chan, et al
    Type
    a
  7. Lopes, M.I.: Principles underlying subject heading languages : an international approach (1996) 0.00
    0.001674345 = product of:
      0.00334869 = sum of:
        0.00334869 = product of:
          0.00669738 = sum of:
            0.00669738 = weight(_text_:a in 5608) [ClassicSimilarity], result of:
              0.00669738 = score(doc=5608,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12611452 = fieldWeight in 5608, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5608)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Discusses the problems in establishing commonly accepted principles for subject retrieval between different bibliographic systems. The Working Group on Principles Underlying Subject Heading Languages was established to devise general principles for any subject retrieval system and to review existing real systems in the light of such principles and compare them in order to evaluate the extent of their coverage and their application in current practices. Provides a background and history of the Working Group. Discusses the principles underlying subject headings and their purposes and the state of the work and major findings
    Type
    a
  8. Bean, C.: ¬The semantics of hierarchy : explicit parent-child relationships in MeSH tree structures (1998) 0.00
    0.001674345 = product of:
      0.00334869 = sum of:
        0.00334869 = product of:
          0.00669738 = sum of:
            0.00669738 = weight(_text_:a in 42) [ClassicSimilarity], result of:
              0.00669738 = score(doc=42,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12611452 = fieldWeight in 42, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=42)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Parent-Child relationships in MeSH trees were surveyed and described, and their patterns in the relational structure were determined for selected broad subject categories and subcategories. Is-a relationships dominated and were more prevalent overall than previously reported; however, an additional 67 different relationships were also seen, most of them nonhierarchical. Relational profiles were found to vary both within and among subject subdomains, but tended to display characteristic domain patterns. The implications for inferential reasoning and other cognitive and computational operations on hierarchical structures are considered
    Type
    a
  9. Green, R.: Relationships in the organization of knowledge : an overview (2001) 0.00
    0.001674345 = product of:
      0.00334869 = sum of:
        0.00334869 = product of:
          0.00669738 = sum of:
            0.00669738 = weight(_text_:a in 1142) [ClassicSimilarity], result of:
              0.00669738 = score(doc=1142,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12611452 = fieldWeight in 1142, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1142)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Relationships are specified by simultaneously identifying a semantic relationship and the set of participants involved in it, pairing each participant with its role in the relationship. Properties pertaining to the participant set and the nature of the relationship are explored. Relationships in the organization of knowledge are surveyed, encompassing relationships between units of recorded knowledge based an descriptions of those units; intratextual and intertextual relationships, including relationships based an text structure, citation relationships, and hypertext links; subject relationships in thesauri and other classificatory structures, including relationships for literature-based knowledge discovery; and relevance relationships.
    Type
    a
  10. Mai, J.-E.: Actors, domains, and constraints in the design and construction of controlled vocabularies (2008) 0.00
    0.0014647468 = product of:
      0.0029294936 = sum of:
        0.0029294936 = product of:
          0.005858987 = sum of:
            0.005858987 = weight(_text_:a in 1921) [ClassicSimilarity], result of:
              0.005858987 = score(doc=1921,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.11032722 = fieldWeight in 1921, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1921)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Classification schemes, thesauri, taxonomies, and other controlled vocabularies play important roles in the organization and retrieval of information in many different environments. While the design and construction of controlled vocabularies have been prescribed at the technical level in great detail over the past decades, the methodological level has been somewhat neglected. However, classification research has in recent years focused on developing approaches to the analysis of users, domains, and activities that could produce requirements for the design of controlled vocabularies. Researchers have often argued that the design, construction, and use of controlled vocabularies need to be based on analyses and understandings of the contexts in which these controlled vocabularies function. While one would assume that the growing body of research on human information behavior might help guide the development of controlled vocabularies shed light on these contexts, unfortunately, much of the research in this area is descriptive in nature and of little use for systems design. This paper discusses these trends and outlines a holistic approach that demonstrates how the design of controlled vocabularies can be informed by investigations of people's interactions with information. This approach is based on the Cognitive Work Analysis framework and outlines several dimensions of human-information interactions. Application of this approach will result is a comprehensive understanding of the contexts in which the controlled vocabulary will function and which can be used for the development of for the development of controlled vocabularies.
    Type
    a
  11. Milstead, J.L.: Standards for relationships between subject indexing terms (2001) 0.00
    0.0014351527 = product of:
      0.0028703054 = sum of:
        0.0028703054 = product of:
          0.005740611 = sum of:
            0.005740611 = weight(_text_:a in 1148) [ClassicSimilarity], result of:
              0.005740611 = score(doc=1148,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.10809815 = fieldWeight in 1148, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1148)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Relationships between the terms in thesauri and Indexes are the subject of national and international standards. The standards for thesauri enumerate and provide criteria for three basic types of relationship: equivalence, hierarchical, and associative. Standards and guidelines for indexes draw an the thesaurus standards to provide less detailed guidance for showing relationships between the terms used in an Index. The international standard for multilingual thesauri adds recommendations for assuring equal treatment of the languages of a thesaurus. The present standards were developed when lookup and search were essentially manual, and the value of the kinds of relationships has never been determined. It is not clear whether users understand or can use the distinctions between kinds of relationships. On the other hand, sophisticated text analysis systems may be able both to assist with development of more powerful term relationship schemes and to use the relationships to improve retrieval.
    Type
    a
  12. Engerer, V.: Control and syntagmatization : vocabulary requirements in information retrieval thesauri and natural language lexicons (2017) 0.00
    0.0014351527 = product of:
      0.0028703054 = sum of:
        0.0028703054 = product of:
          0.005740611 = sum of:
            0.005740611 = weight(_text_:a in 3678) [ClassicSimilarity], result of:
              0.005740611 = score(doc=3678,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.10809815 = fieldWeight in 3678, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3678)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper explores the relationships between natural language lexicons in lexical semantics and thesauri in information retrieval research. These different areas of knowledge have different restrictions on use of vocabulary; thesauri are used only in information search and retrieval contexts, whereas lexicons are mental systems and generally applicable in all domains of life. A set of vocabulary requirements that defines the more concrete characteristics of vocabulary items in the 2 contexts can be derived from this framework: lexicon items have to be learnable, complex, transparent, etc., whereas thesaurus terms must be effective, current and relevant, searchable, etc. The differences in vocabulary properties correlate with 2 other factors, the well-known dimension of Control (deliberate, social activities of building and maintaining vocabularies), and Syntagmatization, which is less known and describes vocabulary items' varying formal preparedness to exit the thesaurus/lexicon, enter into linear syntactic constructions, and, finally, acquire communicative functionality. It is proposed that there is an inverse relationship between Control and Syntagmatization.
    Type
    a
  13. Fugmann, R.: Unusual possibilities in indexing and classification (1990) 0.00
    0.001353075 = product of:
      0.00270615 = sum of:
        0.00270615 = product of:
          0.0054123 = sum of:
            0.0054123 = weight(_text_:a in 4781) [ClassicSimilarity], result of:
              0.0054123 = score(doc=4781,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.10191591 = fieldWeight in 4781, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4781)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  14. Zhou, G.D.; Zhang, M.: Extracting relation information from text documents by exploring various types of knowledge (2007) 0.00
    0.0011959607 = product of:
      0.0023919214 = sum of:
        0.0023919214 = product of:
          0.0047838427 = sum of:
            0.0047838427 = weight(_text_:a in 927) [ClassicSimilarity], result of:
              0.0047838427 = score(doc=927,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.090081796 = fieldWeight in 927, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=927)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Extracting semantic relationships between entities from text documents is challenging in information extraction and important for deep information processing and management. This paper investigates the incorporation of diverse lexical, syntactic and semantic knowledge in feature-based relation extraction using support vector machines. Our study illustrates that the base phrase chunking information is very effective for relation extraction and contributes to most of the performance improvement from syntactic aspect while current commonly used features from full parsing give limited further enhancement. This suggests that most of useful information in full parse trees for relation extraction is shallow and can be captured by chunking. This indicates that a cheap and robust solution in relation extraction can be achieved without decreasing too much in performance. We also demonstrate how semantic information such as WordNet, can be used in feature-based relation extraction to further improve the performance. Evaluation on the ACE benchmark corpora shows that effective incorporation of diverse features enables our system outperform previously best-reported systems. It also shows that our feature-based system significantly outperforms tree kernel-based systems. This suggests that current tree kernels fail to effectively explore structured syntactic information in relation extraction.
    Type
    a