Search (2 results, page 1 of 1)

  • × author_ss:"Ibekwe-SanJuan, F."
  • × theme_ss:"Wissensrepräsentation"
  1. Ibekwe-SanJuan, F.: Constructing and maintaining knowledge organization tools : a symbolic approach (2006) 0.02
    0.021779295 = product of:
      0.04355859 = sum of:
        0.011892734 = weight(_text_:information in 5595) [ClassicSimilarity], result of:
          0.011892734 = score(doc=5595,freq=6.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.1343758 = fieldWeight in 5595, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=5595)
        0.031665858 = product of:
          0.063331716 = sum of:
            0.063331716 = weight(_text_:organization in 5595) [ClassicSimilarity], result of:
              0.063331716 = score(doc=5595,freq=10.0), product of:
                0.17974974 = queryWeight, product of:
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.050415643 = queryNorm
                0.35233274 = fieldWeight in 5595, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5595)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Purpose - To propose a comprehensive and semi-automatic method for constructing or updating knowledge organization tools such as thesauri. Design/methodology/approach - The paper proposes a comprehensive methodology for thesaurus construction and maintenance combining shallow NLP with a clustering algorithm and an information visualization interface. The resulting system TermWatch, extracts terms from a text collection, mines semantic relations between them using complementary linguistic approaches and clusters terms using these semantic relations. The clusters are mapped onto a 2D using an integrated visualization tool. Findings - The clusters formed exhibit the different relations necessary to populate a thesaurus or ontology: synonymy, generic/specific and relatedness. The clusters represent, for a given term, its closest neighbours in terms of semantic relations. Practical implications - This could change the way in which information professionals (librarians and documentalists) undertake knowledge organization tasks. TermWatch can be useful either as a starting point for grasping the conceptual organization of knowledge in a huge text collection without having to read the texts, then actually serving as a suggestive tool for populating different hierarchies of a thesaurus or an ontology because its clusters are based on semantic relations. Originality/value - This lies in several points: combined use of linguistic relations with an adapted clustering algorithm, which is scalable and can handle sparse data. The paper proposes a comprehensive approach to semantic relations acquisition whereas existing studies often use one or two approaches. The domain knowledge maps produced by the system represents an added advantage over existing approaches to automatic thesaurus construction in that clusters are formed using semantic relations between domain terms. Thus while offering a meaningful synthesis of the information contained in the original corpus through clustering, the results can be used for knowledge organization tasks (thesaurus building and ontology population) The system also constitutes a platform for performing several knowledge-oriented tasks like science and technology watch, textmining, query refinement.
  2. Ibekwe-SanJuan, F.: Semantic metadata annotation : tagging Medline abstracts for enhanced information access (2010) 0.01
    0.013027068 = product of:
      0.026054136 = sum of:
        0.011892734 = weight(_text_:information in 3949) [ClassicSimilarity], result of:
          0.011892734 = score(doc=3949,freq=6.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.1343758 = fieldWeight in 3949, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3949)
        0.014161401 = product of:
          0.028322803 = sum of:
            0.028322803 = weight(_text_:organization in 3949) [ClassicSimilarity], result of:
              0.028322803 = score(doc=3949,freq=2.0), product of:
                0.17974974 = queryWeight, product of:
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.050415643 = queryNorm
                0.15756798 = fieldWeight in 3949, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3949)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Purpose - The object of this study is to develop methods for automatically annotating the argumentative role of sentences in scientific abstracts. Working from Medline abstracts, sentences were classified into four major argumentative roles: objective, method, result, and conclusion. The idea is that, if the role of each sentence can be marked up, then these metadata can be used during information retrieval to seek particular types of information such as novelty, conclusions, methodologies, aims/goals of a scientific piece of work. Design/methodology/approach - Two approaches were tested: linguistic cues and positional heuristics. Linguistic cues are lexico-syntactic patterns modelled as regular expressions implemented in a linguistic parser. Positional heuristics make use of the relative position of a sentence in the abstract to deduce its argumentative class. Findings - The experiments showed that positional heuristics attained a much higher degree of accuracy on Medline abstracts with an F-score of 64 per cent, whereas the linguistic cues only attained an F-score of 12 per cent. This is mostly because sentences from different argumentative roles are not always announced by surface linguistic cues. Research limitations/implications - A limitation to the study was the inability to test other methods to perform this task such as machine learning techniques which have been reported to perform better on Medline abstracts. Also, to compare the results of the study with earlier studies using Medline abstracts, the different argumentative roles present in Medline had to be mapped on to four major argumentative roles. This may have favourably biased the performance of the sentence classification by positional heuristics. Originality/value - To the best of one's knowledge, this study presents the first instance of evaluating linguistic cues and positional heuristics on the same corpus.
    Footnote
    Beitrag in einem Special Issue: Content architecture: exploiting and managing diverse resources: proceedings of the first national conference of the United Kingdom chapter of the International Society for Knowedge Organization (ISKO)