Search (5 results, page 1 of 1)

Ibekwe-SanJuan, F.: ¬The impact of geographic location on the development of a specialty field : a case study of Sloan Digital Sky Survey in astronomy (2008) 0.00
```
0.0021859813 = product of:
  0.013115887 = sum of:
    0.013115887 = weight(_text_:in in 3241) [ClassicSimilarity], result of:
      0.013115887 = score(doc=3241,freq=12.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.22087781 = fieldWeight in 3241, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3241)
  0.16666667 = coord(1/6)
```
Abstract

We analyze the scientific discourse of researchers in a specialty field in Astronomy by examining the influence that geographic location may have on the development of this field. Using as a case study the Sloan Digital Sky Survey (SDSS) project, we analyzed texts from bibliographic records along three geographic axes: US-only publications, non-US publications and international collaboration. Each geographic region reflected authors affiliated to research institutions in that region. International collaboration refers to papers published by both US-based and non-US based institutions. Through clustering of domain terms used in titles and abstracts fields of the bibliographic records, we were able to automatically identify the topology of topics peculiar to each geographic region and identify the research topics common to the three geographic zones. The results showed that US-only and non-US research in SDSS shared more commonalities with international collaboration than with one another, thus indicating that the former two focused on rather distinct topics.
Ibekwe-SanJuan, F.; Eric SanJuan, E.: Mining for knowledge chunks in a terminology network (2004) 0.00
```
0.0020609628 = product of:
  0.012365777 = sum of:
    0.012365777 = weight(_text_:in in 2623) [ClassicSimilarity], result of:
      0.012365777 = score(doc=2623,freq=6.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.2082456 = fieldWeight in 2623, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=2623)
  0.16666667 = coord(1/6)
```
Abstract

This paper examines further a research hypothesis that syntactic variations are an interesting alternative to the clustering approach and they offer meaningful ways of highlighting and organising associated research topics in a corpus. A textmining and topic mapping system, TermWatch, has been developed based on this hypothesis. Preliminary results obtained an a large IR corpus are promising and call for further systematic investigation.

Series

Advances in knowledge organization; vol.9
Ibekwe-SanJuan, F.; SanJuan, E.: From term variants to research topics (2002) 0.00
```
0.0019676082 = product of:
  0.011805649 = sum of:
    0.011805649 = weight(_text_:in in 1853) [ClassicSimilarity], result of:
      0.011805649 = score(doc=1853,freq=14.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.19881277 = fieldWeight in 1853, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1853)
  0.16666667 = coord(1/6)
```
Abstract

In a scientific and technological watch (STW) task, an expert user needs to survey the evolution of research topics in his area of specialisation in order to detect interesting changes. The majority of methods proposing evaluation metrics (bibliometrics and scientometrics studies) for STW rely solely an statistical data analysis methods (Co-citation analysis, co-word analysis). Such methods usually work an structured databases where the units of analysis (words, keywords) are already attributed to documents by human indexers. The advent of huge amounts of unstructured textual data has rendered necessary the integration of natural language processing (NLP) techniques to first extract meaningful units from texts. We propose a method for STW which is NLP-oriented. The method not only analyses texts linguistically in order to extract terms from them, but also uses linguistic relations (syntactic variations) as the basis for clustering. Terms and variation relations are formalised as weighted di-graphs which the clustering algorithm, CPCL (Classification by Preferential Clustered Link) will seek to reduce in order to produces classes. These classes ideally represent the research topics present in the corpus. The results of the classification are subjected to validation by an expert in STW.
Chen, C.; Ibekwe-SanJuan, F.; Pinho, R.; Zhang, J.: ¬The impact of the sloan digital sky survey on astronomical research : the role of culture, identity, and international collaboration (2008) 0.00
```
0.0017848461 = product of:
  0.010709076 = sum of:
    0.010709076 = weight(_text_:in in 2275) [ClassicSimilarity], result of:
      0.010709076 = score(doc=2275,freq=8.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.18034597 = fieldWeight in 2275, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2275)
  0.16666667 = coord(1/6)
```
Content

We investigate the influence of culture and identity (geographic location) on the constitution of a specific research field. Using as case study the Sloan Digital Sky Survey (SDSS) project in the Astronomy field, we analyzed texts from bibliographic records of publications along three cultural and geographic axes: US only publications, non-US publications and international collaboration. Using three text mining systems (CiteSpace, TermWatch and PEx), we were able to automatically identify the topics specific to each cultural and geographic region as well as isolate the core research topics common to all geographic zones. The results tended to show that US-only and non-US research in this field shared more commonalities with international collaboration than with one another, thus indicating that the former two (US-only and non-US) research focused on rather distinct topics.

Series

Advances in knowledge organization; vol.11

Source

Culture and identity in knowledge organization: Proceedings of the Tenth International ISKO Conference 5-8 August 2008, Montreal, Canada. Ed. by Clément Arsenault and Joseph T. Tennis
Ibekwe-SanJuan, F.: Constructing and maintaining knowledge organization tools : a symbolic approach (2006) 0.00
```
0.0014573209 = product of:
  0.008743925 = sum of:
    0.008743925 = weight(_text_:in in 5595) [ClassicSimilarity], result of:
      0.008743925 = score(doc=5595,freq=12.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.14725187 = fieldWeight in 5595, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.03125 = fieldNorm(doc=5595)
  0.16666667 = coord(1/6)
```
Abstract

Purpose - To propose a comprehensive and semi-automatic method for constructing or updating knowledge organization tools such as thesauri. Design/methodology/approach - The paper proposes a comprehensive methodology for thesaurus construction and maintenance combining shallow NLP with a clustering algorithm and an information visualization interface. The resulting system TermWatch, extracts terms from a text collection, mines semantic relations between them using complementary linguistic approaches and clusters terms using these semantic relations. The clusters are mapped onto a 2D using an integrated visualization tool. Findings - The clusters formed exhibit the different relations necessary to populate a thesaurus or ontology: synonymy, generic/specific and relatedness. The clusters represent, for a given term, its closest neighbours in terms of semantic relations. Practical implications - This could change the way in which information professionals (librarians and documentalists) undertake knowledge organization tasks. TermWatch can be useful either as a starting point for grasping the conceptual organization of knowledge in a huge text collection without having to read the texts, then actually serving as a suggestive tool for populating different hierarchies of a thesaurus or an ontology because its clusters are based on semantic relations. Originality/value - This lies in several points: combined use of linguistic relations with an adapted clustering algorithm, which is scalable and can handle sparse data. The paper proposes a comprehensive approach to semantic relations acquisition whereas existing studies often use one or two approaches. The domain knowledge maps produced by the system represents an added advantage over existing approaches to automatic thesaurus construction in that clusters are formed using semantic relations between domain terms. Thus while offering a meaningful synthesis of the information contained in the original corpus through clustering, the results can be used for knowledge organization tasks (thesaurus building and ontology population) The system also constitutes a platform for performing several knowledge-oriented tasks like science and technology watch, textmining, query refinement.

Search (5 results, page 1 of 1)

Authors

Themes