Search (48 results, page 1 of 3)

Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2012) 0.04

0.038934015 = product of:
  0.05840102 = sum of:
    0.030980824 = product of:
      0.09294247 = sum of:
        0.09294247 = weight(_text_:authors in 1967) [ClassicSimilarity], result of:
          0.09294247 = score(doc=1967,freq=4.0), product of:
            0.21746585 = queryWeight, product of:
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.047702286 = queryNorm
            0.42738882 = fieldWeight in 1967, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.046875 = fieldNorm(doc=1967)
      0.33333334 = coord(1/3)
    0.027420195 = product of:
      0.05484039 = sum of:
        0.05484039 = weight(_text_:22 in 1967) [ClassicSimilarity], result of:
          0.05484039 = score(doc=1967,freq=4.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.32829654 = fieldWeight in 1967, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1967)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This paper reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The paper discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and /or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the DDC (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.

Carrasco, L.; Vidotti, S.: Handling multilinguality in heterogeneous digital cultural heritage systems trough CIDOC CRM ontology (2016) 0.04

0.03544181 = product of:
  0.10632542 = sum of:
    0.10632542 = product of:
      0.21265084 = sum of:
        0.21265084 = weight(_text_:guimarães in 4925) [ClassicSimilarity], result of:
          0.21265084 = score(doc=4925,freq=2.0), product of:
            0.33877054 = queryWeight, product of:
              7.1017675 = idf(docFreq=98, maxDocs=44218)
              0.047702286 = queryNorm
            0.6277135 = fieldWeight in 4925, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.1017675 = idf(docFreq=98, maxDocs=44218)
              0.0625 = fieldNorm(doc=4925)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Knowledge organization for a sustainable world: challenges and perspectives for cultural, scientific, and technological sharing in a connected society : proceedings of the Fourteenth International ISKO Conference 27-29 September 2016, Rio de Janeiro, Brazil / organized by International Society for Knowledge Organization (ISKO), ISKO-Brazil, São Paulo State University ; edited by José Augusto Chaves Guimarães, Suellen Oliveira Milani, Vera Dodebei

Frâncu, V.; Sabo, C.-N.: Implementation of a UDC-based multilingual thesaurus in a library catalogue : the case of BiblioPhil (2010) 0.03

0.033579886 = product of:
  0.05036983 = sum of:
    0.030980824 = product of:
      0.09294247 = sum of:
        0.09294247 = weight(_text_:authors in 3697) [ClassicSimilarity], result of:
          0.09294247 = score(doc=3697,freq=4.0), product of:
            0.21746585 = queryWeight, product of:
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.047702286 = queryNorm
            0.42738882 = fieldWeight in 3697, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.046875 = fieldNorm(doc=3697)
      0.33333334 = coord(1/3)
    0.019389004 = product of:
      0.038778007 = sum of:
        0.038778007 = weight(_text_:22 in 3697) [ClassicSimilarity], result of:
          0.038778007 = score(doc=3697,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.23214069 = fieldWeight in 3697, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3697)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: In order to enhance the use of Universal Decimal Classification (UDC) numbers in information retrieval, the authors have represented classification with multilingual thesaurus descriptors and implemented this solution in an automated way. The authors illustrate a solution implemented in a BiblioPhil library system. The standard formats used are UNIMARC for subject authority records (i.e. the UDC-based multilingual thesaurus) and MARC XML support for data transfer. The multilingual thesaurus was built according to existing standards, the constituent parts of the classification notations being used as the basis for search terms in the multilingual information retrieval. The verbal equivalents, descriptors and non-descriptors, are used to expand the number of concepts and are given in Romanian, English and French. This approach saves the time of the indexer and provides more user-friendly and easier access to the bibliographic information. The multilingual aspect of the thesaurus enhances information access for a greater number of online users
Date: 22. 7.2010 20:40:56

Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2014) 0.03

0.03244501 = product of:
  0.048667513 = sum of:
    0.025817355 = product of:
      0.07745206 = sum of:
        0.07745206 = weight(_text_:authors in 1962) [ClassicSimilarity], result of:
          0.07745206 = score(doc=1962,freq=4.0), product of:
            0.21746585 = queryWeight, product of:
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.047702286 = queryNorm
            0.35615736 = fieldWeight in 1962, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1962)
      0.33333334 = coord(1/3)
    0.02285016 = product of:
      0.04570032 = sum of:
        0.04570032 = weight(_text_:22 in 1962) [ClassicSimilarity], result of:
          0.04570032 = score(doc=1962,freq=4.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.27358043 = fieldWeight in 1962, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1962)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This article reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The article discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and/or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the Dewey Decimal Classification [DDC] (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.

Weihs, J.: Three tales of multilingual cataloguing (1998) 0.02

0.017234672 = product of:
  0.05170401 = sum of:
    0.05170401 = product of:
      0.10340802 = sum of:
        0.10340802 = weight(_text_:22 in 6063) [ClassicSimilarity], result of:
          0.10340802 = score(doc=6063,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.61904186 = fieldWeight in 6063, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=6063)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 2. 8.2001 8:55:22

Dini, L.: CACAO : multilingual access to bibliographic records (2007) 0.01

0.012926003 = product of:
  0.038778007 = sum of:
    0.038778007 = product of:
      0.077556014 = sum of:
        0.077556014 = weight(_text_:22 in 126) [ClassicSimilarity], result of:
          0.077556014 = score(doc=126,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.46428138 = fieldWeight in 126, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=126)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Content: Vortrag anlässlich des Workshops: "Extending the multilingual capacity of The European Library in the EDL project Stockholm, Swedish National Library, 22-23 November 2007".

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.01

0.010771669 = product of:
  0.03231501 = sum of:
    0.03231501 = product of:
      0.06463002 = sum of:
        0.06463002 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.06463002 = score(doc=4157,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Landry, P.: MACS: multilingual access to subject and link management : Extending the Multilingual Capacity of TEL in the EDL Project (2007) 0.01

0.010771669 = product of:
  0.03231501 = sum of:
    0.03231501 = product of:
      0.06463002 = sum of:
        0.06463002 = weight(_text_:22 in 1287) [ClassicSimilarity], result of:
          0.06463002 = score(doc=1287,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.38690117 = fieldWeight in 1287, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1287)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Content: Vortrag anlässlich des Workshops: "Extending the multilingual capacity of The European Library in the EDL project Stockholm, Swedish National Library, 22-23 November 2007".

Zhou, Y. et al.: Analysing entity context in multilingual Wikipedia to support entity-centric retrieval applications (2016) 0.01

0.010771669 = product of:
  0.03231501 = sum of:
    0.03231501 = product of:
      0.06463002 = sum of:
        0.06463002 = weight(_text_:22 in 2758) [ClassicSimilarity], result of:
          0.06463002 = score(doc=2758,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.38690117 = fieldWeight in 2758, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2758)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 2.2016 18:25:22

Celli, F. et al.: Enabling multilingual search through controlled vocabularies : the AGRIS approach (2016) 0.01

0.010771669 = product of:
  0.03231501 = sum of:
    0.03231501 = product of:
      0.06463002 = sum of:
        0.06463002 = weight(_text_:22 in 3278) [ClassicSimilarity], result of:
          0.06463002 = score(doc=3278,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.38690117 = fieldWeight in 3278, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3278)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou

Jorna, K.; Davies, S.: Multilingual thesauri for the modern world : no ideal solution? (2001) 0.01
```
0.0103269415 = product of:
  0.030980824 = sum of:
    0.030980824 = product of:
      0.09294247 = sum of:
        0.09294247 = weight(_text_:authors in 4486) [ClassicSimilarity], result of:
          0.09294247 = score(doc=4486,freq=4.0), product of:
            0.21746585 = queryWeight, product of:
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.047702286 = queryNorm
            0.42738882 = fieldWeight in 4486, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.046875 = fieldNorm(doc=4486)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)
```
Abstract

In the 21st century, multilingual tools are gaining importance as increasingly diverse user groups from different cultural and linguistic backgrounds seek access to equally diverse pieces of information. The authors of this paper believe that most current forms of multilingual information access are inadequate for this role, and that a new form of multilingual thesaurus is required. The core of this paper introduces their pilot thesaurus InfoDEFT as a possible model for new online thesauri, which are semantically structured, encyclopedic and multilingual. The authors conclude that while the manual construction of such thesauri is labour intensive and hence costly, pilot thesauri can be used as training sets for artificial learning programmes, thus increasing their volume considerably at relatively little extra cost.
Clough, P.; Sanderson, M.: User experiments with the Eurovision Cross-Language Image Retrieval System (2006) 0.01
```
0.0103269415 = product of:
  0.030980824 = sum of:
    0.030980824 = product of:
      0.09294247 = sum of:
        0.09294247 = weight(_text_:authors in 5052) [ClassicSimilarity], result of:
          0.09294247 = score(doc=5052,freq=4.0), product of:
            0.21746585 = queryWeight, product of:
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.047702286 = queryNorm
            0.42738882 = fieldWeight in 5052, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.046875 = fieldNorm(doc=5052)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)
```
Abstract

In this article the authors present Eurovision, a textbased system for cross-language (CL) image retrieval. The system is evaluated by multilingual users for two search tasks with the system configured in English and five other languages. To the authors' knowledge, this is the first published set of user experiments for CL image retrieval. They show that (a) it is possible to create a usable multilingual search engine using little knowledge of any language other than English, (b) categorizing images assists the user's search, and (c) there are differences in the way users search between the proposed search tasks. Based on the two search tasks and user feedback, they describe important aspects of any CL image retrieval system.
Kwasnik, B.H.; Rubin, V.L.: Stretching conceptual structures in classifications across languages and cultures (2003) 0.01
```
0.0103269415 = product of:
  0.030980824 = sum of:
    0.030980824 = product of:
      0.09294247 = sum of:
        0.09294247 = weight(_text_:authors in 5517) [ClassicSimilarity], result of:
          0.09294247 = score(doc=5517,freq=4.0), product of:
            0.21746585 = queryWeight, product of:
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.047702286 = queryNorm
            0.42738882 = fieldWeight in 5517, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.046875 = fieldNorm(doc=5517)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)
```
Abstract

The authors describe the difficulties of translating classifications from a source language and culture to another language and culture. To demonstrate these problems, kinship terms and concepts from native speakers of fourteen languages were collected and analyzed to find differences between their terms and structures and those used in English. Using the representations of kinship terms in the Library of Congress Classification (LCC) and the Dewey Decimal Classification (DDC) as examples, the authors identified the source of possible lack of mapping between the domain of kinship in the fourteen languages studied and the LCC and DDC. Finally, some preliminary suggestions for how to make translated classifications more linguistically and culturally hospitable are offered.

Timotin, A.: Multilingvism si tezaure de concepte (1994) 0.01

0.008617336 = product of:
  0.025852006 = sum of:
    0.025852006 = product of:
      0.05170401 = sum of:
        0.05170401 = weight(_text_:22 in 7887) [ClassicSimilarity], result of:
          0.05170401 = score(doc=7887,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.30952093 = fieldWeight in 7887, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=7887)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Probleme de Informare si Documentare. 28(1994) no.1, S.13-22

Cao, L.; Leong, M.-K.; Low, H.-B.: Searching heterogeneous multilingual bibliographic sources (1998) 0.01

0.008617336 = product of:
  0.025852006 = sum of:
    0.025852006 = product of:
      0.05170401 = sum of:
        0.05170401 = weight(_text_:22 in 3564) [ClassicSimilarity], result of:
          0.05170401 = score(doc=3564,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.30952093 = fieldWeight in 3564, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3564)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 8.1996 22:08:06

Reinisch, F.: Wer suchet - der findet? : oder Die Überwindung der sprachlichen Grenzen bei der Suche in Volltextdatenbanken (2000) 0.01

0.008617336 = product of:
  0.025852006 = sum of:
    0.025852006 = product of:
      0.05170401 = sum of:
        0.05170401 = weight(_text_:22 in 4919) [ClassicSimilarity], result of:
          0.05170401 = score(doc=4919,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.30952093 = fieldWeight in 4919, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4919)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 7.2000 17:48:06

Heinzelin, D. de; ¬d'¬Hautcourt, F.; Pols, R.: ¬Un nouveaux thesaurus multilingue informatise relatif aux instruments de musique (1998) 0.01

0.008617336 = product of:
  0.025852006 = sum of:
    0.025852006 = product of:
      0.05170401 = sum of:
        0.05170401 = weight(_text_:22 in 932) [ClassicSimilarity], result of:
          0.05170401 = score(doc=932,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.30952093 = fieldWeight in 932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=932)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 8.1996 22:01:00

Park, J.-r.: Cross-lingual name and subject access : mechanisms and challenge (2007) 0.01

0.008617336 = product of:
  0.025852006 = sum of:
    0.025852006 = product of:
      0.05170401 = sum of:
        0.05170401 = weight(_text_:22 in 255) [ClassicSimilarity], result of:
          0.05170401 = score(doc=255,freq=2.0), product of:
            0.16704528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047702286 = queryNorm
            0.30952093 = fieldWeight in 255, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=255)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 10. 9.2000 17:38:22

Li, K.W.; Yang, C.C.: Conceptual analysis of parallel corpus collected from the Web (2006) 0.01
```
0.008605786 = product of:
  0.025817355 = sum of:
    0.025817355 = product of:
      0.07745206 = sum of:
        0.07745206 = weight(_text_:authors in 5051) [ClassicSimilarity], result of:
          0.07745206 = score(doc=5051,freq=4.0), product of:
            0.21746585 = queryWeight, product of:
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.047702286 = queryNorm
            0.35615736 = fieldWeight in 5051, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5051)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)
```
Abstract

As illustrated by the World Wide Web, the volume of information in languages other than English has grown significantly in recent years. This highlights the importance of multilingual corpora. Much effort has been devoted to the compilation of multilingual corpora for the purpose of cross-lingual information retrieval and machine translation. Existing parallel corpora mostly involve European languages, such as English-French and English-Spanish. There is still a lack of parallel corpora between European languages and Asian. languages. In the authors' previous work, an alignment method to identify one-to-one Chinese and English title pairs was developed to construct an English-Chinese parallel corpus that works automatically from the World Wide Web, and a 100% precision and 87% recall were obtained. Careful analysis of these results has helped the authors to understand how the alignment method can be improved. A conceptual analysis was conducted, which includes the analysis of conceptual equivalent and conceptual information alternation in the aligned and nonaligned English-Chinese title pairs that are obtained by the alignment method. The result of the analysis not only reflects the characteristics of parallel corpora, but also gives insight into the strengths and weaknesses of the alignment method. In particular, conceptual alternation, such as omission and addition, is found to have a significant impact on the performance of the alignment method.
Tsuji, K.; Kageura, K.: Automatic generation of Japanese-English bilingual thesauri based on bilingual corpora (2006) 0.01
```
0.008605786 = product of:
  0.025817355 = sum of:
    0.025817355 = product of:
      0.07745206 = sum of:
        0.07745206 = weight(_text_:authors in 5061) [ClassicSimilarity], result of:
          0.07745206 = score(doc=5061,freq=4.0), product of:
            0.21746585 = queryWeight, product of:
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.047702286 = queryNorm
            0.35615736 = fieldWeight in 5061, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.558814 = idf(docFreq=1258, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5061)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)
```
Abstract

The authors propose a method for automatically generating Japanese-English bilingual thesauri based on bilingual corpora. The term bilingual thesaurus refers to a set of bilingual equivalent words and their synonyms. Most of the methods proposed so far for extracting bilingual equivalent word clusters from bilingual corpora depend heavily on word frequency and are not effective for dealing with low-frequency clusters. These low-frequency bilingual clusters are worth extracting because they contain many newly coined terms that are in demand but are not listed in existing bilingual thesauri. Assuming that single language-pair-independent methods such as frequency-based ones have reached their limitations and that a language-pair-dependent method used in combination with other methods shows promise, the authors propose the following approach: (a) Extract translation pairs based on transliteration patterns; (b) remove the pairs from among the candidate words; (c) extract translation pairs based on word frequency from the remaining candidate words; and (d) generate bilingual clusters based on the extracted pairs using a graph-theoretic method. The proposed method has been found to be significantly more effective than other methods.

Search (48 results, page 1 of 3)

Authors

Years

Languages

Types

Themes