Search (61 results, page 1 of 4)

De Luca, E.W.; Dahlberg, I.: Including knowledge domains from the ICC into the multilingual lexical linked data cloud (2014) 0.04
```
0.040833745 = product of:
  0.08166749 = sum of:
    0.063353375 = weight(_text_:data in 1493) [ClassicSimilarity], result of:
      0.063353375 = score(doc=1493,freq=18.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.52404076 = fieldWeight in 1493, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1493)
    0.018314114 = product of:
      0.036628228 = sum of:
        0.036628228 = weight(_text_:22 in 1493) [ClassicSimilarity], result of:
          0.036628228 = score(doc=1493,freq=4.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.27358043 = fieldWeight in 1493, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1493)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

A lot of information that is already available on the Web, or retrieved from local information systems and social networks is structured in data silos that are not semantically related. Semantic technologies make it emerge that the use of typed links that directly express their relations are an advantage for every application that can reuse the incorporated knowledge about the data. For this reason, data integration, through reengineering (e.g. triplify), or querying (e.g. D2R) is an important task in order to make information available for everyone. Thus, in order to build a semantic map of the data, we need knowledge about data items itself and the relation between heterogeneous data items. In this paper, we present our work of providing Lexical Linked Data (LLD) through a meta-model that contains all the resources and gives the possibility to retrieve and navigate them from different perspectives. We combine the existing work done on knowledge domains (based on the Information Coding Classification) within the Multilingual Lexical Linked Data Cloud (based on the RDF/OWL EurowordNet and the related integrated lexical resources (MultiWordNet, EuroWordNet, MEMODATA Lexicon, Hamburg Methaphor DB).

Date

22. 9.2014 19:01:18

Source

Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2012) 0.04

0.03632982 = product of:
  0.07265964 = sum of:
    0.0506827 = weight(_text_:data in 1967) [ClassicSimilarity], result of:
      0.0506827 = score(doc=1967,freq=8.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.4192326 = fieldWeight in 1967, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1967)
    0.021976938 = product of:
      0.043953877 = sum of:
        0.043953877 = weight(_text_:22 in 1967) [ClassicSimilarity], result of:
          0.043953877 = score(doc=1967,freq=4.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.32829654 = fieldWeight in 1967, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1967)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: This paper reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The paper discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and /or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the DDC (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.

Zhou, Y. et al.: Analysing entity context in multilingual Wikipedia to support entity-centric retrieval applications (2016) 0.03

0.03406783 = product of:
  0.06813566 = sum of:
    0.042235587 = weight(_text_:data in 2758) [ClassicSimilarity], result of:
      0.042235587 = score(doc=2758,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.34936053 = fieldWeight in 2758, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.078125 = fieldNorm(doc=2758)
    0.02590007 = product of:
      0.05180014 = sum of:
        0.05180014 = weight(_text_:22 in 2758) [ClassicSimilarity], result of:
          0.05180014 = score(doc=2758,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.38690117 = fieldWeight in 2758, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2758)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 1. 2.2016 18:25:22
Source: Semantic keyword-based search on structured data sources: First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers. Eds.: J. Cardoso et al

Ludwig, L.: Lösung zum multilingualen Wissensmanagement semantischer Informationen (2010) 0.03
```
0.031300645 = product of:
  0.06260129 = sum of:
    0.021117793 = weight(_text_:data in 4281) [ClassicSimilarity], result of:
      0.021117793 = score(doc=4281,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.17468026 = fieldWeight in 4281, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4281)
    0.041483495 = product of:
      0.08296699 = sum of:
        0.08296699 = weight(_text_:lexikon in 4281) [ClassicSimilarity], result of:
          0.08296699 = score(doc=4281,freq=2.0), product of:
            0.23962554 = queryWeight, product of:
              6.2675414 = idf(docFreq=227, maxDocs=44218)
              0.03823278 = queryNorm
            0.346236 = fieldWeight in 4281, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.2675414 = idf(docFreq=227, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4281)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Content

"Bis vor wenigen Jahren waren kürzere physische Schriftstücke und Bücher die bevorzugten Mittel beim Gedankenverfassen und Gedankenaustausch. Dokumentenregister halfen beim Auffinden, Gliederungen unterstützten beim Zurechtfinden, ggf. assistierten Stichwortverzeichnisse beim Herauspicken. Diese inkrementelle Orientierung weicht zunehmend einer reinen Stichwortsuche in elektronischen Dokumentenkorpora, insbesondere dem WWW. Dokumentenregister, Gliederungen und Stichwortverzeichnisse werden von auf Wortindexen aufbauenden Suchmaschinen ausgehebelt. Das Suchergebnis verweist direkt auf einen einzelnen Textausschnitt (Snippet). Zurechtfinden im Dokument und Auffinden der richtigen Dokumente(nvorschläge) erfolgen nun, wenn überhaupt, in umgekehrter Reihenfolge und demgemäß unter Umständen sehr mühsam. Auf Anhieb erfolgreich ist eine solche Suche allerdings dann, wenn das Zieldokument auf das Stichwort völlig zugeschnitten erscheint, wenn also förmlich Textausschnitt, Kapitel und Dokument in eins fallen. Der Sog der Suchmaschinen zerschlägt die traditionelle sequentielle Dokumentengliederung, zerschlägt zuletzt das Dokument selbst in immer kleinere suchmaschinengerechte Einheiten. Auf solche Weise löst die Indexierung in Einzelwörter letztlich das Dokument selbst auf. Zurück bleibt allein eine Ansammlung indexgemäß geordneter Informationseinheiten: das Lexikon oder der Katalog. Im elektronisch gestützten Wissensmanagement nimmt nun das Wiki den Platz des Lexikons ein und der benamste Wikieintrag den Platz des Dokumentes."

Source

Semantic web & linked data: Elemente zukünftiger Informationsinfrastrukturen ; 1. DGI-Konferenz ; 62. Jahrestagung der DGI ; Frankfurt am Main, 7. - 9. Oktober 2010 ; Proceedings / Deutsche Gesellschaft für Informationswissenschaft und Informationspraxis. Hrsg.: M. Ockenfeld

Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2014) 0.03

0.03027485 = product of:
  0.0605497 = sum of:
    0.042235587 = weight(_text_:data in 1962) [ClassicSimilarity], result of:
      0.042235587 = score(doc=1962,freq=8.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.34936053 = fieldWeight in 1962, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1962)
    0.018314114 = product of:
      0.036628228 = sum of:
        0.036628228 = weight(_text_:22 in 1962) [ClassicSimilarity], result of:
          0.036628228 = score(doc=1962,freq=4.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.27358043 = fieldWeight in 1962, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1962)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: This article reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The article discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and/or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the Dewey Decimal Classification [DDC] (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.

Luca, E.W. de; Dahlberg, I.: ¬Die Multilingual Lexical Linked Data Cloud : eine mögliche Zugangsoptimierung? (2014) 0.03
```
0.029716276 = product of:
  0.05943255 = sum of:
    0.04389251 = weight(_text_:data in 1736) [ClassicSimilarity], result of:
      0.04389251 = score(doc=1736,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.3630661 = fieldWeight in 1736, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1736)
    0.015540041 = product of:
      0.031080082 = sum of:
        0.031080082 = weight(_text_:22 in 1736) [ClassicSimilarity], result of:
          0.031080082 = score(doc=1736,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.23214069 = fieldWeight in 1736, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1736)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Sehr viele Informationen sind bereits im Web verfügbar oder können aus isolierten strukturierten Datenspeichern wie Informationssystemen und sozialen Netzwerken gewonnen werden. Datenintegration durch Nachbearbeitung oder durch Suchmechanismen (z. B. D2R) ist deshalb wichtig, um Informationen allgemein verwendbar zu machen. Semantische Technologien ermöglichen die Verwendung definierter Verbindungen (typisierter Links), durch die ihre Beziehungen zueinander festgehalten werden, was Vorteile für jede Anwendung bietet, die das in Daten enthaltene Wissen wieder verwenden kann. Um eine semantische Daten-Landkarte herzustellen, benötigen wir Wissen über die einzelnen Daten und ihre Beziehung zu anderen Daten. Dieser Beitrag stellt unsere Arbeit zur Benutzung von Lexical Linked Data (LLD) durch ein Meta-Modell vor, das alle Ressourcen enthält und zudem die Möglichkeit bietet sie unter unterschiedlichen Gesichtspunkten aufzufinden. Wir verbinden damit bestehende Arbeiten über Wissensgebiete (basierend auf der Information Coding Classification) mit der Multilingual Lexical Linked Data Cloud (basierend auf der RDF/OWL-Repräsentation von EuroWordNet und den ähnlichen integrierten lexikalischen Ressourcen MultiWordNet, MEMODATA und die Hamburg Metapher DB).

Date

22. 9.2014 19:00:13
Frâncu, V.; Sabo, C.-N.: Implementation of a UDC-based multilingual thesaurus in a library catalogue : the case of BiblioPhil (2010) 0.02
```
0.020440696 = product of:
  0.04088139 = sum of:
    0.02534135 = weight(_text_:data in 3697) [ClassicSimilarity], result of:
      0.02534135 = score(doc=3697,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.2096163 = fieldWeight in 3697, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3697)
    0.015540041 = product of:
      0.031080082 = sum of:
        0.031080082 = weight(_text_:22 in 3697) [ClassicSimilarity], result of:
          0.031080082 = score(doc=3697,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.23214069 = fieldWeight in 3697, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3697)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

In order to enhance the use of Universal Decimal Classification (UDC) numbers in information retrieval, the authors have represented classification with multilingual thesaurus descriptors and implemented this solution in an automated way. The authors illustrate a solution implemented in a BiblioPhil library system. The standard formats used are UNIMARC for subject authority records (i.e. the UDC-based multilingual thesaurus) and MARC XML support for data transfer. The multilingual thesaurus was built according to existing standards, the constituent parts of the classification notations being used as the basis for search terms in the multilingual information retrieval. The verbal equivalents, descriptors and non-descriptors, are used to expand the number of concepts and are given in Romanian, English and French. This approach saves the time of the indexer and provides more user-friendly and easier access to the bibliographic information. The multilingual aspect of the thesaurus enhances information access for a greater number of online users

Date

22. 7.2010 20:40:56
Luca, E.W. de: Extending the linked data cloud with multilingual lexical linked data (2013) 0.02
```
0.019035323 = product of:
  0.07614129 = sum of:
    0.07614129 = weight(_text_:data in 1073) [ClassicSimilarity], result of:
      0.07614129 = score(doc=1073,freq=26.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.6298187 = fieldWeight in 1073, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1073)
  0.25 = coord(1/4)
```
Abstract

A lot of information that is already available on the Web, or retrieved from local information systems and social networks, is structured in data silos that are not semantically related. Semantic technologies make it apparent that the use of typed links that directly express their relations are an advantage for every application that can reuse the incorporated knowledge about the data. For this reason, data integration, through reengineering (e.g., triplify) or querying (e.g., D2R), is an important task in order to make information available for everyone. Thus, in order to build a semantic map of the data, we need knowledge about data items itself and the relation between heterogeneous data items. Here we present our work of providing Lexical Linked Data (LLD) through a meta-model that contains all the resources and gives the possibility to retrieve and navigate them from different perspectives. After giving the definition of Lexical Linked Data, we describe the existing datasets we collected and the new datasets we included. Here we describe their format and show some use cases where we link lexical data, and show how to reuse and inference semantic data derived from lexical data. Different lexical resources (MultiWordNet, EuroWordNet, MEMODATA Lexicon, the Hamburg Methaphor Database) are connected to each other towards an Integrated Vocabulary for LLD that we evaluate and present.
Larkey, L.S.; Connell, M.E.: Structured queries, language modelling, and relevance modelling in cross-language information retrieval (2005) 0.02
```
0.017033914 = product of:
  0.03406783 = sum of:
    0.021117793 = weight(_text_:data in 1022) [ClassicSimilarity], result of:
      0.021117793 = score(doc=1022,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.17468026 = fieldWeight in 1022, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 1022) [ClassicSimilarity], result of:
          0.02590007 = score(doc=1022,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 1022, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1022)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in an approach often called structured query translation. In contrast, language models incorporate translation probabilities into a unified framework. We compare the two approaches on Arabic and Spanish data sets, using two kinds of bilingual dictionaries--one derived from a conventional dictionary, and one derived from a parallel corpus. We find that structured query processing gives slightly better results when queries are not expanded. On the other hand, when queries are expanded, language modeling gives better results, but only when using a probabilistic dictionary derived from a parallel corpus. We pursue two additional issues inherent in the comparison of structured query processing with language modeling. The first concerns query expansion, and the second is the role of translation probabilities. We compare conventional expansion techniques (pseudo-relevance feedback) with relevance modeling, a new IR approach which fits into the formal framework of language modeling. We find that relevance modeling and pseudo-relevance feedback achieve comparable levels of retrieval and that good translation probabilities confer a small but significant advantage.

Date

26.12.2007 20:22:11
Gupta, P.; Banchs, R.E.; Rosso, P.: Continuous space models for CLIR (2017) 0.01
```
0.012670675 = product of:
  0.0506827 = sum of:
    0.0506827 = weight(_text_:data in 3295) [ClassicSimilarity], result of:
      0.0506827 = score(doc=3295,freq=8.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.4192326 = fieldWeight in 3295, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3295)
  0.25 = coord(1/4)
```
Abstract

We present and evaluate a novel technique for learning cross-lingual continuous space models to aid cross-language information retrieval (CLIR). Our model, which is referred to as external-data composition neural network (XCNN), is based on a composition function that is implemented on top of a deep neural network that provides a distributed learning framework. Different from most existing models, which rely only on available parallel data for training, our learning framework provides a natural way to exploit monolingual data and its associated relevance metadata for learning continuous space representations of language. Cross-language extensions of the obtained models can then be trained by using a small set of parallel data. This property is very helpful for resource-poor languages, therefore, we carry out experiments on the English-Hindi language pair. On the conducted comparative evaluation, the proposed model is shown to outperform state-of-the-art continuous space models with statistically significant margin on two different tasks: parallel sentence retrieval and ad-hoc retrieval.
EuropeanaTech and Multilinguality : Issue 1 of EuropeanaTech Insight (2015) 0.01
```
0.011174486 = product of:
  0.044697944 = sum of:
    0.044697944 = weight(_text_:data in 1832) [ClassicSimilarity], result of:
      0.044697944 = score(doc=1832,freq=14.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.36972845 = fieldWeight in 1832, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=1832)
  0.25 = coord(1/4)
```
Abstract

Welcome to the very first issue of EuropeanaTech Insight, a multimedia publication about research and development within the EuropeanaTech community. EuropeanaTech is a very active community. It spans all of Europe and is made up of technical experts from the various disciplines within digital cultural heritage. At any given moment, members can be found presenting their work in project meetings, seminars and conferences around the world. Now, through EuropeanaTech Insight, we can share that inspiring work with the whole community. In our first three issues, we're showcasing topics discussed at the EuropeanaTech 2015 Conference, an exciting event that gave rise to lots of innovative ideas and fruitful conversations on the themes of data quality, data modelling, open data, data re-use, multilingualism and discovery. Welcome, bienvenue, bienvenido, Välkommen, Tervetuloa to the first Issue of EuropeanaTech Insight. Are we talking your language? No? Well I can guarantee you Europeana is. One of the European Union's great beauties and strengths is its diversity. That diversity is perhaps most evident in the 24 different languages spoken in the EU. Making it possible for all European citizens to easily and seamlessly communicate in their native language with others who do not speak that language is a huge technical undertaking. Translating documents, news, speeches and historical texts was once exclusively done manually. Clearly, that takes a huge amount of time and resources and means that not everything can be translated... However, with the advances in machine and automatic translation, it's becoming more possible to provide instant and pretty accurate translations. Europeana provides access to over 40 million digitised cultural heritage offering content in over 33 languages. But what value does Europeana provide if people can only find results in their native language? None. That's why the EuropeanaTech community is collectively working towards making it more possible for everyone to discover our collections in their native language. In this issue of EuropeanaTech Insight, we hear from community members who are making great strides in machine translation and enrichment tools to help improve not only access to data, but also how we retrieve, browse and understand it.

Content

Juliane Stiller, J.: Automatic Solutions to Improve Multilingual Access in Europeana / Vila-Suero, D. and A. Gómez-Pérez: Multilingual Linked Data / Pilos, S.: Automated Translation: Connecting Culture / Karlgren, J.: Big Data, Libraries, and Multilingual New Text / Ziedins, J.: Latvia translates with hugo.lv
Mitchell, J.S.; Rype, I.; Svanberg, M.: Mixed translation models for the Dewey Decimal Classification (DDC) System (2008) 0.01
```
0.010973128 = product of:
  0.04389251 = sum of:
    0.04389251 = weight(_text_:data in 2246) [ClassicSimilarity], result of:
      0.04389251 = score(doc=2246,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.3630661 = fieldWeight in 2246, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=2246)
  0.25 = coord(1/4)
```
Content

This paper explores the feasibility of developing mixed translations of the Dewey Decimal Classification (DDC system in countries/language groups where English enjoys wide use in academic and social discourse. A mixed translation uses existing DDC data in the vernacular plus additional data from the English-language full edition of the DDC to form a single mixed edition. Two approaches to mixed translations using Norwegian/English and Swedish/English DDC data are described, along with the design of a pilot study to evaluate use of a mixed translation as a classifier's tool.

Musmann, K.: ¬The diffusion of knowledge across the lingustic frontier : an exmination of monographic translations (1989) 0.01

0.010558897 = product of:
  0.042235587 = sum of:
    0.042235587 = weight(_text_:data in 602) [ClassicSimilarity], result of:
      0.042235587 = score(doc=602,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.34936053 = fieldWeight in 602, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.078125 = fieldNorm(doc=602)
  0.25 = coord(1/4)

Abstract: Presents a preliminary assessment of the extent and characteristics of the translations of monographs as a form of information transfer and communication between language blocs. The study was based on statistical data provided by Unesco.

Jahns, Y.: Sacherschließung - zeitgemäß und zukunftsfähig (2010) 0.01

0.010558897 = product of:
  0.042235587 = sum of:
    0.042235587 = weight(_text_:data in 3278) [ClassicSimilarity], result of:
      0.042235587 = score(doc=3278,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.34936053 = fieldWeight in 3278, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.078125 = fieldNorm(doc=3278)
  0.25 = coord(1/4)

Content: Vortragende: Patrice Landry (MACS); Helga Karg (CrissCross); Armin Kühn (BibScout); Joachim Neubert (Linked data); Dörte Braune-Egloff u. Ester Scheven (RSWK, SWD); Heidrun Wiesenmüller (LCSH); Guido Bee (DDC Deutsch)

Weihs, J.: Three tales of multilingual cataloguing (1998) 0.01

0.010360028 = product of:
  0.04144011 = sum of:
    0.04144011 = product of:
      0.08288022 = sum of:
        0.08288022 = weight(_text_:22 in 6063) [ClassicSimilarity], result of:
          0.08288022 = score(doc=6063,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.61904186 = fieldWeight in 6063, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=6063)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 2. 8.2001 8:55:22

Strobel, S.; Marín-Arraiza, P.: Metadata for scientific audiovisual media : current practices and perspectives of the TIB / AV-portal (2015) 0.01
```
0.009144273 = product of:
  0.03657709 = sum of:
    0.03657709 = weight(_text_:data in 3667) [ClassicSimilarity], result of:
      0.03657709 = score(doc=3667,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.30255508 = fieldWeight in 3667, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3667)
  0.25 = coord(1/4)
```
Abstract

Descriptive metadata play a key role in finding relevant search results in large amounts of unstructured data. However, current scientific audiovisual media are provided with little metadata, which makes them hard to find, let alone individual sequences. In this paper, the TIB / AV-Portal is presented as a use case where methods concerning the automatic generation of metadata, a semantic search and cross-lingual retrieval (German/English) have already been applied. These methods result in a better discoverability of the scientific audiovisual media hosted in the portal. Text, speech, and image content of the video are automatically indexed by specialised GND (Gemeinsame Normdatei) subject headings. A semantic search is established based on properties of the GND ontology. The cross-lingual retrieval uses English 'translations' that were derived by an ontology mapping (DBpedia i. a.). Further ways of increasing the discoverability and reuse of the metadata are publishing them as Linked Open Data and interlinking them with other data sets.
Borgman, C.L.: Multi-media, multi-cultural, and multi-lingual digital libraries : or how do we exchange data In 400 languages? (1997) 0.01
```
0.009052367 = product of:
  0.036209468 = sum of:
    0.036209468 = weight(_text_:data in 1263) [ClassicSimilarity], result of:
      0.036209468 = score(doc=1263,freq=12.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.29951423 = fieldWeight in 1263, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1263)
  0.25 = coord(1/4)
```
Abstract

The Internet would not be very useful if communication were limited to textual exchanges between speakers of English located in the United States. Rather, its value lies in its ability to enable people from multiple nations, speaking multiple languages, to employ multiple media in interacting with each other. While computer networks broke through national boundaries long ago, they remain much more effective for textual communication than for exchanges of sound, images, or mixed media -- and more effective for communication in English than for exchanges in most other languages, much less interactions involving multiple languages. Supporting searching and display in multiple languages is an increasingly important issue for all digital libraries accessible on the Internet. Even if a digital library contains materials in only one language, the content needs to be searchable and displayable on computers in countries speaking other languages. We need to exchange data between digital libraries, whether in a single language or in multiple languages. Data exchanges may be large batch updates or interactive hyperlinks. In any of these cases, character sets must be represented in a consistent manner if exchanges are to succeed. Issues of interoperability, portability, and data exchange related to multi-lingual character sets have received surprisingly little attention in the digital library community or in discussions of standards for information infrastructure, except in Europe. The landmark collection of papers on Standards Policy for Information Infrastructure, for example, contains no discussion of multi-lingual issues except for a passing reference to the Unicode standard. The goal of this short essay is to draw attention to the multi-lingual issues involved in designing digital libraries accessible on the Internet. Many of the multi-lingual design issues parallel those of multi-media digital libraries, a topic more familiar to most readers of D-Lib Magazine. This essay draws examples from multi-media DLs to illustrate some of the urgent design challenges in creating a globally distributed network serving people who speak many languages other than English. First we introduce some general issues of medium, culture, and language, then discuss the design challenges in the transition from local to global systems, lastly addressing technical matters. The technical issues involve the choice of character sets to represent languages, similar to the choices made in representing images or sound. However, the scale of the language problem is far greater. Standards for multi-media representation are being adopted fairly rapidly, in parallel with the availability of multi-media content in electronic form. By contrast, we have hundreds (and sometimes thousands) of years worth of textual materials in hundreds of languages, created long before data encoding standards existed. Textual content from past and present is being encoded in language and application-specific representations that are difficult to exchange without losing data -- if they exchange at all. We illustrate the multi-language DL challenge with examples drawn from the research library community, which typically handles collections of materials in 400 or so languages. These are problems faced not only by developers of digital libraries, but by those who develop and manage any communication technology that crosses national or linguistic boundaries.
Salomonsen, A.: ¬The European National Libraries Cooperative Project on CD-ROM : results, experience and perspectives (1993) 0.01
```
0.008959521 = product of:
  0.035838082 = sum of:
    0.035838082 = weight(_text_:data in 6544) [ClassicSimilarity], result of:
      0.035838082 = score(doc=6544,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.29644224 = fieldWeight in 6544, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=6544)
  0.25 = coord(1/4)
```
Abstract

In 1989 a consortium of the national libraries of Denmark, France, Germany, Italy, Netherlands, Portugal and the UK agreed to cooperate in investigating the potential of CD-ROMs as a means of distributing and using national bibliographic data. The project, which was divided into 10 manageable sub projects, was launched in Jan 90. One major result is a draft specification of requirements for a common retrieval interface for bibliographic data, designed to match as closely as possible the needs of four user groups: acquisition librarians, cataloguers, reference librarians and end users. A second is the production of a pilot CD-ROM in UNIMARC; The Explorers, containing records from the national bibliographies of Denmark, Italy, Netherlands and Portugal. Other major products are MARC to UNIMARC conversion tables, and a multilingual interface. Valuable if sometimes painful experience was gained during the project
Hainebach, R.: ¬The EUROCAT project : the integration of European community multidisciplinary and document-oriented databases on CD-ROM; an exercise in merging data from several databases into a single database as well as solving the problem of multilingualism (1993) 0.01
```
0.008959521 = product of:
  0.035838082 = sum of:
    0.035838082 = weight(_text_:data in 7404) [ClassicSimilarity], result of:
      0.035838082 = score(doc=7404,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.29644224 = fieldWeight in 7404, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=7404)
  0.25 = coord(1/4)
```
Abstract

The Institutions of the European Communities produce document-oriented databases based on publications and documents distributed either by the Office for Official Publications of the European Communities or by the individual EC institutions themselves. These databases are known under the names of ABEL, CATEL, CELEX, CORDIS RTD publications, ECLAS, EPOQUE, EURISTOTE, RAPID and SCAD and are available via hosts such as EUROBASES, ECHO and the Office for Official Publications. Until the establishment of the EUROCAT project, no single database held a comprehensive and complete collection of all European Community documents and publications. Describes the work on integrating and harmonising the data from the databases to produce the multilingual EUROCAT database using MS-DOS based software. The resulting database will be available on CD-ROM
Landry, P.: Providing multilingual subject access through linking of subject heading languages : the MACS approach (2009) 0.01
```
0.008447117 = product of:
  0.03378847 = sum of:
    0.03378847 = weight(_text_:data in 2787) [ClassicSimilarity], result of:
      0.03378847 = score(doc=2787,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.2794884 = fieldWeight in 2787, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=2787)
  0.25 = coord(1/4)
```
Abstract

The MACS project aims at providing multilingual subject access to library catalogues through the use of concordances between subject headings from LCSH, RAMEAU and SWD. The manual approach, as used by MACS, has been up to now the most reliable method for ensuring accurate multilingual subject access to bibliographic data. The presentation will give an overview on the development of the project and will outline the strategy and methods used by the MACS project. The presentation will also include a demonstration of the search interface developed by The European Library (TEL).

Search (61 results, page 1 of 4)

Authors

Years

Languages

Types

Themes