Search (40 results, page 2 of 2)

  • × theme_ss:"Multilinguale Probleme"
  • × type_ss:"a"
  • × year_i:[2010 TO 2020}
  1. Ma, X.; Carranza, E.J.M.; Wu, C.; Meer, F.D. van der; Liu, G.: ¬A SKOS-based multilingual thesaurus of geological time scale for interoperability of online geological maps (2011) 0.01
    0.0060736625 = product of:
      0.018220987 = sum of:
        0.008743925 = weight(_text_:in in 4800) [ClassicSimilarity], result of:
          0.008743925 = score(doc=4800,freq=12.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.14725187 = fieldWeight in 4800, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.03125 = fieldNorm(doc=4800)
        0.009477063 = weight(_text_:und in 4800) [ClassicSimilarity], result of:
          0.009477063 = score(doc=4800,freq=2.0), product of:
            0.09675359 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.043654136 = queryNorm
            0.09795051 = fieldWeight in 4800, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.03125 = fieldNorm(doc=4800)
      0.33333334 = coord(2/6)
    
    Abstract
    The usefulness of online geological maps is hindered by linguistic barriers. Multilingual geoscience thesauri alleviate linguistic barriers of geological maps. However, the benefits of multilingual geoscience thesauri for online geological maps are less studied. In this regard, we developed a multilingual thesaurus of geological time scale (GTS) to alleviate linguistic barriers of GTS records among online geological maps. We extended the Simple Knowledge Organization System (SKOS) model to represent the ordinal hierarchical structure of GTS terms. We collected GTS terms in seven languages and encoded them into a thesaurus by using the extended SKOS model. We implemented methods of characteristic-oriented term retrieval in JavaScript programs for accessing Web Map Services (WMS), recognizing GTS terms, and making translations. With the developed thesaurus and programs, we set up a pilot system to test recognitions and translations of GTS terms in online geological maps. Results of this pilot system proved the accuracy of the developed thesaurus and the functionality of the developed programs. Therefore, with proper deployments, SKOS-based multilingual geoscience thesauri can be functional for alleviating linguistic barriers among online geological maps and, thus, improving their interoperability.
    Content
    Article Outline 1. Introduction 2. SKOS-based multilingual thesaurus of geological time scale 2.1. Addressing the insufficiency of SKOS in the context of the Semantic Web 2.2. Addressing semantics and syntax/lexicon in multilingual GTS terms 2.3. Extending SKOS model to capture GTS structure 2.4. Summary of building the SKOS-based MLTGTS 3. Recognizing and translating GTS terms retrieved from WMS 4. Pilot system, results, and evaluation 5. Discussion 6. Conclusions Vgl. unter: http://www.sciencedirect.com/science?_ob=MiamiImageURL&_cid=271720&_user=3865853&_pii=S0098300411000744&_check=y&_origin=&_coverDate=31-Oct-2011&view=c&wchp=dGLbVlt-zSkzS&_valck=1&md5=e2c1daf53df72d034d22278212578f42&ie=/sdarticle.pdf.
    Theme
    Konzeption und Anwendung des Prinzips Thesaurus
  2. Strobel, S.: Englischsprachige Erweiterung des TIB / AV-Portals : Ein GND/DBpedia-Mapping zur Gewinnung eines englischen Begriffssystems (2014) 0.01
    0.0060522384 = product of:
      0.018156715 = sum of:
        0.006310384 = weight(_text_:in in 2876) [ClassicSimilarity], result of:
          0.006310384 = score(doc=2876,freq=4.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.10626988 = fieldWeight in 2876, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2876)
        0.01184633 = weight(_text_:und in 2876) [ClassicSimilarity], result of:
          0.01184633 = score(doc=2876,freq=2.0), product of:
            0.09675359 = queryWeight, product of:
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.043654136 = queryNorm
            0.12243814 = fieldWeight in 2876, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.216367 = idf(docFreq=13101, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2876)
      0.33333334 = coord(2/6)
    
    Abstract
    Die Videos des TIB / AV-Portals werden mit insgesamt 63.356 GND-Sachbegriffen aus Naturwissenschaft und Technik automatisch verschlagwortet. Neben den deutschsprachigen Videos verfügt das TIB / AV-Portal auch über zahlreiche englischsprachige Videos. Die GND enthält zu den in der TIB / AV-Portal-Wissensbasis verwendeten Sachbegriffen nur sehr wenige englische Bezeichner. Es fehlt demnach ein englisches Indexierungsvokabular, mit dem die englischsprachigen Videos automatisch verschlagwortet werden können. Die Lösung dieses Problems sieht wie folgt aus: Die englischen Bezeichner sollen über ein Mapping der GND-Sachbegriffe auf andere Datensätze gewonnen werden, die eine englische Übersetzung der Begriffe enthalten. Die verwendeten Mappingstrategien nutzen die DBpedia, LCSH, MACS-Ergebnisse sowie den WTI-Thesaurus. Am Ende haben 35.025 GND-Sachbegriffe (mindestens) einen englischen Bezeichner ermittelt bekommen. Diese englischen Bezeichner können für die automatische Verschlagwortung der englischsprachigen Videos unmittelbar herangezogen werden. 11.694 GND-Sachbegriffe konnten zwar nicht ins Englische "übersetzt", aber immerhin mit einem Oberbegriff assoziiert werden, der eine englische Übersetzung hat. Diese Assoziation dient der Erweiterung der Suchergebnisse.
    Content
    Beitrag als ausgearbeitete Form eines Vortrages während des 103. Deutschen Bibliothekartages in Bremen. Vgl.: https://www.o-bib.de/article/view/2014H1S197-204.
  3. Flores, F.N.; Moreira, V.P.: Assessing the impact of stemming accuracy on information retrieval : a multilingual perspective (2016) 0.00
    0.0023611297 = product of:
      0.014166778 = sum of:
        0.014166778 = weight(_text_:in in 3187) [ClassicSimilarity], result of:
          0.014166778 = score(doc=3187,freq=14.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.23857531 = fieldWeight in 3187, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=3187)
      0.16666667 = coord(1/6)
    
    Abstract
    The quality of stemming algorithms is typically measured in two different ways: (i) how accurately they map the variant forms of a word to the same stem; or (ii) how much improvement they bring to Information Retrieval systems. In this article, we evaluate various stemming algorithms, in four languages, in terms of accuracy and in terms of their aid to Information Retrieval. The aim is to assess whether the most accurate stemmers are also the ones that bring the biggest gain in Information Retrieval. Experiments in English, French, Portuguese, and Spanish show that this is not always the case, as stemmers with higher error rates yield better retrieval quality. As a byproduct, we also identified the most accurate stemmers and the best for Information Retrieval purposes.
  4. Luo, M.M.; Nahl, D.: Let's Google : uncertainty and bilingual search (2019) 0.00
    0.0023611297 = product of:
      0.014166778 = sum of:
        0.014166778 = weight(_text_:in in 5363) [ClassicSimilarity], result of:
          0.014166778 = score(doc=5363,freq=14.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.23857531 = fieldWeight in 5363, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=5363)
      0.16666667 = coord(1/6)
    
    Abstract
    This study applies Kuhlthau's Information Search Process stage (ISP) model to understand bilingual users' Internet search experience. We conduct a quasi-field experiment with 30 bilingual searchers and the results suggested that the ISP model was applicable in studying searchers' information retrieval behavior in search tasks. The ISP model was applicable in studying searchers' information retrieval behavior in simple tasks. However, searchers' emotional responses differed from those of the ISP model for a complex task. By testing searchers using different search strategies, the results suggested that search engines with multilanguage search functions provide an advantage for bilingual searchers in the Internet's multilingual environment. The findings showed that when searchers used a search engine as a tool for problem solving, they might experience different feelings in each ISP stage than in searching for information for a term paper using a library. The results echo other research findings that indicate that information seeking is a multifaceted phenomenon.
  5. Carrasco, L.; Vidotti, S.: Handling multilinguality in heterogeneous digital cultural heritage systems trough CIDOC CRM ontology (2016) 0.00
    0.0020609628 = product of:
      0.012365777 = sum of:
        0.012365777 = weight(_text_:in in 4925) [ClassicSimilarity], result of:
          0.012365777 = score(doc=4925,freq=6.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.2082456 = fieldWeight in 4925, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=4925)
      0.16666667 = coord(1/6)
    
    Series
    Advances in knowledge organization; vol.15
    Source
    Knowledge organization for a sustainable world: challenges and perspectives for cultural, scientific, and technological sharing in a connected society : proceedings of the Fourteenth International ISKO Conference 27-29 September 2016, Rio de Janeiro, Brazil / organized by International Society for Knowledge Organization (ISKO), ISKO-Brazil, São Paulo State University ; edited by José Augusto Chaves Guimarães, Suellen Oliveira Milani, Vera Dodebei
  6. Mitchell, J.S.; Rype, I.; Svanberg, M.: Mixed translations of the DDC : design, usability, and implications for knowledge organization in multilingual environments (2011) 0.00
    0.0019955188 = product of:
      0.011973113 = sum of:
        0.011973113 = weight(_text_:in in 3034) [ClassicSimilarity], result of:
          0.011973113 = score(doc=3034,freq=10.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.20163295 = fieldWeight in 3034, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=3034)
      0.16666667 = coord(1/6)
    
    Abstract
    This paper reports on an ongoing investigation of mixed translation models for the Dewey Decimal Classification (DDC) system to support classification and access. A mixed translation uses DDC classes in the vernacular to form the basic framework of the mixed edition; English-language records are ingested directly to complete hierarchies where needed. Separate indexes of available terminology in the vernacular and English are provided. Specific Norwegian and Swedish mixed models are described, along with testing results of the Norwegian model. General implications of mixed translation models for knowledge organization in multilingual environments are considered.
    Source
    Subject access: preparing for the future. Conference on August 20 - 21, 2009 in Florence, the IFLA Classification and Indexing Section sponsored an IFLA satellite conference entitled "Looking at the Past and Preparing for the Future". Eds.: P. Landry et al
  7. Vilares, J.; Alonso, M.A.; Doval, Y.; Vilares, M.: Studying the effect and treatment of misspelled queries in Cross-Language Information Retrieval (2016) 0.00
    0.0019955188 = product of:
      0.011973113 = sum of:
        0.011973113 = weight(_text_:in in 2974) [ClassicSimilarity], result of:
          0.011973113 = score(doc=2974,freq=10.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.20163295 = fieldWeight in 2974, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=2974)
      0.16666667 = coord(1/6)
    
    Abstract
    General graph random walk has been successfully applied in multi-document summarization, but it has some limitations to process documents by this way. In this paper, we propose a novel hypergraph based vertex-reinforced random walk framework for multi-document summarization. The framework first exploits the Hierarchical Dirichlet Process (HDP) topic model to learn a word-topic probability distribution in sentences. Then the hypergraph is used to capture both cluster relationship based on the word-topic probability distribution and pairwise similarity among sentences. Finally, a time-variant random walk algorithm for hypergraphs is developed to rank sentences which ensures sentence diversity by vertex-reinforcement in summaries. Experimental results on the public available dataset demonstrate the effectiveness of our framework.
  8. He, D.; Wu, D.: Enhancing query translation with relevance feedback in translingual information retrieval : a study of the medication process (2011) 0.00
    0.001821651 = product of:
      0.010929906 = sum of:
        0.010929906 = weight(_text_:in in 4244) [ClassicSimilarity], result of:
          0.010929906 = score(doc=4244,freq=12.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.18406484 = fieldWeight in 4244, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4244)
      0.16666667 = coord(1/6)
    
    Abstract
    As an effective technique for improving retrieval effectiveness, relevance feedback (RF) has been widely studied in both monolingual and translingual information retrieval (TLIR). The studies of RF in TLIR have been focused on query expansion (QE), in which queries are reformulated before and/or after they are translated. However, RF in TLIR actually not only can help select better query terms, but also can enhance query translation by adjusting translation probabilities and even resolving some out-of-vocabulary terms. In this paper, we propose a novel relevance feedback method called translation enhancement (TE), which uses the extracted translation relationships from relevant documents to revise the translation probabilities of query terms and to identify extra available translation alternatives so that the translated queries are more tuned to the current search. We studied TE using pseudo-relevance feedback (PRF) and interactive relevance feedback (IRF). Our results show that TE can significantly improve TLIR with both types of relevance feedback methods, and that the improvement is comparable to that of query expansion. More importantly, the effects of translation enhancement and query expansion are complementary. Their integration can produce further improvement, and makes TLIR more robust for a variety of queries.
  9. Ye, Z.; Huang, J.X.; He, B.; Lin, H.: Mining a multilingual association dictionary from Wikipedia for cross-language information retrieval (2012) 0.00
    0.001821651 = product of:
      0.010929906 = sum of:
        0.010929906 = weight(_text_:in in 513) [ClassicSimilarity], result of:
          0.010929906 = score(doc=513,freq=12.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.18406484 = fieldWeight in 513, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=513)
      0.16666667 = coord(1/6)
    
    Abstract
    Wikipedia is characterized by its dense link structure and a large number of articles in different languages, which make it a notable Web corpus for knowledge extraction and mining, in particular for mining the multilingual associations. In this paper, motivated by a psychological theory of word meaning, we propose a graph-based approach to constructing a cross-language association dictionary (CLAD) from Wikipedia, which can be used in a variety of cross-language accessing and processing applications. In order to evaluate the quality of the mined CLAD, and to demonstrate how the mined CLAD can be used in practice, we explore two different applications of the mined CLAD to cross-language information retrieval (CLIR). First, we use the mined CLAD to conduct cross-language query expansion; and, second, we use it to filter out translation candidates with low translation probabilities. Experimental results on a variety of standard CLIR test collections show that the CLIR retrieval performance can be substantially improved with the above two applications of CLAD, which indicates that the mined CLAD is of sound quality.
  10. Strobel, S.; Marín-Arraiza, P.: Metadata for scientific audiovisual media : current practices and perspectives of the TIB / AV-portal (2015) 0.00
    0.001821651 = product of:
      0.010929906 = sum of:
        0.010929906 = weight(_text_:in in 3667) [ClassicSimilarity], result of:
          0.010929906 = score(doc=3667,freq=12.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.18406484 = fieldWeight in 3667, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3667)
      0.16666667 = coord(1/6)
    
    Abstract
    Descriptive metadata play a key role in finding relevant search results in large amounts of unstructured data. However, current scientific audiovisual media are provided with little metadata, which makes them hard to find, let alone individual sequences. In this paper, the TIB / AV-Portal is presented as a use case where methods concerning the automatic generation of metadata, a semantic search and cross-lingual retrieval (German/English) have already been applied. These methods result in a better discoverability of the scientific audiovisual media hosted in the portal. Text, speech, and image content of the video are automatically indexed by specialised GND (Gemeinsame Normdatei) subject headings. A semantic search is established based on properties of the GND ontology. The cross-lingual retrieval uses English 'translations' that were derived by an ontology mapping (DBpedia i. a.). Further ways of increasing the discoverability and reuse of the metadata are publishing them as Linked Open Data and interlinking them with other data sets.
    Series
    Communications in computer and information science; 544
  11. Vassilakaki, E.; Garoufallou, E.; Johnson, F.; Hartley, R.J.: ¬An exploration of users' needs for multilingual information retrieval and access (2015) 0.00
    0.0017848461 = product of:
      0.010709076 = sum of:
        0.010709076 = weight(_text_:in in 2394) [ClassicSimilarity], result of:
          0.010709076 = score(doc=2394,freq=8.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.18034597 = fieldWeight in 2394, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=2394)
      0.16666667 = coord(1/6)
    
    Abstract
    The need for promoting Multilingual Information Retrieval (MLIR) and Access (MLIA) has become evident, now more than ever, given the increase of the online information produced daily in languages other than English. This study aims to explore users' information needs when searching for information across languages. Specifically, the method of questionnaire was employed to shed light on the Library and Information Science (LIS) undergraduate students' use of search engines, databases, digital libraries when searching as well as their needs for multilingual access. This study contributes in informing the design of MLIR systems by focusing on the reasons and situations under which users would search and use information in multiple languages.
    Series
    Communications in computer and information science; 544
  12. Olvera-Lobo, M.-D.; García-Santiago, L.: Analysis of errors in the automatic translation of questions for translingual QA systems (2010) 0.00
    0.0016629322 = product of:
      0.009977593 = sum of:
        0.009977593 = weight(_text_:in in 3956) [ClassicSimilarity], result of:
          0.009977593 = score(doc=3956,freq=10.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.16802745 = fieldWeight in 3956, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3956)
      0.16666667 = coord(1/6)
    
    Abstract
    Purpose - This study aims to focus on the evaluation of systems for the automatic translation of questions destined to translingual question-answer (QA) systems. The efficacy of online translators when performing as tools in QA systems is analysed using a collection of documents in the Spanish language. Design/methodology/approach - Automatic translation is evaluated in terms of the functionality of actual translations produced by three online translators (Google Translator, Promt Translator, and Worldlingo) by means of objective and subjective evaluation measures, and the typology of errors produced was identified. For this purpose, a comparative study of the quality of the translation of factual questions of the CLEF collection of queries was carried out, from German and French to Spanish. Findings - It was observed that the rates of error for the three systems evaluated here are greater in the translations pertaining to the language pair German-Spanish . Promt was identified as the most reliable translator of the three (on average) for the two linguistic combinations evaluated. However, for the Spanish-German pair, a good assessment of the Google online translator was obtained as well. Most errors (46.38 percent) tended to be of a lexical nature, followed by those due to a poor translation of the interrogative particle of the query (31.16 percent). Originality/value - The evaluation methodology applied focuses above all on the finality of the translation. That is, does the resulting question serve as effective input into a translingual QA system? Thus, instead of searching for "perfection", the functionality of the question and its capacity to lead one to an adequate response are appraised. The results obtained contribute to the development of improved translingual QA systems.
  13. Yu, L.-C.; Wu, C.-H.; Chang, R.-Y.; Liu, C.-H.; Hovy, E.H.: Annotation and verification of sense pools in OntoNotes (2010) 0.00
    0.0016629322 = product of:
      0.009977593 = sum of:
        0.009977593 = weight(_text_:in in 4236) [ClassicSimilarity], result of:
          0.009977593 = score(doc=4236,freq=10.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.16802745 = fieldWeight in 4236, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4236)
      0.16666667 = coord(1/6)
    
    Abstract
    The paper describes the OntoNotes, a multilingual (English, Chinese and Arabic) corpus with large-scale semantic annotations, including predicate-argument structure, word senses, ontology linking, and coreference. The underlying semantic model of OntoNotes involves word senses that are grouped into so-called sense pools, i.e., sets of near-synonymous senses of words. Such information is useful for many applications, including query expansion for information retrieval (IR) systems, (near-)duplicate detection for text summarization systems, and alternative word selection for writing support systems. Although a sense pool provides a set of near-synonymous senses of words, there is still no knowledge about whether two words in a pool are interchangeable in practical use. Therefore, this paper devises an unsupervised algorithm that incorporates Google n-grams and a statistical test to determine whether a word in a pool can be substituted by other words in the same pool. The n-gram features are used to measure the degree of context mismatch for a substitution. The statistical test is then applied to determine whether the substitution is adequate based on the degree of mismatch. The proposed method is compared with a supervised method, namely Linear Discriminant Analysis (LDA). Experimental results show that the proposed unsupervised method can achieve comparable performance with the supervised method.
  14. Tsai, M.-.F.; Chen, H.-H.; Wang, Y.-T.: Learning a merge model for multilingual information retrieval (2011) 0.00
    0.0016629322 = product of:
      0.009977593 = sum of:
        0.009977593 = weight(_text_:in in 2750) [ClassicSimilarity], result of:
          0.009977593 = score(doc=2750,freq=10.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.16802745 = fieldWeight in 2750, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2750)
      0.16666667 = coord(1/6)
    
    Abstract
    This paper proposes a learning approach for the merging process in multilingual information retrieval (MLIR). To conduct the learning approach, we present a number of features that may influence the MLIR merging process. These features are mainly extracted from three levels: query, document, and translation. After the feature extraction, we then use the FRank ranking algorithm to construct a merge model. To the best of our knowledge, this practice is the first attempt to use a learning-based ranking algorithm to construct a merge model for MLIR merging. In our experiments, three test collections for the task of crosslingual information retrieval (CLIR) in NTCIR3, 4, and 5 are employed to assess the performance of our proposed method. Moreover, several merging methods are also carried out for a comparison, including traditional merging methods, the 2-step merging strategy, and the merging method based on logistic regression. The experimental results show that our proposed method can significantly improve merging quality on two different types of datasets. In addition to the effectiveness, through the merge model generated by FRank, our method can further identify key factors that influence the merging process. This information might provide us more insight and understanding into MLIR merging.
    Content
    Beitrag in einem Themenschwerpunkt "Managing and Mining Multilingual Documents". Vgl.: 10.1016/j.ipm.2009.12.002.
  15. Ménard, E.: Ordinary image retrieval in a multilingual context : a comparison of two indexing vocabularies (2010) 0.00
    0.0015740865 = product of:
      0.009444519 = sum of:
        0.009444519 = weight(_text_:in in 3946) [ClassicSimilarity], result of:
          0.009444519 = score(doc=3946,freq=14.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.15905021 = fieldWeight in 3946, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.03125 = fieldNorm(doc=3946)
      0.16666667 = coord(1/6)
    
    Abstract
    Purpose - This paper seeks to examine image retrieval within two different contexts: a monolingual context where the language of the query is the same as the indexing language and a multilingual context where the language of the query is different from the indexing language. The study also aims to compare two different approaches for the indexing of ordinary images representing common objects: traditional image indexing with the use of a controlled vocabulary and free image indexing using uncontrolled vocabulary. Design/methodology/approach - This research uses three data collection methods. An analysis of the indexing terms was employed in order to examine the multiplicity of term types assigned to images. A simulation of the retrieval process involving a set of 30 images was performed with 60 participants. The quantification of the retrieval performance of each indexing approach was based on the usability measures, that is, effectiveness, efficiency and satisfaction of the user. Finally, a questionnaire was used to gather information on searcher satisfaction during and after the retrieval process. Findings - The results of this research are twofold. The analysis of indexing terms associated with all the 3,950 images provides a comprehensive description of the characteristics of the four non-combined indexing forms used for the study. Also, the retrieval simulation results offers information about the relative performance of the six indexing forms (combined and non-combined) in terms of their effectiveness, efficiency (temporal and human) and the image searcher's satisfaction. Originality/value - The findings of the study suggest that, in the near future, the information systems could benefit from allowing an increased coexistence of controlled vocabularies and uncontrolled vocabularies, resulting from collaborative image tagging, for example, and giving the users the possibility to dynamically participate in the image-indexing process, in a more user-centred way.
    Footnote
    Beitrag in einem Special Issue: Content architecture: exploiting and managing diverse resources: proceedings of the first national conference of the United Kingdom chapter of the International Society for Knowedge Organization (ISKO)
  16. Rettinger, A.; Schumilin, A.; Thoma, S.; Ell, B.: Learning a cross-lingual semantic representation of relations expressed in text (2015) 0.00
    0.0014873719 = product of:
      0.008924231 = sum of:
        0.008924231 = weight(_text_:in in 2027) [ClassicSimilarity], result of:
          0.008924231 = score(doc=2027,freq=2.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.15028831 = fieldWeight in 2027, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.078125 = fieldNorm(doc=2027)
      0.16666667 = coord(1/6)
    
  17. Freire, N.; Charles, V.; Isaac, A.: Subject information and multilingualism in European bibliographic datasets : experiences with Universal Decimal Classification (2015) 0.00
    0.0014873719 = product of:
      0.008924231 = sum of:
        0.008924231 = weight(_text_:in in 2289) [ClassicSimilarity], result of:
          0.008924231 = score(doc=2289,freq=2.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.15028831 = fieldWeight in 2289, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.078125 = fieldNorm(doc=2289)
      0.16666667 = coord(1/6)
    
  18. Niininen, S.; Nykyri, S.; Suominen, O.: ¬The future of metadata : open, linked, and multilingual - the YSO case (2017) 0.00
    0.0014873719 = product of:
      0.008924231 = sum of:
        0.008924231 = weight(_text_:in in 3707) [ClassicSimilarity], result of:
          0.008924231 = score(doc=3707,freq=8.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.15028831 = fieldWeight in 3707, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3707)
      0.16666667 = coord(1/6)
    
    Abstract
    Purpose The purpose of this paper is threefold: to focus on the process of multilingual concept scheme construction and the challenges involved; to addresses concrete challenges faced in the construction process and especially those related to equivalence between terms and concepts; and to briefly outlines the translation strategies developed during the process of concept scheme construction. Design/methodology/approach The analysis is based on experience acquired during the establishment of the Finnish thesaurus and ontology service Finto as well as the trilingual General Finnish Ontology YSO, both of which are being maintained and further developed at the National Library of Finland. Findings Although uniform resource identifiers can be considered language-independent, they do not render concept schemes and their construction free of language-related challenges. The fundamental issue with all the challenges faced is how to maintain consistency and predictability when the nature of language requires each concept to be treated individually. The key to such challenges is to recognise the function of the vocabulary and the needs of its intended users. Social implications Open science increases the transparency of not only research products, but also metadata tools. Gaining a deeper understanding of the challenges involved in their construction is important for a great variety of users - e.g. indexers, vocabulary builders and information seekers. Today, multilingualism is an essential aspect at both the national and international information society level. Originality/value This paper draws on the practical challenges faced in concept scheme construction in a trilingual environment, with a focus on "concept scheme" as a translation and mapping unit.
  19. Luca, E.W. de: Extending the linked data cloud with multilingual lexical linked data (2013) 0.00
    0.0012881019 = product of:
      0.007728611 = sum of:
        0.007728611 = weight(_text_:in in 1073) [ClassicSimilarity], result of:
          0.007728611 = score(doc=1073,freq=6.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.1301535 = fieldWeight in 1073, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1073)
      0.16666667 = coord(1/6)
    
    Abstract
    A lot of information that is already available on the Web, or retrieved from local information systems and social networks, is structured in data silos that are not semantically related. Semantic technologies make it apparent that the use of typed links that directly express their relations are an advantage for every application that can reuse the incorporated knowledge about the data. For this reason, data integration, through reengineering (e.g., triplify) or querying (e.g., D2R), is an important task in order to make information available for everyone. Thus, in order to build a semantic map of the data, we need knowledge about data items itself and the relation between heterogeneous data items. Here we present our work of providing Lexical Linked Data (LLD) through a meta-model that contains all the resources and gives the possibility to retrieve and navigate them from different perspectives. After giving the definition of Lexical Linked Data, we describe the existing datasets we collected and the new datasets we included. Here we describe their format and show some use cases where we link lexical data, and show how to reuse and inference semantic data derived from lexical data. Different lexical resources (MultiWordNet, EuroWordNet, MEMODATA Lexicon, the Hamburg Methaphor Database) are connected to each other towards an Integrated Vocabulary for LLD that we evaluate and present.
  20. Kim, S.; Ko, Y.; Oard, D.W.: Combining lexical and statistical translation evidence for cross-language information retrieval (2015) 0.00
    8.9242304E-4 = product of:
      0.005354538 = sum of:
        0.005354538 = weight(_text_:in in 1606) [ClassicSimilarity], result of:
          0.005354538 = score(doc=1606,freq=2.0), product of:
            0.059380736 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.043654136 = queryNorm
            0.09017298 = fieldWeight in 1606, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=1606)
      0.16666667 = coord(1/6)
    
    Abstract
    This article explores how best to use lexical and statistical translation evidence together for cross-language information retrieval (CLIR). Lexical translation evidence is assembled from Wikipedia and from a large machine-readable dictionary, statistical translation evidence is drawn from parallel corpora, and evidence from co-occurrence in the document language provides a basis for limiting the adverse effect of translation ambiguity. Coverage statistics for NII Testbeds and Community for Information Access Research (NTCIR) queries confirm that these resources have complementary strengths. Experiments with translation evidence from a small parallel corpus indicate that even rather rough estimates of translation probabilities can yield further improvements over a strong technique for translation weighting based on using Jensen-Shannon divergence as a term-association measure. Finally, a novel approach to posttranslation query expansion using a random walk over the Wikipedia concept link graph is shown to yield further improvements over alternative techniques for posttranslation query expansion. Evaluation results on the NTCIR-5 English-Korean test collection show statistically significant improvements over strong baselines.

Languages

  • e 30
  • d 10