Search (31 results, page 1 of 2)

  • × year_i:[2010 TO 2020}
  • × theme_ss:"Multilinguale Probleme"
  1. Stiller, J.; Gäde, M.; Petras, V.: Multilingual access to digital libraries : the Europeana use case (2013) 0.06
    0.05502244 = product of:
      0.11004488 = sum of:
        0.10171618 = weight(_text_:digitale in 902) [ClassicSimilarity], result of:
          0.10171618 = score(doc=902,freq=4.0), product of:
            0.18027179 = queryWeight, product of:
              5.158747 = idf(docFreq=690, maxDocs=44218)
              0.034944877 = queryNorm
            0.56423795 = fieldWeight in 902, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.158747 = idf(docFreq=690, maxDocs=44218)
              0.0546875 = fieldNorm(doc=902)
        0.008328702 = weight(_text_:information in 902) [ClassicSimilarity], result of:
          0.008328702 = score(doc=902,freq=2.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.13576832 = fieldWeight in 902, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=902)
      0.5 = coord(2/4)
    
    Abstract
    Der Artikel fasst Komponenten für einen mehrsprachigen Zugang in digitalen Bibliotheken zusammen. Dabei wird der Fokus auf Bibliotheken für das digitale Kulturerbe gelegt. Eine Analyse aktueller (existierender) Informationssysteme im sogenannten GLAM-Bereich (Galerien, Bibliotheken, Archive, Museen) beschreibt angewandte Lösungen für die Recherche (Suchen und Blättern) von und die Interaktion mit mehrsprachigen Inhalten. Europeana, die europäische digitale Bibliothek für Kulturerbe, wird als Fallbeispiel hervorgehoben und es werden beispielhaft Interaktionsszenarios für die mehrsprachige Recherche vorgestellt. Die Herausforderungen in der Implementierung von Komponenten für den mehrsprachigen Informationszugang sowie Empfehlungen für den verbesserten Einsatz werden vorgestellt und diskutiert.
    Source
    Information - Wissenschaft und Praxis. 64(2013) H.2/3, S.86-95
  2. Vassilakaki, E.; Garoufallou, E.; Johnson, F.; Hartley, R.J.: ¬An exploration of users' needs for multilingual information retrieval and access (2015) 0.01
    0.0050479556 = product of:
      0.020191822 = sum of:
        0.020191822 = weight(_text_:information in 2394) [ClassicSimilarity], result of:
          0.020191822 = score(doc=2394,freq=16.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.3291521 = fieldWeight in 2394, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2394)
      0.25 = coord(1/4)
    
    Abstract
    The need for promoting Multilingual Information Retrieval (MLIR) and Access (MLIA) has become evident, now more than ever, given the increase of the online information produced daily in languages other than English. This study aims to explore users' information needs when searching for information across languages. Specifically, the method of questionnaire was employed to shed light on the Library and Information Science (LIS) undergraduate students' use of search engines, databases, digital libraries when searching as well as their needs for multilingual access. This study contributes in informing the design of MLIR systems by focusing on the reasons and situations under which users would search and use information in multiple languages.
    Series
    Communications in computer and information science; 544
  3. Peters, C.; Braschler, M.; Clough, P.: Multilingual information retrieval : from research to practice (2012) 0.00
    0.004608132 = product of:
      0.018432528 = sum of:
        0.018432528 = weight(_text_:information in 361) [ClassicSimilarity], result of:
          0.018432528 = score(doc=361,freq=30.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.3004734 = fieldWeight in 361, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=361)
      0.25 = coord(1/4)
    
    Abstract
    We are living in a multilingual world and the diversity in languages which are used to interact with information access systems has generated a wide variety of challenges to be addressed by computer and information scientists. The growing amount of non-English information accessible globally and the increased worldwide exposure of enterprises also necessitates the adaptation of Information Retrieval (IR) methods to new, multilingual settings.Peters, Braschler and Clough present a comprehensive description of the technologies involved in designing and developing systems for Multilingual Information Retrieval (MLIR). They provide readers with broad coverage of the various issues involved in creating systems to make accessible digitally stored materials regardless of the language(s) they are written in. Details on Cross-Language Information Retrieval (CLIR) are also covered that help readers to understand how to develop retrieval systems that cross language boundaries. Their work is divided into six chapters and accompanies the reader step-by-step through the various stages involved in building, using and evaluating MLIR systems. The book concludes with some examples of recent applications that utilise MLIR technologies. Some of the techniques described have recently started to appear in commercial search systems, while others have the potential to be part of future incarnations.The book is intended for graduate students, scholars, and practitioners with a basic understanding of classical text retrieval methods. It offers guidelines and information on all aspects that need to be taken into consideration when building MLIR systems, while avoiding too many 'hands-on details' that could rapidly become obsolete. Thus it bridges the gap between the material covered by most of the classical IR textbooks and the novel requirements related to the acquisition and dissemination of information in whatever language it is stored.
    Content
    Inhalt: 1 Introduction 2 Within-Language Information Retrieval 3 Cross-Language Information Retrieval 4 Interaction and User Interfaces 5 Evaluation for Multilingual Information Retrieval Systems 6 Applications of Multilingual Information Access
    RSWK
    Information-Retrieval-System / Mehrsprachigkeit / Abfrage / Zugriff
    Subject
    Information-Retrieval-System / Mehrsprachigkeit / Abfrage / Zugriff
  4. Flores, F.N.; Moreira, V.P.: Assessing the impact of stemming accuracy on information retrieval : a multilingual perspective (2016) 0.00
    0.004371658 = product of:
      0.017486632 = sum of:
        0.017486632 = weight(_text_:information in 3187) [ClassicSimilarity], result of:
          0.017486632 = score(doc=3187,freq=12.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.2850541 = fieldWeight in 3187, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3187)
      0.25 = coord(1/4)
    
    Abstract
    The quality of stemming algorithms is typically measured in two different ways: (i) how accurately they map the variant forms of a word to the same stem; or (ii) how much improvement they bring to Information Retrieval systems. In this article, we evaluate various stemming algorithms, in four languages, in terms of accuracy and in terms of their aid to Information Retrieval. The aim is to assess whether the most accurate stemmers are also the ones that bring the biggest gain in Information Retrieval. Experiments in English, French, Portuguese, and Spanish show that this is not always the case, as stemmers with higher error rates yield better retrieval quality. As a byproduct, we also identified the most accurate stemmers and the best for Information Retrieval purposes.
    Source
    Information processing and management. 52(2016) no.5, S.840-854
  5. Luo, M.M.; Nahl, D.: Let's Google : uncertainty and bilingual search (2019) 0.00
    0.004371658 = product of:
      0.017486632 = sum of:
        0.017486632 = weight(_text_:information in 5363) [ClassicSimilarity], result of:
          0.017486632 = score(doc=5363,freq=12.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.2850541 = fieldWeight in 5363, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5363)
      0.25 = coord(1/4)
    
    Abstract
    This study applies Kuhlthau's Information Search Process stage (ISP) model to understand bilingual users' Internet search experience. We conduct a quasi-field experiment with 30 bilingual searchers and the results suggested that the ISP model was applicable in studying searchers' information retrieval behavior in search tasks. The ISP model was applicable in studying searchers' information retrieval behavior in simple tasks. However, searchers' emotional responses differed from those of the ISP model for a complex task. By testing searchers using different search strategies, the results suggested that search engines with multilanguage search functions provide an advantage for bilingual searchers in the Internet's multilingual environment. The findings showed that when searchers used a search engine as a tool for problem solving, they might experience different feelings in each ISP stage than in searching for information for a term paper using a library. The results echo other research findings that indicate that information seeking is a multifaceted phenomenon.
    Source
    Journal of the Association for Information Science and Technology. 70(2019) no.9, S.1014-1025
  6. Wang, J.; Oard, D.W.: Matching meaning for cross-language information retrieval (2012) 0.00
    0.004164351 = product of:
      0.016657405 = sum of:
        0.016657405 = weight(_text_:information in 7430) [ClassicSimilarity], result of:
          0.016657405 = score(doc=7430,freq=8.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.27153665 = fieldWeight in 7430, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7430)
      0.25 = coord(1/4)
    
    Abstract
    This article describes a framework for cross-language information retrieval that efficiently leverages statistical estimation of translation probabilities. The framework provides a unified perspective into which some earlier work on techniques for cross-language information retrieval based on translation probabilities can be cast. Modeling synonymy and filtering translation probabilities using bidirectional evidence are shown to yield a balance between retrieval effectiveness and query-time (or indexing-time) efficiency that seems well suited large-scale applications. Evaluations with six test collections show consistent improvements over strong baselines.
    Source
    Information processing and management. 48(2012) no.4, S.631-653
  7. Frâncu, V.; Sabo, C.-N.: Implementation of a UDC-based multilingual thesaurus in a library catalogue : the case of BiblioPhil (2010) 0.00
    0.0035694437 = product of:
      0.014277775 = sum of:
        0.014277775 = weight(_text_:information in 3697) [ClassicSimilarity], result of:
          0.014277775 = score(doc=3697,freq=8.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.23274569 = fieldWeight in 3697, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3697)
      0.25 = coord(1/4)
    
    Abstract
    In order to enhance the use of Universal Decimal Classification (UDC) numbers in information retrieval, the authors have represented classification with multilingual thesaurus descriptors and implemented this solution in an automated way. The authors illustrate a solution implemented in a BiblioPhil library system. The standard formats used are UNIMARC for subject authority records (i.e. the UDC-based multilingual thesaurus) and MARC XML support for data transfer. The multilingual thesaurus was built according to existing standards, the constituent parts of the classification notations being used as the basis for search terms in the multilingual information retrieval. The verbal equivalents, descriptors and non-descriptors, are used to expand the number of concepts and are given in Romanian, English and French. This approach saves the time of the indexer and provides more user-friendly and easier access to the bibliographic information. The multilingual aspect of the thesaurus enhances information access for a greater number of online users
  8. Kim, S.; Ko, Y.; Oard, D.W.: Combining lexical and statistical translation evidence for cross-language information retrieval (2015) 0.00
    0.0035694437 = product of:
      0.014277775 = sum of:
        0.014277775 = weight(_text_:information in 1606) [ClassicSimilarity], result of:
          0.014277775 = score(doc=1606,freq=8.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.23274569 = fieldWeight in 1606, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1606)
      0.25 = coord(1/4)
    
    Abstract
    This article explores how best to use lexical and statistical translation evidence together for cross-language information retrieval (CLIR). Lexical translation evidence is assembled from Wikipedia and from a large machine-readable dictionary, statistical translation evidence is drawn from parallel corpora, and evidence from co-occurrence in the document language provides a basis for limiting the adverse effect of translation ambiguity. Coverage statistics for NII Testbeds and Community for Information Access Research (NTCIR) queries confirm that these resources have complementary strengths. Experiments with translation evidence from a small parallel corpus indicate that even rather rough estimates of translation probabilities can yield further improvements over a strong technique for translation weighting based on using Jensen-Shannon divergence as a term-association measure. Finally, a novel approach to posttranslation query expansion using a random walk over the Wikipedia concept link graph is shown to yield further improvements over alternative techniques for posttranslation query expansion. Evaluation results on the NTCIR-5 English-Korean test collection show statistically significant improvements over strong baselines.
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.23-39
  9. Stiller, J.; Király, P.: Multitlinguality of metadata : measuring the miltilingual degree of Europeana's metadata (2017) 0.00
    0.0033653039 = product of:
      0.013461215 = sum of:
        0.013461215 = weight(_text_:information in 3558) [ClassicSimilarity], result of:
          0.013461215 = score(doc=3558,freq=4.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.21943474 = fieldWeight in 3558, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3558)
      0.25 = coord(1/4)
    
    Source
    Everything changes, everything stays the same? - Understanding information spaces : Proceedings of the 15th International Symposium of Information Science (ISI 2017), Berlin/Germany, 13th - 15th March 2017. Eds.: M. Gäde, V. Trkulja u. V. Petras
  10. Tsai, M.-.F.; Chen, H.-H.; Wang, Y.-T.: Learning a merge model for multilingual information retrieval (2011) 0.00
    0.0033256328 = product of:
      0.013302531 = sum of:
        0.013302531 = weight(_text_:information in 2750) [ClassicSimilarity], result of:
          0.013302531 = score(doc=2750,freq=10.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.21684799 = fieldWeight in 2750, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2750)
      0.25 = coord(1/4)
    
    Abstract
    This paper proposes a learning approach for the merging process in multilingual information retrieval (MLIR). To conduct the learning approach, we present a number of features that may influence the MLIR merging process. These features are mainly extracted from three levels: query, document, and translation. After the feature extraction, we then use the FRank ranking algorithm to construct a merge model. To the best of our knowledge, this practice is the first attempt to use a learning-based ranking algorithm to construct a merge model for MLIR merging. In our experiments, three test collections for the task of crosslingual information retrieval (CLIR) in NTCIR3, 4, and 5 are employed to assess the performance of our proposed method. Moreover, several merging methods are also carried out for a comparison, including traditional merging methods, the 2-step merging strategy, and the merging method based on logistic regression. The experimental results show that our proposed method can significantly improve merging quality on two different types of datasets. In addition to the effectiveness, through the merge model generated by FRank, our method can further identify key factors that influence the merging process. This information might provide us more insight and understanding into MLIR merging.
    Source
    Information processing and management. 47(2011) no.5, S.635-646
  11. Luca, E.W. de: Extending the linked data cloud with multilingual lexical linked data (2013) 0.00
    0.0029745363 = product of:
      0.011898145 = sum of:
        0.011898145 = weight(_text_:information in 1073) [ClassicSimilarity], result of:
          0.011898145 = score(doc=1073,freq=8.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.19395474 = fieldWeight in 1073, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1073)
      0.25 = coord(1/4)
    
    Abstract
    A lot of information that is already available on the Web, or retrieved from local information systems and social networks, is structured in data silos that are not semantically related. Semantic technologies make it apparent that the use of typed links that directly express their relations are an advantage for every application that can reuse the incorporated knowledge about the data. For this reason, data integration, through reengineering (e.g., triplify) or querying (e.g., D2R), is an important task in order to make information available for everyone. Thus, in order to build a semantic map of the data, we need knowledge about data items itself and the relation between heterogeneous data items. Here we present our work of providing Lexical Linked Data (LLD) through a meta-model that contains all the resources and gives the possibility to retrieve and navigate them from different perspectives. After giving the definition of Lexical Linked Data, we describe the existing datasets we collected and the new datasets we included. Here we describe their format and show some use cases where we link lexical data, and show how to reuse and inference semantic data derived from lexical data. Different lexical resources (MultiWordNet, EuroWordNet, MEMODATA Lexicon, the Hamburg Methaphor Database) are connected to each other towards an Integrated Vocabulary for LLD that we evaluate and present.
    Footnote
    Part of a section "Papers from the 13th Meeting of the German ISKO "Theory, Information, and Organization of Knowledge," Potsdam, 19-20 March 2013"
  12. De Luca, E.W.; Dahlberg, I.: Including knowledge domains from the ICC into the multilingual lexical linked data cloud (2014) 0.00
    0.0029745363 = product of:
      0.011898145 = sum of:
        0.011898145 = weight(_text_:information in 1493) [ClassicSimilarity], result of:
          0.011898145 = score(doc=1493,freq=8.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.19395474 = fieldWeight in 1493, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1493)
      0.25 = coord(1/4)
    
    Abstract
    A lot of information that is already available on the Web, or retrieved from local information systems and social networks is structured in data silos that are not semantically related. Semantic technologies make it emerge that the use of typed links that directly express their relations are an advantage for every application that can reuse the incorporated knowledge about the data. For this reason, data integration, through reengineering (e.g. triplify), or querying (e.g. D2R) is an important task in order to make information available for everyone. Thus, in order to build a semantic map of the data, we need knowledge about data items itself and the relation between heterogeneous data items. In this paper, we present our work of providing Lexical Linked Data (LLD) through a meta-model that contains all the resources and gives the possibility to retrieve and navigate them from different perspectives. We combine the existing work done on knowledge domains (based on the Information Coding Classification) within the Multilingual Lexical Linked Data Cloud (based on the RDF/OWL EurowordNet and the related integrated lexical resources (MultiWordNet, EuroWordNet, MEMODATA Lexicon, Hamburg Methaphor DB).
  13. Rettinger, A.; Schumilin, A.; Thoma, S.; Ell, B.: Learning a cross-lingual semantic representation of relations expressed in text (2015) 0.00
    0.0029745363 = product of:
      0.011898145 = sum of:
        0.011898145 = weight(_text_:information in 2027) [ClassicSimilarity], result of:
          0.011898145 = score(doc=2027,freq=2.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.19395474 = fieldWeight in 2027, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=2027)
      0.25 = coord(1/4)
    
    Series
    Information Systems and Applications, incl. Internet/Web, and HCI; Bd. 9088
  14. Freire, N.; Charles, V.; Isaac, A.: Subject information and multilingualism in European bibliographic datasets : experiences with Universal Decimal Classification (2015) 0.00
    0.0029745363 = product of:
      0.011898145 = sum of:
        0.011898145 = weight(_text_:information in 2289) [ClassicSimilarity], result of:
          0.011898145 = score(doc=2289,freq=2.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.19395474 = fieldWeight in 2289, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=2289)
      0.25 = coord(1/4)
    
  15. Celli, F. et al.: Enabling multilingual search through controlled vocabularies : the AGRIS approach (2016) 0.00
    0.0029745363 = product of:
      0.011898145 = sum of:
        0.011898145 = weight(_text_:information in 3278) [ClassicSimilarity], result of:
          0.011898145 = score(doc=3278,freq=2.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.19395474 = fieldWeight in 3278, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=3278)
      0.25 = coord(1/4)
    
    Series
    Communications in computer and information science; 672
  16. Yu, L.-C.; Wu, C.-H.; Chang, R.-Y.; Liu, C.-H.; Hovy, E.H.: Annotation and verification of sense pools in OntoNotes (2010) 0.00
    0.0025760243 = product of:
      0.010304097 = sum of:
        0.010304097 = weight(_text_:information in 4236) [ClassicSimilarity], result of:
          0.010304097 = score(doc=4236,freq=6.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.16796975 = fieldWeight in 4236, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4236)
      0.25 = coord(1/4)
    
    Abstract
    The paper describes the OntoNotes, a multilingual (English, Chinese and Arabic) corpus with large-scale semantic annotations, including predicate-argument structure, word senses, ontology linking, and coreference. The underlying semantic model of OntoNotes involves word senses that are grouped into so-called sense pools, i.e., sets of near-synonymous senses of words. Such information is useful for many applications, including query expansion for information retrieval (IR) systems, (near-)duplicate detection for text summarization systems, and alternative word selection for writing support systems. Although a sense pool provides a set of near-synonymous senses of words, there is still no knowledge about whether two words in a pool are interchangeable in practical use. Therefore, this paper devises an unsupervised algorithm that incorporates Google n-grams and a statistical test to determine whether a word in a pool can be substituted by other words in the same pool. The n-gram features are used to measure the degree of context mismatch for a substitution. The statistical test is then applied to determine whether the substitution is adequate based on the degree of mismatch. The proposed method is compared with a supervised method, namely Linear Discriminant Analysis (LDA). Experimental results show that the proposed unsupervised method can achieve comparable performance with the supervised method.
    Source
    Information processing and management. 46(2010) no.4, S.436-447
  17. He, D.; Wu, D.: Enhancing query translation with relevance feedback in translingual information retrieval : a study of the medication process (2011) 0.00
    0.0025760243 = product of:
      0.010304097 = sum of:
        0.010304097 = weight(_text_:information in 4244) [ClassicSimilarity], result of:
          0.010304097 = score(doc=4244,freq=6.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.16796975 = fieldWeight in 4244, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4244)
      0.25 = coord(1/4)
    
    Abstract
    As an effective technique for improving retrieval effectiveness, relevance feedback (RF) has been widely studied in both monolingual and translingual information retrieval (TLIR). The studies of RF in TLIR have been focused on query expansion (QE), in which queries are reformulated before and/or after they are translated. However, RF in TLIR actually not only can help select better query terms, but also can enhance query translation by adjusting translation probabilities and even resolving some out-of-vocabulary terms. In this paper, we propose a novel relevance feedback method called translation enhancement (TE), which uses the extracted translation relationships from relevant documents to revise the translation probabilities of query terms and to identify extra available translation alternatives so that the translated queries are more tuned to the current search. We studied TE using pseudo-relevance feedback (PRF) and interactive relevance feedback (IRF). Our results show that TE can significantly improve TLIR with both types of relevance feedback methods, and that the improvement is comparable to that of query expansion. More importantly, the effects of translation enhancement and query expansion are complementary. Their integration can produce further improvement, and makes TLIR more robust for a variety of queries.
    Source
    Information processing and management. 47(2011) no.1, S.1-17
  18. Ye, Z.; Huang, J.X.; He, B.; Lin, H.: Mining a multilingual association dictionary from Wikipedia for cross-language information retrieval (2012) 0.00
    0.0025760243 = product of:
      0.010304097 = sum of:
        0.010304097 = weight(_text_:information in 513) [ClassicSimilarity], result of:
          0.010304097 = score(doc=513,freq=6.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.16796975 = fieldWeight in 513, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=513)
      0.25 = coord(1/4)
    
    Abstract
    Wikipedia is characterized by its dense link structure and a large number of articles in different languages, which make it a notable Web corpus for knowledge extraction and mining, in particular for mining the multilingual associations. In this paper, motivated by a psychological theory of word meaning, we propose a graph-based approach to constructing a cross-language association dictionary (CLAD) from Wikipedia, which can be used in a variety of cross-language accessing and processing applications. In order to evaluate the quality of the mined CLAD, and to demonstrate how the mined CLAD can be used in practice, we explore two different applications of the mined CLAD to cross-language information retrieval (CLIR). First, we use the mined CLAD to conduct cross-language query expansion; and, second, we use it to filter out translation candidates with low translation probabilities. Experimental results on a variety of standard CLIR test collections show that the CLIR retrieval performance can be substantially improved with the above two applications of CLAD, which indicates that the mined CLAD is of sound quality.
    Source
    Journal of the American Society for Information Science and Technology. 63(2012) no.12, S.2474-2487
  19. Huckstorf, A.; Petras, V.: Mind the lexical gap : EuroVoc Building Block of the Semantic Web (2011) 0.00
    0.0025239778 = product of:
      0.010095911 = sum of:
        0.010095911 = weight(_text_:information in 2782) [ClassicSimilarity], result of:
          0.010095911 = score(doc=2782,freq=4.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.16457605 = fieldWeight in 2782, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2782)
      0.25 = coord(1/4)
    
    Abstract
    Ein Konferenzereignis der besonderen Art fand am 18. und 19. November 2010 in Luxemburg statt. Initiiert durch das Amt für Veröffentlichungen der Europäischen Union (http://publications.europa.eu) waren Bibliothekare und Information Professionals eingeladen, um über die Zukunft mehrsprachiger kontrollierter Vokabulare in Informationssystemen und insbesondere deren Beitrag zum Semantic Web zu diskutieren. Organisiert wurde die Konferenz durch das EuroVoc-Team, das den Thesaurus der Europäischen Union bearbeitet. Die letzte EuroVoc-Konferenz fand im Jahr 2006 statt. In der Zwischenzeit ist EuroVoc zu einem ontologie-basierten Thesaurusmanagementsystem übergegangen und hat systematisch begonnen, Semantic-Web-Technologien für die Bearbeitung und Repräsentation einzusetzen und sich mit anderen Vokabularen zu vernetzen. Ein produktiver Austausch fand mit den Produzenten anderer europäischer und internationaler Vokabulare (z.B. United Nations oder FAO) sowie Vertretern aus Projekten, die an Themen über automatische Indexierung (hier insbesondere parlamentarische und rechtliche Dokumente) sowie Interoperabilitiät zwischen Vokabularen arbeiten, statt.
    Source
    Information - Wissenschaft und Praxis. 62(2011) H.2/3, S.125-126
  20. Luca, E.W. de; Dahlberg, I.: ¬Die Multilingual Lexical Linked Data Cloud : eine mögliche Zugangsoptimierung? (2014) 0.00
    0.0025239778 = product of:
      0.010095911 = sum of:
        0.010095911 = weight(_text_:information in 1736) [ClassicSimilarity], result of:
          0.010095911 = score(doc=1736,freq=4.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.16457605 = fieldWeight in 1736, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1736)
      0.25 = coord(1/4)
    
    Abstract
    Sehr viele Informationen sind bereits im Web verfügbar oder können aus isolierten strukturierten Datenspeichern wie Informationssystemen und sozialen Netzwerken gewonnen werden. Datenintegration durch Nachbearbeitung oder durch Suchmechanismen (z. B. D2R) ist deshalb wichtig, um Informationen allgemein verwendbar zu machen. Semantische Technologien ermöglichen die Verwendung definierter Verbindungen (typisierter Links), durch die ihre Beziehungen zueinander festgehalten werden, was Vorteile für jede Anwendung bietet, die das in Daten enthaltene Wissen wieder verwenden kann. Um ­eine semantische Daten-Landkarte herzustellen, benötigen wir Wissen über die einzelnen Daten und ihre Beziehung zu anderen Daten. Dieser Beitrag stellt unsere Arbeit zur Benutzung von Lexical Linked Data (LLD) durch ein Meta-Modell vor, das alle Ressourcen enthält und zudem die Möglichkeit bietet sie unter unterschiedlichen Gesichtspunkten aufzufinden. Wir verbinden damit bestehende Arbeiten über Wissensgebiete (basierend auf der Information Coding Classification) mit der Multilingual Lexical Linked Data Cloud (basierend auf der RDF/OWL-Repräsentation von EuroWordNet und den ähnlichen integrierten lexikalischen Ressourcen MultiWordNet, MEMODATA und die Hamburg Metapher DB).
    Source
    Information - Wissenschaft und Praxis. 65(2014) H.4/5, S.279-287

Languages

  • e 26
  • d 5

Types

  • a 28
  • m 2
  • el 1
  • More… Less…