Search (189 results, page 1 of 10)

Zhou, Y. et al.: Analysing entity context in multilingual Wikipedia to support entity-centric retrieval applications (2016) 0.10

0.09993737 = product of:
  0.16656227 = sum of:
    0.04679445 = weight(_text_:retrieval in 2758) [ClassicSimilarity], result of:
      0.04679445 = score(doc=2758,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.33420905 = fieldWeight in 2758, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=2758)
    0.0884113 = weight(_text_:semantic in 2758) [ClassicSimilarity], result of:
      0.0884113 = score(doc=2758,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.45938298 = fieldWeight in 2758, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.078125 = fieldNorm(doc=2758)
    0.031356532 = product of:
      0.062713064 = sum of:
        0.062713064 = weight(_text_:22 in 2758) [ClassicSimilarity], result of:
          0.062713064 = score(doc=2758,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.38690117 = fieldWeight in 2758, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2758)
      0.5 = coord(1/2)
  0.6 = coord(3/5)

Date: 1. 2.2016 18:25:22
Source: Semantic keyword-based search on structured data sources: First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers. Eds.: J. Cardoso et al

Li, K.W.; Yang, C.C.: Automatic crosslingual thesaurus generated from the Hong Kong SAR Police Department Web Corpus for Crime Analysis (2005) 0.08
```
0.08180248 = product of:
  0.13633746 = sum of:
    0.04185423 = weight(_text_:retrieval in 3391) [ClassicSimilarity], result of:
      0.04185423 = score(doc=3391,freq=10.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.29892567 = fieldWeight in 3391, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=3391)
    0.079077475 = weight(_text_:semantic in 3391) [ClassicSimilarity], result of:
      0.079077475 = score(doc=3391,freq=10.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.41088465 = fieldWeight in 3391, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.03125 = fieldNorm(doc=3391)
    0.0154057555 = product of:
      0.030811511 = sum of:
        0.030811511 = weight(_text_:web in 3391) [ClassicSimilarity], result of:
          0.030811511 = score(doc=3391,freq=4.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.2039694 = fieldWeight in 3391, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=3391)
      0.5 = coord(1/2)
  0.6 = coord(3/5)
```
Abstract

For the sake of national security, very large volumes of data and information are generated and gathered daily. Much of this data and information is written in different languages, stored in different locations, and may be seemingly unconnected. Crosslingual semantic interoperability is a major challenge to generate an overview of this disparate data and information so that it can be analyzed, shared, searched, and summarized. The recent terrorist attacks and the tragic events of September 11, 2001 have prompted increased attention an national security and criminal analysis. Many Asian countries and cities, such as Japan, Taiwan, and Singapore, have been advised that they may become the next targets of terrorist attacks. Semantic interoperability has been a focus in digital library research. Traditional information retrieval (IR) approaches normally require a document to share some common keywords with the query. Generating the associations for the related terms between the two term spaces of users and documents is an important issue. The problem can be viewed as the creation of a thesaurus. Apart from this, terrorists and criminals may communicate through letters, e-mails, and faxes in languages other than English. The translation ambiguity significantly exacerbates the retrieval problem. The problem is expanded to crosslingual semantic interoperability. In this paper, we focus an the English/Chinese crosslingual semantic interoperability problem. However, the developed techniques are not limited to English and Chinese languages but can be applied to many other languages. English and Chinese are popular languages in the Asian region. Much information about national security or crime is communicated in these languages. An efficient automatically generated thesaurus between these languages is important to crosslingual information retrieval between English and Chinese languages. To facilitate crosslingual information retrieval, a corpus-based approach uses the term co-occurrence statistics in parallel or comparable corpora to construct a statistical translation model to cross the language boundary. In this paper, the text based approach to align English/Chinese Hong Kong Police press release documents from the Web is first presented. We also introduce an algorithmic approach to generate a robust knowledge base based an statistical correlation analysis of the semantics (knowledge) embedded in the bilingual press release corpus. The research output consisted of a thesaurus-like, semantic network knowledge base, which can aid in semanticsbased crosslingual information management and retrieval.

Rettinger, A.; Schumilin, A.; Thoma, S.; Ell, B.: Learning a cross-lingual semantic representation of relations expressed in text (2015) 0.08

0.08012127 = product of:
  0.20030317 = sum of:
    0.15313287 = weight(_text_:semantic in 2027) [ClassicSimilarity], result of:
      0.15313287 = score(doc=2027,freq=6.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.7956747 = fieldWeight in 2027, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.078125 = fieldNorm(doc=2027)
    0.0471703 = product of:
      0.0943406 = sum of:
        0.0943406 = weight(_text_:web in 2027) [ClassicSimilarity], result of:
          0.0943406 = score(doc=2027,freq=6.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.6245262 = fieldWeight in 2027, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=2027)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Series: Information Systems and Applications, incl. Internet/Web, and HCI; Bd. 9088
Source: The Semantic Web: latest advances and new domains. 12th European Semantic Web Conference, ESWC 2015 Portoroz, Slovenia, May 31 -- June 4, 2015. Proceedings. Eds.: F. Gandon u.a

Fluhr, C.: Crosslingual access to photo databases (2012) 0.07

0.07314605 = product of:
  0.12191008 = sum of:
    0.028076671 = weight(_text_:retrieval in 93) [ClassicSimilarity], result of:
      0.028076671 = score(doc=93,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.20052543 = fieldWeight in 93, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=93)
    0.075019486 = weight(_text_:semantic in 93) [ClassicSimilarity], result of:
      0.075019486 = score(doc=93,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.38979942 = fieldWeight in 93, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=93)
    0.01881392 = product of:
      0.03762784 = sum of:
        0.03762784 = weight(_text_:22 in 93) [ClassicSimilarity], result of:
          0.03762784 = score(doc=93,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.23214069 = fieldWeight in 93, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=93)
      0.5 = coord(1/2)
  0.6 = coord(3/5)

Abstract: This paper is about search of photos in photo databases of agencies which sell photos over the Internet. The problem is far from the behavior of photo databases managed by librarians and also far from the corpora generally used for research purposes. The descriptions use mainly single words and it is well known that it is not the best way to have a good search. This increases the problem of semantic ambiguity. This problem of semantic ambiguity is crucial for cross-language querying. On the other hand, users are not aware of documentation techniques and use generally very simple queries but want to get precise answers. This paper gives the experience gained in a 3 year use (2006-2008) of a cross-language access to several of the main international commercial photo databases. The languages used were French, English, and German.
Date: 17. 4.2012 14:25:22
Source: Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a

Subirats, I.; Prasad, A.R.D.; Keizer, J.; Bagdanov, A.: Implementation of rich metadata formats and demantic tools using DSpace (2008) 0.07
```
0.06936182 = product of:
  0.11560303 = sum of:
    0.01871778 = weight(_text_:retrieval in 2656) [ClassicSimilarity], result of:
      0.01871778 = score(doc=2656,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.13368362 = fieldWeight in 2656, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=2656)
    0.050012987 = weight(_text_:semantic in 2656) [ClassicSimilarity], result of:
      0.050012987 = score(doc=2656,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.25986627 = fieldWeight in 2656, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.03125 = fieldNorm(doc=2656)
    0.046872254 = sum of:
      0.021787029 = weight(_text_:web in 2656) [ClassicSimilarity], result of:
        0.021787029 = score(doc=2656,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.14422815 = fieldWeight in 2656, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.03125 = fieldNorm(doc=2656)
      0.025085226 = weight(_text_:22 in 2656) [ClassicSimilarity], result of:
        0.025085226 = score(doc=2656,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.15476047 = fieldWeight in 2656, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=2656)
  0.6 = coord(3/5)
```
Abstract

This poster explores the customization of DSpace to allow the use of the AGRIS Application Profile metadata standard and the AGROVOC thesaurus. The objective is the adaptation of DSpace, through the least invasive code changes either in the form of plug-ins or add-ons, to the specific needs of the Agricultural Sciences and Technology community. Metadata standards such as AGRIS AP, and Knowledge Organization Systems such as the AGROVOC thesaurus, provide mechanisms for sharing information in a standardized manner by recommending the use of common semantics and interoperable syntax (Subirats et al., 2007). AGRIS AP was created to enhance the description, exchange and subsequent retrieval of agricultural Document-like Information Objects (DLIOs). It is a metadata schema which draws from Metadata standards such as Dublin Core (DC), the Australian Government Locator Service Metadata (AGLS) and the Agricultural Metadata Element Set (AgMES) namespaces. It allows sharing of information across dispersed bibliographic systems (FAO, 2005). AGROVOC68 is a multilingual structured thesaurus covering agricultural and related domains. Its main role is to standardize the indexing process in order to make searching simpler and more efficient. AGROVOC is developed by FAO (Lauser et al., 2006). The customization of the DSpace is taking place in several phases. First, the AGRIS AP metadata schema was mapped onto the metadata DSpace model, with several enhancements implemented to support AGRIS AP elements. Next, AGROVOC will be integrated as a controlled vocabulary accessed through a local SKOS or OWL file. Eventually the system will be configurable to access AGROVOC through local files or remotely via webservices. Finally, spell checking and tooltips will be incorporated in the user interface to support metadata editing. Adapting DSpace to support AGRIS AP and annotation using the semantically-rich AGROVOC thesaurus transform DSpace into a powerful, domain-specific system for annotation and exchange of bibliographic metadata in the agricultural domain.

Source

Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas

Theme

Semantic Web

Levergood, B.; Farrenkopf, S.; Frasnelli, E.: ¬The specification of the language of the field and interoperability : cross-language access to catalogues and online libraries (CACAO) (2008) 0.06

0.05996243 = product of:
  0.09993738 = sum of:
    0.028076671 = weight(_text_:retrieval in 2646) [ClassicSimilarity], result of:
      0.028076671 = score(doc=2646,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.20052543 = fieldWeight in 2646, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2646)
    0.05304678 = weight(_text_:semantic in 2646) [ClassicSimilarity], result of:
      0.05304678 = score(doc=2646,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.2756298 = fieldWeight in 2646, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=2646)
    0.01881392 = product of:
      0.03762784 = sum of:
        0.03762784 = weight(_text_:22 in 2646) [ClassicSimilarity], result of:
          0.03762784 = score(doc=2646,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.23214069 = fieldWeight in 2646, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2646)
      0.5 = coord(1/2)
  0.6 = coord(3/5)

Abstract: The CACAO Project (Cross-language Access to Catalogues and Online Libraries) has been designed to implement natural language processing and cross-language information retrieval techniques to provide cross-language access to information in libraries, a critical issue in the linguistically diverse European Union. This project report addresses two metadata-related challenges for the library community in this context: "false friends" (identical words having different meanings in different languages) and term ambiguity. The possible solutions involve enriching the metadata with attributes specifying language or the source authority file, or associating potential search terms to classes in a classification system. The European Library will evaluate an early implementation of this work in late 2008.
Source: Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas

Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2012) 0.06

0.05557645 = product of:
  0.13894112 = sum of:
    0.05304678 = weight(_text_:semantic in 1967) [ClassicSimilarity], result of:
      0.05304678 = score(doc=1967,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.2756298 = fieldWeight in 1967, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=1967)
    0.085894346 = sum of:
      0.03268054 = weight(_text_:web in 1967) [ClassicSimilarity], result of:
        0.03268054 = score(doc=1967,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.21634221 = fieldWeight in 1967, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.046875 = fieldNorm(doc=1967)
      0.0532138 = weight(_text_:22 in 1967) [ClassicSimilarity], result of:
        0.0532138 = score(doc=1967,freq=4.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.32829654 = fieldWeight in 1967, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=1967)
  0.4 = coord(2/5)

Abstract: This paper reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The paper discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and /or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the DDC (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.
Source: Beyond libraries - subject metadata in the digital environment and semantic web. IFLA Satellite Post-Conference, 17-18 August 2012, Tallinn

Cross-language information retrieval (1998) 0.06
```
0.055131674 = product of:
  0.09188612 = sum of:
    0.04679445 = weight(_text_:retrieval in 6299) [ClassicSimilarity], result of:
      0.04679445 = score(doc=6299,freq=32.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.33420905 = fieldWeight in 6299, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
    0.038283218 = weight(_text_:semantic in 6299) [ClassicSimilarity], result of:
      0.038283218 = score(doc=6299,freq=6.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.19891867 = fieldWeight in 6299, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
    0.0068084467 = product of:
      0.013616893 = sum of:
        0.013616893 = weight(_text_:web in 6299) [ClassicSimilarity], result of:
          0.013616893 = score(doc=6299,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.09014259 = fieldWeight in 6299, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
      0.5 = coord(1/2)
  0.6 = coord(3/5)
```
Content

Enthält die Beiträge: GREFENSTETTE, G.: The Problem of Cross-Language Information Retrieval; DAVIS, M.W.: On the Effective Use of Large Parallel Corpora in Cross-Language Text Retrieval; BALLESTEROS, L. u. W.B. CROFT: Statistical Methods for Cross-Language Information Retrieval; Distributed Cross-Lingual Information Retrieval; Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing; EVANS, D.A. u.a.: Mapping Vocabularies Using Latent Semantics; PICCHI, E. u. C. PETERS: Cross-Language Information Retrieval: A System for Comparable Corpus Querying; YAMABANA, K. u.a.: A Language Conversion Front-End for Cross-Language Information Retrieval; GACHOT, D.A. u.a.: The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval; HULL, D.: A Weighted Boolean Model for Cross-Language Text Retrieval; SHERIDAN, P. u.a. Building a Large Multilingual Test Collection from Comparable News Documents; OARD; D.W. u. B.J. DORR: Evaluating Cross-Language Text Filtering Effectiveness

Footnote

Rez. in: Machine translation review: 1999, no.10, S.26-27 (D. Lewis): "Cross Language Information Retrieval (CLIR) addresses the growing need to access large volumes of data across language boundaries. The typical requirement is for the user to input a free form query, usually a brief description of a topic, into a search or retrieval engine which returns a list, in ranked order, of documents or web pages that are relevant to the topic. The search engine matches the terms in the query to indexed terms, usually keywords previously derived from the target documents. Unlike monolingual information retrieval, CLIR requires query terms in one language to be matched to indexed terms in another. Matching can be done by bilingual dictionary lookup, full machine translation, or by applying statistical methods. A query's success is measured in terms of recall (how many potentially relevant target documents are found) and precision (what proportion of documents found are relevant). Issues in CLIR are how to translate query terms into index terms, how to eliminate alternative translations (e.g. to decide that French 'traitement' in a query means 'treatment' and not 'salary'), and how to rank or weight translation alternatives that are retained (e.g. how to order the French terms 'aventure', 'business', 'affaire', and 'liaison' as relevant translations of English 'affair'). Grefenstette provides a lucid and useful overview of the field and the problems. The volume brings together a number of experiments and projects in CLIR. Mark Davies (New Mexico State University) describes Recuerdo, a Spanish retrieval engine which reduces translation ambiguities by scanning indexes for parallel texts; it also uses either a bilingual dictionary or direct equivalents from a parallel corpus in order to compare results for queries on parallel texts. Lisa Ballesteros and Bruce Croft (University of Massachusetts) use a 'local feedback' technique which automatically enhances a query by adding extra terms to it both before and after translation; such terms can be derived from documents known to be relevant to the query.
Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.

Series

The Kluwer International series on information retrieval

De Luca, E.W.; Dahlberg, I.: Including knowledge domains from the ICC into the multilingual lexical linked data cloud (2014) 0.05

0.053637948 = product of:
  0.13409486 = sum of:
    0.062516235 = weight(_text_:semantic in 1493) [ClassicSimilarity], result of:
      0.062516235 = score(doc=1493,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32483283 = fieldWeight in 1493, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1493)
    0.07157862 = sum of:
      0.027233787 = weight(_text_:web in 1493) [ClassicSimilarity], result of:
        0.027233787 = score(doc=1493,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.18028519 = fieldWeight in 1493, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1493)
      0.04434483 = weight(_text_:22 in 1493) [ClassicSimilarity], result of:
        0.04434483 = score(doc=1493,freq=4.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.27358043 = fieldWeight in 1493, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1493)
  0.4 = coord(2/5)

Abstract: A lot of information that is already available on the Web, or retrieved from local information systems and social networks is structured in data silos that are not semantically related. Semantic technologies make it emerge that the use of typed links that directly express their relations are an advantage for every application that can reuse the incorporated knowledge about the data. For this reason, data integration, through reengineering (e.g. triplify), or querying (e.g. D2R) is an important task in order to make information available for everyone. Thus, in order to build a semantic map of the data, we need knowledge about data items itself and the relation between heterogeneous data items. In this paper, we present our work of providing Lexical Linked Data (LLD) through a meta-model that contains all the resources and gives the possibility to retrieve and navigate them from different perspectives. We combine the existing work done on knowledge domains (based on the Information Coding Classification) within the Multilingual Lexical Linked Data Cloud (based on the RDF/OWL EurowordNet and the related integrated lexical resources (MultiWordNet, EuroWordNet, MEMODATA Lexicon, Hamburg Methaphor DB).
Date: 22. 9.2014 19:01:18
Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.05
```
0.05286736 = product of:
  0.088112265 = sum of:
    0.04185423 = weight(_text_:retrieval in 6068) [ClassicSimilarity], result of:
      0.04185423 = score(doc=6068,freq=10.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.29892567 = fieldWeight in 6068, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=6068)
    0.03536452 = weight(_text_:semantic in 6068) [ClassicSimilarity], result of:
      0.03536452 = score(doc=6068,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.18375319 = fieldWeight in 6068, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.03125 = fieldNorm(doc=6068)
    0.010893514 = product of:
      0.021787029 = sum of:
        0.021787029 = weight(_text_:web in 6068) [ClassicSimilarity], result of:
          0.021787029 = score(doc=6068,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.14422815 = fieldWeight in 6068, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=6068)
      0.5 = coord(1/2)
  0.6 = coord(3/5)
```
Abstract

Over the past 50 years, a variety of language-related capabilities has been developed in machine translation, information retrieval, speech recognition, text summarization, and so on. These applications rest upon a set of core techniques such as language modeling, information extraction, parsing, generation, and multimedia planning and integration; and they involve methods using statistics, rules, grammars, lexicons, ontologies, training techniques, and so on. It is a puzzling fact that although all of this work deals with language in some form or other, the major applications have each developed a separate research field. For example, there is no reason why speech recognition techniques involving n-grams and hidden Markov models could not have been used in machine translation 15 years earlier than they were, or why some of the lexical and semantic insights from the subarea called Computational Linguistics are still not used in information retrieval.
This picture will rapidly change. The twin challenges of massive information overload via the web and ubiquitous computers present us with an unavoidable task: developing techniques to handle multilingual and multi-modal information robustly and efficiently, with as high quality performance as possible. The most effective way for us to address such a mammoth task, and to ensure that our various techniques and applications fit together, is to start talking across the artificial research boundaries. Extending the current technologies will require integrating the various capabilities into multi-functional and multi-lingual natural language systems. However, at this time there is no clear vision of how these technologies could or should be assembled into a coherent framework. What would be involved in connecting a speech recognition system to an information retrieval engine, and then using machine translation and summarization software to process the retrieved text? How can traditional parsing and generation be enhanced with statistical techniques? What would be the effect of carefully crafted lexicons on traditional information retrieval? At which points should machine translation be interleaved within information retrieval systems to enable multilingual processing?
Huckstorf, A.; Petras, V.: Mind the lexical gap : EuroVoc Building Block of the Semantic Web (2011) 0.05
```
0.048072767 = product of:
  0.12018191 = sum of:
    0.091879725 = weight(_text_:semantic in 2782) [ClassicSimilarity], result of:
      0.091879725 = score(doc=2782,freq=6.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.47740483 = fieldWeight in 2782, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=2782)
    0.028302183 = product of:
      0.056604367 = sum of:
        0.056604367 = weight(_text_:web in 2782) [ClassicSimilarity], result of:
          0.056604367 = score(doc=2782,freq=6.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.37471575 = fieldWeight in 2782, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2782)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Ein Konferenzereignis der besonderen Art fand am 18. und 19. November 2010 in Luxemburg statt. Initiiert durch das Amt für Veröffentlichungen der Europäischen Union (http://publications.europa.eu) waren Bibliothekare und Information Professionals eingeladen, um über die Zukunft mehrsprachiger kontrollierter Vokabulare in Informationssystemen und insbesondere deren Beitrag zum Semantic Web zu diskutieren. Organisiert wurde die Konferenz durch das EuroVoc-Team, das den Thesaurus der Europäischen Union bearbeitet. Die letzte EuroVoc-Konferenz fand im Jahr 2006 statt. In der Zwischenzeit ist EuroVoc zu einem ontologie-basierten Thesaurusmanagementsystem übergegangen und hat systematisch begonnen, Semantic-Web-Technologien für die Bearbeitung und Repräsentation einzusetzen und sich mit anderen Vokabularen zu vernetzen. Ein produktiver Austausch fand mit den Produzenten anderer europäischer und internationaler Vokabulare (z.B. United Nations oder FAO) sowie Vertretern aus Projekten, die an Themen über automatische Indexierung (hier insbesondere parlamentarische und rechtliche Dokumente) sowie Interoperabilitiät zwischen Vokabularen arbeiten, statt.

Sartini, B.; Erp, M. van; Gangemi, A.: Marriage is a peach and a chalice : modelling cultural symbolism on the Semantic Web (2021) 0.05

0.048072767 = product of:
  0.12018191 = sum of:
    0.091879725 = weight(_text_:semantic in 557) [ClassicSimilarity], result of:
      0.091879725 = score(doc=557,freq=6.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.47740483 = fieldWeight in 557, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=557)
    0.028302183 = product of:
      0.056604367 = sum of:
        0.056604367 = weight(_text_:web in 557) [ClassicSimilarity], result of:
          0.056604367 = score(doc=557,freq=6.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.37471575 = fieldWeight in 557, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=557)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: In this work, we fill the gap in the Semantic Web in the context of Cultural Symbolism. Building upon earlier work in \citesartini_towards_2021, we introduce the Simulation Ontology, an ontology that models the background knowledge of symbolic meanings, developed by combining the concepts taken from the authoritative theory of Simulacra and Simulations of Jean Baudrillard with symbolic structures and content taken from "Symbolism: a Comprehensive Dictionary'' by Steven Olderr. We re-engineered the symbolic knowledge already present in heterogeneous resources by converting it into our ontology schema to create HyperReal, the first knowledge graph completely dedicated to cultural symbolism. A first experiment run on the knowledge graph is presented to show the potential of quantitative research on symbolism.
Theme: Semantic Web

Freitas-Junior, H.R.; Ribeiro-Neto, B.A.; Freitas-Vale, R. de; Laender, A.H.F.; Lima, L.R.S. de: Categorization-driven cross-language retrieval of medical information (2006) 0.05

0.046360638 = product of:
  0.11590159 = sum of:
    0.057311267 = weight(_text_:retrieval in 5282) [ClassicSimilarity], result of:
      0.057311267 = score(doc=5282,freq=12.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.40932083 = fieldWeight in 5282, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5282)
    0.05859032 = sum of:
      0.027233787 = weight(_text_:web in 5282) [ClassicSimilarity], result of:
        0.027233787 = score(doc=5282,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.18028519 = fieldWeight in 5282, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5282)
      0.031356532 = weight(_text_:22 in 5282) [ClassicSimilarity], result of:
        0.031356532 = score(doc=5282,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.19345059 = fieldWeight in 5282, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5282)
  0.4 = coord(2/5)

Abstract: The Web has become a large repository of documents (or pages) written in many different languages. In this context, traditional information retrieval (IR) techniques cannot be used whenever the user query and the documents being retrieved are in different languages. To address this problem, new cross-language information retrieval (CLIR) techniques have been proposed. In this work, we describe a method for cross-language retrieval of medical information. This method combines query terms and related medical concepts obtained automatically through a categorization procedure. The medical concepts are used to create a linguistic abstraction that allows retrieval of information in a language-independent way, minimizing linguistic problems such as polysemy. To evaluate our method, we carried out experiments using the OHSUMED test collection, whose documents are written in English, with queries expressed in Portuguese, Spanish, and French. The results indicate that our cross-language retrieval method is as effective as a standard vector space model algorithm operating on queries and documents in the same language. Further, our results are better than previous results in the literature.
Date: 22. 7.2006 16:46:36

Evens, M.: Thesaural relations in information retrieval (2002) 0.05

0.046331253 = product of:
  0.11582813 = sum of:
    0.06278135 = weight(_text_:retrieval in 1201) [ClassicSimilarity], result of:
      0.06278135 = score(doc=1201,freq=10.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.44838852 = fieldWeight in 1201, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1201)
    0.05304678 = weight(_text_:semantic in 1201) [ClassicSimilarity], result of:
      0.05304678 = score(doc=1201,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.2756298 = fieldWeight in 1201, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=1201)
  0.4 = coord(2/5)

Abstract: Thesaural relations have long been used in information retrieval to enrich queries; they have sometimes been used to cluster documents as well. Sometimes the first query to an information retrieval system yields no results at all, or, what can be even more disconcerting, many thousands of hits. One solution is to rephrase the query, improving the choice of query terms by using related terms of different types. A collection of related terms is often called a thesaurus. This chapter describes the lexical-semantic relations that have been used in building thesauri and summarizes some of the effects of using these relational thesauri in information retrieval experiments
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2014) 0.05

0.04631371 = product of:
  0.11578427 = sum of:
    0.04420565 = weight(_text_:semantic in 1962) [ClassicSimilarity], result of:
      0.04420565 = score(doc=1962,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.22969149 = fieldWeight in 1962, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1962)
    0.07157862 = sum of:
      0.027233787 = weight(_text_:web in 1962) [ClassicSimilarity], result of:
        0.027233787 = score(doc=1962,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.18028519 = fieldWeight in 1962, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1962)
      0.04434483 = weight(_text_:22 in 1962) [ClassicSimilarity], result of:
        0.04434483 = score(doc=1962,freq=4.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.27358043 = fieldWeight in 1962, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1962)
  0.4 = coord(2/5)

Abstract: This article reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The article discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and/or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the Dewey Decimal Classification [DDC] (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.
Footnote: Contribution in a special issue "Beyond libraries: Subject metadata in the digital environment and Semantic Web" - Enthält Beiträge der gleichnamigen IFLA Satellite Post-Conference, 17-18 August 2012, Tallinn.

Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.04

0.044768713 = product of:
  0.11192178 = sum of:
    0.028076671 = weight(_text_:retrieval in 4436) [ClassicSimilarity], result of:
      0.028076671 = score(doc=4436,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.20052543 = fieldWeight in 4436, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.08384511 = sum of:
      0.046217266 = weight(_text_:web in 4436) [ClassicSimilarity], result of:
        0.046217266 = score(doc=4436,freq=4.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.3059541 = fieldWeight in 4436, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.046875 = fieldNorm(doc=4436)
      0.03762784 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
        0.03762784 = score(doc=4436,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.23214069 = fieldWeight in 4436, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=4436)
  0.4 = coord(2/5)

Abstract: Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
Date: 16. 2.2000 14:22:39

Lassalle, E.: Text retrieval : from a monolingual system to a multilingual system (1993) 0.04

0.043284822 = product of:
  0.108212054 = sum of:
    0.04632414 = weight(_text_:retrieval in 7403) [ClassicSimilarity], result of:
      0.04632414 = score(doc=7403,freq=4.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.33085006 = fieldWeight in 7403, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7403)
    0.06188791 = weight(_text_:semantic in 7403) [ClassicSimilarity], result of:
      0.06188791 = score(doc=7403,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32156807 = fieldWeight in 7403, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7403)
  0.4 = coord(2/5)

Abstract: Describes the TELMI monolingual text retrieval system and its future extension, a multilingual system. TELMI is designed for medium sized databases containing short texts. The characteristics of the system are fine-grained natural language processing (NLP); an open domain and a large scale knowledge base; automated indexing based on conceptual representation of texts and reusability of the NLP tools. Discusses the French MINITEL service, the MGS information service and the TELMI research system covering the full text system; NLP architecture; the lexical level; the syntactic level; the semantic level and an example of the use of a generic system

McCulloch, E.: Multiple terminologies : an obstacle to information retrieval (2004) 0.04

0.043284822 = product of:
  0.108212054 = sum of:
    0.04632414 = weight(_text_:retrieval in 2798) [ClassicSimilarity], result of:
      0.04632414 = score(doc=2798,freq=4.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.33085006 = fieldWeight in 2798, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2798)
    0.06188791 = weight(_text_:semantic in 2798) [ClassicSimilarity], result of:
      0.06188791 = score(doc=2798,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32156807 = fieldWeight in 2798, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2798)
  0.4 = coord(2/5)

Abstract: An issue currently at the forefront of digital library research is the prevalence of disparate terminologies and the associated limitations imposed on user searching. It is thought that semantic interoperability is achievable by improving the compatibility between terminologies and classification schemes, enabling users to search multiple resources simultaneously and improve retrieval effectiveness through the use of associated terms drawn from several schemes. This column considers the terminology issue before outlining various proposed methods of tackling it, with a particular focus on terminology mapping.

Ma, X.; Carranza, E.J.M.; Wu, C.; Meer, F.D. van der; Liu, G.: ¬A SKOS-based multilingual thesaurus of geological time scale for interoperability of online geological maps (2011) 0.04
```
0.041692834 = product of:
  0.069488056 = sum of:
    0.01871778 = weight(_text_:retrieval in 4800) [ClassicSimilarity], result of:
      0.01871778 = score(doc=4800,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.13368362 = fieldWeight in 4800, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=4800)
    0.03536452 = weight(_text_:semantic in 4800) [ClassicSimilarity], result of:
      0.03536452 = score(doc=4800,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.18375319 = fieldWeight in 4800, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.03125 = fieldNorm(doc=4800)
    0.0154057555 = product of:
      0.030811511 = sum of:
        0.030811511 = weight(_text_:web in 4800) [ClassicSimilarity], result of:
          0.030811511 = score(doc=4800,freq=4.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.2039694 = fieldWeight in 4800, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=4800)
      0.5 = coord(1/2)
  0.6 = coord(3/5)
```
Abstract

The usefulness of online geological maps is hindered by linguistic barriers. Multilingual geoscience thesauri alleviate linguistic barriers of geological maps. However, the benefits of multilingual geoscience thesauri for online geological maps are less studied. In this regard, we developed a multilingual thesaurus of geological time scale (GTS) to alleviate linguistic barriers of GTS records among online geological maps. We extended the Simple Knowledge Organization System (SKOS) model to represent the ordinal hierarchical structure of GTS terms. We collected GTS terms in seven languages and encoded them into a thesaurus by using the extended SKOS model. We implemented methods of characteristic-oriented term retrieval in JavaScript programs for accessing Web Map Services (WMS), recognizing GTS terms, and making translations. With the developed thesaurus and programs, we set up a pilot system to test recognitions and translations of GTS terms in online geological maps. Results of this pilot system proved the accuracy of the developed thesaurus and the functionality of the developed programs. Therefore, with proper deployments, SKOS-based multilingual geoscience thesauri can be functional for alleviating linguistic barriers among online geological maps and, thus, improving their interoperability.

Content

Article Outline 1. Introduction 2. SKOS-based multilingual thesaurus of geological time scale 2.1. Addressing the insufficiency of SKOS in the context of the Semantic Web 2.2. Addressing semantics and syntax/lexicon in multilingual GTS terms 2.3. Extending SKOS model to capture GTS structure 2.4. Summary of building the SKOS-based MLTGTS 3. Recognizing and translating GTS terms retrieved from WMS 4. Pilot system, results, and evaluation 5. Discussion 6. Conclusions Vgl. unter: http://www.sciencedirect.com/science?_ob=MiamiImageURL&_cid=271720&_user=3865853&_pii=S0098300411000744&_check=y&_origin=&_coverDate=31-Oct-2011&view=c&wchp=dGLbVlt-zSkzS&_valck=1&md5=e2c1daf53df72d034d22278212578f42&ie=/sdarticle.pdf.
Strobel, S.; Marín-Arraiza, P.: Metadata for scientific audiovisual media : current practices and perspectives of the TIB / AV-portal (2015) 0.04
```
0.038241964 = product of:
  0.09560491 = sum of:
    0.033088673 = weight(_text_:retrieval in 3667) [ClassicSimilarity], result of:
      0.033088673 = score(doc=3667,freq=4.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.23632148 = fieldWeight in 3667, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3667)
    0.062516235 = weight(_text_:semantic in 3667) [ClassicSimilarity], result of:
      0.062516235 = score(doc=3667,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32483283 = fieldWeight in 3667, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3667)
  0.4 = coord(2/5)
```
Abstract

Descriptive metadata play a key role in finding relevant search results in large amounts of unstructured data. However, current scientific audiovisual media are provided with little metadata, which makes them hard to find, let alone individual sequences. In this paper, the TIB / AV-Portal is presented as a use case where methods concerning the automatic generation of metadata, a semantic search and cross-lingual retrieval (German/English) have already been applied. These methods result in a better discoverability of the scientific audiovisual media hosted in the portal. Text, speech, and image content of the video are automatically indexed by specialised GND (Gemeinsame Normdatei) subject headings. A semantic search is established based on properties of the GND ontology. The cross-lingual retrieval uses English 'translations' that were derived by an ontology mapping (DBpedia i. a.). Further ways of increasing the discoverability and reuse of the metadata are publishing them as Linked Open Data and interlinking them with other data sets.

Search (189 results, page 1 of 10)

Authors

Years

Languages

Types

Themes

Subjects

Classifications