Search (9 results, page 1 of 1)

  • × theme_ss:"Automatisches Indexieren"
  • × theme_ss:"Multilinguale Probleme"
  1. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.03
    0.032337323 = product of:
      0.064674646 = sum of:
        0.064674646 = sum of:
          0.00859806 = weight(_text_:a in 4157) [ClassicSimilarity], result of:
            0.00859806 = score(doc=4157,freq=4.0), product of:
              0.04772363 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.041389145 = queryNorm
              0.18016359 = fieldWeight in 4157, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.078125 = fieldNorm(doc=4157)
          0.056076583 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
            0.056076583 = score(doc=4157,freq=2.0), product of:
              0.14493774 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.041389145 = queryNorm
              0.38690117 = fieldWeight in 4157, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
    Type
    a
  2. Ferber, R.: Automated indexing with thesaurus descriptors : a co-occurence based approach to multilingual retrieval (1997) 0.00
    0.002843541 = product of:
      0.005687082 = sum of:
        0.005687082 = product of:
          0.011374164 = sum of:
            0.011374164 = weight(_text_:a in 4144) [ClassicSimilarity], result of:
              0.011374164 = score(doc=4144,freq=28.0), product of:
                0.04772363 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.041389145 = queryNorm
                0.23833402 = fieldWeight in 4144, product of:
                  5.2915025 = tf(freq=28.0), with freq of:
                    28.0 = termFreq=28.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4144)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Indexing documents with descriptors from a multilingual thesaurus is an approach to multilingual information retrieval. However, manual indexing is expensive. Automazed indexing methods in general use terms found in the document. Thesaurus descriptors are complex terms that are often not used in documents or have specific meanings within the thesaurus; therefore most weighting schemes of automated indexing methods are not suited to select thesaurus descriptors. In this paper a linear associative system is described that uses similarity values extracted from a large corpus of manually indexed documents to construct a rank ordering of the descriptors for a given document title. The system is adaptive and has to be tuned with a training sample of records for the specific task. The system was tested on a corpus of some 80.000 bibliographic records. The results show a high variability with changing parameter values. This indicated that it is very important to empirically adapt the model to the specific situation it is used in. The overall median of the manually assigned descriptors in the automatically generated ranked list of all 3.631 descriptors is 14 for the set used to adapt the system and 11 for a test set not used in the optimization process. This result shows that the optimization is not a fitting to a specific training set but a real adaptation of the model to the setting
    Type
    a
  3. Hlava, M.M.K.: Machine-Aided Indexing (MAI) in a multilingual environemt (1992) 0.00
    0.0027189455 = product of:
      0.005437891 = sum of:
        0.005437891 = product of:
          0.010875782 = sum of:
            0.010875782 = weight(_text_:a in 2378) [ClassicSimilarity], result of:
              0.010875782 = score(doc=2378,freq=10.0), product of:
                0.04772363 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.041389145 = queryNorm
                0.22789092 = fieldWeight in 2378, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2378)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The Machine-Aided Indexing (MAI) program, developed by Access Innovations, Inc., is a semantic based, Boolean statement, rule interpreting application designed to operate in a multilingual environment. Use of MAI across several languages with controlled vocabularies for each language provides a consistency in indexing not available through any other mechanism
    Type
    a
  4. Lassalle, E.: Text retrieval : from a monolingual system to a multilingual system (1993) 0.00
    0.0026061484 = product of:
      0.0052122967 = sum of:
        0.0052122967 = product of:
          0.010424593 = sum of:
            0.010424593 = weight(_text_:a in 7403) [ClassicSimilarity], result of:
              0.010424593 = score(doc=7403,freq=12.0), product of:
                0.04772363 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.041389145 = queryNorm
                0.21843673 = fieldWeight in 7403, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7403)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Describes the TELMI monolingual text retrieval system and its future extension, a multilingual system. TELMI is designed for medium sized databases containing short texts. The characteristics of the system are fine-grained natural language processing (NLP); an open domain and a large scale knowledge base; automated indexing based on conceptual representation of texts and reusability of the NLP tools. Discusses the French MINITEL service, the MGS information service and the TELMI research system covering the full text system; NLP architecture; the lexical level; the syntactic level; the semantic level and an example of the use of a generic system
    Type
    a
  5. Hlava, M.M.K.: Machine aided indexing (MAI) in a multilingual environment (1993) 0.00
    0.0026061484 = product of:
      0.0052122967 = sum of:
        0.0052122967 = product of:
          0.010424593 = sum of:
            0.010424593 = weight(_text_:a in 7405) [ClassicSimilarity], result of:
              0.010424593 = score(doc=7405,freq=12.0), product of:
                0.04772363 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.041389145 = queryNorm
                0.21843673 = fieldWeight in 7405, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7405)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The machine aided indexing (MAI) software devloped by Access Innovations, Inc., is a semantic based, Boolean statement, rule interpreting application with 3 modules: the MA engine which accepts input files, matches terms in the knowledge base, interprets rules, and outputs a text file with suggested indexing terms; a rule building application allowing each Boolean style rule in the knowledge base to be created or modifies; and a statistical computation module which analyzes performance of the MA software against text manually indexed by professional human indexers. The MA software can be applied across multiple languages and can be used where the text to be searched is in one language and the indexes to be output are in another
    Type
    a
  6. Strobel, S.; Marín-Arraiza, P.: Metadata for scientific audiovisual media : current practices and perspectives of the TIB / AV-portal (2015) 0.00
    0.0020106873 = product of:
      0.0040213745 = sum of:
        0.0040213745 = product of:
          0.008042749 = sum of:
            0.008042749 = weight(_text_:a in 3667) [ClassicSimilarity], result of:
              0.008042749 = score(doc=3667,freq=14.0), product of:
                0.04772363 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.041389145 = queryNorm
                0.1685276 = fieldWeight in 3667, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3667)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Descriptive metadata play a key role in finding relevant search results in large amounts of unstructured data. However, current scientific audiovisual media are provided with little metadata, which makes them hard to find, let alone individual sequences. In this paper, the TIB / AV-Portal is presented as a use case where methods concerning the automatic generation of metadata, a semantic search and cross-lingual retrieval (German/English) have already been applied. These methods result in a better discoverability of the scientific audiovisual media hosted in the portal. Text, speech, and image content of the video are automatically indexed by specialised GND (Gemeinsame Normdatei) subject headings. A semantic search is established based on properties of the GND ontology. The cross-lingual retrieval uses English 'translations' that were derived by an ontology mapping (DBpedia i. a.). Further ways of increasing the discoverability and reuse of the metadata are publishing them as Linked Open Data and interlinking them with other data sets.
    Type
    a
  7. Flores, F.N.; Moreira, V.P.: Assessing the impact of stemming accuracy on information retrieval : a multilingual perspective (2016) 0.00
    0.001823924 = product of:
      0.003647848 = sum of:
        0.003647848 = product of:
          0.007295696 = sum of:
            0.007295696 = weight(_text_:a in 3187) [ClassicSimilarity], result of:
              0.007295696 = score(doc=3187,freq=8.0), product of:
                0.04772363 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.041389145 = queryNorm
                0.15287387 = fieldWeight in 3187, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3187)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The quality of stemming algorithms is typically measured in two different ways: (i) how accurately they map the variant forms of a word to the same stem; or (ii) how much improvement they bring to Information Retrieval systems. In this article, we evaluate various stemming algorithms, in four languages, in terms of accuracy and in terms of their aid to Information Retrieval. The aim is to assess whether the most accurate stemmers are also the ones that bring the biggest gain in Information Retrieval. Experiments in English, French, Portuguese, and Spanish show that this is not always the case, as stemmers with higher error rates yield better retrieval quality. As a byproduct, we also identified the most accurate stemmers and the best for Information Retrieval purposes.
    Type
    a
  8. Stegentritt, E.: Evaluationsresultate des mehrsprachigen Suchsystems CANAL/LS (1998) 0.00
    0.001719612 = product of:
      0.003439224 = sum of:
        0.003439224 = product of:
          0.006878448 = sum of:
            0.006878448 = weight(_text_:a in 7216) [ClassicSimilarity], result of:
              0.006878448 = score(doc=7216,freq=4.0), product of:
                0.04772363 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.041389145 = queryNorm
                0.14413087 = fieldWeight in 7216, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7216)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The search system CANAL/LS simplifies the searching of library catalogues by analyzing search questions linguistically and translating them if required. The linguistic analysis reduces the search question words to their basic forms so that they can be compared with basic title forms. Consequently all variants of words and parts of compounds in German can be found. Presents the results of an analysis of search questions in a catalogue of 45.000 titles in the field of psychology
    Type
    a
  9. Strobel, S.: Englischsprachige Erweiterung des TIB / AV-Portals : Ein GND/DBpedia-Mapping zur Gewinnung eines englischen Begriffssystems (2014) 0.00
    7.5996824E-4 = product of:
      0.0015199365 = sum of:
        0.0015199365 = product of:
          0.003039873 = sum of:
            0.003039873 = weight(_text_:a in 2876) [ClassicSimilarity], result of:
              0.003039873 = score(doc=2876,freq=2.0), product of:
                0.04772363 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.041389145 = queryNorm
                0.06369744 = fieldWeight in 2876, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2876)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a