Search (6 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Retrievalstudien"
  • × year_i:[2000 TO 2010}
  1. Talvensaari, T.; Laurikkala, J.; Järvelin, K.; Juhola, M.: ¬A study on automatic creation of a comparable document collection in cross-language information retrieval (2006) 0.03
    0.029645318 = product of:
      0.059290636 = sum of:
        0.059290636 = product of:
          0.11858127 = sum of:
            0.11858127 = weight(_text_:400 in 5601) [ClassicSimilarity], result of:
              0.11858127 = score(doc=5601,freq=2.0), product of:
                0.32745647 = queryWeight, product of:
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.049953517 = queryNorm
                0.36212835 = fieldWeight in 5601, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5601)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Purpose - To present a method for creating a comparable document collection from two document collections in different languages. Design/methodology/approach - The best query keys were extracted from a Finnish source collection (articles of the newspaper Aamulehti) with the relative average term frequency formula. The keys were translated into English with a dictionary-based query translation program. The resulting lists of words were used as queries that were run against the target collection (Los Angeles Times articles) with the nearest neighbor method. The documents were aligned with unrestricted and date-restricted alignment schemes, which were also combined. Findings - The combined alignment scheme was found the best, when the relatedness of the document pairs was assessed with a five-degree relevance scale. Of the 400 document pairs, roughly 40 percent were highly or fairly related and 75 percent included at least lexical similarity. Research limitations/implications - The number of alignment pairs was small due to the short common time period of the two collections, and their geographical (and thus, topical) remoteness. In future, our aim is to build larger comparable corpora in various languages and use them as source of translation knowledge for the purposes of cross-language information retrieval (CLIR). Practical implications - Readily available parallel corpora are scarce. With this method, two unrelated document collections can relatively easily be aligned to create a CLIR resource. Originality/value - The method can be applied to weakly linked collections and morphologically complex languages, such as Finnish.
  2. Voorhees, E.M.; Harman, D.: Overview of the Sixth Text REtrieval Conference (TREC-6) (2000) 0.02
    0.023688043 = product of:
      0.047376085 = sum of:
        0.047376085 = product of:
          0.09475217 = sum of:
            0.09475217 = weight(_text_:22 in 6438) [ClassicSimilarity], result of:
              0.09475217 = score(doc=6438,freq=2.0), product of:
                0.17492871 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049953517 = queryNorm
                0.5416616 = fieldWeight in 6438, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6438)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    11. 8.2001 16:22:19
  3. Leininger, K.: Interindexer consistency in PsychINFO (2000) 0.01
    0.010152018 = product of:
      0.020304035 = sum of:
        0.020304035 = product of:
          0.04060807 = sum of:
            0.04060807 = weight(_text_:22 in 2552) [ClassicSimilarity], result of:
              0.04060807 = score(doc=2552,freq=2.0), product of:
                0.17492871 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049953517 = queryNorm
                0.23214069 = fieldWeight in 2552, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2552)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    9. 2.1997 18:44:22
  4. King, D.W.: Blazing new trails : in celebration of an audacious career (2000) 0.01
    0.008460015 = product of:
      0.01692003 = sum of:
        0.01692003 = product of:
          0.03384006 = sum of:
            0.03384006 = weight(_text_:22 in 1184) [ClassicSimilarity], result of:
              0.03384006 = score(doc=1184,freq=2.0), product of:
                0.17492871 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049953517 = queryNorm
                0.19345059 = fieldWeight in 1184, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1184)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 9.1997 19:16:05
  5. Petrelli, D.: On the role of user-centred evaluation in the advancement of interactive information retrieval (2008) 0.01
    0.008460015 = product of:
      0.01692003 = sum of:
        0.01692003 = product of:
          0.03384006 = sum of:
            0.03384006 = weight(_text_:22 in 2026) [ClassicSimilarity], result of:
              0.03384006 = score(doc=2026,freq=2.0), product of:
                0.17492871 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049953517 = queryNorm
                0.19345059 = fieldWeight in 2026, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2026)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information processing and management. 44(2008) no.1, S.22-38
  6. Larsen, B.; Ingwersen, P.; Lund, B.: Data fusion according to the principle of polyrepresentation (2009) 0.01
    0.006768012 = product of:
      0.013536024 = sum of:
        0.013536024 = product of:
          0.027072048 = sum of:
            0.027072048 = weight(_text_:22 in 2752) [ClassicSimilarity], result of:
              0.027072048 = score(doc=2752,freq=2.0), product of:
                0.17492871 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049953517 = queryNorm
                0.15476047 = fieldWeight in 2752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2752)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 18:48:28