Search (4 results, page 1 of 1)

  • × author_ss:"Kettunen, K."
  • × theme_ss:"Computerlinguistik"
  1. Kettunen, K.; Kunttu, T.; Järvelin, K.: To stem or lemmatize a highly inflectional language in a probabilistic IR environment? (2005) 0.02
    0.016073458 = product of:
      0.032146916 = sum of:
        0.032146916 = product of:
          0.048220374 = sum of:
            0.04525219 = weight(_text_:k in 4395) [ClassicSimilarity], result of:
              0.04525219 = score(doc=4395,freq=4.0), product of:
                0.16225883 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04545348 = queryNorm
                0.2788889 = fieldWeight in 4395, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4395)
            0.0029681858 = weight(_text_:s in 4395) [ClassicSimilarity], result of:
              0.0029681858 = score(doc=4395,freq=2.0), product of:
                0.049418733 = queryWeight, product of:
                  1.0872376 = idf(docFreq=40523, maxDocs=44218)
                  0.04545348 = queryNorm
                0.060061958 = fieldWeight in 4395, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.0872376 = idf(docFreq=40523, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4395)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    Journal of documentation. 61(2005) no.4, S.476-496
  2. Airio, E.; Kettunen, K.: Does dictionary based bilingual retrieval work in a non-normalized index? (2009) 0.02
    0.015454079 = product of:
      0.030908158 = sum of:
        0.030908158 = product of:
          0.046362236 = sum of:
            0.038397755 = weight(_text_:k in 4224) [ClassicSimilarity], result of:
              0.038397755 = score(doc=4224,freq=2.0), product of:
                0.16225883 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04545348 = queryNorm
                0.23664509 = fieldWeight in 4224, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4224)
            0.007964479 = weight(_text_:s in 4224) [ClassicSimilarity], result of:
              0.007964479 = score(doc=4224,freq=10.0), product of:
                0.049418733 = queryWeight, product of:
                  1.0872376 = idf(docFreq=40523, maxDocs=44218)
                  0.04545348 = queryNorm
                0.16116315 = fieldWeight in 4224, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.0872376 = idf(docFreq=40523, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4224)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    Many operational IR indexes are non-normalized, i.e. no lemmatization or stemming techniques, etc. have been employed in indexing. This poses a challenge for dictionary-based cross-language retrieval (CLIR), because translations are mostly lemmas. In this study, we face the challenge of dictionary-based CLIR in a non-normalized index. We test two optional approaches: FCG (Frequent Case Generation) and s-gramming. The idea of FCG is to automatically generate the most frequent inflected forms for a given lemma. FCG has been tested in monolingual retrieval and has been shown to be a good method for inflected retrieval, especially for highly inflected languages. S-gramming is an approximate string matching technique (an extension of n-gramming). The language pairs in our tests were English-Finnish, English-Swedish, Swedish-Finnish and Finnish-Swedish. Both our approaches performed quite well, but the results varied depending on the language pair. S-gramming and FCG performed quite equally in all the other language pairs except Finnish-Swedish, where s-gramming outperformed FCG.
    Source
    Information processing and management. 45(2009) no.6, S.703-713
  3. Kettunen, K.: Reductive and generative approaches to management of morphological variation of keywords in monolingual information retrieval : an overview (2009) 0.01
    0.013986527 = product of:
      0.027973054 = sum of:
        0.027973054 = product of:
          0.04195958 = sum of:
            0.038397755 = weight(_text_:k in 2835) [ClassicSimilarity], result of:
              0.038397755 = score(doc=2835,freq=2.0), product of:
                0.16225883 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04545348 = queryNorm
                0.23664509 = fieldWeight in 2835, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2835)
            0.003561823 = weight(_text_:s in 2835) [ClassicSimilarity], result of:
              0.003561823 = score(doc=2835,freq=2.0), product of:
                0.049418733 = queryWeight, product of:
                  1.0872376 = idf(docFreq=40523, maxDocs=44218)
                  0.04545348 = queryNorm
                0.072074346 = fieldWeight in 2835, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.0872376 = idf(docFreq=40523, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2835)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    Journal of documentation. 65(2009) no.2, S.267-290
  4. Järvelin, A.; Keskustalo, H.; Sormunen, E.; Saastamoinen, M.; Kettunen, K.: Information retrieval from historical newspaper collections in highly inflectional languages : a query expansion approach (2016) 0.01
    0.011655438 = product of:
      0.023310876 = sum of:
        0.023310876 = product of:
          0.034966312 = sum of:
            0.031998128 = weight(_text_:k in 3223) [ClassicSimilarity], result of:
              0.031998128 = score(doc=3223,freq=2.0), product of:
                0.16225883 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04545348 = queryNorm
                0.19720423 = fieldWeight in 3223, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3223)
            0.0029681858 = weight(_text_:s in 3223) [ClassicSimilarity], result of:
              0.0029681858 = score(doc=3223,freq=2.0), product of:
                0.049418733 = queryWeight, product of:
                  1.0872376 = idf(docFreq=40523, maxDocs=44218)
                  0.04545348 = queryNorm
                0.060061958 = fieldWeight in 3223, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.0872376 = idf(docFreq=40523, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3223)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.12, S.2928-2946