Search (7 results, page 1 of 1)

  • × author_ss:"Järvelin, K."
  1. Toivonen, J.; Pirkola, A.; Keskustalo, H.; Visala, K.; Järvelin, K.: Translating cross-lingual spelling variants using transformation rules (2005) 0.03
    0.02627486 = product of:
      0.1313743 = sum of:
        0.1313743 = weight(_text_:dictionaries in 1052) [ClassicSimilarity], result of:
          0.1313743 = score(doc=1052,freq=2.0), product of:
            0.2864761 = queryWeight, product of:
              6.9177637 = idf(docFreq=118, maxDocs=44218)
              0.041411664 = queryNorm
            0.4585873 = fieldWeight in 1052, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.9177637 = idf(docFreq=118, maxDocs=44218)
              0.046875 = fieldNorm(doc=1052)
      0.2 = coord(1/5)
    
    Abstract
    Technical terms and proper names constitute a major problem in dictionary-based cross-language information retrieval (CLIR). However, technical terms and proper names in different languages often share the same Latin or Greek origin, being thus spelling variants of each other. In this paper we present a novel two-step fuzzy translation technique for cross-lingual spelling variants. In the first step, transformation rules are applied to source words to render them more similar to their target language equivalents. The rules are generated automatically using translation dictionaries as source data. In the second step, the intermediate forms obtained in the first step are translated into a target language using fuzzy matching. The effectiveness of the technique was evaluated empirically using five source languages and English as a target language. The two-step technique performed better, in some cases considerably better, than fuzzy matching alone. Even using the first step as such showed promising results.
  2. Lehtokangas, R.; Keskustalo, H.; Järvelin, K.: Experiments with transitive dictionary translation and pseudo-relevance feedback using graded relevance assessments (2008) 0.01
    0.009260482 = product of:
      0.046302408 = sum of:
        0.046302408 = product of:
          0.092604816 = sum of:
            0.092604816 = weight(_text_:german in 1349) [ClassicSimilarity], result of:
              0.092604816 = score(doc=1349,freq=2.0), product of:
                0.24051933 = queryWeight, product of:
                  5.808009 = idf(docFreq=360, maxDocs=44218)
                  0.041411664 = queryNorm
                0.38502026 = fieldWeight in 1349, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.808009 = idf(docFreq=360, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1349)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    In this article, the authors present evaluation results for transitive dictionary-based cross-language information retrieval (CLIR) using graded relevance assessments in a best match retrieval environment. A text database containing newspaper articles and a related set of 35 search topics were used in the tests. Source language topics (in English, German, and Swedish) were automatically translated into the target language (Finnish) via an intermediate (or pivot) language. Effectiveness of the transitively translated queries was compared to that of the directly translated and monolingual Finnish queries. Pseudo-relevance feedback (PRF) was also used to expand the original transitive target queries. Cross-language information retrieval performance was evaluated on three relevance thresholds: stringent, regular, and liberal. The transitive translations performed well achieving, on the average, 85-93% of the direct translation performance, and 66-72% of monolingual performance. Moreover, PRF was successful in raising the performance of transitive translation routes in absolute terms as well as in relation to monolingual and direct translation performance applying PRF.
  3. Saarikoski, J.; Laurikkala, J.; Järvelin, K.; Juhola, M.: ¬A study of the use of self-organising maps in information retrieval (2009) 0.01
    0.007717067 = product of:
      0.038585335 = sum of:
        0.038585335 = product of:
          0.07717067 = sum of:
            0.07717067 = weight(_text_:german in 2836) [ClassicSimilarity], result of:
              0.07717067 = score(doc=2836,freq=2.0), product of:
                0.24051933 = queryWeight, product of:
                  5.808009 = idf(docFreq=360, maxDocs=44218)
                  0.041411664 = queryNorm
                0.3208502 = fieldWeight in 2836, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.808009 = idf(docFreq=360, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2836)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    Purpose - The aim of this paper is to explore the possibility of retrieving information with Kohonen self-organising maps, which are known to be effective to group objects according to their similarity or dissimilarity. Design/methodology/approach - After conventional preprocessing, such as transforming into vector space, documents from a German document collection were trained for a neural network of Kohonen self-organising map type. Such an unsupervised network forms a document map from which relevant objects can be found according to queries. Findings - Self-organising maps ordered documents to groups from which it was possible to find relevant targets. Research limitations/implications - The number of documents used was moderate due to the limited number of documents associated to test topics. The training of self-organising maps entails rather long running times, which is their practical limitation. In future, the aim will be to build larger networks by compressing document matrices, and to develop document searching in them. Practical implications - With self-organising maps the distribution of documents can be visualised and relevant documents found in document collections of limited size. Originality/value - The paper reports on an approach that can be especially used to group documents and also for information search. So far self-organising maps have rarely been studied for information retrieval. Instead, they have been applied to document grouping tasks.
  4. Järvelin, K.; Kristensen, J.; Niemi, T.; Sormunen, E.; Keskustalo, H.: ¬A deductive data model for query expansion (1996) 0.00
    0.0033664254 = product of:
      0.016832126 = sum of:
        0.016832126 = product of:
          0.033664253 = sum of:
            0.033664253 = weight(_text_:22 in 2230) [ClassicSimilarity], result of:
              0.033664253 = score(doc=2230,freq=2.0), product of:
                0.1450166 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041411664 = queryNorm
                0.23214069 = fieldWeight in 2230, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2230)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Source
    Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR '96), Zürich, Switzerland, August 18-22, 1996. Eds.: H.P. Frei et al
  5. Saastamoinen, M.; Järvelin, K.: Search task features in work tasks of varying types and complexity (2017) 0.00
    0.0033664254 = product of:
      0.016832126 = sum of:
        0.016832126 = product of:
          0.033664253 = sum of:
            0.033664253 = weight(_text_:22 in 3589) [ClassicSimilarity], result of:
              0.033664253 = score(doc=3589,freq=2.0), product of:
                0.1450166 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041411664 = queryNorm
                0.23214069 = fieldWeight in 3589, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3589)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    Information searching in practice seldom is an end in itself. In work, work task (WT) performance forms the context, which information searching should serve. Therefore, information retrieval (IR) systems development/evaluation should take the WT context into account. The present paper analyzes how WT features: task complexity and task types, affect information searching in authentic work: the types of information needs, search processes, and search media. We collected data on 22 information professionals in authentic work situations in three organization types: city administration, universities, and companies. The data comprise 286 WTs and 420 search tasks (STs). The data include transaction logs, video recordings, daily questionnaires, interviews. and observation. The data were analyzed quantitatively. Even if the participants used a range of search media, most STs were simple throughout the data, and up to 42% of WTs did not include searching. WT's effects on STs are not straightforward: different WT types react differently to WT complexity. Due to the simplicity of authentic searching, the WT/ST types in interactive IR experiments should be reconsidered.
  6. Näppilä, T.; Järvelin, K.; Niemi, T.: ¬A tool for data cube construction from structurally heterogeneous XML documents (2008) 0.00
    0.0028053545 = product of:
      0.014026772 = sum of:
        0.014026772 = product of:
          0.028053544 = sum of:
            0.028053544 = weight(_text_:22 in 1369) [ClassicSimilarity], result of:
              0.028053544 = score(doc=1369,freq=2.0), product of:
                0.1450166 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041411664 = queryNorm
                0.19345059 = fieldWeight in 1369, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1369)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Date
    9. 2.2008 17:22:42
  7. Vakkari, P.; Järvelin, K.; Chang, Y.-W.: ¬The association of disciplinary background with the evolution of topics and methods in Library and Information Science research 1995-2015 (2023) 0.00
    0.0028053545 = product of:
      0.014026772 = sum of:
        0.014026772 = product of:
          0.028053544 = sum of:
            0.028053544 = weight(_text_:22 in 998) [ClassicSimilarity], result of:
              0.028053544 = score(doc=998,freq=2.0), product of:
                0.1450166 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041411664 = queryNorm
                0.19345059 = fieldWeight in 998, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=998)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Date
    22. 6.2023 18:15:06