Search (37 results, page 1 of 2)

Talvensaari, T.; Juhola, M.; Laurikkala, J.; Järvelin, K.: Corpus-based cross-language information retrieval in retrieval of highly relevant documents (2007) 0.03

0.025605785 = product of:
  0.08962024 = sum of:
    0.032137483 = weight(_text_:wide in 139) [ClassicSimilarity], result of:
      0.032137483 = score(doc=139,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.24476713 = fieldWeight in 139, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
    0.017435152 = weight(_text_:web in 139) [ClassicSimilarity], result of:
      0.017435152 = score(doc=139,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.18028519 = fieldWeight in 139, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
    0.010089659 = weight(_text_:information in 139) [ClassicSimilarity], result of:
      0.010089659 = score(doc=139,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.19395474 = fieldWeight in 139, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
    0.029957948 = weight(_text_:retrieval in 139) [ClassicSimilarity], result of:
      0.029957948 = score(doc=139,freq=8.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.33420905 = fieldWeight in 139, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
  0.2857143 = coord(4/14)

Abstract: Information retrieval systems' ability to retrieve highly relevant documents has become more and more important in the age of extremely large collections, such as the World Wide Web (WWW). The authors' aim was to find out how corpus-based cross-language information retrieval (CLIR) manages in retrieving highly relevant documents. They created a Finnish-Swedish comparable corpus from two loosely related document collections and used it as a source of knowledge for query translation. Finnish test queries were translated into Swedish and run against a Swedish test collection. Graded relevance assessments were used in evaluating the results and three relevance criterion levels-liberal, regular, and stringent-were applied. The runs were also evaluated with generalized recall and precision, which weight the retrieved documents according to their relevance level. The performance of the Comparable Corpus Translation system (COCOT) was compared to that of a dictionarybased query translation program; the two translation methods were also combined. The results indicate that corpus-based CUR performs particularly well with highly relevant documents. In average precision, COCOT even matched the monolingual baseline on the highest relevance level. The performance of the different query translation methods was further analyzed by finding out reasons for poor rankings of highly relevant documents.
Source: Journal of the American Society for Information Science and Technology. 58(2007) no.3, S.322-334

Pirkola, A.; Hedlund, T.; Keskustalo, H.; Järvelin, K.: Dictionary-based cross-language information retrieval : problems, methods, and research findings (2001) 0.01

0.0113271745 = product of:
  0.07929022 = sum of:
    0.019976506 = weight(_text_:information in 3908) [ClassicSimilarity], result of:
      0.019976506 = score(doc=3908,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3840108 = fieldWeight in 3908, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.109375 = fieldNorm(doc=3908)
    0.05931371 = weight(_text_:retrieval in 3908) [ClassicSimilarity], result of:
      0.05931371 = score(doc=3908,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.6617001 = fieldWeight in 3908, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=3908)
  0.14285715 = coord(2/14)

Source: Information retrieval. 4(2001), S.209-230

Pharo, N.; Järvelin, K.: ¬The SST method : a tool for analysing Web information search processes (2004) 0.01

0.010910697 = product of:
  0.050916586 = sum of:
    0.02465703 = weight(_text_:web in 2533) [ClassicSimilarity], result of:
      0.02465703 = score(doc=2533,freq=4.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.25496176 = fieldWeight in 2533, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2533)
    0.011280581 = weight(_text_:information in 2533) [ClassicSimilarity], result of:
      0.011280581 = score(doc=2533,freq=10.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.21684799 = fieldWeight in 2533, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2533)
    0.014978974 = weight(_text_:retrieval in 2533) [ClassicSimilarity], result of:
      0.014978974 = score(doc=2533,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.16710453 = fieldWeight in 2533, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2533)
  0.21428572 = coord(3/14)

Abstract: The article presents the search situation transition (SST) method for analysing Web information search (WIS) processes. The idea of the method is to analyse searching behaviour, the process, in detail and connect both the searchers' actions (captured in a log) and his/her intentions and goals, which log analysis never captures. On the other hand, ex post factor surveys, while popular in WIS research, cannot capture the actual search processes. The method is presented through three facets: its domain, its procedure, and its justification. The method's domain is presented in the form of a conceptual framework which maps five central categories that influence WIS processes; the searcher, the social/organisational environment, the work task, the search task, and the process itself. The method's procedure includes various techniques for data collection and analysis. The article presents examples from real WIS processes and shows how the method can be used to identify the interplay of the categories during the processes. It is shown that the method presents a new approach in information seeking and retrieval by focusing on the search process as a phenomenon and by explicating how different information seeking factors directly affect the search process.
Source: Information processing and management. 40(2004) no.4, S.633-654

Saastamoinen, M.; Järvelin, K.: Search task features in work tasks of varying types and complexity (2017) 0.01

0.009004591 = product of:
  0.042021424 = sum of:
    0.016016837 = weight(_text_:information in 3589) [ClassicSimilarity], result of:
      0.016016837 = score(doc=3589,freq=14.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3078936 = fieldWeight in 3589, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=3589)
    0.01797477 = weight(_text_:retrieval in 3589) [ClassicSimilarity], result of:
      0.01797477 = score(doc=3589,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.20052543 = fieldWeight in 3589, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3589)
    0.008029819 = product of:
      0.024089456 = sum of:
        0.024089456 = weight(_text_:22 in 3589) [ClassicSimilarity], result of:
          0.024089456 = score(doc=3589,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.23214069 = fieldWeight in 3589, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3589)
      0.33333334 = coord(1/3)
  0.21428572 = coord(3/14)

Abstract: Information searching in practice seldom is an end in itself. In work, work task (WT) performance forms the context, which information searching should serve. Therefore, information retrieval (IR) systems development/evaluation should take the WT context into account. The present paper analyzes how WT features: task complexity and task types, affect information searching in authentic work: the types of information needs, search processes, and search media. We collected data on 22 information professionals in authentic work situations in three organization types: city administration, universities, and companies. The data comprise 286 WTs and 420 search tasks (STs). The data include transaction logs, video recordings, daily questionnaires, interviews. and observation. The data were analyzed quantitatively. Even if the participants used a range of search media, most STs were simple throughout the data, and up to 42% of WTs did not include searching. WT's effects on STs are not straightforward: different WT types react differently to WT complexity. Due to the simplicity of authentic searching, the WT/ST types in interactive IR experiments should be reconsidered.
Source: Journal of the Association for Information Science and Technology. 68(2017) no.5, S.1111-1123

Hansen, P.; Järvelin, K.: Collaborative Information Retrieval in an information-intensive domain (2005) 0.01
```
0.008884343 = product of:
  0.0621904 = sum of:
    0.018161386 = weight(_text_:information in 1040) [ClassicSimilarity], result of:
      0.018161386 = score(doc=1040,freq=18.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.34911853 = fieldWeight in 1040, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1040)
    0.044029012 = weight(_text_:retrieval in 1040) [ClassicSimilarity], result of:
      0.044029012 = score(doc=1040,freq=12.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.49118498 = fieldWeight in 1040, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1040)
  0.14285715 = coord(2/14)
```
Abstract

In this article we investigate the expressions of collaborative activities within information seeking and retrieval processes (IS&R). Generally, information seeking and retrieval is regarded as an individual and isolated process in IR research. We assume that an IS&R situation is not merely an individual effort, but inherently involves various collaborative activities. We present empirical results from a real-life and information-intensive setting within the patent domain, showing that the patent task performance process involves highly collaborative aspects throughout the stages of the information seeking and retrieval process. Furthermore, we show that these activities may be categorised and related to different stages in an information seeking and retrieval process. Therefore, the assumption that information retrieval performance is purely individual needs to be reconsidered. Finally, we also propose a refined IR framework involving collaborative aspects.

Source

Information processing and management. 41(2005) no.5, S.1101-1120

Järvelin, K.; Kristensen, J.; Niemi, T.; Sormunen, E.; Keskustalo, H.: ¬A deductive data model for query expansion (1996) 0.01

0.0084650945 = product of:
  0.039503776 = sum of:
    0.0060537956 = weight(_text_:information in 2230) [ClassicSimilarity], result of:
      0.0060537956 = score(doc=2230,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.116372846 = fieldWeight in 2230, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2230)
    0.025420163 = weight(_text_:retrieval in 2230) [ClassicSimilarity], result of:
      0.025420163 = score(doc=2230,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.2835858 = fieldWeight in 2230, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2230)
    0.008029819 = product of:
      0.024089456 = sum of:
        0.024089456 = weight(_text_:22 in 2230) [ClassicSimilarity], result of:
          0.024089456 = score(doc=2230,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.23214069 = fieldWeight in 2230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2230)
      0.33333334 = coord(1/3)
  0.21428572 = coord(3/14)

Source: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR '96), Zürich, Switzerland, August 18-22, 1996. Eds.: H.P. Frei et al
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Järvelin, K.: Evaluation (2011) 0.01

0.008009522 = product of:
  0.05606665 = sum of:
    0.014125523 = weight(_text_:information in 548) [ClassicSimilarity], result of:
      0.014125523 = score(doc=548,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.27153665 = fieldWeight in 548, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.109375 = fieldNorm(doc=548)
    0.04194113 = weight(_text_:retrieval in 548) [ClassicSimilarity], result of:
      0.04194113 = score(doc=548,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.46789268 = fieldWeight in 548, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=548)
  0.14285715 = coord(2/14)

Source: Interactive information seeking, behaviour and retrieval. Eds.: Ruthven, I. u. D. Kelly

Järvelin, K.; Niemi, T.: Deductive information retrieval based on classifications (1993) 0.01
```
0.007471486 = product of:
  0.0523004 = sum of:
    0.012107591 = weight(_text_:information in 2229) [ClassicSimilarity], result of:
      0.012107591 = score(doc=2229,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23274569 = fieldWeight in 2229, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2229)
    0.04019281 = weight(_text_:retrieval in 2229) [ClassicSimilarity], result of:
      0.04019281 = score(doc=2229,freq=10.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.44838852 = fieldWeight in 2229, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2229)
  0.14285715 = coord(2/14)
```
Abstract

Modern fact databses contain abundant data classified through several classifications. Typically, users msut consult these classifications in separate manuals or files, thus making their effective use difficult. Contemporary database systems do little support deductive use of classifications. In this study we show how deductive data management techniques can be applied to the utilization of data value classifications. Computation of transitive class relationships is of primary importance here. We define a representation of classifications which supports transitive computation and present an operation-oriented deductive query language tailored for classification-based deductive information retrieval. The operations of this language are on the same abstraction level as relational algebra operations and can be integrated with these to form a powerful and flexible query language for deductive information retrieval. We define the integration of these operations and demonstrate the usefulness of the language in terms of several sample queries

Source

Journal of the American Society for Information Science. 44(1993) no.10, S.557-578

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Klassifikationssysteme im Online-Retrieval

Pharo, N.; Järvelin, K.: "Irrational" searchers and IR-rational researchers (2006) 0.01

0.0072008185 = product of:
  0.050405726 = sum of:
    0.041844364 = weight(_text_:web in 4922) [ClassicSimilarity], result of:
      0.041844364 = score(doc=4922,freq=8.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.43268442 = fieldWeight in 4922, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4922)
    0.00856136 = weight(_text_:information in 4922) [ClassicSimilarity], result of:
      0.00856136 = score(doc=4922,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16457605 = fieldWeight in 4922, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4922)
  0.14285715 = coord(2/14)

Abstract: In this article the authors look at the prescriptions advocated by Web search textbooks in the light of a selection of empirical data of real Web information search processes. They use the strategy of disjointed incrementalism, which is a theoretical foundation from decision making, to focus an how people face complex problems, and claim that such problem solving can be compared to the tasks searchers perform when interacting with the Web. The findings suggest that textbooks an Web searching should take into account that searchers only tend to take a certain number of sources into consideration, that the searchers adjust their goals and objectives during searching, and that searchers reconsider the usefulness of sources at different stages of their work tasks as well as their search tasks.
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.2, S.222-232

Pirkola, A.; Puolamäki, D.; Järvelin, K.: Applying query structuring in cross-language retrieval (2003) 0.01
```
0.0063587003 = product of:
  0.0445109 = sum of:
    0.00856136 = weight(_text_:information in 1074) [ClassicSimilarity], result of:
      0.00856136 = score(doc=1074,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16457605 = fieldWeight in 1074, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1074)
    0.03594954 = weight(_text_:retrieval in 1074) [ClassicSimilarity], result of:
      0.03594954 = score(doc=1074,freq=8.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.40105087 = fieldWeight in 1074, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1074)
  0.14285715 = coord(2/14)
```
Abstract

We will explore various ways to apply query structuring in cross-language information retrieval. In the first test, English queries were translated into Finnish using an electronic dictionary, and were run in a Finnish newspaper database of 55,000 articles. Queries were structured by combining the Finnish translation equivalents of the same English query key using the syn-operator of the InQuery retrieval system. Structured queries performed markedly better than unstructured queries. Second, the effects of compound-based structuring using a proximity operator for the translation equivalents of query language compound components were tested. The method was not useful in syn-based queries but resulted in decrease in retrieval effectiveness. Proper names are often non-identical spelling variants in different languages. This allows n-gram based translation of names not included in a dictionary. In the third test, a query structuring method where the Boolean and-operator was used to assign more weight to keys translated through n-gram matching gave good results.

Source

Information processing and management. 39(2003) no.3, S.391-402

Järvelin, K.; Ingwersen, P.: User-oriented and cognitive models of information retrieval (2009) 0.01

0.005984274 = product of:
  0.041889917 = sum of:
    0.012233062 = weight(_text_:information in 3901) [ClassicSimilarity], result of:
      0.012233062 = score(doc=3901,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23515764 = fieldWeight in 3901, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3901)
    0.029656855 = weight(_text_:retrieval in 3901) [ClassicSimilarity], result of:
      0.029656855 = score(doc=3901,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.33085006 = fieldWeight in 3901, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3901)
  0.14285715 = coord(2/14)

Abstract: The domain of user-oriented and cognitive information retrieval (IR) is first discussed, followed by a discussion on the dimensions and types of models one may build for the domain. The focus of the present entry is on the models of user-oriented and cognitive IR, not on their empirical applications. Several models with different emphases on user-oriented and cognitive IR are presented-ranging from overall approaches and relevance models to procedural models, cognitive models, and task-based models. The present entry does not discuss empirical findings based on the models.
Source: Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates

Lehtokangas, R.; Keskustalo, H.; Järvelin, K.: Experiments with transitive dictionary translation and pseudo-relevance feedback using graded relevance assessments (2008) 0.01
```
0.0059455284 = product of:
  0.041618697 = sum of:
    0.0104854815 = weight(_text_:information in 1349) [ClassicSimilarity], result of:
      0.0104854815 = score(doc=1349,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.20156369 = fieldWeight in 1349, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1349)
    0.031133216 = weight(_text_:retrieval in 1349) [ClassicSimilarity], result of:
      0.031133216 = score(doc=1349,freq=6.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.34732026 = fieldWeight in 1349, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1349)
  0.14285715 = coord(2/14)
```
Abstract

In this article, the authors present evaluation results for transitive dictionary-based cross-language information retrieval (CLIR) using graded relevance assessments in a best match retrieval environment. A text database containing newspaper articles and a related set of 35 search topics were used in the tests. Source language topics (in English, German, and Swedish) were automatically translated into the target language (Finnish) via an intermediate (or pivot) language. Effectiveness of the transitively translated queries was compared to that of the directly translated and monolingual Finnish queries. Pseudo-relevance feedback (PRF) was also used to expand the original transitive target queries. Cross-language information retrieval performance was evaluated on three relevance thresholds: stringent, regular, and liberal. The transitive translations performed well achieving, on the average, 85-93% of the direct translation performance, and 66-72% of monolingual performance. Moreover, PRF was successful in raising the performance of transitive translation routes in absolute terms as well as in relation to monolingual and direct translation performance applying PRF.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.3, S.476-488
Vakkari, P.; Järvelin, K.: Explanation in information seeking and retrieval (2005) 0.01
```
0.0058251214 = product of:
  0.040775847 = sum of:
    0.013980643 = weight(_text_:information in 643) [ClassicSimilarity], result of:
      0.013980643 = score(doc=643,freq=24.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.2687516 = fieldWeight in 643, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=643)
    0.026795205 = weight(_text_:retrieval in 643) [ClassicSimilarity], result of:
      0.026795205 = score(doc=643,freq=10.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.29892567 = fieldWeight in 643, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=643)
  0.14285715 = coord(2/14)
```
Abstract

Information Retrieval (IR) is a research area both within Computer Science and Information Science. It has by and large two communities: a Computer Science oriented experimental approach and a user-oriented Information Science approach with a Social Science background. The communities hold a critical stance towards each other (e.g., Ingwersen, 1996), the latter suspecting the realism of the former, and the former suspecting the usefulness of the latter. Within Information Science the study of information seeking (IS) also has a Social Science background. There is a lot of research in each of these particular areas of information seeking and retrieval (IS&R). However, the three communities do not really communicate with each other. Why is this, and could the relationships be otherwise? Do the communities in fact belong together? Or perhaps each community is better off forgetting about the existence of the other two? We feel that the relationships between the research areas have not been properly analyzed. One way to analyze the relationships is to examine what each research area is trying to find out: which phenomena are being explained and how. We believe that IS&R research would benefit from being analytic about its frameworks, models and theories, not just at the level of meta-theories, but also much more concretely at the level of study designs. Over the years there have been calls for more context in the study of IS&R. Work tasks as well as cultural activities/interests have been proposed as the proper context for information access. For example, Wersig (1973) conceptualized information needs from the tasks perspective. He argued that in order to learn about information needs and seeking, one needs to take into account the whole active professional role of the individuals being investigated. Byström and Järvelin (1995) analysed IS processes in the light of tasks of varying complexity. Ingwersen (1996) discussed the role of tasks and their descriptions and problematic situations from a cognitive perspective on IR. Most recently, Vakkari (2003) reviewed task-based IR and Järvelin and Ingwersen (2004) proposed the extension of IS&R research toward the task context. Therefore there is much support to the task context, but how should it be applied in IS&R?

Series

The information retrieval series, vol. 19

Source

New directions in cognitive information retrieval. Eds.: A. Spink, C. Cole
Tuomaala, O.; Järvelin, K.; Vakkari, P.: Evolution of library and information science, 1965-2005 : content analysis of journal articles (2014) 0.01
```
0.0055227536 = product of:
  0.038659275 = sum of:
    0.017475804 = weight(_text_:information in 1309) [ClassicSimilarity], result of:
      0.017475804 = score(doc=1309,freq=24.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3359395 = fieldWeight in 1309, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1309)
    0.021183468 = weight(_text_:retrieval in 1309) [ClassicSimilarity], result of:
      0.021183468 = score(doc=1309,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23632148 = fieldWeight in 1309, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1309)
  0.14285715 = coord(2/14)
```
Abstract

This article first analyzes library and information science (LIS) research articles published in core LIS journals in 2005. It also examines the development of LIS from 1965 to 2005 in light of comparable data sets for 1965, 1985, and 2005. In both cases, the authors report (a) how the research articles are distributed by topic and (b) what approaches, research strategies, and methods were applied in the articles. In 2005, the largest research areas in LIS by this measure were information storage and retrieval, scientific communication, library and information-service activities, and information seeking. The same research areas constituted the quantitative core of LIS in the previous years since 1965. Information retrieval has been the most popular area of research over the years. The proportion of research on library and information-service activities decreased after 1985, but the popularity of information seeking and of scientific communication grew during the period studied. The viewpoint of research has shifted from library and information organizations to end users and development of systems for the latter. The proportion of empirical research strategies was high and rose over time, with the survey method being the single most important method. However, attention to evaluation and experiments increased considerably after 1985. Conceptual research strategies and system analysis, description, and design were quite popular, but declining. The most significant changes from 1965 to 2005 are the decreasing interest in library and information-service activities and the growth of research into information seeking and scientific communication.

Source

Journal of the Association for Information Science and Technology. 65(2014) no.7, S.1446-1462
Järvelin, K.: ¬An analysis of two approaches in information retrieval : from frameworks to study designs (2007) 0.01
```
0.005361108 = product of:
  0.037527755 = sum of:
    0.012107591 = weight(_text_:information in 326) [ClassicSimilarity], result of:
      0.012107591 = score(doc=326,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23274569 = fieldWeight in 326, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=326)
    0.025420163 = weight(_text_:retrieval in 326) [ClassicSimilarity], result of:
      0.025420163 = score(doc=326,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.2835858 = fieldWeight in 326, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=326)
  0.14285715 = coord(2/14)
```
Abstract

There is a well-known gap between systems-oriented information retrieval (IR) and user-oriented IR, which cognitive IR seeks to bridge. It is therefore interesting to analyze approaches at the level of frameworks, models, and study designs. This article is an exercise in such an analysis, focusing on two significant approaches to IR: the lab IR approach and P. Ingwersen's (1996) cognitive IR approach. The article focuses on their research frameworks, models, hypotheses, laws and theories, study designs, and possible contributions. The two approaches are quite different, which becomes apparent in the use of Independent, controlled, and dependent variables in the study designs of each approach. Thus, each approach is capable of contributing very differently to understanding and developing information access. The article also discusses integrating the approaches at the study-design level.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.7, S.971-986
Halttunen, K.; Järvelin, K.: Assessing learning outcomes in two information retrieval learning environments (2005) 0.01
```
0.005129378 = product of:
  0.035905644 = sum of:
    0.0104854815 = weight(_text_:information in 996) [ClassicSimilarity], result of:
      0.0104854815 = score(doc=996,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.20156369 = fieldWeight in 996, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=996)
    0.025420163 = weight(_text_:retrieval in 996) [ClassicSimilarity], result of:
      0.025420163 = score(doc=996,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.2835858 = fieldWeight in 996, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=996)
  0.14285715 = coord(2/14)
```
Abstract

In order to design information retrieval (IR) learning environments and instruction, it is important to explore learning outcomes of different pedagogical solutions. Learning outcomes have seldom been evaluated in IR instruction. The particular focus of this study is the assessment of learning outcomes in an experimental, but naturalistic, learning environment compared to more traditional instruction. The 57 participants of an introductory course on IR were selected for this study, and the analysis illustrates their learning outcomes regarding both conceptual change and development of IR skill. Concept mapping of student essays was used to analyze conceptual change and log-files of search exercises provided data for performance assessment. Students in the experimental learning environment changed their conceptions more regarding linguistic aspects of IR and paid more emphasis on planning and management of search process. Performance assessment indicates that anchored instruction and scaffolding with an instructional tool, the IR Game, with performance feedback enables students to construct queries with fewer semantic knowledge errors also in operational IR systems.

Source

Information processing and management. 41(2005) no.4, S.949-972
Vakkari, P.; Chang, Y.-W.; Järvelin, K.: Disciplinary contributions to research topics and methodology in Library and Information Science : leading to fragmentation? (2022) 0.00
```
0.0047915326 = product of:
  0.033540726 = sum of:
    0.012357258 = weight(_text_:information in 767) [ClassicSimilarity], result of:
      0.012357258 = score(doc=767,freq=12.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23754507 = fieldWeight in 767, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=767)
    0.021183468 = weight(_text_:retrieval in 767) [ClassicSimilarity], result of:
      0.021183468 = score(doc=767,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23632148 = fieldWeight in 767, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=767)
  0.14285715 = coord(2/14)
```
Abstract

The study analyses contributions to Library and Information Science (LIS) by researchers representing various disciplines. How are such contributions associated with the choice of research topics and methodology? The study employs a quantitative content analysis of articles published in 31 scholarly LIS journals in 2015. Each article is seen as a contribution to LIS by the authors' disciplines, which are inferred from their affiliations. The unit of analysis is the article-discipline pair. Of the contribution instances, the share of LIS is one third. Computer Science contributes one fifth and Business and Economics one sixth. The latter disciplines dominate the contributions in information retrieval, information seeking, and scientific communication indicating strong influences in LIS. Correspondence analysis reveals three clusters of research, one focusing on traditional LIS with contributions from LIS and Humanities and survey-type research; another on information retrieval with contributions from Computer Science and experimental research; and the third on scientific communication with contributions from Natural Sciences and Medicine and citation analytic research. The strong differentiation of scholarly contributions in LIS hints to the fragmentation of LIS as a discipline.

Source

Journal of the Association for Information Science and Technology. 73(2022) no.12, S.1706-1722
Sormunen, E.; Kekäläinen, J.; Koivisto, J.; Järvelin, K.: Document text characteristics affect the ranking of the most relevant documents by expanded structured queries (2001) 0.00
```
0.00446759 = product of:
  0.031273127 = sum of:
    0.010089659 = weight(_text_:information in 4487) [ClassicSimilarity], result of:
      0.010089659 = score(doc=4487,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.19395474 = fieldWeight in 4487, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4487)
    0.021183468 = weight(_text_:retrieval in 4487) [ClassicSimilarity], result of:
      0.021183468 = score(doc=4487,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23632148 = fieldWeight in 4487, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4487)
  0.14285715 = coord(2/14)
```
Abstract

The increasing flood of documentary information through the Internet and other information sources challenges the developers of information retrieval systems. It is not enough that an IR system is able to make a distinction between relevant and non-relevant documents. The reduction of information overload requires that IR systems provide the capability of screening the most valuable documents out of the mass of potentially or marginally relevant documents. This paper introduces a new concept-based method to analyse the text characteristics of documents at varying relevance levels. The results of the document analysis were applied in an experiment on query expansion (QE) in a probabilistic IR system. Statistical differences in textual characteristics of highly relevant and less relevant documents were investigated by applying a facet analysis technique. In highly relevant documents a larger number of aspects of the request were discussed, searchable expressions for the aspects were distributed over a larger set of text paragraphs, and a larger set of unique expressions were used per aspect than in marginally relevant documents. A query expansion experiment verified that the findings of the text analysis can be exploited in formulating more effective queries for best match retrieval in the search for highly relevant documents. The results revealed that expanded queries with concept-based structures performed better than unexpanded queries or Ñnatural languageÒ queries. Further, it was shown that highly relevant documents benefit essentially more from the concept-based QE in ranking than marginally relevant documents.
Saarikoski, J.; Laurikkala, J.; Järvelin, K.; Juhola, M.: ¬A study of the use of self-organising maps in information retrieval (2009) 0.00
```
0.00446759 = product of:
  0.031273127 = sum of:
    0.010089659 = weight(_text_:information in 2836) [ClassicSimilarity], result of:
      0.010089659 = score(doc=2836,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.19395474 = fieldWeight in 2836, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2836)
    0.021183468 = weight(_text_:retrieval in 2836) [ClassicSimilarity], result of:
      0.021183468 = score(doc=2836,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23632148 = fieldWeight in 2836, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2836)
  0.14285715 = coord(2/14)
```
Abstract

Purpose - The aim of this paper is to explore the possibility of retrieving information with Kohonen self-organising maps, which are known to be effective to group objects according to their similarity or dissimilarity. Design/methodology/approach - After conventional preprocessing, such as transforming into vector space, documents from a German document collection were trained for a neural network of Kohonen self-organising map type. Such an unsupervised network forms a document map from which relevant objects can be found according to queries. Findings - Self-organising maps ordered documents to groups from which it was possible to find relevant targets. Research limitations/implications - The number of documents used was moderate due to the limited number of documents associated to test topics. The training of self-organising maps entails rather long running times, which is their practical limitation. In future, the aim will be to build larger networks by compressing document matrices, and to develop document searching in them. Practical implications - With self-organising maps the distribution of documents can be visualised and relevant documents found in document collections of limited size. Originality/value - The paper reports on an approach that can be especially used to group documents and also for information search. So far self-organising maps have rarely been studied for information retrieval. Instead, they have been applied to document grouping tasks.
Talvensaari, T.; Laurikkala, J.; Järvelin, K.; Juhola, M.: ¬A study on automatic creation of a comparable document collection in cross-language information retrieval (2006) 0.00
```
0.0040454194 = product of:
  0.028317936 = sum of:
    0.0071344664 = weight(_text_:information in 5601) [ClassicSimilarity], result of:
      0.0071344664 = score(doc=5601,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.13714671 = fieldWeight in 5601, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5601)
    0.021183468 = weight(_text_:retrieval in 5601) [ClassicSimilarity], result of:
      0.021183468 = score(doc=5601,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23632148 = fieldWeight in 5601, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5601)
  0.14285715 = coord(2/14)
```
Abstract

Purpose - To present a method for creating a comparable document collection from two document collections in different languages. Design/methodology/approach - The best query keys were extracted from a Finnish source collection (articles of the newspaper Aamulehti) with the relative average term frequency formula. The keys were translated into English with a dictionary-based query translation program. The resulting lists of words were used as queries that were run against the target collection (Los Angeles Times articles) with the nearest neighbor method. The documents were aligned with unrestricted and date-restricted alignment schemes, which were also combined. Findings - The combined alignment scheme was found the best, when the relatedness of the document pairs was assessed with a five-degree relevance scale. Of the 400 document pairs, roughly 40 percent were highly or fairly related and 75 percent included at least lexical similarity. Research limitations/implications - The number of alignment pairs was small due to the short common time period of the two collections, and their geographical (and thus, topical) remoteness. In future, our aim is to build larger comparable corpora in various languages and use them as source of translation knowledge for the purposes of cross-language information retrieval (CLIR). Practical implications - Readily available parallel corpora are scarce. With this method, two unrelated document collections can relatively easily be aligned to create a CLIR resource. Originality/value - The method can be applied to weakly linked collections and morphologically complex languages, such as Finnish.

Search (37 results, page 1 of 2)

Authors

Years

Themes