Search (31 results, page 1 of 2)

Järvelin, K.; Kristensen, J.; Niemi, T.; Sormunen, E.; Keskustalo, H.: ¬A deductive data model for query expansion (1996) 0.01

0.005982068 = product of:
  0.020937236 = sum of:
    0.006214436 = product of:
      0.03107218 = sum of:
        0.03107218 = weight(_text_:retrieval in 2230) [ClassicSimilarity], result of:
          0.03107218 = score(doc=2230,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.2835858 = fieldWeight in 2230, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2230)
      0.2 = coord(1/5)
    0.0147228 = product of:
      0.0294456 = sum of:
        0.0294456 = weight(_text_:22 in 2230) [ClassicSimilarity], result of:
          0.0294456 = score(doc=2230,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.23214069 = fieldWeight in 2230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2230)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Source: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR '96), Zürich, Switzerland, August 18-22, 1996. Eds.: H.P. Frei et al
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Saastamoinen, M.; Järvelin, K.: Search task features in work tasks of varying types and complexity (2017) 0.01

0.00546202 = product of:
  0.01911707 = sum of:
    0.00439427 = product of:
      0.02197135 = sum of:
        0.02197135 = weight(_text_:retrieval in 3589) [ClassicSimilarity], result of:
          0.02197135 = score(doc=3589,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.20052543 = fieldWeight in 3589, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=3589)
      0.2 = coord(1/5)
    0.0147228 = product of:
      0.0294456 = sum of:
        0.0294456 = weight(_text_:22 in 3589) [ClassicSimilarity], result of:
          0.0294456 = score(doc=3589,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.23214069 = fieldWeight in 3589, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3589)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: Information searching in practice seldom is an end in itself. In work, work task (WT) performance forms the context, which information searching should serve. Therefore, information retrieval (IR) systems development/evaluation should take the WT context into account. The present paper analyzes how WT features: task complexity and task types, affect information searching in authentic work: the types of information needs, search processes, and search media. We collected data on 22 information professionals in authentic work situations in three organization types: city administration, universities, and companies. The data comprise 286 WTs and 420 search tasks (STs). The data include transaction logs, video recordings, daily questionnaires, interviews. and observation. The data were analyzed quantitatively. Even if the participants used a range of search media, most STs were simple throughout the data, and up to 42% of WTs did not include searching. WT's effects on STs are not straightforward: different WT types react differently to WT complexity. Due to the simplicity of authentic searching, the WT/ST types in interactive IR experiments should be reconsidered.

Näppilä, T.; Järvelin, K.; Niemi, T.: ¬A tool for data cube construction from structurally heterogeneous XML documents (2008) 0.00
```
0.004639679 = product of:
  0.016238876 = sum of:
    0.003969876 = product of:
      0.01984938 = sum of:
        0.01984938 = weight(_text_:system in 1369) [ClassicSimilarity], result of:
          0.01984938 = score(doc=1369,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.17398985 = fieldWeight in 1369, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1369)
      0.2 = coord(1/5)
    0.0122690005 = product of:
      0.024538001 = sum of:
        0.024538001 = weight(_text_:22 in 1369) [ClassicSimilarity], result of:
          0.024538001 = score(doc=1369,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.19345059 = fieldWeight in 1369, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1369)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

Data cubes for OLAP (On-Line Analytical Processing) often need to be constructed from data located in several distributed and autonomous information sources. Such a data integration process is challenging due to semantic, syntactic, and structural heterogeneity among the data. While XML (extensible markup language) is the de facto standard for data exchange, the three types of heterogeneity remain. Moreover, popular path-oriented XML query languages, such as XQuery, require the user to know in much detail the structure of the documents to be processed and are, thus, effectively impractical in many real-world data integration tasks. Several Lowest Common Ancestor (LCA)-based XML query evaluation strategies have recently been introduced to provide a more structure-independent way to access XML documents. We shall, however, show that this approach leads in the context of certain - not uncommon - types of XML documents to undesirable results. This article introduces a novel high-level data extraction primitive that utilizes the purpose-built Smallest Possible Context (SPC) query evaluation strategy. We demonstrate, through a system prototype for OLAP data cube construction and a sample application in informetrics, that our approach has real advantages in data integration.

Date

9. 2.2008 17:22:42

Pirkola, A.; Puolamäki, D.; Järvelin, K.: Applying query structuring in cross-language retrieval (2003) 0.00

0.0038721121 = product of:
  0.027104784 = sum of:
    0.027104784 = product of:
      0.06776196 = sum of:
        0.0439427 = weight(_text_:retrieval in 1074) [ClassicSimilarity], result of:
          0.0439427 = score(doc=1074,freq=8.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.40105087 = fieldWeight in 1074, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1074)
        0.023819257 = weight(_text_:system in 1074) [ClassicSimilarity], result of:
          0.023819257 = score(doc=1074,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.20878783 = fieldWeight in 1074, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=1074)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)

Abstract: We will explore various ways to apply query structuring in cross-language information retrieval. In the first test, English queries were translated into Finnish using an electronic dictionary, and were run in a Finnish newspaper database of 55,000 articles. Queries were structured by combining the Finnish translation equivalents of the same English query key using the syn-operator of the InQuery retrieval system. Structured queries performed markedly better than unstructured queries. Second, the effects of compound-based structuring using a proximity operator for the translation equivalents of query language compound components were tested. The method was not useful in syn-based queries but resulted in decrease in retrieval effectiveness. Proper names are often non-identical spelling variants in different languages. This allows n-gram based translation of names not included in a dictionary. In the third test, a query structuring method where the Boolean and-operator was used to assign more weight to keys translated through n-gram matching gave good results.

Talvensaari, T.; Juhola, M.; Laurikkala, J.; Järvelin, K.: Corpus-based cross-language information retrieval in retrieval of highly relevant documents (2007) 0.00
```
0.00322676 = product of:
  0.02258732 = sum of:
    0.02258732 = product of:
      0.0564683 = sum of:
        0.03661892 = weight(_text_:retrieval in 139) [ClassicSimilarity], result of:
          0.03661892 = score(doc=139,freq=8.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.33420905 = fieldWeight in 139, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=139)
        0.01984938 = weight(_text_:system in 139) [ClassicSimilarity], result of:
          0.01984938 = score(doc=139,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.17398985 = fieldWeight in 139, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=139)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)
```
Abstract

Information retrieval systems' ability to retrieve highly relevant documents has become more and more important in the age of extremely large collections, such as the World Wide Web (WWW). The authors' aim was to find out how corpus-based cross-language information retrieval (CLIR) manages in retrieving highly relevant documents. They created a Finnish-Swedish comparable corpus from two loosely related document collections and used it as a source of knowledge for query translation. Finnish test queries were translated into Swedish and run against a Swedish test collection. Graded relevance assessments were used in evaluating the results and three relevance criterion levels-liberal, regular, and stringent-were applied. The runs were also evaluated with generalized recall and precision, which weight the retrieved documents according to their relevance level. The performance of the Comparable Corpus Translation system (COCOT) was compared to that of a dictionarybased query translation program; the two translation methods were also combined. The results indicate that corpus-based CUR performs particularly well with highly relevant documents. In average precision, COCOT even matched the monolingual baseline on the highest relevance level. The performance of the different query translation methods was further analyzed by finding out reasons for poor rankings of highly relevant documents.
Sormunen, E.; Kekäläinen, J.; Koivisto, J.; Järvelin, K.: Document text characteristics affect the ranking of the most relevant documents by expanded structured queries (2001) 0.00
```
0.0030837 = product of:
  0.0215859 = sum of:
    0.0215859 = product of:
      0.05396475 = sum of:
        0.025893483 = weight(_text_:retrieval in 4487) [ClassicSimilarity], result of:
          0.025893483 = score(doc=4487,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.23632148 = fieldWeight in 4487, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4487)
        0.028071264 = weight(_text_:system in 4487) [ClassicSimilarity], result of:
          0.028071264 = score(doc=4487,freq=4.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.24605882 = fieldWeight in 4487, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4487)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)
```
Abstract

The increasing flood of documentary information through the Internet and other information sources challenges the developers of information retrieval systems. It is not enough that an IR system is able to make a distinction between relevant and non-relevant documents. The reduction of information overload requires that IR systems provide the capability of screening the most valuable documents out of the mass of potentially or marginally relevant documents. This paper introduces a new concept-based method to analyse the text characteristics of documents at varying relevance levels. The results of the document analysis were applied in an experiment on query expansion (QE) in a probabilistic IR system. Statistical differences in textual characteristics of highly relevant and less relevant documents were investigated by applying a facet analysis technique. In highly relevant documents a larger number of aspects of the request were discussed, searchable expressions for the aspects were distributed over a larger set of text paragraphs, and a larger set of unique expressions were used per aspect than in marginally relevant documents. A query expansion experiment verified that the findings of the text analysis can be exploited in formulating more effective queries for best match retrieval in the search for highly relevant documents. The results revealed that expanded queries with concept-based structures performed better than unexpanded queries or Ñnatural languageÒ queries. Further, it was shown that highly relevant documents benefit essentially more from the concept-based QE in ranking than marginally relevant documents.
Kettunen, K.; Kunttu, T.; Järvelin, K.: To stem or lemmatize a highly inflectional language in a probabilistic IR environment? (2005) 0.00
```
0.002650327 = product of:
  0.018552288 = sum of:
    0.018552288 = product of:
      0.04638072 = sum of:
        0.01830946 = weight(_text_:retrieval in 4395) [ClassicSimilarity], result of:
          0.01830946 = score(doc=4395,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.16710453 = fieldWeight in 4395, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4395)
        0.028071264 = weight(_text_:system in 4395) [ClassicSimilarity], result of:
          0.028071264 = score(doc=4395,freq=4.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.24605882 = fieldWeight in 4395, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4395)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)
```
Abstract

Purpose - To show that stem generation compares well with lemmatization as a morphological tool for a highly inflectional language for IR purposes in a best-match retrieval system. Design/methodology/approach - Effects of three different morphological methods - lemmatization, stemming and stem production - for Finnish are compared in a probabilistic IR environment (INQUERY). Evaluation is done using a four-point relevance scale which is partitioned differently in different test settings. Findings - Results show that stem production, a lighter method than morphological lemmatization, compares well with lemmatization in a best-match IR environment. Differences in performance between stem production and lemmatization are small and they are not statistically significant in most of the tested settings. It is also shown that hitherto a rather neglected method of morphological processing for Finnish, stemming, performs reasonably well although the stemmer used - a Porter stemmer implementation - is far from optimal for a morphologically complex language like Finnish. In another series of tests, the effects of compound splitting and derivational expansion of queries are tested. Practical implications - Usefulness of morphological lemmatization and stem generation for IR purposes can be estimated with many factors. On the average P-R level they seem to behave very close to each other in a probabilistic IR system. Thus, the choice of the used method with highly inflectional languages needs to be estimated along other dimensions too. Originality/value - Results are achieved using Finnish as an example of a highly inflectional language. The results are of interest for anyone who is interested in processing of morphological variation of a highly inflected language for IR purposes.
Tuomaala, O.; Järvelin, K.; Vakkari, P.: Evolution of library and information science, 1965-2005 : content analysis of journal articles (2014) 0.00
```
0.002613878 = product of:
  0.018297145 = sum of:
    0.018297145 = product of:
      0.045742862 = sum of:
        0.025893483 = weight(_text_:retrieval in 1309) [ClassicSimilarity], result of:
          0.025893483 = score(doc=1309,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.23632148 = fieldWeight in 1309, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1309)
        0.01984938 = weight(_text_:system in 1309) [ClassicSimilarity], result of:
          0.01984938 = score(doc=1309,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.17398985 = fieldWeight in 1309, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1309)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)
```
Abstract

This article first analyzes library and information science (LIS) research articles published in core LIS journals in 2005. It also examines the development of LIS from 1965 to 2005 in light of comparable data sets for 1965, 1985, and 2005. In both cases, the authors report (a) how the research articles are distributed by topic and (b) what approaches, research strategies, and methods were applied in the articles. In 2005, the largest research areas in LIS by this measure were information storage and retrieval, scientific communication, library and information-service activities, and information seeking. The same research areas constituted the quantitative core of LIS in the previous years since 1965. Information retrieval has been the most popular area of research over the years. The proportion of research on library and information-service activities decreased after 1985, but the popularity of information seeking and of scientific communication grew during the period studied. The viewpoint of research has shifted from library and information organizations to end users and development of systems for the latter. The proportion of empirical research strategies was high and rose over time, with the survey method being the single most important method. However, attention to evaluation and experiments increased considerably after 1985. Conceptual research strategies and system analysis, description, and design were quite popular, but declining. The most significant changes from 1965 to 2005 are the decreasing interest in library and information-service activities and the growth of research into information seeking and scientific communication.
Ingwersen, P.; Järvelin, K.: ¬The turn : integration of information seeking and retrieval in context (2005) 0.00
```
0.0023021426 = product of:
  0.016114997 = sum of:
    0.016114997 = product of:
      0.04028749 = sum of:
        0.0303628 = weight(_text_:retrieval in 1323) [ClassicSimilarity], result of:
          0.0303628 = score(doc=1323,freq=22.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.2771115 = fieldWeight in 1323, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1323)
        0.00992469 = weight(_text_:system in 1323) [ClassicSimilarity], result of:
          0.00992469 = score(doc=1323,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.08699492 = fieldWeight in 1323, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1323)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)
```
Abstract

The Turn analyzes the research of information seeking and retrieval (IS&R) and proposes a new direction of integrating research in these two areas: the fields should turn off their separate and narrow paths and construct a new avenue of research. An essential direction for this avenue is context as given in the subtitle Integration of Information Seeking and Retrieval in Context. Other essential themes in the book include: IS&R research models, frameworks and theories; search and works tasks and situations in context; interaction between humans and machines; information acquisition, relevance and information use; research design and methodology based on a structured set of explicit variables - all set into the holistic cognitive approach. The present monograph invites the reader into a construction project - there is much research to do for a contextual understanding of IS&R. The Turn represents a wide-ranging perspective of IS&R by providing a novel unique research framework, covering both individual and social aspects of information behavior, including the generation, searching, retrieval and use of information. Regarding traditional laboratory information retrieval research, the monograph proposes the extension of research toward actors, search and work tasks, IR interaction and utility of information. Regarding traditional information seeking research, it proposes the extension toward information access technology and work task contexts. The Turn is the first synthesis of research in the broad area of IS&R ranging from systems oriented laboratory IR research to social science oriented information seeking studies. TOC:Introduction.- The Cognitive Framework for Information.- The Development of Information Seeking Research.- Systems-Oriented Information Retrieval.- Cognitive and User-Oriented Information Retrieval.- The Integrated IS&R Research Framework.- Implications of the Cognitive Framework for IS&R.- Towards a Research Program.- Conclusion.- Definitions.- References.- Index.

Footnote

Rez. in: Mitt. VÖB 59(2006) H.2, S.81-83 (O. Oberhauser): "Mit diesem Band haben zwei herausragende Vertreter der europäischen Informationswissenschaft, die Professoren Peter Ingwersen (Kopenhagen) und Kalervo Järvelin (Tampere) ein Werk vorgelegt, das man vielleicht dereinst als ihr opus magnum bezeichnen wird. Mich würde dies nicht überraschen, denn die Autoren unternehmen hier den ambitionierten Versuch, zwei informations wissenschaftliche Forschungstraditionen, die einander bisher in eher geringem Ausmass begegneten, unter einem gesamtheitlichen kognitiven Ansatz zu vereinen - das primär im sozialwissenschaftlichen Bereich verankerte Forschungsgebiet "Information Seeking and Retrieval" (IS&R) und das vorwiegend im Informatikbereich angesiedelte "Information Retrieval" (IR). Dabei geht es ihnen auch darum, den seit etlichen Jahren zwar dominierenden, aber auch als zu individualistisch kritisierten kognitiven Ansatz so zu erweitern, dass technologische, verhaltensbezogene und kooperative Aspekte in kohärenter Weise berücksichtigt werden. Dies geschieht auf folgende Weise in neun Kapiteln: - Zunächst werden die beiden "Lager" - die an Systemen und Laborexperimenten orientierte IR-Tradition und die an Benutzerfragen orientierte IS&R-Fraktion - einander gegenübergestellt und einige zentrale Begriffe geklärt. - Im zweiten Kapitel erfolgt eine ausführliche Darstellung der kognitiven Richtung der Informationswissenschaft, insbesondere hinsichtlich des Informationsbegriffes. - Daran schliesst sich ein Überblick über die bisherige Forschung zu "Information Seeking" (IS) - eine äusserst brauchbare Einführung in die Forschungsfragen und Modelle, die Forschungsmethodik sowie die in diesem Bereich offenen Fragen, z.B. die aufgrund der einseitigen Ausrichtung des Blickwinkels auf den Benutzer mangelnde Betrachtung der Benutzer-System-Interaktion. - In analoger Weise wird im vierten Kapitel die systemorientierte IRForschung in einem konzentrierten Überblick vorgestellt, in dem es sowohl um das "Labormodell" als auch Ansätze wie die Verarbeitung natürlicher Sprache und Expertensysteme geht. Aspekte wie Relevanz, Anfragemodifikation und Performanzmessung werden ebenso angesprochen wie die Methodik - von den ersten Laborexperimenten bis zu TREC und darüber hinaus.

Series

The Kluwer international series on information retrieval ; 18

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Kekäläinen, J.; Järvelin, K.: Using graded relevance assessments in IR evaluation (2002) 0.00
```
0.0021805053 = product of:
  0.015263537 = sum of:
    0.015263537 = product of:
      0.03815884 = sum of:
        0.01830946 = weight(_text_:retrieval in 5225) [ClassicSimilarity], result of:
          0.01830946 = score(doc=5225,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.16710453 = fieldWeight in 5225, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5225)
        0.01984938 = weight(_text_:system in 5225) [ClassicSimilarity], result of:
          0.01984938 = score(doc=5225,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.17398985 = fieldWeight in 5225, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5225)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)
```
Abstract

Kekalainen and Jarvelin use what they term generalized, nonbinary recall and precision measures where recall is the sum of the relevance scores of the retrieved documents divided by the sum of relevance scores of all documents in the data base, and precision is the sum of the relevance scores of the retrieved documents divided by the number of documents where the relevance scores are real numbers between zero and one. Using the In-Query system and a text data base of 53,893 newspaper articles with 30 queries selected from those for which four relevance categories to provide recall measures were available, search results were evaluated by four judges. Searches were done by average key term weight, Boolean expression, and by average term weight where the terms are grouped by a synonym operator, and for each case with and without expansion of the original terms. Use of higher standards of relevance appears to increase the superiority of the best method. Some methods do a better job of getting the highly relevant documents but do not increase retrieval of marginal ones. There is evidence that generalized precision provides more equitable results, while binary precision provides undeserved merit to some methods. Generally graded relevance measures seem to provide additional insight into IR evaluation.

Pirkola, A.; Hedlund, T.; Keskustalo, H.; Järvelin, K.: Dictionary-based cross-language information retrieval : problems, methods, and research findings (2001) 0.00

0.0020714786 = product of:
  0.01450035 = sum of:
    0.01450035 = product of:
      0.07250175 = sum of:
        0.07250175 = weight(_text_:retrieval in 3908) [ClassicSimilarity], result of:
          0.07250175 = score(doc=3908,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.6617001 = fieldWeight in 3908, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.109375 = fieldNorm(doc=3908)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)

Source: Information retrieval. 4(2001), S.209-230

Vakkari, P.; Järvelin, K.; Chang, Y.-W.: ¬The association of disciplinary background with the evolution of topics and methods in Library and Information Science research 1995-2015 (2023) 0.00

0.0017527144 = product of:
  0.0122690005 = sum of:
    0.0122690005 = product of:
      0.024538001 = sum of:
        0.024538001 = weight(_text_:22 in 998) [ClassicSimilarity], result of:
          0.024538001 = score(doc=998,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.19345059 = fieldWeight in 998, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=998)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Date: 22. 6.2023 18:15:06

Hansen, P.; Järvelin, K.: Collaborative Information Retrieval in an information-intensive domain (2005) 0.00
```
0.0015376742 = product of:
  0.010763719 = sum of:
    0.010763719 = product of:
      0.053818595 = sum of:
        0.053818595 = weight(_text_:retrieval in 1040) [ClassicSimilarity], result of:
          0.053818595 = score(doc=1040,freq=12.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.49118498 = fieldWeight in 1040, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1040)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

In this article we investigate the expressions of collaborative activities within information seeking and retrieval processes (IS&R). Generally, information seeking and retrieval is regarded as an individual and isolated process in IR research. We assume that an IS&R situation is not merely an individual effort, but inherently involves various collaborative activities. We present empirical results from a real-life and information-intensive setting within the patent domain, showing that the patent task performance process involves highly collaborative aspects throughout the stages of the information seeking and retrieval process. Furthermore, we show that these activities may be categorised and related to different stages in an information seeking and retrieval process. Therefore, the assumption that information retrieval performance is purely individual needs to be reconsidered. Finally, we also propose a refined IR framework involving collaborative aspects.

Järvelin, K.: Evaluation (2011) 0.00

0.0014647568 = product of:
  0.010253297 = sum of:
    0.010253297 = product of:
      0.051266484 = sum of:
        0.051266484 = weight(_text_:retrieval in 548) [ClassicSimilarity], result of:
          0.051266484 = score(doc=548,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.46789268 = fieldWeight in 548, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.109375 = fieldNorm(doc=548)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)

Source: Interactive information seeking, behaviour and retrieval. Eds.: Ruthven, I. u. D. Kelly

Järvelin, K.; Niemi, T.: Deductive information retrieval based on classifications (1993) 0.00
```
0.0014036982 = product of:
  0.009825887 = sum of:
    0.009825887 = product of:
      0.049129434 = sum of:
        0.049129434 = weight(_text_:retrieval in 2229) [ClassicSimilarity], result of:
          0.049129434 = score(doc=2229,freq=10.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.44838852 = fieldWeight in 2229, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2229)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

Modern fact databses contain abundant data classified through several classifications. Typically, users msut consult these classifications in separate manuals or files, thus making their effective use difficult. Contemporary database systems do little support deductive use of classifications. In this study we show how deductive data management techniques can be applied to the utilization of data value classifications. Computation of transitive class relationships is of primary importance here. We define a representation of classifications which supports transitive computation and present an operation-oriented deductive query language tailored for classification-based deductive information retrieval. The operations of this language are on the same abstraction level as relational algebra operations and can be integrated with these to form a powerful and flexible query language for deductive information retrieval. We define the integration of these operations and demonstrate the usefulness of the language in terms of several sample queries

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Klassifikationssysteme im Online-Retrieval
Kumpulainen, S.; Järvelin, K.: Barriers to task-based information access in molecular medicine (2012) 0.00
```
0.0011342503 = product of:
  0.007939752 = sum of:
    0.007939752 = product of:
      0.03969876 = sum of:
        0.03969876 = weight(_text_:system in 4965) [ClassicSimilarity], result of:
          0.03969876 = score(doc=4965,freq=8.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.3479797 = fieldWeight in 4965, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4965)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

We analyze barriers to task-based information access in molecular medicine, focusing on research tasks, which provide task performance sessions of varying complexity. Molecular medicine is a relevant domain because it offers thousands of digital resources as the information environment. Data were collected through shadowing of real work tasks. Thirty work task sessions were analyzed and barriers in these identified. The barriers were classified by their character (conceptual, syntactic, and technological) and by their context of appearance (work task, system integration, or system). Also, work task sessions were grouped into three complexity classes and the frequency of barriers of varying types across task complexity levels were analyzed. Our findings indicate that although most of the barriers are on system level, there is a quantum of barriers in integration and work task contexts. These barriers might be overcome through attention to the integrated use of multiple systems at least for the most frequent uses. This can be done by means of standardization and harmonization of the data and by taking the requirements of the work tasks into account in system design and development, because information access is seldom an end itself, but rather serves to reach the goals of work tasks.
Lehtokangas, R.; Keskustalo, H.; Järvelin, K.: Experiments with transitive dictionary translation and pseudo-relevance feedback using graded relevance assessments (2008) 0.00
```
0.0010873 = product of:
  0.0076110996 = sum of:
    0.0076110996 = product of:
      0.0380555 = sum of:
        0.0380555 = weight(_text_:retrieval in 1349) [ClassicSimilarity], result of:
          0.0380555 = score(doc=1349,freq=6.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.34732026 = fieldWeight in 1349, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1349)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

In this article, the authors present evaluation results for transitive dictionary-based cross-language information retrieval (CLIR) using graded relevance assessments in a best match retrieval environment. A text database containing newspaper articles and a related set of 35 search topics were used in the tests. Source language topics (in English, German, and Swedish) were automatically translated into the target language (Finnish) via an intermediate (or pivot) language. Effectiveness of the transitively translated queries was compared to that of the directly translated and monolingual Finnish queries. Pseudo-relevance feedback (PRF) was also used to expand the original transitive target queries. Cross-language information retrieval performance was evaluated on three relevance thresholds: stringent, regular, and liberal. The transitive translations performed well achieving, on the average, 85-93% of the direct translation performance, and 66-72% of monolingual performance. Moreover, PRF was successful in raising the performance of transitive translation routes in absolute terms as well as in relation to monolingual and direct translation performance applying PRF.

Järvelin, K.; Ingwersen, P.: User-oriented and cognitive models of information retrieval (2009) 0.00

0.0010357393 = product of:
  0.007250175 = sum of:
    0.007250175 = product of:
      0.036250874 = sum of:
        0.036250874 = weight(_text_:retrieval in 3901) [ClassicSimilarity], result of:
          0.036250874 = score(doc=3901,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.33085006 = fieldWeight in 3901, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3901)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)

Abstract: The domain of user-oriented and cognitive information retrieval (IR) is first discussed, followed by a discussion on the dimensions and types of models one may build for the domain. The focus of the present entry is on the models of user-oriented and cognitive IR, not on their empirical applications. Several models with different emphases on user-oriented and cognitive IR are presented-ranging from overall approaches and relevance models to procedural models, cognitive models, and task-based models. The present entry does not discuss empirical findings based on the models.

Ferro, N.; Silvello, G.; Keskustalo, H.; Pirkola, A.; Järvelin, K.: ¬The twist measure for IR evaluation : taking user's effort into account (2016) 0.00
```
9.822896E-4 = product of:
  0.0068760267 = sum of:
    0.0068760267 = product of:
      0.034380134 = sum of:
        0.034380134 = weight(_text_:system in 2771) [ClassicSimilarity], result of:
          0.034380134 = score(doc=2771,freq=6.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.30135927 = fieldWeight in 2771, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2771)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

We present a novel measure for ranking evaluation, called Twist (t). It is a measure for informational intents, which handles both binary and graded relevance. t stems from the observation that searching is currently a that searching is currently taken for granted and it is natural for users to assume that search engines are available and work well. As a consequence, users may assume the utility they have in finding relevant documents, which is the focus of traditional measures, as granted. On the contrary, they may feel uneasy when the system returns nonrelevant documents because they are then forced to do additional work to get the desired information, and this causes avoidable effort. The latter is the focus of t, which evaluates the effectiveness of a system from the point of view of the effort required to the users to retrieve the desired information. We provide a formal definition of t, a demonstration of its properties, and introduce the notion of effort/gain plots, which complement traditional utility-based measures. By means of an extensive experimental evaluation, t is shown to grasp different aspects of system performances, to not require extensive and costly assessments, and to be a robust tool for detecting differences between systems.
Vakkari, P.; Järvelin, K.: Explanation in information seeking and retrieval (2005) 0.00
```
9.3579874E-4 = product of:
  0.006550591 = sum of:
    0.006550591 = product of:
      0.032752953 = sum of:
        0.032752953 = weight(_text_:retrieval in 643) [ClassicSimilarity], result of:
          0.032752953 = score(doc=643,freq=10.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.29892567 = fieldWeight in 643, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=643)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

Information Retrieval (IR) is a research area both within Computer Science and Information Science. It has by and large two communities: a Computer Science oriented experimental approach and a user-oriented Information Science approach with a Social Science background. The communities hold a critical stance towards each other (e.g., Ingwersen, 1996), the latter suspecting the realism of the former, and the former suspecting the usefulness of the latter. Within Information Science the study of information seeking (IS) also has a Social Science background. There is a lot of research in each of these particular areas of information seeking and retrieval (IS&R). However, the three communities do not really communicate with each other. Why is this, and could the relationships be otherwise? Do the communities in fact belong together? Or perhaps each community is better off forgetting about the existence of the other two? We feel that the relationships between the research areas have not been properly analyzed. One way to analyze the relationships is to examine what each research area is trying to find out: which phenomena are being explained and how. We believe that IS&R research would benefit from being analytic about its frameworks, models and theories, not just at the level of meta-theories, but also much more concretely at the level of study designs. Over the years there have been calls for more context in the study of IS&R. Work tasks as well as cultural activities/interests have been proposed as the proper context for information access. For example, Wersig (1973) conceptualized information needs from the tasks perspective. He argued that in order to learn about information needs and seeking, one needs to take into account the whole active professional role of the individuals being investigated. Byström and Järvelin (1995) analysed IS processes in the light of tasks of varying complexity. Ingwersen (1996) discussed the role of tasks and their descriptions and problematic situations from a cognitive perspective on IR. Most recently, Vakkari (2003) reviewed task-based IR and Järvelin and Ingwersen (2004) proposed the extension of IS&R research toward the task context. Therefore there is much support to the task context, but how should it be applied in IS&R?

Series

The information retrieval series, vol. 19

Source

New directions in cognitive information retrieval. Eds.: A. Spink, C. Cole

Search (31 results, page 1 of 2)

Authors

Years

Types

Themes