Search (13 results, page 1 of 1)

Wilde, A.; Wenninger, A.; Hopt, O.; Schaer, P.; Zapilko, B.: Aktivitäten von GESIS im Kontext von Open Data und Zugang zu sozialwissenschaftlichen Forschungsergebnissen (2010) 0.05

0.04654293 = product of:
  0.10472159 = sum of:
    0.005161823 = weight(_text_:in in 4275) [ClassicSimilarity], result of:
      0.005161823 = score(doc=4275,freq=2.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.09017298 = fieldWeight in 4275, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=4275)
    0.024478322 = weight(_text_:zu in 4275) [ClassicSimilarity], result of:
      0.024478322 = score(doc=4275,freq=2.0), product of:
        0.12465679 = queryWeight, product of:
          2.9621663 = idf(docFreq=6214, maxDocs=44218)
          0.04208298 = queryNorm
        0.19636573 = fieldWeight in 4275, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9621663 = idf(docFreq=6214, maxDocs=44218)
          0.046875 = fieldNorm(doc=4275)
    0.036257274 = weight(_text_:und in 4275) [ClassicSimilarity], result of:
      0.036257274 = score(doc=4275,freq=14.0), product of:
        0.09327133 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04208298 = queryNorm
        0.38872904 = fieldWeight in 4275, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.046875 = fieldNorm(doc=4275)
    0.038824175 = product of:
      0.07764835 = sum of:
        0.07764835 = weight(_text_:gesellschaft in 4275) [ClassicSimilarity], result of:
          0.07764835 = score(doc=4275,freq=4.0), product of:
            0.18669544 = queryWeight, product of:
              4.4363647 = idf(docFreq=1422, maxDocs=44218)
              0.04208298 = queryNorm
            0.41590917 = fieldWeight in 4275, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.4363647 = idf(docFreq=1422, maxDocs=44218)
              0.046875 = fieldNorm(doc=4275)
      0.5 = coord(1/2)
  0.44444445 = coord(4/9)

Abstract: GESIS - Leibniz-Institut für Sozialwissenschaften betreibt mit dem Volltext-Server SSOAR und der Registrierungsagentur für sozialwissenschaftliche Forschungsdaten dalra zwei Plattformen zum Nachweis von wissenschaftlichen Ergebnissen in Form von Publikationen und Primärdaten. Beide Systeme setzen auf einen konsequenten Einsatz von Persistenten Identifikatoren (URN und DOI), was die Verknüpfung der durch dalra registrierten Daten mit den Volltextdokumenten aus SSOAR sowie anderen Informationen aus den GESIS-Beständen ermöglicht. Zusätzlich wird durch den Einsatz von semantischen Technologien wie SKOS und RDF eine Verbindung zum Semantic Web hergestellt.
Series: Tagungen der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis ; Bd. 14) (DGI-Konferenz ; 1
Source: Semantic web & linked data: Elemente zukünftiger Informationsinfrastrukturen ; 1. DGI-Konferenz ; 62. Jahrestagung der DGI ; Frankfurt am Main, 7. - 9. Oktober 2010 ; Proceedings / Deutsche Gesellschaft für Informationswissenschaft und Informationspraxis. Hrsg.: M. Ockenfeld

Schaer, P.: Integration von Open-Access-Repositorien in Fachportale (2010) 0.04

0.04447403 = product of:
  0.13342209 = sum of:
    0.013465886 = weight(_text_:in in 2320) [ClassicSimilarity], result of:
      0.013465886 = score(doc=2320,freq=10.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.23523843 = fieldWeight in 2320, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2320)
    0.015987955 = weight(_text_:und in 2320) [ClassicSimilarity], result of:
      0.015987955 = score(doc=2320,freq=2.0), product of:
        0.09327133 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04208298 = queryNorm
        0.17141339 = fieldWeight in 2320, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2320)
    0.103968255 = sum of:
      0.06405661 = weight(_text_:gesellschaft in 2320) [ClassicSimilarity], result of:
        0.06405661 = score(doc=2320,freq=2.0), product of:
          0.18669544 = queryWeight, product of:
            4.4363647 = idf(docFreq=1422, maxDocs=44218)
            0.04208298 = queryNorm
          0.34310755 = fieldWeight in 2320, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.4363647 = idf(docFreq=1422, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2320)
      0.039911643 = weight(_text_:22 in 2320) [ClassicSimilarity], result of:
        0.039911643 = score(doc=2320,freq=2.0), product of:
          0.14736743 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04208298 = queryNorm
          0.2708308 = fieldWeight in 2320, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2320)
  0.33333334 = coord(3/9)

Abstract: Open Access Repositorien sind Online-Archive für frei im Internet zugängliche Publikationen im Volltext. Open Access Materialien oder die Open Access Repositorien selbst sind allerdings nur unzureichend in zentrale Fachportale (z.B. virtuelle Fachbibliotheken) eingebunden. Der Beitrag stellt SSOAR - Social Science Open Access Repository, einen disziplinären Open Access Volltextserver für die Sozialwissenschaften vor und zeigt wie dieser in das sozialwissenschaftliche Fachportal Sowiport integriert wird.
Series: Fortschritte in der Wissensorganisation; Bd.11
Source: Wissensspeicher in digitalen Räumen: Nachhaltigkeit - Verfügbarkeit - semantische Interoperabilität. Proceedings der 11. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation, Konstanz, 20. bis 22. Februar 2008. Hrsg.: J. Sieglerschmidt u. H.P.Ohly

Mayr, P.; Mutschke, P.; Schaer, P.; Sure, Y.: Mehrwertdienste für das Information Retrieval (2013) 0.02

0.024811616 = product of:
  0.07443485 = sum of:
    0.01043063 = weight(_text_:in in 935) [ClassicSimilarity], result of:
      0.01043063 = score(doc=935,freq=6.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.1822149 = fieldWeight in 935, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=935)
    0.03197591 = weight(_text_:und in 935) [ClassicSimilarity], result of:
      0.03197591 = score(doc=935,freq=8.0), product of:
        0.09327133 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04208298 = queryNorm
        0.34282678 = fieldWeight in 935, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0546875 = fieldNorm(doc=935)
    0.032028306 = product of:
      0.06405661 = sum of:
        0.06405661 = weight(_text_:gesellschaft in 935) [ClassicSimilarity], result of:
          0.06405661 = score(doc=935,freq=2.0), product of:
            0.18669544 = queryWeight, product of:
              4.4363647 = idf(docFreq=1422, maxDocs=44218)
              0.04208298 = queryNorm
            0.34310755 = fieldWeight in 935, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4363647 = idf(docFreq=1422, maxDocs=44218)
              0.0546875 = fieldNorm(doc=935)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: Ziel des Projekts ist die Entwicklung und Erprobung von metadatenbasierten Mehr-wertdiensten für Retrievalumgebungen mit mehreren Datenbanken: a) Search Term Recommender (STR) als Dienst zum automatischen Vorschlagen von Suchbegriffen aus kontrollierten Vokabularen, b) Bradfordizing als Dienst zum Re-Ranking von Ergebnismengen nach Kernzeitschriften und c) Autorenzentralität als Dienst zum Re-Ranking von. Ergebnismengen nach Zentralität der Autoren in Autorennetzwerken. Schwerpunkt des Projektes ist die prototypische mplementierung der drei Mehrwertdienste in einer integrierten Retrieval-Testumgebung und insbesondere deren quantitative und qualitative Evaluation hinsichtlich Verbesserung der Retrievalqualität bei Einsatz der Mehrwertdienste.
Series: Fortschritte in der Wissensorganisation; Bd.12
Source: Wissen - Wissenschaft - Organisation: Proceedings der 12. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation Bonn, 19. bis 21. Oktober 2009. Hrsg.: H.P. Ohly

Schaer, P.: Sprachmodelle und neuronale Netze im Information Retrieval (2023) 0.02
```
0.020383175 = product of:
  0.061149523 = sum of:
    0.010536527 = weight(_text_:in in 799) [ClassicSimilarity], result of:
      0.010536527 = score(doc=799,freq=12.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.18406484 = fieldWeight in 799, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=799)
    0.020398602 = weight(_text_:zu in 799) [ClassicSimilarity], result of:
      0.020398602 = score(doc=799,freq=2.0), product of:
        0.12465679 = queryWeight, product of:
          2.9621663 = idf(docFreq=6214, maxDocs=44218)
          0.04208298 = queryNorm
        0.16363811 = fieldWeight in 799, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9621663 = idf(docFreq=6214, maxDocs=44218)
          0.0390625 = fieldNorm(doc=799)
    0.030214394 = weight(_text_:und in 799) [ClassicSimilarity], result of:
      0.030214394 = score(doc=799,freq=14.0), product of:
        0.09327133 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04208298 = queryNorm
        0.32394084 = fieldWeight in 799, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0390625 = fieldNorm(doc=799)
  0.33333334 = coord(3/9)
```
Abstract

In den letzten Jahren haben Sprachmodelltechnologien unterschiedlichster Ausprägungen in der Informationswissenschaft Einzug gehalten. Diesen Sprachmodellen, die unter den Bezeichnungen GPT, ELMo oder BERT bekannt sind, ist gemein, dass sie dank sehr großer Webkorpora auf eine Datenbasis zurückgreifen, die bei vorherigen Sprachmodellansätzen undenkbar war. Gleichzeitig setzen diese Modelle auf neuere Entwicklungen des maschinellen Lernens, insbesondere auf künstliche neuronale Netze. Diese Technologien haben auch im Information Retrieval (IR) Fuß gefasst und bereits kurz nach ihrer Einführung sprunghafte, substantielle Leistungssteigerungen erzielt. Neuronale Netze haben in Kombination mit großen vortrainierten Sprachmodellen und kontextualisierten Worteinbettungen geführt. Wurde in vergangenen Jahren immer wieder eine stagnierende Retrievalleistung beklagt, die Leistungssteigerungen nur gegenüber "schwachen Baselines" aufwies, so konnten mit diesen technischen und methodischen Innovationen beeindruckende Leistungssteigerungen in Aufgaben wie dem klassischen Ad-hoc-Retrieval, der maschinellen Übersetzung oder auch dem Question Answering erzielt werden. In diesem Kapitel soll ein kurzer Überblick über die Grundlagen der Sprachmodelle und der NN gegeben werden, um die prinzipiellen Bausteine zu verstehen, die hinter aktuellen Technologien wie ELMo oder BERT stecken, die die Welt des NLP und IR im Moment beherrschen.

Source

Grundlagen der Informationswissenschaft. Hrsg.: Rainer Kuhlen, Dirk Lewandowski, Wolfgang Semar und Christa Womser-Hacker. 7., völlig neu gefasste Ausg

Fühles-Ubach, S.; Schaer, P.; Lepsky, K.; Seidler-de Alwis, R.: Data Librarian : ein neuer Studienschwerpunkt für wissenschaftliche Bibliotheken und Forschungseinrichtungen (2019) 0.01

0.009423676 = product of:
  0.04240654 = sum of:
    0.01043063 = weight(_text_:in in 5836) [ClassicSimilarity], result of:
      0.01043063 = score(doc=5836,freq=6.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.1822149 = fieldWeight in 5836, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5836)
    0.03197591 = weight(_text_:und in 5836) [ClassicSimilarity], result of:
      0.03197591 = score(doc=5836,freq=8.0), product of:
        0.09327133 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04208298 = queryNorm
        0.34282678 = fieldWeight in 5836, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5836)
  0.22222222 = coord(2/9)

Abstract: Der Beitrag beschäftigt sich mit dem neuen Studienschwerpunkt "Data Librarian" im Studiengang "Data and Information Science", der seit dem Wintersemester 2018/19 am Institut für Informationswissenschaft der Technischen Hochschule Köln angeboten wird. Im Rahmen einer gemeinsamen Akkreditierung aller Bachelor-Studiengänge des Instituts entwickelt, bündelt bzw. vermittelt er u. a. umfassende Kenntnisse in den Bereichen Datenstrukturen, Datenverarbeitung, Informationssysteme, Datenanalyse und Information Research in den ersten Semestern. Das sechsmonatige Praxissemester findet in einer wissenschaftlichen Bibliothek oder Informationseinrichtung statt, bevor die Schwerpunkte Forschungsdaten I+II, Wissenschaftskommunikation, Szientometrie und automatische Erschließung vermittelt werden.
Source: Bibliothek: Forschung und Praxis. 43(2019) H.2, S.255-261

Mayr, P.; Mutschke, P.; Petras, V.; Schaer, P.; Sure, Y.: Applying science models for search (2010) 0.01

0.0072717485 = product of:
  0.032722868 = sum of:
    0.0068824305 = weight(_text_:in in 4663) [ClassicSimilarity], result of:
      0.0068824305 = score(doc=4663,freq=2.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.120230645 = fieldWeight in 4663, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=4663)
    0.025840437 = weight(_text_:und in 4663) [ClassicSimilarity], result of:
      0.025840437 = score(doc=4663,freq=4.0), product of:
        0.09327133 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04208298 = queryNorm
        0.27704588 = fieldWeight in 4663, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0625 = fieldNorm(doc=4663)
  0.22222222 = coord(2/9)

Abstract: The paper proposes three different kinds of science models as value-added services that are integrated in the retrieval process to enhance retrieval quailty. The paper discusses the approaches Search Term Recommendation, Bradfordizing and Author Centrality on a general level and addresses implementation issues of the models within a real-life retrieval environment.
Source: Information und Wissen: global, sozial und frei? Proceedings des 12. Internationalen Symposiums für Informationswissenschaft (ISI 2011) ; Hildesheim, 9. - 11. März 2011. Hrsg.: J. Griesbaum, T. Mandl u. C. Womser-Hacker

Mayr, P.; Schaer, P.; Mutschke, P.: ¬A science model driven retrieval prototype (2011) 0.01
```
0.0053394684 = product of:
  0.024027606 = sum of:
    0.010323646 = weight(_text_:in in 649) [ClassicSimilarity], result of:
      0.010323646 = score(doc=649,freq=8.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.18034597 = fieldWeight in 649, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=649)
    0.013703961 = weight(_text_:und in 649) [ClassicSimilarity], result of:
      0.013703961 = score(doc=649,freq=2.0), product of:
        0.09327133 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.04208298 = queryNorm
        0.14692576 = fieldWeight in 649, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.046875 = fieldNorm(doc=649)
  0.22222222 = coord(2/9)
```
Abstract

This paper is about a better understanding of the structure and dynamics of science and the usage of these insights for compensating the typical problems that arises in metadata-driven Digital Libraries. Three science model driven retrieval services are presented: co-word analysis based query expansion, re-ranking via Bradfordizing and author centrality. The services are evaluated with relevance assessments from which two important implications emerge: (1) precision values of the retrieval services are the same or better than the tf-idf retrieval baseline and (2) each service retrieved a disjoint set of documents. The different services each favor quite other - but still relevant - documents than pure term-frequency based rankings. The proposed models and derived retrieval services therefore open up new viewpoints on the scientific knowledge space and provide an alternative framework to structure scholarly information systems.

Series

Bibliotheca Academica - Reihe Informations- und Bibliothekswissenschaften; Bd. 1

Source

Concepts in context: Proceedings of the Cologne Conference on Interoperability and Semantics in Knowledge Organization July 19th - 20th, 2010. Eds.: F. Boteram, W. Gödert u. J. Hubrich

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Posch, L.; Schaer, P.; Bleier, A.; Strohmaier, M.: ¬A system for probabilistic linking of thesauri and classification systems (2015) 0.00
```
0.0016390155 = product of:
  0.014751139 = sum of:
    0.014751139 = weight(_text_:in in 2515) [ClassicSimilarity], result of:
      0.014751139 = score(doc=2515,freq=12.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.2576908 = fieldWeight in 2515, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2515)
  0.11111111 = coord(1/9)
```
Abstract

This paper presents a system which creates and visualizes probabilistic semantic links between concepts in a thesaurus and classes in a classification system. For creating the links, we build on the Polylingual Labeled Topic Model (PLL-TM) (Posch et al., in KI 2015: advances in artificial intelligence, 2015). PLL-TM identifies probable thesaurus descriptors for each class in the classification system by using information from the natural language text of documents, their assigned thesaurus descriptors and their designated classes. The links are then presented to users of the system in an interactive visualization, providing them with an automatically generated overview of the relations between the thesaurus and the classification system.
Munkelt, J.; Schaer, P.; Lepsky, K.: Towards an IR test collection for the German National Library (2018) 0.00
```
0.0015174334 = product of:
  0.013656901 = sum of:
    0.013656901 = weight(_text_:in in 4311) [ClassicSimilarity], result of:
      0.013656901 = score(doc=4311,freq=14.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.23857531 = fieldWeight in 4311, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=4311)
  0.11111111 = coord(1/9)
```
Abstract

Automatic content indexing is one of the innovations that are increasingly changing the way libraries work. In theory, it promises a cataloguing service that would hardly be possible with humans in terms of speed, quantity and maybe quality. The German National Library (DNB) has also recognised this potential and is increasingly relying on the automatic indexing of their catalogue content. The DNB took a major step in this direction in 2017, which was announced in two papers. The announcement was rather restrained, but the content of the papers is all the more explosive for the library community: Since September 2017, the DNB has discontinued the intellectual indexing of series Band H and has switched to an automatic process for these series. The subject indexing of online publications (series O) has been purely automatical since 2010; from September 2017, monographs and periodicals published outside the publishing industry and university publications will no longer be indexed by people. This raises the question: What is the quality of the automatic indexing compared to the manual work or in other words to which degree can the automatic indexing replace people without a signi cant drop in regards to quality?
Breuer, T.; Tavakolpoursaleh, N.; Schaer, P.; Hienert, D.; Schaible, J.; Castro, L.J.: Online Information Retrieval Evaluation using the STELLA Framework (2022) 0.00
```
9.933934E-4 = product of:
  0.00894054 = sum of:
    0.00894054 = weight(_text_:in in 640) [ClassicSimilarity], result of:
      0.00894054 = score(doc=640,freq=6.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.1561842 = fieldWeight in 640, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=640)
  0.11111111 = coord(1/9)
```
Abstract

Involving users in early phases of software development has become a common strategy as it enables developers to consider user needs from the beginning. Once a system is in production, new opportunities to observe, evaluate and learn from users emerge as more information becomes available. Gathering information from users to continuously evaluate their behavior is a common practice for commercial software, while the Cranfield paradigm remains the preferred option for Information Retrieval (IR) and recommendation systems in the academic world. Here we introduce the Infrastructures for Living Labs STELLA project which aims to create an evaluation infrastructure allowing experimental systems to run along production web-based academic search systems with real users. STELLA combines user interactions and log files analyses to enable large-scale A/B experiments for academic search.
Schaer, P.; Mayr, P.; Sünkler, S.; Lewandowski, D.: How relevant is the long tail? : a relevance assessment study on million short (2016) 0.00
```
9.5589313E-4 = product of:
  0.008603038 = sum of:
    0.008603038 = weight(_text_:in in 3144) [ClassicSimilarity], result of:
      0.008603038 = score(doc=3144,freq=8.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.15028831 = fieldWeight in 3144, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3144)
  0.11111111 = coord(1/9)
```
Abstract

Users of web search engines are known to mostly focus on the top ranked results of the search engine result page. While many studies support this well known information seeking pattern only few studies concentrate on the question what users are missing by neglecting lower ranked results. To learn more about the relevance distributions in the so-called long tail we conducted a relevance assessment study with the Million Short long-tail web search engine. While we see a clear difference in the content between the head and the tail of the search engine result list we see no statistical significant differences in the binary relevance judgments and weak significant differences when using graded relevance. The tail contains different but still valuable results. We argue that the long tail can be a rich source for the diversification of web search engine result lists but it needs more evaluation to clearly describe the differences.

Footnote

To appear in Experimental IR Meets Multilinguality, Multimodality, and Interaction. 7th International Conference of the CLEF Association, CLEF 2016, \'Evora, Portugal, September 5-8, 2016.
Neumann, M.; Steinberg, J.; Schaer, P.: Web-ccraping for non-programmers : introducing OXPath for digital library metadata harvesting (2017) 0.00
```
8.2782784E-4 = product of:
  0.0074504507 = sum of:
    0.0074504507 = weight(_text_:in in 3895) [ClassicSimilarity], result of:
      0.0074504507 = score(doc=3895,freq=6.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.1301535 = fieldWeight in 3895, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3895)
  0.11111111 = coord(1/9)
```
Abstract

Building up new collections for digital libraries is a demanding task. Available data sets have to be extracted which is usually done with the help of software developers as it involves custom data handlers or conversion scripts. In cases where the desired data is only available on the data provider's website custom web scrapers are needed. This may be the case for small to medium-size publishers, research institutes or funding agencies. As data curation is a typical task that is done by people with a library and information science background, these people are usually proficient with XML technologies but are not full-stack programmers. Therefore we would like to present a web scraping tool that does not demand the digital library curators to program custom web scrapers from scratch. We present the open-source tool OXPath, an extension of XPath, that allows the user to define data to be extracted from websites in a declarative way. By taking one of our own use cases as an example, we guide you in more detail through the process of creating an OXPath wrapper for metadata harvesting. We also point out some practical things to consider when creating a web scraper (with OXPath). On top of that, we also present a syntax highlighting plugin for the popular text editor Atom that we developed to further support OXPath users and to simplify the authoring process.
Balog, K.; Schuth, A.; Dekker, P.; Tavakolpoursaleh, N.; Schaer, P.; Chuang, P.-Y.: Overview of the TREC 2016 Open Search track Academic Search Edition (2016) 0.00
```
7.647145E-4 = product of:
  0.0068824305 = sum of:
    0.0068824305 = weight(_text_:in in 43) [ClassicSimilarity], result of:
      0.0068824305 = score(doc=43,freq=2.0), product of:
        0.057243563 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.04208298 = queryNorm
        0.120230645 = fieldWeight in 43, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=43)
  0.11111111 = coord(1/9)
```
Abstract

We present the TREC Open Search track, which represents a new evaluation paradigm for information retrieval. It offers the possibility for researchers to evaluate their approaches in a live setting, with real, unsuspecting users of an existing search engine. The first edition of the track focuses on the academic search domain and features the ad-hoc scientific literature search task. We report on experiments with three different academic search engines: Cite-SeerX, SSOAR, and Microsoft Academic Search.

Search (13 results, page 1 of 1)

Authors

Years

Languages

Types

Themes