Search (4 results, page 1 of 1)

  • × author_ss:"Rijke, M. de"
  • × year_i:[2010 TO 2020}
  1. Graus, D.; Odijk, D.; Rijke, M. de: ¬The birth of collective memories : analyzing emerging entities in text streams (2018) 0.01
    0.014386819 = product of:
      0.057547275 = sum of:
        0.057547275 = weight(_text_:social in 4252) [ClassicSimilarity], result of:
          0.057547275 = score(doc=4252,freq=4.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.3115296 = fieldWeight in 4252, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4252)
      0.25 = coord(1/4)
    
    Abstract
    We study how collective memories are formed online. We do so by tracking entities that emerge in public discourse, that is, in online text streams such as social media and news streams, before they are incorporated into Wikipedia, which, we argue, can be viewed as an online place for collective memory. By tracking how entities emerge in public discourse, that is, the temporal patterns between their first mention in online text streams and subsequent incorporation into collective memory, we gain insights into how the collective remembrance process happens online. Specifically, we analyze nearly 80,000 entities as they emerge in online text streams before they are incorporated into Wikipedia. The online text streams we use for our analysis comprise of social media and news streams, and span over 579 million documents in a time span of 18 months. We discover two main emergence patterns: entities that emerge in a "bursty" fashion, that is, that appear in public discourse without a precedent, blast into activity and transition into collective memory. Other entities display a "delayed" pattern, where they appear in public discourse, experience a period of inactivity, and then resurface before transitioning into our cultural collective memory.
  2. Berendsen, R.; Rijke, M. de; Balog, K.; Bogers, T.; Bosch, A. van den: On the assessment of expertise profiles (2013) 0.01
    0.0065351077 = product of:
      0.026140431 = sum of:
        0.026140431 = product of:
          0.052280862 = sum of:
            0.052280862 = weight(_text_:aspects in 1089) [ClassicSimilarity], result of:
              0.052280862 = score(doc=1089,freq=2.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.2496898 = fieldWeight in 1089, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1089)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Expertise retrieval has attracted significant interest in the field of information retrieval. Expert finding has been studied extensively, with less attention going to the complementary task of expert profiling, that is, automatically identifying topics about which a person is knowledgeable. We describe a test collection for expert profiling in which expert users have self-selected their knowledge areas. Motivated by the sparseness of this set of knowledge areas, we report on an assessment experiment in which academic experts judge a profile that has been automatically generated by state-of-the-art expert-profiling algorithms; optionally, experts can indicate a level of expertise for relevant areas. Experts may also give feedback on the quality of the system-generated knowledge areas. We report on a content analysis of these comments and gain insights into what aspects of profiles matter to experts. We provide an error analysis of the system-generated profiles, identifying factors that help explain why certain experts may be harder to profile than others. We also analyze the impact on evaluating expert-profiling systems of using self-selected versus judged system-generated knowledge areas as ground truth; they rank systems somewhat differently but detect about the same amount of pairwise significant differences despite the fact that the judged system-generated assessments are more sparse.
  3. Bron, M.; Gorp, J. Van; Rijke, M. de: Media studies research in the data-driven age : how research questions evolve (2016) 0.01
    0.0065351077 = product of:
      0.026140431 = sum of:
        0.026140431 = product of:
          0.052280862 = sum of:
            0.052280862 = weight(_text_:aspects in 3008) [ClassicSimilarity], result of:
              0.052280862 = score(doc=3008,freq=2.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.2496898 = fieldWeight in 3008, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3008)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    The introduction of new technologies and access to new information channels continue to change the way media studies researchers work and the questions they seek to answer. We investigate the current practices of media studies researchers and how these practices affect their research questions. Through the analysis of 27 interviews about the research practices of media studies researchers during a research project we developed a model of the activities in their research cycle. We find that information gathering and analysis activities are dominating the research cycle. These activities influence the research outcomes as they determine how research questions asked by media studies researchers evolve. Specifically, we show how research questions are related to the availability and accessibility of data as well as new information sources for contextualization of the research topic. Our contribution is a comprehensive account of the overall research cycle of media studies researchers as well as specific aspects of the research cycle, i.e., information sources, information seeking challenges, and the development of research questions. This work confirms findings of previous work in this area using a previously unstudied group of researchers, as well as providing new details about how research questions evolve.
  4. Kenter, T.; Balog, K.; Rijke, M. de: Evaluating document filtering systems over time (2015) 0.01
    0.005228086 = product of:
      0.020912344 = sum of:
        0.020912344 = product of:
          0.041824687 = sum of:
            0.041824687 = weight(_text_:aspects in 2672) [ClassicSimilarity], result of:
              0.041824687 = score(doc=2672,freq=2.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.19975184 = fieldWeight in 2672, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2672)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Document filtering is a popular task in information retrieval. A stream of documents arriving over time is filtered for documents relevant to a set of topics. The distinguishing feature of document filtering is the temporal aspect introduced by the stream of documents. Document filtering systems, up to now, have been evaluated in terms of traditional metrics like (micro- or macro-averaged) precision, recall, MAP, nDCG, F1 and utility. We argue that these metrics do not capture all relevant aspects of the systems being evaluated. In particular, they lack support for the temporal dimension of the task. We propose a time-sensitive way of measuring performance of document filtering systems over time by employing trend estimation. In short, the performance is calculated for batches, a trend line is fitted to the results, and the estimated performance of systems at the end of the evaluation period is used to compare systems. We detail the application of our proposed trend estimation framework and examine the assumptions that need to hold for valid significance testing. Additionally, we analyze the requirements a document filtering metric has to meet and show that traditional macro-averaged true-positive-based metrics, like precision, recall and utility fail to capture essential information when applied in a batch setting. In particular, false positives returned in a batch for topics that are absent from the ground truth in that batch go unnoticed. This is a serious flaw as over-generation of a system might be overlooked this way. We propose a new metric, aptness, that does capture false positives. We incorporate this metric in an overall score and show that this new score does meet all requirements. To demonstrate the results of our proposed evaluation methodology, we analyze the runs submitted to the two most recent editions of a document filtering evaluation campaign. We re-evaluate the runs submitted to the Cumulative Citation Recommendation task of the 2012 and 2013 editions of the TREC Knowledge Base Acceleration track, and show that important new insights emerge.