Search (9 results, page 1 of 1)

Blanke, T.; Lalmas, M.; Huibers, T.: ¬A framework for the theoretical evaluation of XML retrieval (2012) 0.01
```
0.007658652 = product of:
  0.053610563 = sum of:
    0.0060537956 = weight(_text_:information in 509) [ClassicSimilarity], result of:
      0.0060537956 = score(doc=509,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.116372846 = fieldWeight in 509, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=509)
    0.04755677 = weight(_text_:retrieval in 509) [ClassicSimilarity], result of:
      0.04755677 = score(doc=509,freq=14.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.5305404 = fieldWeight in 509, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=509)
  0.14285715 = coord(2/14)
```
Abstract

We present a theoretical framework to evaluate XML retrieval. XML retrieval deals with retrieving those document components-the XML elements-that specifically answer a query. In this article, theoretical evaluation is concerned with the formal representation of qualitative properties of retrieval models. It complements experimental methods by showing the properties of the underlying reasoning assumptions that decide when a document is about a query. We define a theoretical methodology based on the idea of "aboutness" and apply it to current XML retrieval models. This allows comparing and analyzing the reasoning behavior of XML retrieval models experimented within the INEX evaluation campaigns. For each model we derive functional and qualitative properties that qualify its formal behavior. We then use these properties to explain experimental results obtained with some of the XML retrieval models.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.12, S.2463-2473
Szlávik, Z.; Tombros, A.; Lalmas, M.: Summarisation of the logical structure of XML documents (2012) 0.01
```
0.005129378 = product of:
  0.035905644 = sum of:
    0.0104854815 = weight(_text_:information in 2731) [ClassicSimilarity], result of:
      0.0104854815 = score(doc=2731,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.20156369 = fieldWeight in 2731, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2731)
    0.025420163 = weight(_text_:retrieval in 2731) [ClassicSimilarity], result of:
      0.025420163 = score(doc=2731,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.2835858 = fieldWeight in 2731, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2731)
  0.14285715 = coord(2/14)
```
Abstract

Summarisation is traditionally used to produce summaries of the textual contents of documents. In this paper, it is argued that summarisation methods can also be applied to the logical structure of XML documents. Structure summarisation selects the most important elements of the logical structure and ensures that the user's attention is focused towards sections, subsections, etc. that are believed to be of particular interest. Structure summaries are shown to users as hierarchical tables of contents. This paper discusses methods for structure summarisation that use various features of XML elements in order to select document portions that a user's attention should be focused to. An evaluation methodology for structure summarisation is also introduced and summarisation results using various summariser versions are presented and compared to one another. We show that data sets used in information retrieval evaluation can be used effectively in order to produce high quality (query independent) structure summaries. We also discuss the choice and effectiveness of particular summariser features with respect to several evaluation measures.

Content

Beitrag in einem Themenheft "Large-Scale and Distributed Systems for Information Retrieval" Vgl.: doi:10.1016/j.ipm.2011.11.002.

Source

Information processing and management. 48(2012) no.5, S.956-968
Nikolov, D.; Lalmas, M.; Flammini, A.; Menczer, F.: Quantifying biases in online information exposure (2019) 0.00
```
0.004963813 = product of:
  0.034746688 = sum of:
    0.02465703 = weight(_text_:web in 4986) [ClassicSimilarity], result of:
      0.02465703 = score(doc=4986,freq=4.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.25496176 = fieldWeight in 4986, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4986)
    0.010089659 = weight(_text_:information in 4986) [ClassicSimilarity], result of:
      0.010089659 = score(doc=4986,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.19395474 = fieldWeight in 4986, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4986)
  0.14285715 = coord(2/14)
```
Abstract

Our consumption of online information is mediated by filtering, ranking, and recommendation algorithms that introduce unintentional biases as they attempt to deliver relevant and engaging content. It has been suggested that our reliance on online technologies such as search engines and social media may limit exposure to diverse points of view and make us vulnerable to manipulation by disinformation. In this article, we mine a massive data set of web traffic to quantify two kinds of bias: (i) homogeneity bias, which is the tendency to consume content from a narrow set of information sources, and (ii) popularity bias, which is the selective exposure to content from top sites. Our analysis reveals different bias levels across several widely used web platforms. Search exposes users to a diverse set of sources, while social media traffic tends to exhibit high popularity and homogeneity bias. When we focus our analysis on traffic to news sites, we find higher levels of popularity bias, with smaller differences across applications. Overall, our results quantify the extent to which our choices of online systems confine us inside "social bubbles."

Source

Journal of the Association for Information Science and Technology. 70(2019) no.3, S.218-229
Arapakis, I.; Cambazoglu, B.B.; Lalmas, M.: On the feasibility of predicting popular news at cold start (2017) 0.00
```
0.004243123 = product of:
  0.029701859 = sum of:
    0.02465703 = weight(_text_:web in 3595) [ClassicSimilarity], result of:
      0.02465703 = score(doc=3595,freq=4.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.25496176 = fieldWeight in 3595, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3595)
    0.0050448296 = weight(_text_:information in 3595) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=3595,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 3595, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3595)
  0.14285715 = coord(2/14)
```
Abstract

Prominent news sites on the web provide hundreds of news articles daily. The abundance of news content competing to attract online attention, coupled with the manual effort involved in article selection, necessitates the timely prediction of future popularity of these news articles. The future popularity of a news article can be estimated using signals indicating the article's penetration in social media (e.g., number of tweets) in addition to traditional web analytics (e.g., number of page views). In practice, it is important to make such estimations as early as possible, preferably before the article is made available on the news site (i.e., at cold start). In this paper we perform a study on cold-start news popularity prediction using a collection of 13,319 news articles obtained from Yahoo News, a major news provider. We characterize the popularity of news articles through a set of online metrics and try to predict their values across time using machine learning techniques on a large collection of features obtained from various sources. Our findings indicate that predicting news popularity at cold start is a difficult task, contrary to the findings of a prior work on the same topic. Most articles' popularity may not be accurately anticipated solely on the basis of content features, without having the early-stage popularity values.

Source

Journal of the Association for Information Science and Technology. 68(2017) no.5, S.1149-1164
Arapakis, I.; Lalmas, M.; Ceylan, H.; Donmez, P.: Automatically embedding newsworthy links to articles : from implementation to evaluation (2014) 0.00
```
0.003211426 = product of:
  0.022479981 = sum of:
    0.017435152 = weight(_text_:web in 1185) [ClassicSimilarity], result of:
      0.017435152 = score(doc=1185,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.18028519 = fieldWeight in 1185, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1185)
    0.0050448296 = weight(_text_:information in 1185) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=1185,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 1185, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1185)
  0.14285715 = coord(2/14)
```
Abstract

News portals are a popular destination for web users. News providers are therefore interested in attaining higher visitor rates and promoting greater engagement with their content. One aspect of engagement deals with keeping users on site longer by allowing them to have enhanced click-through experiences. News portals have invested in ways to embed links within news stories but so far these links have been curated by news editors. Given the manual effort involved, the use of such links is limited to a small scale. In this article, we evaluate a system-based approach that detects newsworthy events in a news article and locates other articles related to these events. Our system does not rely on resources like Wikipedia to identify events, and it was designed to be domain independent. A rigorous evaluation, using Amazon's Mechanical Turk, was performed to assess the system-embedded links against the manually-curated ones. Our findings reveal that our system's performance is comparable with that of professional editors, and that users find the automatically generated highlights interesting and the associated articles worthy of reading. Our evaluation also provides quantitative and qualitative insights into the curation of links, from the perspective of users and professional editors.

Source

Journal of the Association for Information Science and Technology. 65(2014) no.1, S.129-145
Piwowarski, B.; Amini, M.R.; Lalmas, M.: On using a quantum physics formalism for multidocument summarization (2012) 0.00
```
8.0575584E-4 = product of:
  0.011280581 = sum of:
    0.011280581 = weight(_text_:information in 236) [ClassicSimilarity], result of:
      0.011280581 = score(doc=236,freq=10.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.21684799 = fieldWeight in 236, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=236)
  0.071428575 = coord(1/14)
```
Abstract

Multidocument summarization (MDS) aims for each given query to extract compressed and relevant information with respect to the different query-related themes present in a set of documents. Many approaches operate in two steps. Themes are first identified from the set, and then a summary is formed by extracting salient sentences within the different documents of each of the identified themes. Among these approaches, latent semantic analysis (LSA) based approaches rely on spectral decomposition techniques to identify the themes. In this article, we propose a major extension of these techniques that relies on the quantum information access (QIA) framework. The latter is a framework developed for modeling information access based on the probabilistic formalism of quantum physics. The QIA framework not only points out the limitations of the current LSA-based approaches, but motivates a new principled criterium to tackle multidocument summarization that addresses these limitations. As a byproduct, it also provides a way to enhance the LSA-based approaches. Extensive experiments on the DUC 2005, 2006 and 2007 datasets show that the proposed approach consistently improves over both the LSA-based approaches and the systems that competed in the yearly DUC competitions. This demonstrates the potential impact of quantum-inspired approaches to information access in general, and of the QIA framework in particular.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.5, S.865-888

Arapakis, I.; Lalmas, M.; Cambazoglu, B.B.; MarcosM.-C.; Jose, J.M.: User engagement in online news : under the scope of sentiment, interest, affect, and gaze (2014) 0.00

3.6034497E-4 = product of:
  0.0050448296 = sum of:
    0.0050448296 = weight(_text_:information in 1497) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=1497,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 1497, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1497)
  0.071428575 = coord(1/14)

Source: Journal of the Association for Information Science and Technology. 65(2014) no.10, S.1988-2005

Lehmann, J.; Castillo, C.; Lalmas, M.; Baeza-Yates, R.: Story-focused reading in online news and its potential for user engagement (2017) 0.00

3.6034497E-4 = product of:
  0.0050448296 = sum of:
    0.0050448296 = weight(_text_:information in 3529) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=3529,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 3529, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3529)
  0.071428575 = coord(1/14)

Source: Journal of the Association for Information Science and Technology. 68(2017) no.4, S.869-883

Goyal, N.; Bron, M.; Lalmas, M.; Haines, A.; Cramer, H.: Designing for mobile experience beyond the native ad click : exploring landing page presentation style and media usage (2018) 0.00

3.6034497E-4 = product of:
  0.0050448296 = sum of:
    0.0050448296 = weight(_text_:information in 4289) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=4289,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 4289, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4289)
  0.071428575 = coord(1/14)

Source: Journal of the Association for Information Science and Technology. 69(2018) no.7, S.913-923

Search (9 results, page 1 of 1)

Authors