Search (23 results, page 1 of 2)

Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.02

0.018567387 = product of:
  0.055702157 = sum of:
    0.055702157 = sum of:
      0.020087399 = weight(_text_:of in 1451) [ClassicSimilarity], result of:
        0.020087399 = score(doc=1451,freq=16.0), product of:
          0.06850986 = queryWeight, product of:
            1.5637573 = idf(docFreq=25162, maxDocs=44218)
            0.043811057 = queryNorm
          0.2932045 = fieldWeight in 1451, product of:
            4.0 = tf(freq=16.0), with freq of:
              16.0 = termFreq=16.0
            1.5637573 = idf(docFreq=25162, maxDocs=44218)
            0.046875 = fieldNorm(doc=1451)
      0.03561476 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
        0.03561476 = score(doc=1451,freq=2.0), product of:
          0.15341885 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.043811057 = queryNorm
          0.23214069 = fieldWeight in 1451, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=1451)
  0.33333334 = coord(1/3)

Abstract: Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
Date: 22. 3.2003 19:27:36
Source: Journal of the American Society for Information Science and technology. 54(2003) no.4, S.281-284

Lalmas, M.; Ruthven, I.: Representing and retrieving structured documents using the Dempster-Shafer theory of evidence : modelling and evaluation (1998) 0.00
```
0.0047837105 = product of:
  0.014351131 = sum of:
    0.014351131 = product of:
      0.028702263 = sum of:
        0.028702263 = weight(_text_:of in 1076) [ClassicSimilarity], result of:
          0.028702263 = score(doc=1076,freq=24.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.41895083 = fieldWeight in 1076, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1076)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Reports on a theoretical model of structured document indexing and retrieval based on the Dempster-Schafer Theory of Evidence. Includes a description of the model of structured document retrieval, the representation of structured documents, the representation of individual components, how components are combined, details of the combination process, and how relevance is captured within the model. Also presents a detailed account of an implementation of the model, and an evaluation scheme designed to test the effectiveness of the model

Source

Journal of documentation. 54(1998) no.5, S.529-565
Kazai, G.; Lalmas, M.; Fuhr, N.; Gövert, N.: ¬A report an the first year of the INitiative for the Evaluation of XML Retrieval (INEX'02) (2004) 0.00
```
0.0045800544 = product of:
  0.013740162 = sum of:
    0.013740162 = product of:
      0.027480325 = sum of:
        0.027480325 = weight(_text_:of in 2267) [ClassicSimilarity], result of:
          0.027480325 = score(doc=2267,freq=22.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.40111488 = fieldWeight in 2267, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2267)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The INitiative for the Evaluation of XML retrieval (INEX) aims at providing an infrastructure to evaluate the effectiveness of content-oriented XML retrieval systems. To this end, in the first round of INEX in 2002, a test collection of real world XML documents along with a set of topics and respective relevance assessments have been created with the collaboration of 36 participating organizations. In this article, we provide an overview of the first round of the INEX initiative.

Source

Journal of the American Society for Information Science and Technology. 55(2004) no.6, S.551-556
Arapakis, I.; Lalmas, M.; Cambazoglu, B.B.; MarcosM.-C.; Jose, J.M.: User engagement in online news : under the scope of sentiment, interest, affect, and gaze (2014) 0.00
```
0.0038202507 = product of:
  0.011460752 = sum of:
    0.011460752 = product of:
      0.022921504 = sum of:
        0.022921504 = weight(_text_:of in 1497) [ClassicSimilarity], result of:
          0.022921504 = score(doc=1497,freq=30.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.33457235 = fieldWeight in 1497, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1497)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Online content providers, such as news portals and social media platforms, constantly seek new ways to attract large shares of online attention by keeping their users engaged. A common challenge is to identify which aspects of online interaction influence user engagement the most. In this article, through an analysis of a news article collection obtained from Yahoo News US, we demonstrate that news articles exhibit considerable variation in terms of the sentimentality and polarity of their content, depending on factors such as news provider and genre. Moreover, through a laboratory study, we observe the effect of sentimentality and polarity of news and comments on a set of subjective and objective measures of engagement. In particular, we show that attention, affect, and gaze differ across news of varying interestingness. As part of our study, we also explore methods that exploit the sentiments expressed in user comments to reorder the lists of comments displayed in news pages. Our results indicate that user engagement can be anticipated predicted if we account for the sentimentality and polarity of the content as well as other factors that drive attention and inspire human curiosity.

Source

Journal of the Association for Information Science and Technology. 65(2014) no.10, S.1988-2005
Arapakis, I.; Cambazoglu, B.B.; Lalmas, M.: On the feasibility of predicting popular news at cold start (2017) 0.00
```
0.0038202507 = product of:
  0.011460752 = sum of:
    0.011460752 = product of:
      0.022921504 = sum of:
        0.022921504 = weight(_text_:of in 3595) [ClassicSimilarity], result of:
          0.022921504 = score(doc=3595,freq=30.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.33457235 = fieldWeight in 3595, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3595)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Prominent news sites on the web provide hundreds of news articles daily. The abundance of news content competing to attract online attention, coupled with the manual effort involved in article selection, necessitates the timely prediction of future popularity of these news articles. The future popularity of a news article can be estimated using signals indicating the article's penetration in social media (e.g., number of tweets) in addition to traditional web analytics (e.g., number of page views). In practice, it is important to make such estimations as early as possible, preferably before the article is made available on the news site (i.e., at cold start). In this paper we perform a study on cold-start news popularity prediction using a collection of 13,319 news articles obtained from Yahoo News, a major news provider. We characterize the popularity of news articles through a set of online metrics and try to predict their values across time using machine learning techniques on a large collection of features obtained from various sources. Our findings indicate that predicting news popularity at cold start is a difficult task, contrary to the findings of a prior work on the same topic. Most articles' popularity may not be accurately anticipated solely on the basis of content features, without having the early-stage popularity values.

Source

Journal of the Association for Information Science and Technology. 68(2017) no.5, S.1149-1164
Szlávik, Z.; Tombros, A.; Lalmas, M.: Summarisation of the logical structure of XML documents (2012) 0.00
```
0.003743066 = product of:
  0.0112291975 = sum of:
    0.0112291975 = product of:
      0.022458395 = sum of:
        0.022458395 = weight(_text_:of in 2731) [ClassicSimilarity], result of:
          0.022458395 = score(doc=2731,freq=20.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.32781258 = fieldWeight in 2731, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2731)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Summarisation is traditionally used to produce summaries of the textual contents of documents. In this paper, it is argued that summarisation methods can also be applied to the logical structure of XML documents. Structure summarisation selects the most important elements of the logical structure and ensures that the user's attention is focused towards sections, subsections, etc. that are believed to be of particular interest. Structure summaries are shown to users as hierarchical tables of contents. This paper discusses methods for structure summarisation that use various features of XML elements in order to select document portions that a user's attention should be focused to. An evaluation methodology for structure summarisation is also introduced and summarisation results using various summariser versions are presented and compared to one another. We show that data sets used in information retrieval evaluation can be used effectively in order to produce high quality (query independent) structure summaries. We also discuss the choice and effectiveness of particular summariser features with respect to several evaluation measures.
Rijsbergen, C.J. van; Lalmas, M.: Information calculus for information retrieval (1996) 0.00
```
0.00355646 = product of:
  0.0106693795 = sum of:
    0.0106693795 = product of:
      0.021338759 = sum of:
        0.021338759 = weight(_text_:of in 4201) [ClassicSimilarity], result of:
          0.021338759 = score(doc=4201,freq=26.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.31146988 = fieldWeight in 4201, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4201)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Information is and always has been an elusive concept; nevertheless many philosophers, mathematicians, logicians and computer scientists have felt that it is fundamental. Many attempts have been made to come up with some sensible and intuitively acceptable definition of information; up to now, none of these have succeeded. This work is based on the approach followed by Dretske, Barwise, and Devlin, who claimed that the notion of information starts from the position that given an ontology of objects individuated by a cognitive agent, it makes sense to speak of the information an object (e.g., a text, an image, a video) contains about another object (e.g. the query). This phenomenon is captured by the flow of information between objects. Its exploitation is the task of an information retrieval system. These authors proposes a theory of information that provides an analysis of the concept of information (any type, from any media) and the manner in which intelligent organisms (referring to as cognitive agents) handle and respond to the information picked up from their environment. They defined the nature of information flow and the mechanisms that give rise to such a flow. The theory, which is based on Situation Theory, is expressed with a calculus defined on channels. The calculus was defined so that it satisfies properties that are attributes to information and its flows. This paper demonstrates the connection between this calculus and information retrieval, and porposes a model of an information retrieval system based on this calculus

Source

Journal of the American Society for Information Science. 47(1996) no.5, S.385-398

Lalmas, M.; Ruthven, I.: ¬A model for structured document retrieval : empirical investigations (1997) 0.00

0.0035289964 = product of:
  0.010586989 = sum of:
    0.010586989 = product of:
      0.021173978 = sum of:
        0.021173978 = weight(_text_:of in 727) [ClassicSimilarity], result of:
          0.021173978 = score(doc=727,freq=10.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.3090647 = fieldWeight in 727, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=727)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Documents often display a structure, e.g. several sections, each with several subsections and so on. Taking into account the structure of a document allows the retrieval process to focus on those parts of the document that are most relevant to an information need. In previous work, we developed a model for the representation and the retrieval of structured documents. This paper reports the first experimental study of the effectiveness and applicability of the model

Ruthven, I.; Lalmas, M.; Rijsbergen, K. van: Combining and selecting characteristics of information use (2002) 0.00
```
0.0035289964 = product of:
  0.010586989 = sum of:
    0.010586989 = product of:
      0.021173978 = sum of:
        0.021173978 = weight(_text_:of in 5208) [ClassicSimilarity], result of:
          0.021173978 = score(doc=5208,freq=40.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.3090647 = fieldWeight in 5208, product of:
              6.3245554 = tf(freq=40.0), with freq of:
                40.0 = termFreq=40.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=5208)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Ruthven, Lalmas, and van Rijsbergen use traditional term importance measures like inverse document frequency, noise, based upon in-document frequency, and term frequency supplemented by theme value which is calculated from differences of expected positions of words in a text from their actual positions, on the assumption that even distribution indicates term association with a main topic, and context, which is based on a query term's distance from the nearest other query term relative to the average expected distribution of all query terms in the document. They then define document characteristics like specificity, the sum of all idf values in a document over the total terms in the document, or document complexity, measured by the documents average idf value; and information to noise ratio, info-noise, tokens after stopping and stemming over tokens before these processes, measuring the ratio of useful and non-useful information in a document. Retrieval tests are then carried out using each characteristic, combinations of the characteristics, and relevance feedback to determine the correct combination of characteristics. A file ranks independently of query terms by both specificity and info-noise, but if presence of a query term is required unique rankings are generated. Tested on five standard collections the traditional characteristics out preformed the new characteristics, which did, however, out preform random retrieval. All possible combinations of characteristics were also tested both with and without a set of scaling weights applied. All characteristics can benefit by combination with another characteristic or set of characteristics and performance as a single characteristic is a good indicator of performance in combination. Larger combinations tended to be more effective than smaller ones and weighting increased precision measures of middle ranking combinations but decreased the ranking of poorer combinations. The best combinations vary for each collection, and in some collections with the addition of weighting. Finally, with all documents ranked by the all characteristics combination, they take the top 30 documents and calculate the characteristic scores for each term in both the relevant and the non-relevant sets. Then taking for each query term the characteristics whose average was higher for relevant than non-relevant documents the documents are re-ranked. The relevance feedback method of selecting characteristics can select a good set of characteristics for query terms.

Source

Journal of the American Society for Information Science and technology. 53(2002) no.5, S.378-396

Lalmas, M.: Logical models in information retrieval : introduction and overview (1998) 0.00

0.0034169364 = product of:
  0.010250809 = sum of:
    0.010250809 = product of:
      0.020501617 = sum of:
        0.020501617 = weight(_text_:of in 2668) [ClassicSimilarity], result of:
          0.020501617 = score(doc=2668,freq=6.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.2992506 = fieldWeight in 2668, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=2668)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Introduces the formalisms used in logical models for information retrieval. Shows the use of logic to build the models and presents an overview of current logic models in information retrieval
Footnote: Contribution to an issue devoted to application of logic to information retrieval

Lalmas, M.: XML retrieval (2009) 0.00
```
0.0034169364 = product of:
  0.010250809 = sum of:
    0.010250809 = product of:
      0.020501617 = sum of:
        0.020501617 = weight(_text_:of in 4998) [ClassicSimilarity], result of:
          0.020501617 = score(doc=4998,freq=24.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.2992506 = fieldWeight in 4998, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4998)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Documents usually have a content and a structure. The content refers to the text of the document, whereas the structure refers to how a document is logically organized. An increasingly common way to encode the structure is through the use of a mark-up language. Nowadays, the most widely used mark-up language for representing structure is the eXtensible Mark-up Language (XML). XML can be used to provide a focused access to documents, i.e. returning XML elements, such as sections and paragraphs, instead of whole documents in response to a query. Such focused strategies are of particular benefit for information repositories containing long documents, or documents covering a wide variety of topics, where users are directed to the most relevant content within a document. The increased adoption of XML to represent a document structure requires the development of tools to effectively access documents marked-up in XML. This book provides a detailed description of query languages, indexing strategies, ranking algorithms, presentation scenarios developed to access XML documents. Major advances in XML retrieval were seen from 2002 as a result of INEX, the Initiative for Evaluation of XML Retrieval. INEX, also described in this book, provided test sets for evaluating XML retrieval effectiveness. Many of the developments and results described in this book were investigated within INEX.

Content

Table of Contents: Introduction / Basic XML Concepts / Historical Perspectives / Query Languages / Indexing Strategies / Ranking Strategies / Presentation Strategies / Evaluating XML Retrieval Effectiveness / Conclusions
Blanke, T.; Lalmas, M.; Huibers, T.: ¬A framework for the theoretical evaluation of XML retrieval (2012) 0.00
```
0.0033478998 = product of:
  0.010043699 = sum of:
    0.010043699 = product of:
      0.020087399 = sum of:
        0.020087399 = weight(_text_:of in 509) [ClassicSimilarity], result of:
          0.020087399 = score(doc=509,freq=16.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.2932045 = fieldWeight in 509, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=509)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

We present a theoretical framework to evaluate XML retrieval. XML retrieval deals with retrieving those document components-the XML elements-that specifically answer a query. In this article, theoretical evaluation is concerned with the formal representation of qualitative properties of retrieval models. It complements experimental methods by showing the properties of the underlying reasoning assumptions that decide when a document is about a query. We define a theoretical methodology based on the idea of "aboutness" and apply it to current XML retrieval models. This allows comparing and analyzing the reasoning behavior of XML retrieval models experimented within the INEX evaluation campaigns. For each model we derive functional and qualitative properties that qualify its formal behavior. We then use these properties to explain experimental results obtained with some of the XML retrieval models.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.12, S.2463-2473
Piwowarski, B.; Amini, M.R.; Lalmas, M.: On using a quantum physics formalism for multidocument summarization (2012) 0.00
```
0.0029591531 = product of:
  0.008877459 = sum of:
    0.008877459 = product of:
      0.017754918 = sum of:
        0.017754918 = weight(_text_:of in 236) [ClassicSimilarity], result of:
          0.017754918 = score(doc=236,freq=18.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.25915858 = fieldWeight in 236, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=236)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Multidocument summarization (MDS) aims for each given query to extract compressed and relevant information with respect to the different query-related themes present in a set of documents. Many approaches operate in two steps. Themes are first identified from the set, and then a summary is formed by extracting salient sentences within the different documents of each of the identified themes. Among these approaches, latent semantic analysis (LSA) based approaches rely on spectral decomposition techniques to identify the themes. In this article, we propose a major extension of these techniques that relies on the quantum information access (QIA) framework. The latter is a framework developed for modeling information access based on the probabilistic formalism of quantum physics. The QIA framework not only points out the limitations of the current LSA-based approaches, but motivates a new principled criterium to tackle multidocument summarization that addresses these limitations. As a byproduct, it also provides a way to enhance the LSA-based approaches. Extensive experiments on the DUC 2005, 2006 and 2007 datasets show that the proposed approach consistently improves over both the LSA-based approaches and the systems that competed in the yearly DUC competitions. This demonstrates the potential impact of quantum-inspired approaches to information access in general, and of the QIA framework in particular.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.5, S.865-888
Goyal, N.; Bron, M.; Lalmas, M.; Haines, A.; Cramer, H.: Designing for mobile experience beyond the native ad click : exploring landing page presentation style and media usage (2018) 0.00
```
0.0029591531 = product of:
  0.008877459 = sum of:
    0.008877459 = product of:
      0.017754918 = sum of:
        0.017754918 = weight(_text_:of in 4289) [ClassicSimilarity], result of:
          0.017754918 = score(doc=4289,freq=18.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.25915858 = fieldWeight in 4289, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4289)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Many free mobile applications are supported by advertising. Ads can greatly affect user perceptions and behavior. In mobile apps, ads often follow a "native" format: they are designed to conform in both format and style to the actual content and context of the application. Clicking on the ad leads users to a second destination, outside of the hosting app, where the unified experience provided by native ads within the app is not necessarily reflected by the landing page the user arrives at. Little is known about whether and how this type of mobile ads is impacting user experience. In this paper, we use both quantitative and qualitative methods to study the impact of two design decisions for the landing page of a native ad on the user experience: (i) native ad style (following the style of the application) versus a non-native ad style; and (ii) pages with multimedia versus static pages. We found considerable variability in terms of user experience with mobile ad landing pages when varying presentation style and multimedia usage, especially interaction between presence of video and ad style (native or non-native). We also discuss insights and recommendations for improving the user experience with mobile native ads.

Source

Journal of the Association for Information Science and Technology. 69(2018) no.7, S.913-923
Nikolov, D.; Lalmas, M.; Flammini, A.; Menczer, F.: Quantifying biases in online information exposure (2019) 0.00
```
0.0029591531 = product of:
  0.008877459 = sum of:
    0.008877459 = product of:
      0.017754918 = sum of:
        0.017754918 = weight(_text_:of in 4986) [ClassicSimilarity], result of:
          0.017754918 = score(doc=4986,freq=18.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.25915858 = fieldWeight in 4986, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4986)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Our consumption of online information is mediated by filtering, ranking, and recommendation algorithms that introduce unintentional biases as they attempt to deliver relevant and engaging content. It has been suggested that our reliance on online technologies such as search engines and social media may limit exposure to diverse points of view and make us vulnerable to manipulation by disinformation. In this article, we mine a massive data set of web traffic to quantify two kinds of bias: (i) homogeneity bias, which is the tendency to consume content from a narrow set of information sources, and (ii) popularity bias, which is the selective exposure to content from top sites. Our analysis reveals different bias levels across several widely used web platforms. Search exposes users to a diverse set of sources, while social media traffic tends to exhibit high popularity and homogeneity bias. When we focus our analysis on traffic to news sites, we find higher levels of popularity bias, with smaller differences across applications. Overall, our results quantify the extent to which our choices of online systems confine us inside "social bubbles."

Source

Journal of the Association for Information Science and Technology. 70(2019) no.3, S.218-229

Ruthven, I.; Lalmas, M.: Selective relevance feedback using term characteristics (1999) 0.00

0.0027899165 = product of:
  0.008369749 = sum of:
    0.008369749 = product of:
      0.016739499 = sum of:
        0.016739499 = weight(_text_:of in 3824) [ClassicSimilarity], result of:
          0.016739499 = score(doc=3824,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.24433708 = fieldWeight in 3824, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=3824)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Vocabulary as a central concept in digital libraries: interdisciplinary concepts, challenges, and opportunities : proceedings of the Third International Conference an Conceptions of Library and Information Science (COLIS3), Dubrovnik, Croatia, 23-26 May 1999. Ed. by T. Arpanac et al

Lehmann, J.; Castillo, C.; Lalmas, M.; Baeza-Yates, R.: Story-focused reading in online news and its potential for user engagement (2017) 0.00
```
0.0027899165 = product of:
  0.008369749 = sum of:
    0.008369749 = product of:
      0.016739499 = sum of:
        0.016739499 = weight(_text_:of in 3529) [ClassicSimilarity], result of:
          0.016739499 = score(doc=3529,freq=16.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.24433708 = fieldWeight in 3529, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3529)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

We study the news reading behavior of several hundred thousand users on 65 highly visited news sites. We focus on a specific phenomenon: users reading several articles related to a particular news development, which we call story-focused reading. Our goal is to understand the effect of story-focused reading on user engagement and how news sites can support this phenomenon. We found that most users focus on stories that interest them and that even casual news readers engage in story-focused reading. During story-focused reading, users spend more time reading and a larger number of news sites are involved. In addition, readers employ different strategies to find articles related to a story. We also analyze how news sites promote story-focused reading by looking at how they link their articles to related content published by them, or by other sources. The results show that providing links to related content leads to a higher engagement of the users, and that this is the case even for links to external sites. We also show that the performance of links can be affected by their type, their position, and how many of them are present within an article.

Footnote

This work was done while Janette Lehmann was a PhD student at Universitat Pompeu Fabra and it was carried out as part of her PhD internship at Yahoo! Labs Barcelona. This work was carried out while Carlos Castillo was working at Qatar Computing Research Institute.

Source

Journal of the Association for Information Science and Technology. 68(2017) no.4, S.869-883
Reid, J.; Lalmas, M.; Finesilver, K.; Hertzum, M.: Best entry points for structured document retrieval : part I: characteristics (2006) 0.00
```
0.0027618767 = product of:
  0.00828563 = sum of:
    0.00828563 = product of:
      0.01657126 = sum of:
        0.01657126 = weight(_text_:of in 960) [ClassicSimilarity], result of:
          0.01657126 = score(doc=960,freq=8.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.24188137 = fieldWeight in 960, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=960)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Structured document retrieval makes use of document components as the basis of the retrieval process, rather than complete documents. The inherent relationships between these components make it vital to support users' natural browsing behaviour in order to offer effective and efficient access to structured documents. This paper examines the concept of best entry points, which are document components from which the user can browse to obtain optimal access to relevant document components. In particular this paper investigates the basic characteristics of best entry points.
Reid, J.; Lalmas, M.; Finesilver, K.; Hertzum, M.: Best entry points for structured document retrieval : part II: types, usage and effectiveness (2006) 0.00
```
0.0027618767 = product of:
  0.00828563 = sum of:
    0.00828563 = product of:
      0.01657126 = sum of:
        0.01657126 = weight(_text_:of in 961) [ClassicSimilarity], result of:
          0.01657126 = score(doc=961,freq=8.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.24188137 = fieldWeight in 961, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=961)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Structured document retrieval makes use of document components as the basis of the retrieval process, rather than complete documents. The inherent relationships between these components make it vital to support users' natural browsing behaviour in order to offer effective and efficient access to structured documents. This paper examines the concept of best entry points, which are document components from which the user can browse to obtain optimal access to relevant document components. It investigates at the types of best entry points in structured document retrieval, and their usage and effectiveness in real information search tasks.
Lalmas, M.: XML information retrieval (2009) 0.00
```
0.0027618767 = product of:
  0.00828563 = sum of:
    0.00828563 = product of:
      0.01657126 = sum of:
        0.01657126 = weight(_text_:of in 3880) [ClassicSimilarity], result of:
          0.01657126 = score(doc=3880,freq=8.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.24188137 = fieldWeight in 3880, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3880)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Nowadays, increasingly, documents are marked-up using eXtensible Mark-up Language (XML), the format standard for structured documents. In contrast to HTML, which is mainly layout-oriented, XML follows the fundamental concept of separating the logical structure of a document from its layout. This document logical structure can be exploited to allow a focused access to documents, where the aim is to return the most relevant fragments within documents as answers to queries, instead of whole documents. This entry describes approaches developed to query, represent, and rank XML fragments.

Source

Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates

Search (23 results, page 1 of 2)

Authors

Years

Types

Themes

Subjects

Classifications