Search (4 results, page 1 of 1)

  • × author_ss:"Bar-Ilan, J."
  • × author_ss:"Levene, M."
  • × language_ss:"e"
  1. Zhitomirsky-Geffet, M.; Bar-Ilan, J.; Levene, M.: Testing the stability of "wisdom of crowds" judgments of search results over time and their similarity with the search engine rankings (2016) 0.01
    0.006825361 = product of:
      0.051190205 = sum of:
        0.04261639 = weight(_text_:evaluation in 3071) [ClassicSimilarity], result of:
          0.04261639 = score(doc=3071,freq=6.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.3210899 = fieldWeight in 3071, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.03125 = fieldNorm(doc=3071)
        0.008573813 = product of:
          0.017147627 = sum of:
            0.017147627 = weight(_text_:22 in 3071) [ClassicSimilarity], result of:
              0.017147627 = score(doc=3071,freq=2.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.15476047 = fieldWeight in 3071, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3071)
          0.5 = coord(1/2)
      0.13333334 = coord(2/15)
    
    Abstract
    Purpose - One of the under-explored aspects in the process of user information seeking behaviour is influence of time on relevance evaluation. It has been shown in previous studies that individual users might change their assessment of search results over time. It is also known that aggregated judgements of multiple individual users can lead to correct and reliable decisions; this phenomenon is known as the "wisdom of crowds". The purpose of this paper is to examine whether aggregated judgements will be more stable and thus more reliable over time than individual user judgements. Design/methodology/approach - In this study two simple measures are proposed to calculate the aggregated judgements of search results and compare their reliability and stability to individual user judgements. In addition, the aggregated "wisdom of crowds" judgements were used as a means to compare the differences between human assessments of search results and search engine's rankings. A large-scale user study was conducted with 87 participants who evaluated two different queries and four diverse result sets twice, with an interval of two months. Two types of judgements were considered in this study: relevance on a four-point scale, and ranking on a ten-point scale without ties. Findings - It was found that aggregated judgements are much more stable than individual user judgements, yet they are quite different from search engine rankings. Practical implications - The proposed "wisdom of crowds"-based approach provides a reliable reference point for the evaluation of search engines. This is also important for exploring the need of personalisation and adapting search engine's ranking over time to changes in users preferences. Originality/value - This is a first study that applies the notion of "wisdom of crowds" to examine an under-explored in the literature phenomenon of "change in time" in user evaluation of relevance.
    Date
    20. 1.2015 18:30:22
  2. Zhitomirsky-Geffet, M.; Bar-Ilan, J.; Levene, M.: Analysis of change in users' assessment of search results over time (2017) 0.01
    0.006582941 = product of:
      0.049372055 = sum of:
        0.030755727 = weight(_text_:evaluation in 3593) [ClassicSimilarity], result of:
          0.030755727 = score(doc=3593,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.23172665 = fieldWeight in 3593, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3593)
        0.01861633 = weight(_text_:web in 3593) [ClassicSimilarity], result of:
          0.01861633 = score(doc=3593,freq=2.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.18028519 = fieldWeight in 3593, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3593)
      0.13333334 = coord(2/15)
    
    Abstract
    We present the first systematic study of the influence of time on user judgements for rankings and relevance grades of web search engine results. The goal of this study is to evaluate the change in user assessment of search results and explore how users' judgements change. To this end, we conducted a large-scale user study with 86 participants who evaluated 2 different queries and 4 diverse result sets twice with an interval of 2 months. To analyze the results we investigate whether 2 types of patterns of user behavior from the theory of categorical thinking hold for the case of evaluation of search results: (a) coarseness and (b) locality. To quantify these patterns we devised 2 new measures of change in user judgements and distinguish between local (when users swap between close ranks and relevance values) and nonlocal changes. Two types of judgements were considered in this study: (a) relevance on a 4-point scale, and (b) ranking on a 10-point scale without ties. We found that users tend to change their judgements of the results over time in about 50% of cases for relevance and in 85% of cases for ranking. However, the majority of these changes were local.
  3. Bar-Ilan, J.; Levene, M.: ¬The hw-rank : an h-index variant for ranking web pages (2015) 0.00
    0.0024821775 = product of:
      0.03723266 = sum of:
        0.03723266 = weight(_text_:web in 1694) [ClassicSimilarity], result of:
          0.03723266 = score(doc=1694,freq=2.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.36057037 = fieldWeight in 1694, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=1694)
      0.06666667 = coord(1/15)
    
  4. Bar-Ilan, J.; Levene, M.; Mat-Hassan, M.: Methods for evaluating dynamic changes in search engine rankings : a case study (2006) 0.00
    0.0014041315 = product of:
      0.021061972 = sum of:
        0.021061972 = weight(_text_:web in 616) [ClassicSimilarity], result of:
          0.021061972 = score(doc=616,freq=4.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.2039694 = fieldWeight in 616, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=616)
      0.06666667 = coord(1/15)
    
    Abstract
    Purpose - The objective of this paper is to characterize the changes in the rankings of the top ten results of major search engines over time and to compare the rankings between these engines. Design/methodology/approach - The papers compare rankings of the top-ten results of the search engines Google and AlltheWeb on ten identical queries over a period of three weeks. Only the top-ten results were considered, since users do not normally inspect more than the first results page returned by a search engine. The experiment was repeated twice, in October 2003 and in January 2004, in order to assess changes to the top-ten results of some of the queries during the three months interval. In order to assess the changes in the rankings, three measures were computed for each data collection point and each search engine. Findings - The findings in this paper show that the rankings of AlltheWeb were highly stable over each period, while the rankings of Google underwent constant yet minor changes, with occasional major ones. Changes over time can be explained by the dynamic nature of the web or by fluctuations in the search engines' indexes. The top-ten results of the two search engines had surprisingly low overlap. With such small overlap, the task of comparing the rankings of the two engines becomes extremely challenging. Originality/value - The paper shows that because of the abundance of information on the web, ranking search results is of extreme importance. The paper compares several measures for computing the similarity between rankings of search tools, and shows that none of the measures is fully satisfactory as a standalone measure. It also demonstrates the apparent differences in the ranking algorithms of two widely used search engines.