Search (24 results, page 1 of 2)

  • × theme_ss:"Suchmaschinen"
  • × type_ss:"a"
  • × type_ss:"el"
  1. Boldi, P.; Santini, M.; Vigna, S.: PageRank as a function of the damping factor (2005) 0.00
    0.004931886 = product of:
      0.022193488 = sum of:
        0.012233539 = product of:
          0.024467077 = sum of:
            0.024467077 = weight(_text_:web in 2564) [ClassicSimilarity], result of:
              0.024467077 = score(doc=2564,freq=4.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.25496176 = fieldWeight in 2564, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2564)
          0.5 = coord(1/2)
        0.009959949 = product of:
          0.019919898 = sum of:
            0.019919898 = weight(_text_:22 in 2564) [ClassicSimilarity], result of:
              0.019919898 = score(doc=2564,freq=2.0), product of:
                0.10297151 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02940506 = queryNorm
                0.19345059 = fieldWeight in 2564, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2564)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Abstract
    PageRank is defined as the stationary state of a Markov chain. The chain is obtained by perturbing the transition matrix induced by a web graph with a damping factor alpha that spreads uniformly part of the rank. The choice of alpha is eminently empirical, and in most cases the original suggestion alpha=0.85 by Brin and Page is still used. Recently, however, the behaviour of PageRank with respect to changes in alpha was discovered to be useful in link-spam detection. Moreover, an analytical justification of the value chosen for alpha is still missing. In this paper, we give the first mathematical analysis of PageRank when alpha changes. In particular, we show that, contrarily to popular belief, for real-world graphs values of alpha close to 1 do not give a more meaningful ranking. Then, we give closed-form formulae for PageRank derivatives of any order, and an extension of the Power Method that approximates them with convergence O(t**k*alpha**t) for the k-th derivative. Finally, we show a tight connection between iterated computation and analytical behaviour by proving that the k-th iteration of the Power Method gives exactly the PageRank value obtained using a Maclaurin polynomial of degree k. The latter result paves the way towards the application of analytical methods to the study of PageRank.
    Date
    16. 1.2016 10:22:28
    Source
    http://vigna.di.unimi.it/ftp/papers/PageRankAsFunction.pdf [Proceedings of the ACM World Wide Web Conference (WWW), 2005]
  2. Baeza-Yates, R.; Boldi, P.; Castillo, C.: Generalizing PageRank : damping functions for linkbased ranking algorithms (2006) 0.00
    0.004135637 = product of:
      0.018610368 = sum of:
        0.008650418 = product of:
          0.017300837 = sum of:
            0.017300837 = weight(_text_:web in 2565) [ClassicSimilarity], result of:
              0.017300837 = score(doc=2565,freq=2.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.18028519 = fieldWeight in 2565, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2565)
          0.5 = coord(1/2)
        0.009959949 = product of:
          0.019919898 = sum of:
            0.019919898 = weight(_text_:22 in 2565) [ClassicSimilarity], result of:
              0.019919898 = score(doc=2565,freq=2.0), product of:
                0.10297151 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02940506 = queryNorm
                0.19345059 = fieldWeight in 2565, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2565)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Abstract
    This paper introduces a family of link-based ranking algorithms that propagate page importance through links. In these algorithms there is a damping function that decreases with distance, so a direct link implies more endorsement than a link through a long path. PageRank is the most widely known ranking function of this family. The main objective of this paper is to determine whether this family of ranking techniques has some interest per se, and how different choices for the damping function impact on rank quality and on convergence speed. Even though our results suggest that PageRank can be approximated with other simpler forms of rankings that may be computed more efficiently, our focus is of more speculative nature, in that it aims at separating the kernel of PageRank, that is, link-based importance propagation, from the way propagation decays over paths. We focus on three damping functions, having linear, exponential, and hyperbolic decay on the lengths of the paths. The exponential decay corresponds to PageRank, and the other functions are new. Our presentation includes algorithms, analysis, comparisons and experiments that study their behavior under different parameters in real Web graph data. Among other results, we show how to calculate a linear approximation that induces a page ordering that is almost identical to PageRank's using a fixed small number of iterations; comparisons were performed using Kendall's tau on large domain datasets.
    Date
    16. 1.2016 10:22:28
  3. Sander-Beuermann, W.: Generationswechsel bei MetaGer : ein Rückblick und Ausblick (2019) 0.00
    0.0033974003 = product of:
      0.030576602 = sum of:
        0.030576602 = product of:
          0.061153203 = sum of:
            0.061153203 = weight(_text_:seite in 4993) [ClassicSimilarity], result of:
              0.061153203 = score(doc=4993,freq=2.0), product of:
                0.16469958 = queryWeight, product of:
                  5.601063 = idf(docFreq=443, maxDocs=44218)
                  0.02940506 = queryNorm
                0.3713015 = fieldWeight in 4993, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.601063 = idf(docFreq=443, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4993)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Sicherlich kennen Sie diese Empfehlung auch: Jeder Text, der beim Leser ankommen soll, müsse mit einer GUTEN Nachricht beginnen. Trotzdem mache ich es nun mal umgekehrt. Denn schon immer ... habe ich den Tag gehasst, an dem ich in einem Alter oberhalb der 70 Lebensjahre diesen Text, den Sie gerade lesen, schreibe. Das war die schlechte Nachricht. Nun aber kommt die gute: Ich habe bemerkt, dass dieser Tag des Generationswechsels gekommen ist. Egal, von welcher Seite er betrachtet wird: Irgendwann kommt dieser Tag. Selbst dann, wenn man es nicht bemerkt, was auch nicht selten ist, aber viel schlimmer wäre.
  4. Dunning, A.: Do we still need search engines? (1999) 0.00
    0.0030986508 = product of:
      0.027887857 = sum of:
        0.027887857 = product of:
          0.055775713 = sum of:
            0.055775713 = weight(_text_:22 in 6021) [ClassicSimilarity], result of:
              0.055775713 = score(doc=6021,freq=2.0), product of:
                0.10297151 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02940506 = queryNorm
                0.5416616 = fieldWeight in 6021, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6021)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Source
    Ariadne. 1999, no.22
  5. Ding, L.; Finin, T.; Joshi, A.; Peng, Y.; Cost, R.S.; Sachs, J.; Pan, R.; Reddivari, P.; Doshi, V.: Swoogle : a Semantic Web search and metadata engine (2004) 0.00
    0.0030515809 = product of:
      0.027464228 = sum of:
        0.027464228 = product of:
          0.054928456 = sum of:
            0.054928456 = weight(_text_:web in 4704) [ClassicSimilarity], result of:
              0.054928456 = score(doc=4704,freq=14.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.57238775 = fieldWeight in 4704, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4704)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Swoogle is a crawler-based indexing and retrieval system for the Semantic Web, i.e., for Web documents in RDF or OWL. It extracts metadata for each discovered document, and computes relations between documents. Discovered documents are also indexed by an information retrieval system which can use either character N-Gram or URIrefs as keywords to find relevant documents and to compute the similarity among a set of documents. One of the interesting properties we compute is rank, a measure of the importance of a Semantic Web document.
    Content
    Vgl. unter: http://www.dblab.ntua.gr/~bikakis/LD/5.pdf Vgl. auch: http://swoogle.umbc.edu/. Vgl. auch: http://ebiquity.umbc.edu/paper/html/id/183/. Vgl. auch: Radhakrishnan, A.: Swoogle : An Engine for the Semantic Web unter: http://www.searchenginejournal.com/swoogle-an-engine-for-the-semantic-web/5469/.
    Theme
    Semantic Web
  6. Bradley, P.: ¬The relevance of underpants to searching the Web (2000) 0.00
    0.0026912412 = product of:
      0.02422117 = sum of:
        0.02422117 = product of:
          0.04844234 = sum of:
            0.04844234 = weight(_text_:web in 3961) [ClassicSimilarity], result of:
              0.04844234 = score(doc=3961,freq=2.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.50479853 = fieldWeight in 3961, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3961)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
  7. Brin, S.; Page, L.: ¬The anatomy of a large-scale hypertextual Web search engine (1998) 0.00
    0.002542984 = product of:
      0.022886856 = sum of:
        0.022886856 = product of:
          0.04577371 = sum of:
            0.04577371 = weight(_text_:web in 947) [ClassicSimilarity], result of:
              0.04577371 = score(doc=947,freq=14.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.47698978 = fieldWeight in 947, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=947)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want
  8. Advanced online media use (2023) 0.00
    0.0021748515 = product of:
      0.019573662 = sum of:
        0.019573662 = product of:
          0.039147325 = sum of:
            0.039147325 = weight(_text_:web in 954) [ClassicSimilarity], result of:
              0.039147325 = score(doc=954,freq=4.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.4079388 = fieldWeight in 954, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0625 = fieldNorm(doc=954)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Content
    "1. Use a range of different media 2. Access paywalled media content 3. Use an advertising and tracking blocker 4. Use alternatives to Google Search 5. Use alternatives to YouTube 6. Use alternatives to Facebook and Twitter 7. Caution with Wikipedia 8. Web browser, email, and internet access 9. Access books and scientific papers 10. Access deleted web content"
  9. Hurz, S.: Google verfolgt Nutzer, auch wenn sie explizit widersprechen (2018) 0.00
    0.0019223152 = product of:
      0.017300837 = sum of:
        0.017300837 = product of:
          0.034601673 = sum of:
            0.034601673 = weight(_text_:web in 4404) [ClassicSimilarity], result of:
              0.034601673 = score(doc=4404,freq=2.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.36057037 = fieldWeight in 4404, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4404)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Wenn Google-Nutzer den Standortverlauf ausschalten, speichert das Unternehmen trotzdem Bewegungsdaten. Betroffen sind mehr als zwei Milliarden Menschen, die Android-Smartphones oder iPhones mit Google-Diensten verwenden. Wer das Tracking verhindern will, muss die "Web- und App-Aktivitäten" komplett deaktivieren.
  10. Ogden, J.; Summers, E.; Walker, S.: Know(ing) Infrastructure : the wayback machine as object and instrument of digital research (2023) 0.00
    0.0019223152 = product of:
      0.017300837 = sum of:
        0.017300837 = product of:
          0.034601673 = sum of:
            0.034601673 = weight(_text_:web in 1084) [ClassicSimilarity], result of:
              0.034601673 = score(doc=1084,freq=8.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.36057037 = fieldWeight in 1084, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1084)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    From documenting human rights abuses to studying online advertising, web archives are increasingly positioned as critical resources for a broad range of scholarly Internet research agendas. In this article, we reflect on the motivations and methodological challenges of investigating the world's largest web archive, the Internet Archive's Wayback Machine (IAWM). Using a mixed methods approach, we report on a pilot project centred around documenting the inner workings of 'Save Page Now' (SPN) - an Internet Archive tool that allows users to initiate the creation and storage of 'snapshots' of web resources. By improving our understanding of SPN and its role in shaping the IAWM, this work examines how the public tool is being used to 'save the Web' and highlights the challenges of operationalising a study of the dynamic sociotechnical processes supporting this knowledge infrastructure. Inspired by existing Science and Technology Studies (STS) approaches, the paper charts our development of methodological interventions to support an interdisciplinary investigation of SPN, including: ethnographic methods, 'experimental blackbox tactics', data tracing, modelling and documentary research. We discuss the opportunities and limitations of our methodology when interfacing with issues associated with temporality, scale and visibility, as well as critically engage with our own positionality in the research process (in terms of expertise and access). We conclude with reflections on the implications of digital STS approaches for 'knowing infrastructure', where the use of these infrastructures is unavoidably intertwined with our ability to study the situated and material arrangements of their creation.
  11. Bensman, S.J.: Eugene Garfield, Francis Narin, and PageRank : the theoretical bases of the Google search engine (2013) 0.00
    0.0017706576 = product of:
      0.015935918 = sum of:
        0.015935918 = product of:
          0.031871837 = sum of:
            0.031871837 = weight(_text_:22 in 1149) [ClassicSimilarity], result of:
              0.031871837 = score(doc=1149,freq=2.0), product of:
                0.10297151 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02940506 = queryNorm
                0.30952093 = fieldWeight in 1149, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1149)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Date
    17.12.2013 11:02:22
  12. Schaat, S.: Von der automatisierten Manipulation zur Manipulation der Automatisierung (2019) 0.00
    0.0017706576 = product of:
      0.015935918 = sum of:
        0.015935918 = product of:
          0.031871837 = sum of:
            0.031871837 = weight(_text_:22 in 4996) [ClassicSimilarity], result of:
              0.031871837 = score(doc=4996,freq=2.0), product of:
                0.10297151 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02940506 = queryNorm
                0.30952093 = fieldWeight in 4996, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4996)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Date
    19. 2.2019 17:22:00
  13. Söhler, M.: Schluss mit Schema F (2011) 0.00
    0.0017193711 = product of:
      0.015474339 = sum of:
        0.015474339 = product of:
          0.030948678 = sum of:
            0.030948678 = weight(_text_:web in 4439) [ClassicSimilarity], result of:
              0.030948678 = score(doc=4439,freq=10.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.32250395 = fieldWeight in 4439, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4439)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Mit Schema.org und dem semantischen Web sollen Suchmaschinen verstehen lernen
    Content
    "Wörter haben oft mehrere Bedeutungen. Einige kennen den "Kanal" als künstliche Wasserstraße, andere vom Fernsehen. Die Waage kann zum Erfassen des Gewichts nützlich sein oder zur Orientierung auf der Horoskopseite. Casablanca ist eine Stadt und ein Film zugleich. Wo Menschen mit der Zeit Bedeutungen unterscheiden und verarbeiten lernen, können dies Suchmaschinen von selbst nicht. Stets listen sie dumpf hintereinander weg alles auf, was sie zu einem Thema finden. Damit das nicht so bleibt, haben sich nun Google, Yahoo und die zu Microsoft gehörende Suchmaschine Bing zusammengetan, um der Suche im Netz mehr Verständnis zu verpassen. Man spricht dabei auch von einer "semantischen Suche". Das Ergebnis heißt Schema.org. Wer die Webseite einmal besucht, sich ein wenig in die Unterstrukturen hereinklickt und weder Vorkenntnisse im Programmieren noch im Bereich des semantischen Webs hat, wird sich überfordert und gelangweilt wieder abwenden. Doch was hier entstehen könnte, hat das Zeug dazu, Teile des Netzes und speziell die Funktionen von Suchmaschinen mittel- oder langfristig zu verändern. "Große Player sind dabei, sich auf Standards zu einigen", sagt Daniel Bahls, Spezialist für Semantische Technologien beim ZBW Leibniz-Informationszentrum Wirtschaft in Hamburg. "Die semantischen Technologien stehen schon seit Jahren im Raum und wurden bisher nur im kleineren Kontext verwendet." Denn Schema.org lädt Entwickler, Forscher, die Semantic-Web-Community und am Ende auch alle Betreiber von Websites dazu ein, an der Umgestaltung der Suche im Netz mitzuwirken. Inhalte von Websites sollen mit einem speziellen, aber einheitlichen Vokabular für die Crawler - die Analyseprogramme der Suchmaschinen - gekennzeichnet und aufbereitet werden.
    Indem Schlagworte, sogenannte Tags, in den für Normal-User nicht sichtbaren Teil des Codes von Websites eingebettet werden, sind Suchmachinen nicht mehr so sehr auf die Analyse der natürlichen Sprache angewiesen, um Texte inhaltlich zu erfassen. Im Blog ZBW Mediatalk wird dies als "Semantic Web light" bezeichnet - ein semantisches Web auf niedrigster Ebene. Aber selbst das werde "schon viel bewirken", meint Bahls. "Das semantische Web wird sich über die nächsten Jahrzehnte evolutionär weiterentwickeln." Einen "Abschluss" werde es nie geben, "da eine einheitliche Formalisierung von Begrifflichkeiten auf feiner Stufe kaum möglich ist". Die Ergebnisse aus Schema.org würden "zeitnah" in die Suchmaschine integriert, "denn einen Zeitplan" gebe es nicht, so Stefan Keuchel, Pressesprecher von Google Deutschland. Bis das so weit ist, hilft der Verweis von Daniel Bahns auf die bereits existierende semantische Suchmaschine Sig.ma. Geschwindigkeit und Menge der Ergebnisse nach einer Suchanfrage spielen hier keine Rolle. Sig.ma sammelt seine Informationen allein im Bereich des semantischen Webs und listet nach einer Anfrage alles Bekannte strukturiert auf.
  14. Schaer, P.; Mayr, P.; Sünkler, S.; Lewandowski, D.: How relevant is the long tail? : a relevance assessment study on million short (2016) 0.00
    0.0016647738 = product of:
      0.014982964 = sum of:
        0.014982964 = product of:
          0.029965928 = sum of:
            0.029965928 = weight(_text_:web in 3144) [ClassicSimilarity], result of:
              0.029965928 = score(doc=3144,freq=6.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.3122631 = fieldWeight in 3144, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3144)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Users of web search engines are known to mostly focus on the top ranked results of the search engine result page. While many studies support this well known information seeking pattern only few studies concentrate on the question what users are missing by neglecting lower ranked results. To learn more about the relevance distributions in the so-called long tail we conducted a relevance assessment study with the Million Short long-tail web search engine. While we see a clear difference in the content between the head and the tail of the search engine result list we see no statistical significant differences in the binary relevance judgments and weak significant differences when using graded relevance. The tail contains different but still valuable results. We argue that the long tail can be a rich source for the diversification of web search engine result lists but it needs more evaluation to clearly describe the differences.
  15. Warnick, W.L.; Leberman, A.; Scott, R.L.; Spence, K.J.; Johnsom, L.A.; Allen, V.S.: Searching the deep Web : directed query engine applications at the Department of Energy (2001) 0.00
    0.0016311385 = product of:
      0.014680246 = sum of:
        0.014680246 = product of:
          0.029360492 = sum of:
            0.029360492 = weight(_text_:web in 1215) [ClassicSimilarity], result of:
              0.029360492 = score(doc=1215,freq=4.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.3059541 = fieldWeight in 1215, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1215)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Directed Query Engines, an emerging class of search engine specifically designed to access distributed resources on the deep web, offer the opportunity to create inexpensive digital libraries. Already, one such engine, Distributed Explorer, has been used to select and assemble high quality information resources and incorporate them into publicly available systems for the physical sciences. By nesting Directed Query Engines so that one query launches several other engines in a cascading fashion, enormous virtual collections may soon be assembled to form a comprehensive information infrastructure for the physical sciences. Once a Directed Query Engine has been configured for a set of information resources, distributed alerts tools can provide patrons with personalized, profile-based notices of recent additions to any of the selected resources. Due to the potentially enormous size and scope of Directed Query Engine applications, consideration must be given to issues surrounding the representation of large quantities of information from multiple, heterogeneous sources.
  16. Powell, J.; Fox, E.A.: Multilingual federated searching across heterogeneous collections (1998) 0.00
    0.0015378521 = product of:
      0.013840669 = sum of:
        0.013840669 = product of:
          0.027681338 = sum of:
            0.027681338 = weight(_text_:web in 1250) [ClassicSimilarity], result of:
              0.027681338 = score(doc=1250,freq=2.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.2884563 = fieldWeight in 1250, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1250)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    This article describes a scalable system for searching heterogeneous multilingual collections on the World Wide Web. It details a markup language for describing the characteristics of a search engine and its interface, and a protocol for requesting word translations between languages.
  17. Hodson, H.: Google's fact-checking bots build vast knowledge bank (2014) 0.00
    0.0015378521 = product of:
      0.013840669 = sum of:
        0.013840669 = product of:
          0.027681338 = sum of:
            0.027681338 = weight(_text_:web in 1700) [ClassicSimilarity], result of:
              0.027681338 = score(doc=1700,freq=2.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.2884563 = fieldWeight in 1700, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1700)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    The search giant is automatically building Knowledge Vault, a massive database that could give us unprecedented access to the world's facts GOOGLE is building the largest store of knowledge in human history - and it's doing so without any human help. Instead, Knowledge Vault autonomously gathers and merges information from across the web into a single base of facts about the world, and the people and objects in it.
  18. Zhao, Y.; Ma, F.; Xia, X.: Evaluating the coverage of entities in knowledge graphs behind general web search engines : Poster (2017) 0.00
    0.0013592821 = product of:
      0.012233539 = sum of:
        0.012233539 = product of:
          0.024467077 = sum of:
            0.024467077 = weight(_text_:web in 3854) [ClassicSimilarity], result of:
              0.024467077 = score(doc=3854,freq=4.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.25496176 = fieldWeight in 3854, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3854)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    Web search engines, such as Google and Bing, are constantly employing results from knowledge organization and various visualization features to improve their search services. Knowledge graph, a large repository of structured knowledge represented by formal languages such as RDF (Resource Description Framework), is used to support entity search feature of Google and Bing (Demartini, 2016). When a user searchs for an entity, such as a person, an organization, or a place in Google or Bing, it is likely that a knowledge cardwill be presented on the right side bar of the search engine result pages (SERPs). For example, when a user searches the entity Benedict Cumberbatch on Google, the knowledge card will show the basic structured information about this person, including his date of birth, height, spouse, parents, and his movies, etc. The knowledge card, which is used to present the result of entity search, is generated from knowledge graphs. Therefore, the quality of knowledge graphs is essential to the performance of entity search. However, studies on the quality of knowledge graphs from the angle of entity coverage are scant in the literature. This study aims to investigate the coverage of entities of knowledge graphs behind Google and Bing.
  19. Lossau, N.: Search engine technology and digital libraries : libraries need to discover the academic internet (2004) 0.00
    0.0013456206 = product of:
      0.012110585 = sum of:
        0.012110585 = product of:
          0.02422117 = sum of:
            0.02422117 = weight(_text_:web in 1161) [ClassicSimilarity], result of:
              0.02422117 = score(doc=1161,freq=2.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.25239927 = fieldWeight in 1161, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1161)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Abstract
    With the development of the World Wide Web, the "information search" has grown to be a significant business sector of a global, competitive and commercial market. Powerful players have entered this market, such as commercial internet search engines, information portals, multinational publishers and online content integrators. Will Google, Yahoo or Microsoft be the only portals to global knowledge in 2010? If libraries do not want to become marginalized in a key area of their traditional services, they need to acknowledge the challenges that come with the globalisation of scholarly information, the existence and further growth of the academic internet
  20. Söhler, M.: "Dumm wie Google" war gestern : semantische Suche im Netz (2011) 0.00
    0.0013456206 = product of:
      0.012110585 = sum of:
        0.012110585 = product of:
          0.02422117 = sum of:
            0.02422117 = weight(_text_:web in 4440) [ClassicSimilarity], result of:
              0.02422117 = score(doc=4440,freq=8.0), product of:
                0.09596372 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02940506 = queryNorm
                0.25239927 = fieldWeight in 4440, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=4440)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Content
    - Neue Standards Doch was hier entstehen könnte, hat das Zeug dazu, Teile des Netzes und speziell die Funktionen von Suchmaschinen mittel- oder langfristig zu verändern. "Große Player sind dabei, sich auf Standards zu einigen", sagt Daniel Bahls, Spezialist für Semantische Technologien beim ZBW Leibniz-Informationszentrum Wirtschaft in Hamburg. "Die semantischen Technologien stehen schon seit Jahren im Raum und wurden bisher nur im kleineren Kontext verwendet." Denn Schema.org lädt Entwickler, Forscher, die Semantic-Web-Community und am Ende auch alle Betreiber von Websites dazu ein, an der Umgestaltung der Suche im Netz mitzuwirken. "Damit wollen Google, Bing und Yahoo! dem Info-Chaos im WWW den Garaus machen", schreibt André Vatter im Blog ZBW Mediatalk. Inhalte von Websites sollen mit einem speziellen, aber einheitlichen Vokabular für die Crawler der Suchmaschinen gekennzeichnet und aufbereitet werden. Indem Schlagworte, so genannte Tags, in den Code von Websites eingebettet werden, sind Suchmachinen nicht mehr so sehr auf die Analyse der natürlichen Sprache angewiesen, um Texte inhaltlich zu erfassen. Im Blog wird dies als "Semantic Web light" bezeichnet - ein semantisches Web auf niedrigster Ebene. Aber selbst das werde "schon viel bewirken", meint Bahls. "Das semantische Web wird sich über die nächsten Jahrzehnte evolutionär weiterentwickeln." Einen "Abschluss" werde es nie geben, "da eine einheitliche Formalisierung von Begrifflichkeiten auf feiner Stufe kaum möglich ist."