Search (13 results, page 1 of 1)

Dunning, A.: Do we still need search engines? (1999) 0.02

0.015257841 = product of:
  0.09154704 = sum of:
    0.09154704 = weight(_text_:22 in 6021) [ClassicSimilarity], result of:
      0.09154704 = score(doc=6021,freq=2.0), product of:
        0.1690115 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04826377 = queryNorm
        0.5416616 = fieldWeight in 6021, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=6021)
  0.16666667 = coord(1/6)

Source: Ariadne. 1999, no.22

Birmingham, J.: Internet search engines (1996) 0.01

0.013078149 = product of:
  0.0784689 = sum of:
    0.0784689 = weight(_text_:22 in 5664) [ClassicSimilarity], result of:
      0.0784689 = score(doc=5664,freq=2.0), product of:
        0.1690115 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04826377 = queryNorm
        0.46428138 = fieldWeight in 5664, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=5664)
  0.16666667 = coord(1/6)

Date: 10.11.1996 16:36:22

Bensman, S.J.: Eugene Garfield, Francis Narin, and PageRank : the theoretical bases of the Google search engine (2013) 0.01

0.008718766 = product of:
  0.052312598 = sum of:
    0.052312598 = weight(_text_:22 in 1149) [ClassicSimilarity], result of:
      0.052312598 = score(doc=1149,freq=2.0), product of:
        0.1690115 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04826377 = queryNorm
        0.30952093 = fieldWeight in 1149, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0625 = fieldNorm(doc=1149)
  0.16666667 = coord(1/6)

Date: 17.12.2013 11:02:22

Schaat, S.: Von der automatisierten Manipulation zur Manipulation der Automatisierung (2019) 0.01

0.008718766 = product of:
  0.052312598 = sum of:
    0.052312598 = weight(_text_:22 in 4996) [ClassicSimilarity], result of:
      0.052312598 = score(doc=4996,freq=2.0), product of:
        0.1690115 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04826377 = queryNorm
        0.30952093 = fieldWeight in 4996, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0625 = fieldNorm(doc=4996)
  0.16666667 = coord(1/6)

Date: 19. 2.2019 17:22:00

Brin, S.; Page, L.: ¬The anatomy of a large-scale hypertextual Web search engine (1998) 0.01
```
0.008005621 = product of:
  0.04803372 = sum of:
    0.04803372 = weight(_text_:problem in 947) [ClassicSimilarity], result of:
      0.04803372 = score(doc=947,freq=2.0), product of:
        0.20485485 = queryWeight, product of:
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.04826377 = queryNorm
        0.23447686 = fieldWeight in 947, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.0390625 = fieldNorm(doc=947)
  0.16666667 = coord(1/6)
```
Abstract

In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want
Austin, D.: How Google finds your needle in the Web's haystack : as we'll see, the trick is to ask the web itself to rank the importance of pages... (2006) 0.01
```
0.007925161 = product of:
  0.047550965 = sum of:
    0.047550965 = weight(_text_:problem in 93) [ClassicSimilarity], result of:
      0.047550965 = score(doc=93,freq=4.0), product of:
        0.20485485 = queryWeight, product of:
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.04826377 = queryNorm
        0.23212028 = fieldWeight in 93, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.02734375 = fieldNorm(doc=93)
  0.16666667 = coord(1/6)
```
Abstract

Imagine a library containing 25 billion documents but with no centralized organization and no librarians. In addition, anyone may add a document at any time without telling anyone. You may feel sure that one of the documents contained in the collection has a piece of information that is vitally important to you, and, being impatient like most of us, you'd like to find it in a matter of seconds. How would you go about doing it? Posed in this way, the problem seems impossible. Yet this description is not too different from the World Wide Web, a huge, highly-disorganized collection of documents in many different formats. Of course, we're all familiar with search engines (perhaps you found this article using one) so we know that there is a solution. This article will describe Google's PageRank algorithm and how it returns pages from the web's collection of 25 billion documents that match search criteria so well that "google" has become a widely used verb. Most search engines, including Google, continually run an army of computer programs that retrieve pages from the web, index the words in each document, and store this information in an efficient format. Each time a user asks for a web search using a search phrase, such as "search engine," the search engine determines all the pages on the web that contains the words in the search phrase. (Perhaps additional information such as the distance between the words "search" and "engine" will be noted as well.) Here is the problem: Google now claims to index 25 billion pages. Roughly 95% of the text in web pages is composed from a mere 10,000 words. This means that, for most searches, there will be a huge number of pages containing the words in the search phrase. What is needed is a means of ranking the importance of the pages that fit the search criteria so that the pages can be sorted with the most important pages at the top of the list. One way to determine the importance of pages is to use a human-generated ranking. For instance, you may have seen pages that consist mainly of a large number of links to other resources in a particular area of interest. Assuming the person maintaining this page is reliable, the pages referenced are likely to be useful. Of course, the list may quickly fall out of date, and the person maintaining the list may miss some important pages, either unintentionally or as a result of an unstated bias. Google's PageRank algorithm assesses the importance of web pages without human evaluation of the content. In fact, Google feels that the value of its service is largely in its ability to provide unbiased results to search queries; Google claims, "the heart of our software is PageRank." As we'll see, the trick is to ask the web itself to rank the importance of pages.
Weigert, M.: Horizobu: Webrecherche statt Websuche (2011) 0.01
```
0.006792995 = product of:
  0.04075797 = sum of:
    0.04075797 = weight(_text_:problem in 4443) [ClassicSimilarity], result of:
      0.04075797 = score(doc=4443,freq=4.0), product of:
        0.20485485 = queryWeight, product of:
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.04826377 = queryNorm
        0.19896023 = fieldWeight in 4443, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.0234375 = fieldNorm(doc=4443)
  0.16666667 = coord(1/6)
```
Content

"Das Problem mit der Suchmaschinen-Optimierung Suchmaschinen sind unser Instrument, um mit der Informationsflut im Internet klar zu kommen. Wie ich in meinem Artikel Die kürzeste Anleitung zur Suchmaschinenoptimierung aller Zeiten ausgeführt habe, gibt es dabei leider das Problem, dass der Platzhirsch Google nicht wirklich die besten Suchresultate liefert: Habt ihr schon mal nach einem Hotel, einem Restaurant oder einer anderen Location gesucht - und die ersten vier Ergebnis-Seiten sind voller Location-Aggregatoren? Wenn ich ganz spezifisch nach einem Hotel soundso in der Soundso-Strasse suche, dann finde ich, das relevanteste Ergebnis ist die Webseite dieses Hotels. Das gehört auf Seite 1 an Platz 1. Dort aber finden sich nur die Webseiten, die ganz besonders dolle suchmaschinenoptimiert sind. Wobei Google Webseiten als am suchmaschinenoptimiertesten einstuft, wenn möglichst viele Links darauf zeigen und der Inhalt relevant sein soll. Die Industrie der Suchmaschinen-Optimierer erreicht dies dadurch, dass sie folgende Dinge machen: - sie lassen Programme und Praktikanten im Web rumschwirren, die sich überall mit hirnlosen Kommentaren verewigen (Hauptsache, die sind verlinkt und zeigen auf ihre zu pushende Webseite) - sie erschaffen geistlose Blogs, in denen hirnlose Texte stehen (Hauptsache, die Keyword-Dichte stimmt) - diese Texte lassen sie durch Schüler und Praktikanten oder gleich durch Software schreiben - Dann kommt es anscheinend noch auf Keywords im Titel, in der URL etc. an.

Place, E.: Internationale Zusammenarbeit bei Internet Subject Gateways (1999) 0.01

0.0065390747 = product of:
  0.03923445 = sum of:
    0.03923445 = weight(_text_:22 in 4189) [ClassicSimilarity], result of:
      0.03923445 = score(doc=4189,freq=2.0), product of:
        0.1690115 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04826377 = queryNorm
        0.23214069 = fieldWeight in 4189, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.046875 = fieldNorm(doc=4189)
  0.16666667 = coord(1/6)

Date: 22. 6.2002 19:35:09

Boldi, P.; Santini, M.; Vigna, S.: PageRank as a function of the damping factor (2005) 0.01

0.0054492294 = product of:
  0.032695375 = sum of:
    0.032695375 = weight(_text_:22 in 2564) [ClassicSimilarity], result of:
      0.032695375 = score(doc=2564,freq=2.0), product of:
        0.1690115 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04826377 = queryNorm
        0.19345059 = fieldWeight in 2564, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2564)
  0.16666667 = coord(1/6)

Date: 16. 1.2016 10:22:28

Baeza-Yates, R.; Boldi, P.; Castillo, C.: Generalizing PageRank : damping functions for linkbased ranking algorithms (2006) 0.01

0.0054492294 = product of:
  0.032695375 = sum of:
    0.032695375 = weight(_text_:22 in 2565) [ClassicSimilarity], result of:
      0.032695375 = score(doc=2565,freq=2.0), product of:
        0.1690115 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04826377 = queryNorm
        0.19345059 = fieldWeight in 2565, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2565)
  0.16666667 = coord(1/6)

Date: 16. 1.2016 10:22:28

Maurer, H.; Balke, T.; Kappe,, F.; Kulathuramaiyer, N.; Weber, S.; Zaka, B.: Report on dangers and opportunities posed by large search engines, particularly Google (2007) 0.00
```
0.0048033725 = product of:
  0.028820235 = sum of:
    0.028820235 = weight(_text_:problem in 754) [ClassicSimilarity], result of:
      0.028820235 = score(doc=754,freq=2.0), product of:
        0.20485485 = queryWeight, product of:
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.04826377 = queryNorm
        0.14068612 = fieldWeight in 754, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.0234375 = fieldNorm(doc=754)
  0.16666667 = coord(1/6)
```
Abstract

The preliminary intended and approved list was: Section 1: To concentrate on Google as virtual monopoly, and Google's reported support of Wikipedia. To find experimental evidence of this support or show that the reports are not more than rumours. Section 2: To address the copy-past syndrome with socio-cultural consequences associated with it. Section 3: To deal with plagiarism and IPR violations as two intertwined topics: how they affect various players (teachers and pupils in school; academia; corporations; governmental studies, etc.). To establish that not enough is done concerning these issues, partially due to just plain ignorance. We will propose some ways to alleviate the problem. Section 4: To discuss the usual tools to fight plagiarism and their shortcomings. Section 5: To propose ways to overcome most of above problems according to proposals by Maurer/Zaka. To examples, but to make it clear that do this more seriously a pilot project is necessary beyond this particular study. Section 6: To briefly analyze various views of plagiarism as it is quite different in different fields (journalism, engineering, architecture, painting, .) and to present a concept that avoids plagiarism from the very beginning. Section 7: To point out the many other dangers of Google or Google-like undertakings: opportunistic ranking, analysis of data as window into commercial future. Section 8: To outline the need of new international laws. Section 9: To mention the feeble European attempts to fight Google, despite Google's growing power. Section 10. To argue that there is no way to catch up with Google in a frontal attack.

Gillitzer, B.: Yewno (2017) 0.00

0.004359383 = product of:
  0.026156299 = sum of:
    0.026156299 = weight(_text_:22 in 3447) [ClassicSimilarity], result of:
      0.026156299 = score(doc=3447,freq=2.0), product of:
        0.1690115 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.04826377 = queryNorm
        0.15476047 = fieldWeight in 3447, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.03125 = fieldNorm(doc=3447)
  0.16666667 = coord(1/6)

Date: 22. 2.2017 10:16:49

Sander-Beuermann, W.: Schürfrechte im Informationszeitalter : Google hin, Microsoft her v das Internet braucht eine freie Suchkultur (2005) 0.00
```
0.0040028105 = product of:
  0.02401686 = sum of:
    0.02401686 = weight(_text_:problem in 3245) [ClassicSimilarity], result of:
      0.02401686 = score(doc=3245,freq=2.0), product of:
        0.20485485 = queryWeight, product of:
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.04826377 = queryNorm
        0.11723843 = fieldWeight in 3245, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.244485 = idf(docFreq=1723, maxDocs=44218)
          0.01953125 = fieldNorm(doc=3245)
  0.16666667 = coord(1/6)
```
Content

Suchmaschinen-Monopolisten können bestimmen oder kontrollieren, welche Information wann und auf welchen Rechnern verfügbar ist, und in welcher Reihenfolge die Ergebnisse angezeigt werden. Durch Beobachtung der Abrufe können die Unternehmen genaue Profile ihrer Nutzer erstellen. Um die Vormacht der kommerziellen Wissenswächter zu brechen, bedarf es einer freien Suchkultur - so wie das offene Betriebssystem Linux die Welt vor einer reinen Windows-Monokultur bewahrt hat. Immerhin scheint man auch auf staatlicher Seite das Problem des "Information Overkill" erkannt zu haben. Die öffentliche Hand fördert zahlreiche Projekte, die Ordnung in den Datenwust bringen wollen. Doch die meisten davon sind mehr visionär als realistisch. Vom einst so gefeierten "Semantic Web" etwa ist auch nach Jahren kaum Handfestes zu sehen. Kein Wunder: Solche Vorhaben setzen voraus, dass die Daten zunächst eingesammelt und suchgerecht indiziert werden. Mangels freier Software fehlt diese Voraussetzung. Was also ist nötig, um im Informationszeitalter die freie Verfügbarkeit der Ressourcen sicherzustellen? Die Antwort ist die gleiche wie einst für Kohle, Eisen und Öl: eine Vielfalt von Anbietern. Der beste Weg dorthin führt über freie Suchmaschinen-Software, auf welche die Betreiber solcher Maschinen zurückgreifen können. Dann entstünde ganz von selbst ein offener und dynamischer Wettbewerb. Freie Suchmaschinen-Software ist jedoch sehr rar. Es gibt Ansätze dazu in Russland und ein einziges Projekt in den USA (nutch.org). Auch Europa ist weitgehend Ödnis - bis auf den Lichtblick Yacy, ein Vorhaben des Frankfurter Softwarespezialisten Michael Christen. Yacy ist meines Wissen der weltweit einzige proof-of-concept einer strikt dezentralen Peer-to-Peer-Suchmaschine (suma-lab.de:8080"). Um die Suchmaschinen-Landschaft zu beleben, haben nun 13 Forscher, Politiker und Unternehmer den "Gemeinnützigen Verein zur Förderung der Suchmaschinen-Technologie und des freien Wissenszugangs" (kurz: SuMa-eV, suma-ev.de) mit Sitz in Hannover gegründet. Zu den Gründungsmitgliedern gehören der MP3-Erfinder Karlheinz Brandenburg, der Vizepräsident für Forschung der Universität Hannover Wolfgang Ertmer und ich selbst. Ziel des SuMa-eV ist die Etablierung einer auf möglichst viele autarke Systeme verteilten Suchmaschinen-Infrastruktur, die von ihrem Bauprinzip her kaum monopolisierbar ist. Der Kerngedanke dieser Struktur, die sich aus sehr vielen und sehr unterschiedlichen Bausteinen zusammensetzen kann, liegt in der Autarkie der Einzelsysteme: gesellschaftlicher Pluralismus wird netztopologisch abgebildet. Eigentlich wäre es im Interesse und in der Macht des Staats, die Meinungsvielfalt im Netz besser zu sichern. Während er - abgesehen von ein paar hellhörigen Parlamentariern - noch träumerische Visionen pflegt, müssen Initiativen wie SuMa-eV einspringen."

Search (13 results, page 1 of 1)

Authors

Years

Languages

Themes