Search (92 results, page 1 of 5)

Kruschwitz, U.; Lungley, D.; Albakour, M-D.; Song, D.: Deriving query suggestions for site search (2013) 0.04
```
0.036563005 = product of:
  0.18281503 = sum of:
    0.023806747 = weight(_text_:web in 1085) [ClassicSimilarity], result of:
      0.023806747 = score(doc=1085,freq=4.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.25496176 = fieldWeight in 1085, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1085)
    0.15900828 = weight(_text_:log in 1085) [ClassicSimilarity], result of:
      0.15900828 = score(doc=1085,freq=12.0), product of:
        0.18335998 = queryWeight, product of:
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.028611459 = queryNorm
        0.86719185 = fieldWeight in 1085, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1085)
  0.2 = coord(2/10)
```
Abstract

Modern search engines have been moving away from simplistic interfaces that aimed at satisfying a user's need with a single-shot query. Interactive features are now integral parts of web search engines. However, generating good query modification suggestions remains a challenging issue. Query log analysis is one of the major strands of work in this direction. Although much research has been performed on query logs collected on the web as a whole, query log analysis to enhance search on smaller and more focused collections has attracted less attention, despite its increasing practical importance. In this article, we report on a systematic study of different query modification methods applied to a substantial query log collected on a local website that already uses an interactive search engine. We conducted experiments in which we asked users to assess the relevance of potential query modification suggestions that have been constructed using a range of log analysis methods and different baseline approaches. The experimental results demonstrate the usefulness of log analysis to extract query modification suggestions. Furthermore, our experiments demonstrate that a more fine-grained approach than grouping search requests into sessions allows for extraction of better refinement terms from query log files.
Web search engine research (2012) 0.03
```
0.025475848 = product of:
  0.12737924 = sum of:
    0.0494814 = weight(_text_:web in 478) [ClassicSimilarity], result of:
      0.0494814 = score(doc=478,freq=12.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.5299281 = fieldWeight in 478, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=478)
    0.07789783 = weight(_text_:log in 478) [ClassicSimilarity], result of:
      0.07789783 = score(doc=478,freq=2.0), product of:
        0.18335998 = queryWeight, product of:
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.028611459 = queryNorm
        0.42483553 = fieldWeight in 478, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.046875 = fieldNorm(doc=478)
  0.2 = coord(2/10)
```
Abstract

"Web Search Engine Research", edited by Dirk Lewandowski, provides an understanding of Web search engines from the unique perspective of Library and Information Science. The book explores a range of topics including retrieval effectiveness, user satisfaction, the evaluation of search interfaces, the impact of search on society, reliability of search results, query log analysis, user guidance in the search process, and the influence of search engine optimization (SEO) on results quality. While research in computer science has mainly focused on technical aspects of search engines, LIS research is centred on users' behaviour when using search engines and how this interaction can be evaluated. LIS research provides a unique perspective in intermediating between the technical aspects, user aspects and their impact on their role in knowledge acquisition. This book is directly relevant to researchers and practitioners in library and information science, computer science, including Web researchers.

LCSH

Web search engines

Subject

Web search engines
Lewandowski, D.; Drechsler, J.; Mach, S. von: Deriving query intents from web search engine queries (2012) 0.02
```
0.017744321 = product of:
  0.0887216 = sum of:
    0.023806747 = weight(_text_:web in 385) [ClassicSimilarity], result of:
      0.023806747 = score(doc=385,freq=4.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.25496176 = fieldWeight in 385, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=385)
    0.06491486 = weight(_text_:log in 385) [ClassicSimilarity], result of:
      0.06491486 = score(doc=385,freq=2.0), product of:
        0.18335998 = queryWeight, product of:
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.028611459 = queryNorm
        0.3540296 = fieldWeight in 385, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0390625 = fieldNorm(doc=385)
  0.2 = coord(2/10)
```
Abstract

The purpose of this article is to test the reliability of query intents derived from queries, either by the user who entered the query or by another juror. We report the findings of three studies. First, we conducted a large-scale classification study (~50,000 queries) using a crowdsourcing approach. Next, we used clickthrough data from a search engine log and validated the judgments given by the jurors from the crowdsourcing study. Finally, we conducted an online survey on a commercial search engine's portal. Because we used the same queries for all three studies, we also were able to compare the results and the effectiveness of the different approaches. We found that neither the crowdsourcing approach, using jurors who classified queries originating from other users, nor the questionnaire approach, using searchers who were asked about their own query that they just entered into a Web search engine, led to satisfying results. This leads us to conclude that there was little understanding of the classification tasks, even though both groups of jurors were given detailed instructions. Although we used manual classification, our research also has important implications for automatic classification. We must question the success of approaches using automatic classification and comparing its performance to a baseline from human jurors.
Sarigil, E.; Sengor Altingovde, I.; Blanco, R.; Barla Cambazoglu, B.; Ozcan, R.; Ulusoy, Ö.: Characterizing, predicting, and handling web search queries that match very few or no results (2018) 0.02
```
0.017744321 = product of:
  0.0887216 = sum of:
    0.023806747 = weight(_text_:web in 4039) [ClassicSimilarity], result of:
      0.023806747 = score(doc=4039,freq=4.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.25496176 = fieldWeight in 4039, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4039)
    0.06491486 = weight(_text_:log in 4039) [ClassicSimilarity], result of:
      0.06491486 = score(doc=4039,freq=2.0), product of:
        0.18335998 = queryWeight, product of:
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.028611459 = queryNorm
        0.3540296 = fieldWeight in 4039, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4039)
  0.2 = coord(2/10)
```
Abstract

A non-negligible fraction of user queries end up with very few or even no matching results in leading commercial web search engines. In this work, we provide a detailed characterization of such queries and show that search engines try to improve such queries by showing the results of related queries. Through a user study, we show that these query suggestions are usually perceived as relevant. Also, through a query log analysis, we show that the users are dissatisfied after submitting a query that match no results at least 88.5% of the time. As a first step towards solving these no-answer queries, we devised a large number of features that can be used to identify such queries and built machine-learning models. These models can be useful for scenarios such as the mobile- or meta-search, where identifying a query that will retrieve no results at the client device (i.e., even before submitting it to the search engine) may yield gains in terms of the bandwidth usage, power consumption, and/or monetary costs. Experiments over query logs indicate that, despite the heavy skew in class sizes, our models achieve good prediction quality, with accuracy (in terms of area under the curve) up to 0.95.
Aloteibi, S.; Sanderson, M.: Analyzing geographic query reformulation : an exploratory study (2014) 0.01
```
0.014275125 = product of:
  0.07137562 = sum of:
    0.06491486 = weight(_text_:log in 1177) [ClassicSimilarity], result of:
      0.06491486 = score(doc=1177,freq=2.0), product of:
        0.18335998 = queryWeight, product of:
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.028611459 = queryNorm
        0.3540296 = fieldWeight in 1177, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1177)
    0.006460763 = product of:
      0.019382289 = sum of:
        0.019382289 = weight(_text_:22 in 1177) [ClassicSimilarity], result of:
          0.019382289 = score(doc=1177,freq=2.0), product of:
            0.10019246 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.028611459 = queryNorm
            0.19345059 = fieldWeight in 1177, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1177)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)
```
Abstract

Search engine users typically engage in multiquery sessions in their quest to fulfill their information needs. Despite a plethora of research findings suggesting that a significant group of users look for information within a specific geographical scope, existing reformulation studies lack a focused analysis of how users reformulate geographic queries. This study comprehensively investigates the ways in which users reformulate such needs in an attempt to fill this gap in the literature. Reformulated sessions were sampled from a query log of a major search engine to extract 2,400 entries that were manually inspected to filter geo sessions. This filter identified 471 search sessions that included geographical intent, and these sessions were analyzed quantitatively and qualitatively. The results revealed that one in five of the users who reformulated their queries were looking for geographically related information. They reformulated their queries by changing the content of the query rather than the structure. Users were not following a unified sequence of modifications and instead performed a single reformulation action. However, in some cases it was possible to anticipate their next move. A number of tasks in geo modifications were identified, including standard, multi-needs, multi-places, and hybrid approaches. The research concludes that it is important to specialize query reformulation studies to focus on particular query types rather than generically analyzing them, as it is apparent that geographic queries have their special reformulation characteristics.

Date

26. 1.2014 18:48:22
Alqaraleh, S.; Ramadan, O.; Salamah, M.: Efficient watcher based web crawler design (2015) 0.01
```
0.010814852 = product of:
  0.054074258 = sum of:
    0.047613494 = weight(_text_:web in 1627) [ClassicSimilarity], result of:
      0.047613494 = score(doc=1627,freq=16.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.5099235 = fieldWeight in 1627, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1627)
    0.006460763 = product of:
      0.019382289 = sum of:
        0.019382289 = weight(_text_:22 in 1627) [ClassicSimilarity], result of:
          0.019382289 = score(doc=1627,freq=2.0), product of:
            0.10019246 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.028611459 = queryNorm
            0.19345059 = fieldWeight in 1627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1627)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)
```
Abstract

Purpose The purpose of this paper is to design a watcher-based crawler (WBC) that has the ability of crawling static and dynamic web sites, and can download only the updated and newly added web pages. Design/methodology/approach In the proposed WBC crawler, a watcher file, which can be uploaded to the web sites servers, prepares a report that contains the addresses of the updated and the newly added web pages. In addition, the WBC is split into five units, where each unit is responsible for performing a specific crawling process. Findings Several experiments have been conducted and it has been observed that the proposed WBC increases the number of uniquely visited static and dynamic web sites as compared with the existing crawling techniques. In addition, the proposed watcher file not only allows the crawlers to visit the updated and newly web pages, but also solves the crawlers overlapping and communication problems. Originality/value The proposed WBC performs all crawling processes in the sense that it detects all updated and newly added pages automatically without any human explicit intervention or downloading the entire web sites.

Date

20. 1.2015 18:30:22

Sander-Beuermann, W.: Generationswechsel bei MetaGer : ein Rückblick und Ausblick (2019) 0.01

0.00988591 = product of:
  0.098859094 = sum of:
    0.098859094 = weight(_text_:schutz in 4993) [ClassicSimilarity], result of:
      0.098859094 = score(doc=4993,freq=2.0), product of:
        0.20656188 = queryWeight, product of:
          7.2195506 = idf(docFreq=87, maxDocs=44218)
          0.028611459 = queryNorm
        0.4785931 = fieldWeight in 4993, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.2195506 = idf(docFreq=87, maxDocs=44218)
          0.046875 = fieldNorm(doc=4993)
  0.1 = coord(1/10)

Issue: Teil 1: Von ersten Internet-Pionieren bis zu Meta-Suchmaschinen [https://www.password-online.de/?wysija-page=1&controller=email&action=view&email_id=633&wysijap=subscriptions&user_id=1045]. Teil 2: Was weiter gelten muss: Freier Wissenszugang, Privatsphäre und Schutz vor Datenkraken! [https://www.password-online.de/?wysija-page=1&controller=email&action=view&email_id=635&wysijap=subscriptions&user_id=1045]

Peters, I.: Folksonomies und kollaborative Informationsdienste : eine Alternative zur Websuche? (2011) 0.01

0.00970437 = product of:
  0.04852185 = sum of:
    0.038090795 = weight(_text_:web in 343) [ClassicSimilarity], result of:
      0.038090795 = score(doc=343,freq=4.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.4079388 = fieldWeight in 343, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0625 = fieldNorm(doc=343)
    0.010431055 = product of:
      0.031293165 = sum of:
        0.031293165 = weight(_text_:29 in 343) [ClassicSimilarity], result of:
          0.031293165 = score(doc=343,freq=2.0), product of:
            0.10064617 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.028611459 = queryNorm
            0.31092256 = fieldWeight in 343, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=343)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)

Abstract: Folksonomies ermöglichen den Nutzern in Kollaborativen Informationsdiensten den Zugang zu verschiedenartigen Informationsressourcen. In welchen Fällen beide Bestandteile des Web 2.0 am besten für das Information Retrieval geeignet sind und wo sie die Websuche ggf. ersetzen können, wird in diesem Beitrag diskutiert. Dazu erfolgt eine detaillierte Betrachtung der Reichweite von Social-Bookmarking-Systemen und Sharing-Systemen sowie der Retrievaleffektivität von Folksonomies innerhalb von Kollaborativen Informationsdiensten.
Pages: S.29-53
Source: Handbuch Internet-Suchmaschinen, 2: Neue Entwicklungen in der Web-Suche. Hrsg.: D. Lewandowski

Chaudiron, S.; Ihadjadene, M.: Studying Web search engines from a user perspective : key concepts and main approaches (2012) 0.01
```
0.008820508 = product of:
  0.04410254 = sum of:
    0.037641775 = weight(_text_:web in 109) [ClassicSimilarity], result of:
      0.037641775 = score(doc=109,freq=10.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.40312994 = fieldWeight in 109, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=109)
    0.006460763 = product of:
      0.019382289 = sum of:
        0.019382289 = weight(_text_:22 in 109) [ClassicSimilarity], result of:
          0.019382289 = score(doc=109,freq=2.0), product of:
            0.10019246 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.028611459 = queryNorm
            0.19345059 = fieldWeight in 109, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=109)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)
```
Abstract

This chapter shows that the wider use of Web search engines, reconsidering the theoretical and methodological frameworks to grasp new information practices. Beginning with an overview of the recent challenges implied by the dynamic nature of the Web, this chapter then traces the information behavior related concepts in order to present the different approaches from the user perspective. The authors pay special attention to the concept of "information practice" and other related concepts such as "use", "activity", and "behavior" largely used in the literature but not always strictly defined. The authors provide an overview of user-oriented studies that are meaningful to understand the different contexts of use of electronic information access systems, focusing on five approaches: the system-oriented approaches, the theories of information seeking, the cognitive and psychological approaches, the management science approaches, and the marketing approaches. Future directions of work are then shaped, including social searching and the ethical, cultural, and political dimensions of Web search engines. The authors conclude considering the importance of Critical theory to better understand the role of Web Search engines in our modern society.

Date

20. 4.2012 13:22:37

Lewandowski, D.: Perspektiven eines Open Web Index (2016) 0.01

0.008491324 = product of:
  0.04245662 = sum of:
    0.033329446 = weight(_text_:web in 2935) [ClassicSimilarity], result of:
      0.033329446 = score(doc=2935,freq=4.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.35694647 = fieldWeight in 2935, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2935)
    0.009127174 = product of:
      0.027381519 = sum of:
        0.027381519 = weight(_text_:29 in 2935) [ClassicSimilarity], result of:
          0.027381519 = score(doc=2935,freq=2.0), product of:
            0.10064617 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.028611459 = queryNorm
            0.27205724 = fieldWeight in 2935, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2935)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)

Abstract: Der Suchmaschinenmarkt wird seit vielen Jahren von nur einer einzigen Suchmaschine, Google, dominiert. Es wurde mittlerweile erkannt, dass diese Situation nicht wünschenswert ist. Wir sprechen nun über mögliche Lösungen. Der Artikel diskutiert unterschiedliche Lösungsansätze und fokussiert dabei auf die Idee einen Offenen Web-Index (OWI), der als öffentliche Infrastruktur verfügbar gemacht werden soll. Die Grundidee ist die Trennung von Datenbestand (Index) und darauf aufsetzenden Diensten, welche in großer Zahl in privater Initiative betrieben werden können. Es geht also darum, die Basis für Vielfalt zu schaffen.
Date: 16. 5.2016 21:53:29

Waller, V.: Not just information : who searches for what on the search engine Google? (2011) 0.01
```
0.007789783 = product of:
  0.07789783 = sum of:
    0.07789783 = weight(_text_:log in 4373) [ClassicSimilarity], result of:
      0.07789783 = score(doc=4373,freq=2.0), product of:
        0.18335998 = queryWeight, product of:
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.028611459 = queryNorm
        0.42483553 = fieldWeight in 4373, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4086204 = idf(docFreq=197, maxDocs=44218)
          0.046875 = fieldNorm(doc=4373)
  0.1 = coord(1/10)
```
Abstract

This paper reports on a transaction log analysis of the type and topic of search queries entered into the search engine Google (Australia). Two aspects, in particular, set this apart from previous studies: the sampling and analysis take account of the distribution of search queries, and lifestyle information of the searcher was matched with each search query. A surprising finding was that there was no observed statistically significant difference in search type or topics for different segments of the online population. It was found that queries about popular culture and Ecommerce accounted for almost half of all search engine queries and that half of the queries were entered with a particular Website in mind. The findings of this study also suggest that the Internet search engine is not only an interface to information or a shortcut to Websites, it is equally a site of leisure. This study has implications for the design and evaluation of search engines as well as our understanding of search engine use.

Jezior, T.: Adaption und Integration von Suchmaschinentechnologie in mor(!)dernen OPACs (2013) 0.01

0.0074730627 = product of:
  0.037365314 = sum of:
    0.026934259 = weight(_text_:web in 2222) [ClassicSimilarity], result of:
      0.026934259 = score(doc=2222,freq=2.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.2884563 = fieldWeight in 2222, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0625 = fieldNorm(doc=2222)
    0.010431055 = product of:
      0.031293165 = sum of:
        0.031293165 = weight(_text_:29 in 2222) [ClassicSimilarity], result of:
          0.031293165 = score(doc=2222,freq=2.0), product of:
            0.10064617 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.028611459 = queryNorm
            0.31092256 = fieldWeight in 2222, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=2222)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)

Abstract: Wissenschaftliche Bibliotheken werden heutzutage durch Universalsuchmaschinen wie Google bedroht. Ein Grund hierfür ist, dass Bibliotheken auf Rechercheinstrumente setzen die, die heutigen Erwartungen ihrer Nutzer nicht mehr erfüllen können. Wollen Bibliotheken auch zukünftig eine tragende Rolle spielen, müssen sie die Techniken in ihrer Produkte integrieren, die Suchmaschinen zu ihrem Erfolg im Web verholfen haben.
Date: 18.10.2015 10:29:56

Lewandowski, D.: Query understanding (2011) 0.01

0.007454296 = product of:
  0.03727148 = sum of:
    0.026934259 = weight(_text_:web in 344) [ClassicSimilarity], result of:
      0.026934259 = score(doc=344,freq=2.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.2884563 = fieldWeight in 344, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0625 = fieldNorm(doc=344)
    0.010337221 = product of:
      0.031011663 = sum of:
        0.031011663 = weight(_text_:22 in 344) [ClassicSimilarity], result of:
          0.031011663 = score(doc=344,freq=2.0), product of:
            0.10019246 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.028611459 = queryNorm
            0.30952093 = fieldWeight in 344, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=344)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)

Date: 18. 9.2018 18:22:18
Source: Handbuch Internet-Suchmaschinen, 2: Neue Entwicklungen in der Web-Suche. Hrsg.: D. Lewandowski

Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.01
```
0.007123591 = product of:
  0.035617955 = sum of:
    0.029157192 = weight(_text_:web in 1605) [ClassicSimilarity], result of:
      0.029157192 = score(doc=1605,freq=6.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.3122631 = fieldWeight in 1605, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1605)
    0.006460763 = product of:
      0.019382289 = sum of:
        0.019382289 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
          0.019382289 = score(doc=1605,freq=2.0), product of:
            0.10019246 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.028611459 = queryNorm
            0.19345059 = fieldWeight in 1605, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1605)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)
```
Abstract

Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.

Source

Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
Sachse, J.: ¬The influence of snippet length on user behavior in mobile web search (2019) 0.01
```
0.007123591 = product of:
  0.035617955 = sum of:
    0.029157192 = weight(_text_:web in 5493) [ClassicSimilarity], result of:
      0.029157192 = score(doc=5493,freq=6.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.3122631 = fieldWeight in 5493, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5493)
    0.006460763 = product of:
      0.019382289 = sum of:
        0.019382289 = weight(_text_:22 in 5493) [ClassicSimilarity], result of:
          0.019382289 = score(doc=5493,freq=2.0), product of:
            0.10019246 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.028611459 = queryNorm
            0.19345059 = fieldWeight in 5493, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5493)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)
```
Abstract

Purpose Web search is more and more moving into mobile contexts. However, screen size of mobile devices is limited and search engine result pages face a trade-off between offering informative snippets and optimal use of space. One factor clearly influencing this trade-off is snippet length. The purpose of this paper is to find out what snippet size to use in mobile web search. Design/methodology/approach For this purpose, an eye-tracking experiment was conducted showing participants search interfaces with snippets of one, three or five lines on a mobile device to analyze 17 dependent variables. In total, 31 participants took part in the study. Each of the participants solved informational and navigational tasks. Findings Results indicate a strong influence of page fold on scrolling behavior and attention distribution across search results. Regardless of query type, short snippets seem to provide too little information about the result, so that search performance and subjective measures are negatively affected. Long snippets of five lines lead to better performance than medium snippets for navigational queries, but to worse performance for informational queries. Originality/value Although space in mobile search is limited, this study shows that longer snippets improve usability and user experience. It further emphasizes that page fold plays a stronger role in mobile than in desktop search for attention distribution.

Date

20. 1.2015 18:30:22
Hogan, A.; Harth, A.; Umbrich, J.; Kinsella, S.; Polleres, A.; Decker, S.: Searching and browsing Linked Data with SWSE : the Semantic Web Search Engine (2011) 0.01
```
0.0058314386 = product of:
  0.058314383 = sum of:
    0.058314383 = weight(_text_:web in 438) [ClassicSimilarity], result of:
      0.058314383 = score(doc=438,freq=24.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.6245262 = fieldWeight in 438, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=438)
  0.1 = coord(1/10)
```
Abstract

In this paper, we discuss the architecture and implementation of the Semantic Web Search Engine (SWSE). Following traditional search engine architecture, SWSE consists of crawling, data enhancing, indexing and a user interface for search, browsing and retrieval of information; unlike traditional search engines, SWSE operates over RDF Web data - loosely also known as Linked Data - which implies unique challenges for the system design, architecture, algorithms, implementation and user interface. In particular, many challenges exist in adopting Semantic Web technologies for Web data: the unique challenges of the Web - in terms of scale, unreliability, inconsistency and noise - are largely overlooked by the current Semantic Web standards. Herein, we describe the current SWSE system, initially detailing the architecture and later elaborating upon the function, design, implementation and performance of each individual component. In so doing, we also give an insight into how current Semantic Web standards can be tailored, in a best-effort manner, for use on Web data. Throughout, we offer evaluation and complementary argumentation to support our design choices, and also offer discussion on future directions and open research questions. Later, we also provide candid discussion relating to the difficulties currently faced in bringing such a search engine into the mainstream, and lessons learnt from roughly six years working on the Semantic Web Search Engine project.

Object

Semantic Web Search Engine

Theme

Semantic Web

Ke, W.: Decentralized search and the clustering paradox in large scale information networks (2012) 0.01

0.005604797 = product of:
  0.028023984 = sum of:
    0.020200694 = weight(_text_:web in 94) [ClassicSimilarity], result of:
      0.020200694 = score(doc=94,freq=2.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.21634221 = fieldWeight in 94, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=94)
    0.007823291 = product of:
      0.023469873 = sum of:
        0.023469873 = weight(_text_:29 in 94) [ClassicSimilarity], result of:
          0.023469873 = score(doc=94,freq=2.0), product of:
            0.10064617 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.028611459 = queryNorm
            0.23319192 = fieldWeight in 94, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=94)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)

Abstract: Amid the rapid growth of information today is the increasing challenge for people to navigate its magnitude. Dynamics and heterogeneity of large information spaces such as the Web raise important questions about information retrieval in these environments. Collection of all information in advance and centralization of IR operations are extremely difficult, if not impossible, because systems are dynamic and information is distributed. The chapter discusses some of the key issues facing classic information retrieval models and presents a decentralized, organic view of information systems pertaining to search in large scale networks. It focuses on the impact of network structure on search performance and discusses a phenomenon we refer to as the Clustering Paradox, in which the topology of interconnected systems imposes a scalability limit.
Pages: S.29-46

Fu, T.; Abbasi, A.; Chen, H.: ¬A focused crawler for Dark Web forums (2010) 0.01
```
0.0050501735 = product of:
  0.050501734 = sum of:
    0.050501734 = weight(_text_:web in 3471) [ClassicSimilarity], result of:
      0.050501734 = score(doc=3471,freq=18.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.5408555 = fieldWeight in 3471, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3471)
  0.1 = coord(1/10)
```
Abstract

The unprecedented growth of the Internet has given rise to the Dark Web, the problematic facet of the Web associated with cybercrime, hate, and extremism. Despite the need for tools to collect and analyze Dark Web forums, the covert nature of this part of the Internet makes traditional Web crawling techniques insufficient for capturing such content. In this study, we propose a novel crawling system designed to collect Dark Web forum content. The system uses a human-assisted accessibility approach to gain access to Dark Web forums. Several URL ordering features and techniques enable efficient extraction of forum postings. The system also includes an incremental crawler coupled with a recall-improvement mechanism intended to facilitate enhanced retrieval and updating of collected content. Experiments conducted to evaluate the effectiveness of the human-assisted accessibility approach and the recall-improvement-based, incremental-update procedure yielded favorable results. The human-assisted approach significantly improved access to Dark Web forums while the incremental crawler with recall improvement also outperformed standard periodic- and incremental-update approaches. Using the system, we were able to collect over 100 Dark Web forums from three regions. A case study encompassing link and content analysis of collected forums was used to illustrate the value and importance of gathering and analyzing content from such online communities.
Li, Z.: ¬A domain specific search engine with explicit document relations (2013) 0.01
```
0.0050501735 = product of:
  0.050501734 = sum of:
    0.050501734 = weight(_text_:web in 1210) [ClassicSimilarity], result of:
      0.050501734 = score(doc=1210,freq=18.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.5408555 = fieldWeight in 1210, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1210)
  0.1 = coord(1/10)
```
Abstract

The current web consists of documents that are highly heterogeneous and hard for machines to understand. The Semantic Web is a progressive movement of the Word Wide Web, aiming at converting the current web of unstructured documents to the web of data. In the Semantic Web, web documents are annotated with metadata using standardized ontology language. These annotated documents are directly processable by machines and it highly improves their usability and usefulness. In Ericsson, similar problems occur. There are massive documents being created with well-defined structures. Though these documents are about domain specific knowledge and can have rich relations, they are currently managed by a traditional search engine, which ignores the rich domain specific information and presents few data to users. Motivated by the Semantic Web, we aim to find standard ways to process these documents, extract rich domain specific information and annotate these data to documents with formal markup languages. We propose this project to develop a domain specific search engine for processing different documents and building explicit relations for them. This research project consists of the three main focuses: examining different domain specific documents and finding ways to extract their metadata; integrating a text search engine with an ontology server; exploring novel ways to build relations for documents. We implement this system and demonstrate its functions. As a prototype, the system provides required features and will be extended in the future.

Theme

Semantic Web
Clewley, N.; Chen, S.Y.; Liu, X.: Cognitive styles and search engine preferences : field dependence/independence vs holism/serialism (2010) 0.00
```
0.0046706647 = product of:
  0.023353323 = sum of:
    0.016833913 = weight(_text_:web in 3961) [ClassicSimilarity], result of:
      0.016833913 = score(doc=3961,freq=2.0), product of:
        0.0933738 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.028611459 = queryNorm
        0.18028519 = fieldWeight in 3961, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3961)
    0.00651941 = product of:
      0.019558229 = sum of:
        0.019558229 = weight(_text_:29 in 3961) [ClassicSimilarity], result of:
          0.019558229 = score(doc=3961,freq=2.0), product of:
            0.10064617 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.028611459 = queryNorm
            0.19432661 = fieldWeight in 3961, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3961)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)
```
Abstract

Purpose - Cognitive style has been identified to be significantly influential in deciding users' preferences of search engines. In particular, Witkin's field dependence/independence has been widely studied in the area of web searching. It has been suggested that this cognitive style has conceptual links with the holism/serialism. This study aims to investigate the differences between the field dependence/independence and holism/serialism. Design/methodology/approach - An empirical study was conducted with 120 students from a UK university. Riding's cognitive style analysis (CSA) and Ford's study preference questionnaire (SPQ) were used to identify the students' cognitive styles. A questionnaire was designed to identify users' preferences for the design of search engines. Data mining techniques were applied to analyse the data obtained from the empirical study. Findings - The results highlight three findings. First, a fundamental link is confirmed between the two cognitive styles. Second, the relationship between field dependent users and holists is suggested to be more prominent than that of field independent users and serialists. Third, the interface design preferences of field dependent and field independent users can be split more clearly than those of holists and serialists. Originality/value - The contributions of this study include a deeper understanding of the similarities and differences between field dependence/independence and holists/serialists as well as proposing a novel methodology for data analyses.

Date

29. 8.2010 13:11:47

Search (92 results, page 1 of 5)

Authors

Languages

Types

Themes

Subjects

Classifications