Search (44 results, page 1 of 3)

Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.08
```
0.08185689 = product of:
  0.16371378 = sum of:
    0.08959906 = weight(_text_:data in 1605) [ClassicSimilarity], result of:
      0.08959906 = score(doc=1605,freq=24.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.60511017 = fieldWeight in 1605, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1605)
    0.07411472 = sum of:
      0.042392377 = weight(_text_:processing in 1605) [ClassicSimilarity], result of:
        0.042392377 = score(doc=1605,freq=2.0), product of:
          0.18956426 = queryWeight, product of:
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.046827413 = queryNorm
          0.22363065 = fieldWeight in 1605, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1605)
      0.03172234 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
        0.03172234 = score(doc=1605,freq=2.0), product of:
          0.16398162 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046827413 = queryNorm
          0.19345059 = fieldWeight in 1605, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1605)
  0.5 = coord(2/4)
```
Abstract

Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.

Source

Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22

Theme

Data Mining

Das, A.; Jain, A.: Indexing the World Wide Web : the journey so far (2012) 0.04

0.03959743 = product of:
  0.07919486 = sum of:
    0.053759433 = weight(_text_:data in 95) [ClassicSimilarity], result of:
      0.053759433 = score(doc=95,freq=6.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.3630661 = fieldWeight in 95, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=95)
    0.025435425 = product of:
      0.05087085 = sum of:
        0.05087085 = weight(_text_:processing in 95) [ClassicSimilarity], result of:
          0.05087085 = score(doc=95,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.26835677 = fieldWeight in 95, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=95)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: In this chapter, the authors describe the key indexing components of today's web search engines. As the World Wide Web has grown, the systems and methods for indexing have changed significantly. The authors present the data structures used, the features extracted, the infrastructure needed, and the options available for designing a brand new search engine. Techniques are highlighted that improve relevance of results, discuss trade-offs to best utilize machine resources, and cover distributed processing concepts in this context. In particular, the authors delve into the topics of indexing phrases instead of terms, storage in memory vs. on disk, and data partitioning. Some thoughts on information organization for the newly emerging data-forms conclude the chapter.

Li, Z.: ¬A domain specific search engine with explicit document relations (2013) 0.03
```
0.032997858 = product of:
  0.065995716 = sum of:
    0.04479953 = weight(_text_:data in 1210) [ClassicSimilarity], result of:
      0.04479953 = score(doc=1210,freq=6.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.30255508 = fieldWeight in 1210, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1210)
    0.021196188 = product of:
      0.042392377 = sum of:
        0.042392377 = weight(_text_:processing in 1210) [ClassicSimilarity], result of:
          0.042392377 = score(doc=1210,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.22363065 = fieldWeight in 1210, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1210)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The current web consists of documents that are highly heterogeneous and hard for machines to understand. The Semantic Web is a progressive movement of the Word Wide Web, aiming at converting the current web of unstructured documents to the web of data. In the Semantic Web, web documents are annotated with metadata using standardized ontology language. These annotated documents are directly processable by machines and it highly improves their usability and usefulness. In Ericsson, similar problems occur. There are massive documents being created with well-defined structures. Though these documents are about domain specific knowledge and can have rich relations, they are currently managed by a traditional search engine, which ignores the rich domain specific information and presents few data to users. Motivated by the Semantic Web, we aim to find standard ways to process these documents, extract rich domain specific information and annotate these data to documents with formal markup languages. We propose this project to develop a domain specific search engine for processing different documents and building explicit relations for them. This research project consists of the three main focuses: examining different domain specific documents and finding ways to extract their metadata; integrating a text search engine with an ontology server; exploring novel ways to build relations for documents. We implement this system and demonstrate its functions. As a prototype, the system provides required features and will be extended in the future.
Lewandowski, D.; Sünkler, S.: What does Google recommend when you want to compare insurance offerings? (2019) 0.03
```
0.030330349 = product of:
  0.060660698 = sum of:
    0.04479953 = weight(_text_:data in 5288) [ClassicSimilarity], result of:
      0.04479953 = score(doc=5288,freq=6.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.30255508 = fieldWeight in 5288, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5288)
    0.01586117 = product of:
      0.03172234 = sum of:
        0.03172234 = weight(_text_:22 in 5288) [ClassicSimilarity], result of:
          0.03172234 = score(doc=5288,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.19345059 = fieldWeight in 5288, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5288)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Purpose The purpose of this paper is to describe a new method to improve the analysis of search engine results by considering the provider level as well as the domain level. This approach is tested by conducting a study using queries on the topic of insurance comparisons. Design/methodology/approach The authors conducted an empirical study that analyses the results of search queries aimed at comparing insurance companies. The authors used a self-developed software system that automatically queries commercial search engines and automatically extracts the content of the returned result pages for further data analysis. The data analysis was carried out using the KNIME Analytics Platform. Findings Google's top search results are served by only a few providers that frequently appear in these results. The authors show that some providers operate several domains on the same topic and that these domains appear for the same queries in the result lists. Research limitations/implications The authors demonstrate the feasibility of this approach and draw conclusions for further investigations from the empirical study. However, the study is a limited use case based on a limited number of search queries. Originality/value The proposed method allows large-scale analysis of the composition of the top results from commercial search engines. It allows using valid empirical data to determine what users actually see on the search engine result pages.

Date

20. 1.2015 18:30:22
Rieh, S.Y.; Kim, Y.-M.; Markey, K.: Amount of invested mental effort (AIME) in online searching (2012) 0.03
```
0.028887425 = product of:
  0.05777485 = sum of:
    0.03657866 = weight(_text_:data in 2726) [ClassicSimilarity], result of:
      0.03657866 = score(doc=2726,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24703519 = fieldWeight in 2726, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2726)
    0.021196188 = product of:
      0.042392377 = sum of:
        0.042392377 = weight(_text_:processing in 2726) [ClassicSimilarity], result of:
          0.042392377 = score(doc=2726,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.22363065 = fieldWeight in 2726, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2726)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

This research investigates how people's perceptions of information retrieval (IR) systems, their perceptions of search tasks, and their perceptions of self-efficacy influence the amount of invested mental effort (AIME) they put into using two different IR systems: a Web search engine and a library system. It also explores the impact of mental effort on an end user's search experience. To assess AIME in online searching, two experiments were conducted using these methods: Experiment 1 relied on self-reports and Experiment 2 employed the dual-task technique. In both experiments, data were collected through search transaction logs, a pre-search background questionnaire, a post-search questionnaire and an interview. Important findings are these: (1) subjects invested greater mental effort searching a library system than searching the Web; (2) subjects put little effort into Web searching because of their high sense of self-efficacy in their searching ability and their perception of the easiness of the Web; (3) subjects did not recognize that putting mental effort into searching was something needed to improve the search results; and (4) data collected from multiple sources proved to be effective for assessing mental effort in online searching.

Source

Information processing and management. 48(2012) no.6, S.1136-1150
Roux, M.: Metadata for search engines : what can be learned from e-Sciences? (2012) 0.02
```
0.020529725 = product of:
  0.0821189 = sum of:
    0.0821189 = weight(_text_:data in 96) [ClassicSimilarity], result of:
      0.0821189 = score(doc=96,freq=14.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.55459267 = fieldWeight in 96, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=96)
  0.25 = coord(1/4)
```
Abstract

E-sciences are data-intensive sciences that make a large use of the Web to share, collect, and process data. In this context, primary scientific data is becoming a new challenging issue as data must be extensively described (1) to account for empiric conditions and results that allow interpretation and/or analyses and (2) to be understandable by computers used for data storage and information retrieval. With this respect, metadata is a focal point whatever it is considered from the point of view of the user to visualize and exploit data as well as this of the search tools to find and retrieve information. Numerous disciplines are concerned with the issues of describing complex observations and addressing pertinent knowledge. In this paper, similarities and differences in data description and exploration strategies among disciplines in e-sciences are examined.
Gossen, T.: Search engines for children : search user interfaces and information-seeking behaviour (2016) 0.02
```
0.016042989 = product of:
  0.064171955 = sum of:
    0.064171955 = sum of:
      0.041966315 = weight(_text_:processing in 2752) [ClassicSimilarity], result of:
        0.041966315 = score(doc=2752,freq=4.0), product of:
          0.18956426 = queryWeight, product of:
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.046827413 = queryNorm
          0.22138305 = fieldWeight in 2752, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2752)
      0.022205638 = weight(_text_:22 in 2752) [ClassicSimilarity], result of:
        0.022205638 = score(doc=2752,freq=2.0), product of:
          0.16398162 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046827413 = queryNorm
          0.1354154 = fieldWeight in 2752, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2752)
  0.25 = coord(1/4)
```
Content

Inhalt: Acknowledgments; Abstract; Zusammenfassung; Contents; List of Figures; List of Tables; List of Acronyms; Chapter 1 Introduction ; 1.1 Research Questions; 1.2 Thesis Outline; Part I Fundamentals ; Chapter 2 Information Retrieval for Young Users ; 2.1 Basics of Information Retrieval; 2.1.1 Architecture of an IR System; 2.1.2 Relevance Ranking; 2.1.3 Search User Interfaces; 2.1.4 Targeted Search Engines; 2.2 Aspects of Child Development Relevant for Information Retrieval Tasks; 2.2.1 Human Cognitive Development; 2.2.2 Information Processing Theory; 2.2.3 Psychosocial Development 2.3 User Studies and Evaluation2.3.1 Methods in User Studies; 2.3.2 Types of Evaluation; 2.3.3 Evaluation with Children; 2.4 Discussion; Chapter 3 State of the Art ; 3.1 Children's Information-Seeking Behaviour; 3.1.1 Querying Behaviour; 3.1.2 Search Strategy; 3.1.3 Navigation Style; 3.1.4 User Interface; 3.1.5 Relevance Judgement; 3.2 Existing Algorithms and User Interface Concepts for Children; 3.2.1 Query; 3.2.2 Content; 3.2.3 Ranking; 3.2.4 Search Result Visualisation; 3.3 Existing Information Retrieval Systems for Children; 3.3.1 Digital Book Libraries; 3.3.2 Web Search Engines 3.4 Summary and DiscussionPart II Studying Open Issues ; Chapter 4 Usability of Existing Search Engines for Young Users ; 4.1 Assessment Criteria; 4.1.1 Criteria for Matching the Motor Skills; 4.1.2 Criteria for Matching the Cognitive Skills; 4.2 Results; 4.2.1 Conformance with Motor Skills; 4.2.2 Conformance with the Cognitive Skills; 4.2.3 Presentation of Search Results; 4.2.4 Browsing versus Searching; 4.2.5 Navigational Style; 4.3 Summary and Discussion; Chapter 5 Large-scale Analysis of Children's Queries and Search Interactions; 5.1 Dataset; 5.2 Results; 5.3 Summary and Discussion Chapter 6 Differences in Usability and Perception of Targeted Web Search Engines between Children and Adults 6.1 Related Work; 6.2 User Study; 6.3 Study Results; 6.4 Summary and Discussion; Part III Tackling the Challenges ; Chapter 7 Search User Interface Design for Children ; 7.1 Conceptual Challenges and Possible Solutions; 7.2 Knowledge Journey Design; 7.3 Evaluation; 7.3.1 Study Design; 7.3.2 Study Results; 7.4 Voice-Controlled Search: Initial Study; 7.4.1 User Study; 7.5 Summary and Discussion; Chapter 8 Addressing User Diversity ; 8.1 Evolving Search User Interface 8.1.1 Mapping Function8.1.2 Evolving Skills; 8.1.3 Detection of User Abilities; 8.1.4 Design Concepts; 8.2 Adaptation of a Search User Interface towards User Needs; 8.2.1 Design & Implementation; 8.2.2 Search Input; 8.2.3 Result Output; 8.2.4 General Properties; 8.2.5 Configuration and Further Details; 8.3 Evaluation; 8.3.1 Study Design; 8.3.2 Study Results; 8.3.3 Preferred UI Settings; 8.3.4 User satisfaction; 8.4 Knowledge Journey Exhibit; 8.4.1 Hardware; 8.4.2 Frontend; 8.4.3 Backend; 8.5 Summary and Discussion; Chapter 9 Supporting Visual Searchers in Processing Search Results 9.1 Related Work

Date

1. 2.2016 18:25:22
Hogan, A.; Harth, A.; Umbrich, J.; Kinsella, S.; Polleres, A.; Decker, S.: Searching and browsing Linked Data with SWSE : the Semantic Web Search Engine (2011) 0.02
```
0.015839024 = product of:
  0.063356094 = sum of:
    0.063356094 = weight(_text_:data in 438) [ClassicSimilarity], result of:
      0.063356094 = score(doc=438,freq=12.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.4278775 = fieldWeight in 438, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=438)
  0.25 = coord(1/4)
```
Abstract

In this paper, we discuss the architecture and implementation of the Semantic Web Search Engine (SWSE). Following traditional search engine architecture, SWSE consists of crawling, data enhancing, indexing and a user interface for search, browsing and retrieval of information; unlike traditional search engines, SWSE operates over RDF Web data - loosely also known as Linked Data - which implies unique challenges for the system design, architecture, algorithms, implementation and user interface. In particular, many challenges exist in adopting Semantic Web technologies for Web data: the unique challenges of the Web - in terms of scale, unreliability, inconsistency and noise - are largely overlooked by the current Semantic Web standards. Herein, we describe the current SWSE system, initially detailing the architecture and later elaborating upon the function, design, implementation and performance of each individual component. In so doing, we also give an insight into how current Semantic Web standards can be tailored, in a best-effort manner, for use on Web data. Throughout, we offer evaluation and complementary argumentation to support our design choices, and also offer discussion on future directions and open research questions. Later, we also provide candid discussion relating to the difficulties currently faced in bringing such a search engine into the mainstream, and lessons learnt from roughly six years working on the Semantic Web Search Engine project.
What is Schema.org? (2011) 0.02
```
0.015519011 = product of:
  0.062076043 = sum of:
    0.062076043 = weight(_text_:data in 4437) [ClassicSimilarity], result of:
      0.062076043 = score(doc=4437,freq=8.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.4192326 = fieldWeight in 4437, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=4437)
  0.25 = coord(1/4)
```
Abstract

This site provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google and Yahoo! rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure. A shared markup vocabulary makes easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of sitemaps.org, Bing, Google and Yahoo! have come together to provide a shared collection of schemas that webmasters can use.
Vaughan, L.; Romero-Frías, E.: Web search volume as a predictor of academic fame : an exploration of Google trends (2014) 0.01
```
0.013439858 = product of:
  0.053759433 = sum of:
    0.053759433 = weight(_text_:data in 1233) [ClassicSimilarity], result of:
      0.053759433 = score(doc=1233,freq=6.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.3630661 = fieldWeight in 1233, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1233)
  0.25 = coord(1/4)
```
Abstract

Searches conducted on web search engines reflect the interests of users and society. Google Trends, which provides information about the queries searched by users of the Google web search engine, is a rich data source from which a wealth of information can be mined. We investigated the possibility of using web search volume data from Google Trends to predict academic fame. As queries are language-dependent, we studied universities from two countries with different languages, the United States and Spain. We found a significant correlation between the search volume of a university name and the university's academic reputation or fame. We also examined the effect of some Google Trends features, namely, limiting the search to a specific country or topic category on the search volume data. Finally, we examined the effect of university sizes on the correlations found to gain a deeper understanding of the nature of the relationships.
Clewley, N.; Chen, S.Y.; Liu, X.: Cognitive styles and search engine preferences : field dependence/independence vs holism/serialism (2010) 0.01
```
0.011199882 = product of:
  0.04479953 = sum of:
    0.04479953 = weight(_text_:data in 3961) [ClassicSimilarity], result of:
      0.04479953 = score(doc=3961,freq=6.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.30255508 = fieldWeight in 3961, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3961)
  0.25 = coord(1/4)
```
Abstract

Purpose - Cognitive style has been identified to be significantly influential in deciding users' preferences of search engines. In particular, Witkin's field dependence/independence has been widely studied in the area of web searching. It has been suggested that this cognitive style has conceptual links with the holism/serialism. This study aims to investigate the differences between the field dependence/independence and holism/serialism. Design/methodology/approach - An empirical study was conducted with 120 students from a UK university. Riding's cognitive style analysis (CSA) and Ford's study preference questionnaire (SPQ) were used to identify the students' cognitive styles. A questionnaire was designed to identify users' preferences for the design of search engines. Data mining techniques were applied to analyse the data obtained from the empirical study. Findings - The results highlight three findings. First, a fundamental link is confirmed between the two cognitive styles. Second, the relationship between field dependent users and holists is suggested to be more prominent than that of field independent users and serialists. Third, the interface design preferences of field dependent and field independent users can be split more clearly than those of holists and serialists. Originality/value - The contributions of this study include a deeper understanding of the similarities and differences between field dependence/independence and holists/serialists as well as proposing a novel methodology for data analyses.
Tetzchner, J. von: As a monopoly in search and advertising Google is not able to resist the misuse of power : is the Internet turning into a battlefield of propaganda? How Google should be regulated (2017) 0.01
```
0.01012129 = product of:
  0.04048516 = sum of:
    0.04048516 = weight(_text_:data in 3891) [ClassicSimilarity], result of:
      0.04048516 = score(doc=3891,freq=10.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.27341786 = fieldWeight in 3891, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=3891)
  0.25 = coord(1/4)
```
Content

How should Google be regulated? We should limit the amount of information that is being collected. In particular we should look at information that is being collected across sites. It should not be legal to combine data from multiple sites and services. The fact that these sites and services are using the same underlying technology does not change the fact that the user's dealings is with a site at a time and each site should not have the right to share the data with others. I believe this the cornerstone of laws in many countries today, but these laws need to be enforced. Data about us is ours alone and it should not be possible to sell it. We should also limit the ability to target users individually. In the past, ads on sites were ads on sites. You might know what kind of users visited a site and you would place tech ads on tech sites and fashion ads on fashion sites. Now the ads follow you individually. That should be made illegal as it uses data collected from multiple sources and invades our privacy. I also believe there should be regulation as to how location data is used and any information related to our mobile devices. In addition, regulators need to be vigilant as to how companies that have monopoly power use their power. That kind of goes without saying. Companies with monopoly powers should not be able to use those powers when competing in an open market or using their monopoly services to limit competition."
Sleem-Amer, M.; Bigorgne, I.; Brizard, S.; Santos, L.D.P.D.; Bouhairi, Y. El; Goujon, B.; Lorin, S.; Martineau, C.; Rigouste, L.; Varga, L.: Intelligent semantic search engines for opinion and sentiment mining (2012) 0.01
```
0.009144665 = product of:
  0.03657866 = sum of:
    0.03657866 = weight(_text_:data in 100) [ClassicSimilarity], result of:
      0.03657866 = score(doc=100,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24703519 = fieldWeight in 100, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=100)
  0.25 = coord(1/4)
```
Abstract

Over the last years, research and industry players have become increasingly interested in analyzing opinions and sentiments expressed on the social media web for product marketing and business intelligence. In order to adapt to this need search engines not only have to be able to retrieve lists of documents but to directly access, analyze, and interpret topics and opinions. This article covers an intermediate phase of the ongoing industrial research project 'DoXa' aiming at developing a semantic opinion and sentiment mining search engine for the French language. The DoXa search engine enables topic related opinion and sentiment extraction beyond positive and negative polarity using rich linguistic resources. Centering the work on two distinct business use cases, the authors analyze both unstructured Web 2.0 contents (e.g., blogs and forums) and structured questionnaire data sets. The focus is on discovering hidden patterns in the data. To this end, the authors present work in progress on opinion topic relation extraction and visual analytics, linguistic resource construction as well as the combination of OLAP technology with semantic search.
Makris, C.; Plegas, Y.; Stamou, S.: Web query disambiguation using PageRank (2012) 0.01
```
0.0077595054 = product of:
  0.031038022 = sum of:
    0.031038022 = weight(_text_:data in 378) [ClassicSimilarity], result of:
      0.031038022 = score(doc=378,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2096163 = fieldWeight in 378, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=378)
  0.25 = coord(1/4)
```
Abstract

In this article, we propose new word sense disambiguation strategies for resolving the senses of polysemous query terms issued to Web search engines, and we explore the application of those strategies when used in a query expansion framework. The novelty of our approach lies in the exploitation of the Web page PageRank values as indicators of the significance the different senses of a term carry when employed in search queries. We also aim at scalable query sense resolution techniques that can be applied without loss of efficiency to large data sets such as those on the Web. Our experimental findings validate that the proposed techniques perform more accurately than do the traditional disambiguation strategies and improve the quality of the search results, when involved in query expansion.
Lewandowski, D.: Evaluating the retrieval effectiveness of web search engines using a representative query sample (2015) 0.01
```
0.0077595054 = product of:
  0.031038022 = sum of:
    0.031038022 = weight(_text_:data in 2157) [ClassicSimilarity], result of:
      0.031038022 = score(doc=2157,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2096163 = fieldWeight in 2157, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=2157)
  0.25 = coord(1/4)
```
Abstract

Search engine retrieval effectiveness studies are usually small scale, using only limited query samples. Furthermore, queries are selected by the researchers. We address these issues by taking a random representative sample of 1,000 informational and 1,000 navigational queries from a major German search engine and comparing Google's and Bing's results based on this sample. Jurors were found through crowdsourcing, and data were collected using specialized software, the Relevance Assessment Tool (RAT). We found that although Google outperforms Bing in both query types, the difference in the performance for informational queries was rather low. However, for navigational queries, Google found the correct answer in 95.3% of cases, whereas Bing only found the correct answer 76.6% of the time. We conclude that search engine performance on navigational queries is of great importance, because users in this case can clearly identify queries that have returned correct results. So, performance on this query type may contribute to explaining user satisfaction with search engines.
Next generation search engines : advanced models for information retrieval (2012) 0.01
```
0.006466255 = product of:
  0.02586502 = sum of:
    0.02586502 = weight(_text_:data in 357) [ClassicSimilarity], result of:
      0.02586502 = score(doc=357,freq=8.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 357, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.01953125 = fieldNorm(doc=357)
  0.25 = coord(1/4)
```
Abstract

With the rapid growth of web-based applications, such as search engines, Facebook, and Twitter, the development of effective and personalized information retrieval techniques and of user interfaces is essential. The amount of shared information and of social networks has also considerably grown, requiring metadata for new sources of information, like Wikipedia and ODP. These metadata have to provide classification information for a wide range of topics, as well as for social networking sites like Twitter, and Facebook, each of which provides additional preferences, tagging information and social contexts. Due to the explosion of social networks and other metadata sources, it is an opportune time to identify ways to exploit such metadata in IR tasks such as user modeling, query understanding, and personalization, to name a few. Although the use of traditional metadata such as html text, web page titles, and anchor text is fairly well-understood, the use of category information, user behavior data, and geographical information is just beginning to be studied. This book is intended for scientists and decision-makers who wish to gain working knowledge about search engines in order to evaluate available solutions and to dialogue with software and data providers.

LCSH

Data mining

Subject

Data mining
Lewandowski, D.; Drechsler, J.; Mach, S. von: Deriving query intents from web search engine queries (2012) 0.01
```
0.006466255 = product of:
  0.02586502 = sum of:
    0.02586502 = weight(_text_:data in 385) [ClassicSimilarity], result of:
      0.02586502 = score(doc=385,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 385, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=385)
  0.25 = coord(1/4)
```
Abstract

The purpose of this article is to test the reliability of query intents derived from queries, either by the user who entered the query or by another juror. We report the findings of three studies. First, we conducted a large-scale classification study (~50,000 queries) using a crowdsourcing approach. Next, we used clickthrough data from a search engine log and validated the judgments given by the jurors from the crowdsourcing study. Finally, we conducted an online survey on a commercial search engine's portal. Because we used the same queries for all three studies, we also were able to compare the results and the effectiveness of the different approaches. We found that neither the crowdsourcing approach, using jurors who classified queries originating from other users, nor the questionnaire approach, using searchers who were asked about their own query that they just entered into a Web search engine, led to satisfying results. This leads us to conclude that there was little understanding of the classification tasks, even though both groups of jurors were given detailed instructions. Although we used manual classification, our research also has important implications for automatic classification. We must question the success of approaches using automatic classification and comparing its performance to a baseline from human jurors.
Bhansali, D.; Desai, H.; Deulkar, K.: ¬A study of different ranking approaches for semantic search (2015) 0.01
```
0.006466255 = product of:
  0.02586502 = sum of:
    0.02586502 = weight(_text_:data in 2696) [ClassicSimilarity], result of:
      0.02586502 = score(doc=2696,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 2696, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2696)
  0.25 = coord(1/4)
```
Abstract

Search Engines have become an integral part of our day to day life. Our reliance on search engines increases with every passing day. With the amount of data available on Internet increasing exponentially, it becomes important to develop new methods and tools that help to return results relevant to the queries and reduce the time spent on searching. The results should be diverse but at the same time should return results focused on the queries asked. Relation Based Page Rank [4] algorithms are considered to be the next frontier in improvement of Semantic Web Search. The probability of finding relevance in the search results as posited by the user while entering the query is used to measure the relevance. However, its application is limited by the complexity of determining relation between the terms and assigning explicit meaning to each term. Trust Rank is one of the most widely used ranking algorithms for semantic web search. Few other ranking algorithms like HITS algorithm, PageRank algorithm are also used for Semantic Web Searching. In this paper, we will provide a comparison of few ranking approaches.
White, R.W.: Interactions with search systems (2016) 0.01
```
0.006466255 = product of:
  0.02586502 = sum of:
    0.02586502 = weight(_text_:data in 3612) [ClassicSimilarity], result of:
      0.02586502 = score(doc=3612,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 3612, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3612)
  0.25 = coord(1/4)
```
Abstract

Information seeking is a fundamental human activity. In the modern world, it is frequently conducted through interactions with search systems. The retrieval and comprehension of information returned by these systems is a key part of decision making and action in a broad range of settings. Advances in data availability coupled with new interaction paradigms, and mobile and cloud computing capabilities, have created a broad range of new opportunities for information access and use. In this comprehensive book for professionals, researchers, and students involved in search system design and evaluation, search expert Ryen White discusses how search systems can capitalize on new capabilities and how next-generation systems must support higher order search activities such as task completion, learning, and decision making. He outlines the implications of these changes for the evolution of search evaluation, as well as challenges that extend beyond search systems in areas such as privacy and societal benefit.
Levy, S.: In the plex : how Google thinks, works, and shapes our lives (2011) 0.01
```
0.006401266 = product of:
  0.025605064 = sum of:
    0.025605064 = weight(_text_:data in 9) [ClassicSimilarity], result of:
      0.025605064 = score(doc=9,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17292464 = fieldWeight in 9, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=9)
  0.25 = coord(1/4)
```
Abstract

Few companies in history have ever been as successful and as admired as Google, the company that has transformed the Internet and become an indispensable part of our lives. How has Google done it? Veteran technology reporter Steven Levy was granted unprecedented access to the company, and in this revelatory book he takes readers inside Google headquarters-the Googleplex-to show how Google works. While they were still students at Stanford, Google cofounders Larry Page and Sergey Brin revolutionized Internet search. They followed this brilliant innovation with another, as two of Google's earliest employees found a way to do what no one else had: make billions of dollars from Internet advertising. With this cash cow (until Google's IPO nobody other than Google management had any idea how lucrative the company's ad business was), Google was able to expand dramatically and take on other transformative projects: more efficient data centers, open-source cell phones, free Internet video (YouTube), cloud computing, digitizing books, and much more. The key to Google's success in all these businesses, Levy reveals, is its engineering mind-set and adoption of such Internet values as speed, openness, experimentation, and risk taking. After its unapologetically elitist approach to hiring, Google pampers its engineers-free food and dry cleaning, on-site doctors and masseuses-and gives them all the resources they need to succeed. Even today, with a workforce of more than 23,000, Larry Page signs off on every hire. But has Google lost its innovative edge? It stumbled badly in China-Levy discloses what went wrong and how Brin disagreed with his peers on the China strategy-and now with its newest initiative, social networking, Google is chasing a successful competitor for the first time. Some employees are leaving the company for smaller, nimbler start-ups. Can the company that famously decided not to be evil still compete? No other book has ever turned Google inside out as Levy does with In the Plex.

Content

The world according to Google: biography of a search engine -- Googlenomics: cracking the code on internet profits -- Don't be evil: how Google built its culture -- Google's cloud: how Google built data centers and killed the hard drive -- Outside the box: the Google phone company. and the Google t.v. company -- Guge: Google moral dilemma in China -- Google.gov: is what's good for Google, good for government or the public? -- Epilogue: chasing tail lights: trying to crack the social code.

Search (44 results, page 1 of 3)

Authors

Languages

Types

Themes

Subjects

Classifications