Search (121 results, page 3 of 7)

Ortiz-Cordova, A.; Jansen, B.J.: Classifying web search queries to identify high revenue generating customers (2012) 0.01
```
0.006219466 = product of:
  0.015548665 = sum of:
    0.010812371 = weight(_text_:a in 279) [ClassicSimilarity], result of:
      0.010812371 = score(doc=279,freq=14.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.20223314 = fieldWeight in 279, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=279)
    0.0047362936 = product of:
      0.009472587 = sum of:
        0.009472587 = weight(_text_:information in 279) [ClassicSimilarity], result of:
          0.009472587 = score(doc=279,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.116372846 = fieldWeight in 279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=279)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Traffic from search engines is important for most online businesses, with the majority of visitors to many websites being referred by search engines. Therefore, an understanding of this search engine traffic is critical to the success of these websites. Understanding search engine traffic means understanding the underlying intent of the query terms and the corresponding user behaviors of searchers submitting keywords. In this research, using 712,643 query keywords from a popular Spanish music website relying on contextual advertising as its business model, we use a k-means clustering algorithm to categorize the referral keywords with similar characteristics of onsite customer behavior, including attributes such as clickthrough rate and revenue. We identified 6 clusters of consumer keywords. Clusters range from a large number of users who are low impact to a small number of high impact users. We demonstrate how online businesses can leverage this segmentation clustering approach to provide a more tailored consumer experience. Implications are that businesses can effectively segment customers to develop better business models to increase advertising conversion rates.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.7, S.1426-1441

Type

a

Karaman, F.: Artificial intelligence enabled search engines (AIESE) and the implications (2012) 0.01

0.006112744 = product of:
  0.01528186 = sum of:
    0.007078358 = weight(_text_:a in 110) [ClassicSimilarity], result of:
      0.007078358 = score(doc=110,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.13239266 = fieldWeight in 110, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=110)
    0.008203502 = product of:
      0.016407004 = sum of:
        0.016407004 = weight(_text_:information in 110) [ClassicSimilarity], result of:
          0.016407004 = score(doc=110,freq=6.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.20156369 = fieldWeight in 110, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=110)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Search engines are the major means of information retrieval over the Internet. People's dependence on them increases over time as SEs introduce new and sophisticated technologies. The developments in the Artificial Intelligence (AI) will transform the current search engines Artificial Intelligence Enabled Search Engines (AIESE). Search engines already play a critical role in classifying, sorting and delivering the information over the Internet. However, as Internet's mainstream role becomes more apparent and AI technology increases the sophistication of the tools of the SEs, their roles will become much more critical. Since, the future of search engines are examined, the technological singularity concept is analyzed in detail. Second and third order indirect side effects are analyzed. A four-stage evolution-model is suggested.
Source: Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a
Type: a

Web search engine research (2012) 0.01

0.006112744 = product of:
  0.01528186 = sum of:
    0.007078358 = weight(_text_:a in 478) [ClassicSimilarity], result of:
      0.007078358 = score(doc=478,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.13239266 = fieldWeight in 478, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=478)
    0.008203502 = product of:
      0.016407004 = sum of:
        0.016407004 = weight(_text_:information in 478) [ClassicSimilarity], result of:
          0.016407004 = score(doc=478,freq=6.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.20156369 = fieldWeight in 478, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=478)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: "Web Search Engine Research", edited by Dirk Lewandowski, provides an understanding of Web search engines from the unique perspective of Library and Information Science. The book explores a range of topics including retrieval effectiveness, user satisfaction, the evaluation of search interfaces, the impact of search on society, reliability of search results, query log analysis, user guidance in the search process, and the influence of search engine optimization (SEO) on results quality. While research in computer science has mainly focused on technical aspects of search engines, LIS research is centred on users' behaviour when using search engines and how this interaction can be evaluated. LIS research provides a unique perspective in intermediating between the technical aspects, user aspects and their impact on their role in knowledge acquisition. This book is directly relevant to researchers and practitioners in library and information science, computer science, including Web researchers.
Footnote: Weitere Rez. in: Journal of Documentation, 69(2013) no.4, S.594-596 (A. MacFarlane)
Series: Library and information science; vol. 4

Berget, G.; Sandnes, F.E.: Do autocomplete functions reduce the impact of dyslexia on information-searching behavior? : the case of Google (2016) 0.01
```
0.006112744 = product of:
  0.01528186 = sum of:
    0.007078358 = weight(_text_:a in 3112) [ClassicSimilarity], result of:
      0.007078358 = score(doc=3112,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.13239266 = fieldWeight in 3112, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=3112)
    0.008203502 = product of:
      0.016407004 = sum of:
        0.016407004 = weight(_text_:information in 3112) [ClassicSimilarity], result of:
          0.016407004 = score(doc=3112,freq=6.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.20156369 = fieldWeight in 3112, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3112)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Dyslexic users often do not exhibit spelling and reading skills at a level required to perform effective search. To explore whether autocomplete functions reduce the impact of dyslexia on information searching, 20 participants with dyslexia and 20 controls solved 10 predefined tasks in the search engine Google. Eye-tracking and screen-capture documented the searches. There were no significant differences between the dyslexic students and the controls in time usage, number of queries, query lengths, or the use of the autocomplete function. However, participants with dyslexia made more misspellings and looked less at the screen and the autocomplete suggestions lists while entering the queries. The results indicate that although the autocomplete function supported the participants in the search process, a more extensive use of the autocomplete function would have reduced misspellings. Further, the high tolerance for spelling errors considerably reduced the effect of dyslexia, and may be as important as the autocomplete function.

Source

Journal of the Association for Information Science and Technology. 67(2016) no.10, S.2320-2328

Type

a

Truran, M.; Schmakeit, J.-F.; Ashman, H.: ¬The effect of user intent on the stability of search engine results (2011) 0.01

0.0060245167 = product of:
  0.015061291 = sum of:
    0.009535614 = weight(_text_:a in 4478) [ClassicSimilarity], result of:
      0.009535614 = score(doc=4478,freq=8.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.17835285 = fieldWeight in 4478, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4478)
    0.005525676 = product of:
      0.011051352 = sum of:
        0.011051352 = weight(_text_:information in 4478) [ClassicSimilarity], result of:
          0.011051352 = score(doc=4478,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.13576832 = fieldWeight in 4478, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4478)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Previous work has established that search engine queries can be classified according to the intent of the searcher (i.e., why is the user searching, what specifically do they intend to do). In this article, we describe an experiment in which four sets of queries, each set representing a different user intent, are repeatedly submitted to three search engines over a period of 60 days. Using a variety of measurements, we describe the overall stability of the search engine results recorded for each group. Our findings suggest that search engine results for informational queries are significantly more stable than the results obtained using transactional, navigational, or commercial queries.
Source: Journal of the American Society for Information Science and Technology. 62(2011) no.7, S.1276-1287
Type: a

What is Schema.org? (2011) 0.01

0.005948606 = product of:
  0.014871514 = sum of:
    0.008173384 = weight(_text_:a in 4437) [ClassicSimilarity], result of:
      0.008173384 = score(doc=4437,freq=8.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.15287387 = fieldWeight in 4437, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=4437)
    0.0066981306 = product of:
      0.013396261 = sum of:
        0.013396261 = weight(_text_:information in 4437) [ClassicSimilarity], result of:
          0.013396261 = score(doc=4437,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.16457605 = fieldWeight in 4437, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4437)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: This site provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google and Yahoo! rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure. A shared markup vocabulary makes easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of sitemaps.org, Bing, Google and Yahoo! have come together to provide a shared collection of schemas that webmasters can use.

Das, A.; Jain, A.: Indexing the World Wide Web : the journey so far (2012) 0.01

0.005948606 = product of:
  0.014871514 = sum of:
    0.008173384 = weight(_text_:a in 95) [ClassicSimilarity], result of:
      0.008173384 = score(doc=95,freq=8.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.15287387 = fieldWeight in 95, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=95)
    0.0066981306 = product of:
      0.013396261 = sum of:
        0.013396261 = weight(_text_:information in 95) [ClassicSimilarity], result of:
          0.013396261 = score(doc=95,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.16457605 = fieldWeight in 95, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=95)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: In this chapter, the authors describe the key indexing components of today's web search engines. As the World Wide Web has grown, the systems and methods for indexing have changed significantly. The authors present the data structures used, the features extracted, the infrastructure needed, and the options available for designing a brand new search engine. Techniques are highlighted that improve relevance of results, discuss trade-offs to best utilize machine resources, and cover distributed processing concepts in this context. In particular, the authors delve into the topics of indexing phrases instead of terms, storage in memory vs. on disk, and data partitioning. Some thoughts on information organization for the newly emerging data-forms conclude the chapter.
Source: Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a
Type: a

Kruschwitz, U.; Lungley, D.; Albakour, M-D.; Song, D.: Deriving query suggestions for site search (2013) 0.01
```
0.005886516 = product of:
  0.01471629 = sum of:
    0.010769378 = weight(_text_:a in 1085) [ClassicSimilarity], result of:
      0.010769378 = score(doc=1085,freq=20.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.20142901 = fieldWeight in 1085, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1085)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 1085) [ClassicSimilarity], result of:
          0.007893822 = score(doc=1085,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 1085, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1085)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Modern search engines have been moving away from simplistic interfaces that aimed at satisfying a user's need with a single-shot query. Interactive features are now integral parts of web search engines. However, generating good query modification suggestions remains a challenging issue. Query log analysis is one of the major strands of work in this direction. Although much research has been performed on query logs collected on the web as a whole, query log analysis to enhance search on smaller and more focused collections has attracted less attention, despite its increasing practical importance. In this article, we report on a systematic study of different query modification methods applied to a substantial query log collected on a local website that already uses an interactive search engine. We conducted experiments in which we asked users to assess the relevance of potential query modification suggestions that have been constructed using a range of log analysis methods and different baseline approaches. The experimental results demonstrate the usefulness of log analysis to extract query modification suggestions. Furthermore, our experiments demonstrate that a more fine-grained approach than grouping search requests into sessions allows for extraction of better refinement terms from query log files.

Source

Journal of the American Society for Information Science and Technology. 64(2013) no.10, S.1975-1994

Type

a

Lewandowski, D.: Suchmaschinen (2013) 0.01

0.00588199 = product of:
  0.014704974 = sum of:
    0.0068111527 = weight(_text_:a in 731) [ClassicSimilarity], result of:
      0.0068111527 = score(doc=731,freq=2.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.12739488 = fieldWeight in 731, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.078125 = fieldNorm(doc=731)
    0.007893822 = product of:
      0.015787644 = sum of:
        0.015787644 = weight(_text_:information in 731) [ClassicSimilarity], result of:
          0.015787644 = score(doc=731,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.19395474 = fieldWeight in 731, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=731)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Source: Grundlagen der praktischen Information und Dokumentation. Handbuch zur Einführung in die Informationswissenschaft und -praxis. 6., völlig neu gefaßte Ausgabe. Hrsg. von R. Kuhlen, W. Semar u. D. Strauch. Begründet von Klaus Laisiepen, Ernst Lutterbeck, Karl-Heinrich Meyer-Uhlenried
Type: a

Shapira, B.; Zabar, B.: Personalized search : integrating collaboration and social networks (2011) 0.01
```
0.0057805413 = product of:
  0.014451353 = sum of:
    0.0076151006 = weight(_text_:a in 4140) [ClassicSimilarity], result of:
      0.0076151006 = score(doc=4140,freq=10.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.14243183 = fieldWeight in 4140, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4140)
    0.006836252 = product of:
      0.013672504 = sum of:
        0.013672504 = weight(_text_:information in 4140) [ClassicSimilarity], result of:
          0.013672504 = score(doc=4140,freq=6.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.16796975 = fieldWeight in 4140, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4140)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Despite improvements in their capabilities, search engines still fail to provide users with only relevant results. One reason is that most search engines implement a "one size fits all" approach that ignores personal preferences when retrieving the results of a user's query. Recent studies (Smyth, 2010) have elaborated the importance of personalizing search results and have proposed integrating recommender system methods for enhancing results using contextual and extrinsic information that might indicate the user's actual needs. In this article, we review recommender system methods used for personalizing and improving search results and examine the effect of two such methods that are merged for this purpose. One method is based on collaborative users' knowledge; the second integrates information from the user's social network. We propose new methods for collaborative-and social-based search and demonstrate that each of these methods, when separately applied, produce more accurate search results than does a purely keyword-based search engine (referred to as "standard search engine"), where the social search engine is more accurate than is the collaborative one. However, separately applied, these methods do not produce a sufficient number of results (low coverage). Nevertheless, merging these methods with those implemented by standard search engines overcomes the low-coverage problem and produces personalized results for users that display significantly more accurate results while also providing sufficient coverage than do standard search engines. The improvement, however, is significant only for topics for which the diversity of terms used for queries among users is low.

Source

Journal of the American Society for Information Science and Technology. 62(2011) no.1, S.146-160

Type

a
Sarigil, E.; Sengor Altingovde, I.; Blanco, R.; Barla Cambazoglu, B.; Ozcan, R.; Ulusoy, Ö.: Characterizing, predicting, and handling web search queries that match very few or no results (2018) 0.01
```
0.0056654564 = product of:
  0.014163641 = sum of:
    0.01021673 = weight(_text_:a in 4039) [ClassicSimilarity], result of:
      0.01021673 = score(doc=4039,freq=18.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.19109234 = fieldWeight in 4039, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4039)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 4039) [ClassicSimilarity], result of:
          0.007893822 = score(doc=4039,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 4039, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4039)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

A non-negligible fraction of user queries end up with very few or even no matching results in leading commercial web search engines. In this work, we provide a detailed characterization of such queries and show that search engines try to improve such queries by showing the results of related queries. Through a user study, we show that these query suggestions are usually perceived as relevant. Also, through a query log analysis, we show that the users are dissatisfied after submitting a query that match no results at least 88.5% of the time. As a first step towards solving these no-answer queries, we devised a large number of features that can be used to identify such queries and built machine-learning models. These models can be useful for scenarios such as the mobile- or meta-search, where identifying a query that will retrieve no results at the client device (i.e., even before submitting it to the search engine) may yield gains in terms of the bandwidth usage, power consumption, and/or monetary costs. Experiments over query logs indicate that, despite the heavy skew in class sizes, our models achieve good prediction quality, with accuracy (in terms of area under the curve) up to 0.95.

Source

Journal of the Association for Information Science and Technology. 69(2018) no.2, S.256-270

Type

a
Souza, J.; Carvalho, A.; Cristo, M.; Moura, E.; Calado, P.; Chirita, P.-A.; Nejdl, W.: Using site-level connections to estimate link confidence (2012) 0.01
```
0.00556948 = product of:
  0.0139237 = sum of:
    0.008341924 = weight(_text_:a in 498) [ClassicSimilarity], result of:
      0.008341924 = score(doc=498,freq=12.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.15602624 = fieldWeight in 498, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=498)
    0.0055817757 = product of:
      0.011163551 = sum of:
        0.011163551 = weight(_text_:information in 498) [ClassicSimilarity], result of:
          0.011163551 = score(doc=498,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.13714671 = fieldWeight in 498, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=498)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is estimated by analyzing the linkage relationships between pages. This information is used by link analysis algorithms as a query-independent feature, to be taken into account when computing the rank of the results. Unfortunately, several types of links found on the web may damage the estimated page reputation and thus cause a negative effect on the quality of search results. This work studies alternatives to reduce the negative impact of such noisy links. More specifically, the authors propose and evaluate new methods that deal with noisy links, considering scenarios where the reputation of pages is computed using the PageRank algorithm. They show, through experiments with real web content, that their methods achieve significant improvements when compared to previous solutions proposed in the literature.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.11, S.2294-2312

Type

a
Li, Z.: ¬A domain specific search engine with explicit document relations (2013) 0.01
```
0.00556948 = product of:
  0.0139237 = sum of:
    0.008341924 = weight(_text_:a in 1210) [ClassicSimilarity], result of:
      0.008341924 = score(doc=1210,freq=12.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.15602624 = fieldWeight in 1210, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1210)
    0.0055817757 = product of:
      0.011163551 = sum of:
        0.011163551 = weight(_text_:information in 1210) [ClassicSimilarity], result of:
          0.011163551 = score(doc=1210,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.13714671 = fieldWeight in 1210, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1210)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

The current web consists of documents that are highly heterogeneous and hard for machines to understand. The Semantic Web is a progressive movement of the Word Wide Web, aiming at converting the current web of unstructured documents to the web of data. In the Semantic Web, web documents are annotated with metadata using standardized ontology language. These annotated documents are directly processable by machines and it highly improves their usability and usefulness. In Ericsson, similar problems occur. There are massive documents being created with well-defined structures. Though these documents are about domain specific knowledge and can have rich relations, they are currently managed by a traditional search engine, which ignores the rich domain specific information and presents few data to users. Motivated by the Semantic Web, we aim to find standard ways to process these documents, extract rich domain specific information and annotate these data to documents with formal markup languages. We propose this project to develop a domain specific search engine for processing different documents and building explicit relations for them. This research project consists of the three main focuses: examining different domain specific documents and finding ways to extract their metadata; integrating a text search engine with an ontology server; exploring novel ways to build relations for documents. We implement this system and demonstrate its functions. As a prototype, the system provides required features and will be extended in the future.
Rieh, S.Y.; Kim, Y.-M.; Markey, K.: Amount of invested mental effort (AIME) in online searching (2012) 0.01
```
0.00556948 = product of:
  0.0139237 = sum of:
    0.008341924 = weight(_text_:a in 2726) [ClassicSimilarity], result of:
      0.008341924 = score(doc=2726,freq=12.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.15602624 = fieldWeight in 2726, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2726)
    0.0055817757 = product of:
      0.011163551 = sum of:
        0.011163551 = weight(_text_:information in 2726) [ClassicSimilarity], result of:
          0.011163551 = score(doc=2726,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.13714671 = fieldWeight in 2726, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2726)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

This research investigates how people's perceptions of information retrieval (IR) systems, their perceptions of search tasks, and their perceptions of self-efficacy influence the amount of invested mental effort (AIME) they put into using two different IR systems: a Web search engine and a library system. It also explores the impact of mental effort on an end user's search experience. To assess AIME in online searching, two experiments were conducted using these methods: Experiment 1 relied on self-reports and Experiment 2 employed the dual-task technique. In both experiments, data were collected through search transaction logs, a pre-search background questionnaire, a post-search questionnaire and an interview. Important findings are these: (1) subjects invested greater mental effort searching a library system than searching the Web; (2) subjects put little effort into Web searching because of their high sense of self-efficacy in their searching ability and their perception of the easiness of the Web; (3) subjects did not recognize that putting mental effort into searching was something needed to improve the search results; and (4) data collected from multiple sources proved to be effective for assessing mental effort in online searching.

Source

Information processing and management. 48(2012) no.6, S.1136-1150

Type

a
Joint, N.: ¬The one-stop shop search engine : a transformational library technology? ANTAEUS (2010) 0.01
```
0.005431735 = product of:
  0.013579337 = sum of:
    0.009632425 = weight(_text_:a in 4201) [ClassicSimilarity], result of:
      0.009632425 = score(doc=4201,freq=16.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.18016359 = fieldWeight in 4201, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4201)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 4201) [ClassicSimilarity], result of:
          0.007893822 = score(doc=4201,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 4201, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4201)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Purpose - The purpose of this paper is to form one of a series which will give an overview of so-called "transformational" areas of digital library technology. The aim will be to assess how much real transformation these applications are bringing about, in terms of creating genuine user benefit and also changing everyday library practice. Design/methodology/approach - An overview of the present state of development of the one-stop shop library search engine, with particular reference to its relationship with the underlying bibliographic databases to which it provides a simplified single interface. Findings - The paper finds that the success of federated searching has proved valuable but limited to date in creating a one-stop shop search engine to rival Google Scholar; but the persistent value of the bibliographic databases sitting underneath a federated search system means that a harvesting search engine could well answer the need for a true one-stop search engine for academic and scholarly information. Research limitations/implications - This paper is based on the hypothesis that Google's success in providing such an apparently high degree of access to electronic journal services is not what it seems, and that it does not render library discovery tools obsolete. It argues that Google has not diminished the pre-eminent role of library bibliographic databases in mediating access to e-journal text, although this hypothesis needs further research to validate or disprove it. Practical implications - The paper affirms the value of bibliographic databases to practitioner librarians and the potential of single interface discovery tools in library practice. Originality/value - The paper uses statistics from US LIS sources to shed light on UK discovery tool issues.

Type

a
Fu, T.; Abbasi, A.; Chen, H.: ¬A focused crawler for Dark Web forums (2010) 0.01
```
0.005182888 = product of:
  0.012957219 = sum of:
    0.009010308 = weight(_text_:a in 3471) [ClassicSimilarity], result of:
      0.009010308 = score(doc=3471,freq=14.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.1685276 = fieldWeight in 3471, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3471)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 3471) [ClassicSimilarity], result of:
          0.007893822 = score(doc=3471,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 3471, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3471)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

The unprecedented growth of the Internet has given rise to the Dark Web, the problematic facet of the Web associated with cybercrime, hate, and extremism. Despite the need for tools to collect and analyze Dark Web forums, the covert nature of this part of the Internet makes traditional Web crawling techniques insufficient for capturing such content. In this study, we propose a novel crawling system designed to collect Dark Web forum content. The system uses a human-assisted accessibility approach to gain access to Dark Web forums. Several URL ordering features and techniques enable efficient extraction of forum postings. The system also includes an incremental crawler coupled with a recall-improvement mechanism intended to facilitate enhanced retrieval and updating of collected content. Experiments conducted to evaluate the effectiveness of the human-assisted accessibility approach and the recall-improvement-based, incremental-update procedure yielded favorable results. The human-assisted approach significantly improved access to Dark Web forums while the incremental crawler with recall improvement also outperformed standard periodic- and incremental-update approaches. Using the system, we were able to collect over 100 Dark Web forums from three regions. A case study encompassing link and content analysis of collected forums was used to illustrate the value and importance of gathering and analyzing content from such online communities.

Source

Journal of the American Society for Information Science and Technology. 61(2010) no.6, S.1213-1231

Type

a
Lewandowski, D.; Drechsler, J.; Mach, S. von: Deriving query intents from web search engine queries (2012) 0.01
```
0.005182888 = product of:
  0.012957219 = sum of:
    0.009010308 = weight(_text_:a in 385) [ClassicSimilarity], result of:
      0.009010308 = score(doc=385,freq=14.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.1685276 = fieldWeight in 385, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=385)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 385) [ClassicSimilarity], result of:
          0.007893822 = score(doc=385,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 385, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=385)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

The purpose of this article is to test the reliability of query intents derived from queries, either by the user who entered the query or by another juror. We report the findings of three studies. First, we conducted a large-scale classification study (~50,000 queries) using a crowdsourcing approach. Next, we used clickthrough data from a search engine log and validated the judgments given by the jurors from the crowdsourcing study. Finally, we conducted an online survey on a commercial search engine's portal. Because we used the same queries for all three studies, we also were able to compare the results and the effectiveness of the different approaches. We found that neither the crowdsourcing approach, using jurors who classified queries originating from other users, nor the questionnaire approach, using searchers who were asked about their own query that they just entered into a Web search engine, led to satisfying results. This leads us to conclude that there was little understanding of the classification tasks, even though both groups of jurors were given detailed instructions. Although we used manual classification, our research also has important implications for automatic classification. We must question the success of approaches using automatic classification and comparing its performance to a baseline from human jurors.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.9, S.1773-1788

Type

a
Zhao, Y.; Ma, F.; Xia, X.: Evaluating the coverage of entities in knowledge graphs behind general web search engines : Poster (2017) 0.01
```
0.005182888 = product of:
  0.012957219 = sum of:
    0.009010308 = weight(_text_:a in 3854) [ClassicSimilarity], result of:
      0.009010308 = score(doc=3854,freq=14.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.1685276 = fieldWeight in 3854, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3854)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 3854) [ClassicSimilarity], result of:
          0.007893822 = score(doc=3854,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 3854, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3854)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Web search engines, such as Google and Bing, are constantly employing results from knowledge organization and various visualization features to improve their search services. Knowledge graph, a large repository of structured knowledge represented by formal languages such as RDF (Resource Description Framework), is used to support entity search feature of Google and Bing (Demartini, 2016). When a user searchs for an entity, such as a person, an organization, or a place in Google or Bing, it is likely that a knowledge cardwill be presented on the right side bar of the search engine result pages (SERPs). For example, when a user searches the entity Benedict Cumberbatch on Google, the knowledge card will show the basic structured information about this person, including his date of birth, height, spouse, parents, and his movies, etc. The knowledge card, which is used to present the result of entity search, is generated from knowledge graphs. Therefore, the quality of knowledge graphs is essential to the performance of entity search. However, studies on the quality of knowledge graphs from the angle of entity coverage are scant in the literature. This study aims to investigate the coverage of entities of knowledge graphs behind Google and Bing.

Type

a
Lewandowski, D.: Evaluating the retrieval effectiveness of web search engines using a representative query sample (2015) 0.01
```
0.0051638708 = product of:
  0.012909677 = sum of:
    0.008173384 = weight(_text_:a in 2157) [ClassicSimilarity], result of:
      0.008173384 = score(doc=2157,freq=8.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.15287387 = fieldWeight in 2157, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=2157)
    0.0047362936 = product of:
      0.009472587 = sum of:
        0.009472587 = weight(_text_:information in 2157) [ClassicSimilarity], result of:
          0.009472587 = score(doc=2157,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.116372846 = fieldWeight in 2157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2157)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Search engine retrieval effectiveness studies are usually small scale, using only limited query samples. Furthermore, queries are selected by the researchers. We address these issues by taking a random representative sample of 1,000 informational and 1,000 navigational queries from a major German search engine and comparing Google's and Bing's results based on this sample. Jurors were found through crowdsourcing, and data were collected using specialized software, the Relevance Assessment Tool (RAT). We found that although Google outperforms Bing in both query types, the difference in the performance for informational queries was rather low. However, for navigational queries, Google found the correct answer in 95.3% of cases, whereas Bing only found the correct answer 76.6% of the time. We conclude that search engine performance on navigational queries is of great importance, because users in this case can clearly identify queries that have returned correct results. So, performance on this query type may contribute to explaining user satisfaction with search engines.

Source

Journal of the Association for Information Science and Technology. 66(2015) no.9, S.1763-1775

Type

a
Roy, R.S.; Agarwal, S.; Ganguly, N.; Choudhury, M.: Syntactic complexity of Web search queries through the lenses of language models, networks and users (2016) 0.01
```
0.005093954 = product of:
  0.012734884 = sum of:
    0.005898632 = weight(_text_:a in 3188) [ClassicSimilarity], result of:
      0.005898632 = score(doc=3188,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.11032722 = fieldWeight in 3188, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3188)
    0.006836252 = product of:
      0.013672504 = sum of:
        0.013672504 = weight(_text_:information in 3188) [ClassicSimilarity], result of:
          0.013672504 = score(doc=3188,freq=6.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.16796975 = fieldWeight in 3188, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3188)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Across the world, millions of users interact with search engines every day to satisfy their information needs. As the Web grows bigger over time, such information needs, manifested through user search queries, also become more complex. However, there has been no systematic study that quantifies the structural complexity of Web search queries. In this research, we make an attempt towards understanding and characterizing the syntactic complexity of search queries using a multi-pronged approach. We use traditional statistical language modeling techniques to quantify and compare the perplexity of queries with natural language (NL). We then use complex network analysis for a comparative analysis of the topological properties of queries issued by real Web users and those generated by statistical models. Finally, we conduct experiments to study whether search engine users are able to identify real queries, when presented along with model-generated ones. The three complementary studies show that the syntactic structure of Web queries is more complex than what n-grams can capture, but simpler than NL. Queries, thus, seem to represent an intermediate stage between syntactic and non-syntactic communication.

Source

Information processing and management. 52(2016) no.5, S.923-948

Type

a

Search (121 results, page 3 of 7)

Authors

Languages

Types

Themes

Subjects

Classifications