Search (14 results, page 1 of 1)

Barrio, P.; Gravano, L.: Sampling strategies for information extraction over the deep web (2017) 0.03
```
0.029316615 = product of:
  0.05863323 = sum of:
    0.029262928 = weight(_text_:data in 3412) [ClassicSimilarity], result of:
      0.029262928 = score(doc=3412,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.19762816 = fieldWeight in 3412, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=3412)
    0.029370302 = product of:
      0.058740605 = sum of:
        0.058740605 = weight(_text_:processing in 3412) [ClassicSimilarity], result of:
          0.058740605 = score(doc=3412,freq=6.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.30987173 = fieldWeight in 3412, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.03125 = fieldNorm(doc=3412)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Information extraction systems discover structured information in natural language text. Having information in structured form enables much richer querying and data mining than possible over the natural language text. However, information extraction is a computationally expensive task, and hence improving the efficiency of the extraction process over large text collections is of critical interest. In this paper, we focus on an especially valuable family of text collections, namely, the so-called deep-web text collections, whose contents are not crawlable and are only available via querying. Important steps for efficient information extraction over deep-web text collections (e.g., selecting the collections on which to focus the extraction effort, based on their contents; or learning which documents within these collections-and in which order-to process, based on their words and phrases) require having a representative document sample from each collection. These document samples have to be collected by querying the deep-web text collections, an expensive process that renders impractical the existing sampling approaches developed for other data scenarios. In this paper, we systematically study the space of query-based document sampling techniques for information extraction over the deep web. Specifically, we consider (i) alternative query execution schedules, which vary on how they account for the query effectiveness, and (ii) alternative document retrieval and processing schedules, which vary on how they distribute the extraction effort over documents. We report the results of the first large-scale experimental evaluation of sampling techniques for information extraction over the deep web. Our results show the merits and limitations of the alternative query execution and document retrieval and processing strategies, and provide a roadmap for addressing this critically important building block for efficient, scalable information extraction.

Source

Information processing and management. 53(2017) no.2, S.309-331
Drabenstott, K.M.: Web search strategies (2000) 0.02
```
0.016690476 = product of:
  0.03338095 = sum of:
    0.020692015 = weight(_text_:data in 1188) [ClassicSimilarity], result of:
      0.020692015 = score(doc=1188,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.1397442 = fieldWeight in 1188, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=1188)
    0.012688936 = product of:
      0.025377871 = sum of:
        0.025377871 = weight(_text_:22 in 1188) [ClassicSimilarity], result of:
          0.025377871 = score(doc=1188,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.15476047 = fieldWeight in 1188, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1188)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Surfing the World Wide Web used to be cool, dude, real cool. But things have gotten hot - so hot that finding something useful an the Web is no longer cool. It is suffocating Web searchers in the smoke and debris of mountain-sized lists of hits, decisions about which search engines they should use, whether they will get lost in the dizzying maze of a subject directory, use the right syntax for the search engine at hand, enter keywords that are likely to retrieve hits an the topics they have in mind, or enlist a browser that has sufficient functionality to display the most promising hits. When it comes to Web searching, in a few short years we have gone from the cool image of surfing the Web into the frying pan of searching the Web. We can turn down the heat by rethinking what Web searchers are doing and introduce some order into the chaos. Web search strategies that are tool-based-oriented to specific Web searching tools such as search en gines, subject directories, and meta search engines-have been widely promoted, and these strategies are just not working. It is time to dissect what Web searching tools expect from searchers and adjust our search strategies to these new tools. This discussion offers Web searchers help in the form of search strategies that are based an strategies that librarians have been using for a long time to search commercial information retrieval systems like Dialog, NEXIS, Wilsonline, FirstSearch, and Data-Star.

Date

22. 9.1997 19:16:05

White, M.D.; Iivonen, M.: Questions as a factor in Web search strategy (2001) 0.01

0.014837332 = product of:
  0.05934933 = sum of:
    0.05934933 = product of:
      0.11869866 = sum of:
        0.11869866 = weight(_text_:processing in 333) [ClassicSimilarity], result of:
          0.11869866 = score(doc=333,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.6261658 = fieldWeight in 333, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.109375 = fieldNorm(doc=333)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information processing and management. 37(2001) no.5, S.721-740

Hsieh-Yee, I.: Research on Web-search behavior (2001) 0.01
```
0.012802532 = product of:
  0.051210128 = sum of:
    0.051210128 = weight(_text_:data in 2277) [ClassicSimilarity], result of:
      0.051210128 = score(doc=2277,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.34584928 = fieldWeight in 2277, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2277)
  0.25 = coord(1/4)
```
Abstract

This article reviews studies, conducted between 1995 and 2000, on Web search behavior. These studies reported on children as well as on adults. Most of the studies on children described their interaction with the Web. Research on adult searchers focused on describing search patterns, and many studies investigated effects of selected factors on search behavior, including information organization and presentation, type of search task, Web experience, cognitive abilities, and affective states. What distinguishes the research on adult searchers is the use of multiple data-gathering methods. The research on Web search behavior reflects researchers' commitment to examine users in their information environment and exhibits rigor in design and data analysis. However, many studies lack external validity. Implications of this body of research are discussed.
Spink, A.; Danby, S.; Mallan, K.; Butler, C.: Exploring young children's web searching and technoliteracy (2010) 0.01
```
0.009144665 = product of:
  0.03657866 = sum of:
    0.03657866 = weight(_text_:data in 3623) [ClassicSimilarity], result of:
      0.03657866 = score(doc=3623,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24703519 = fieldWeight in 3623, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3623)
  0.25 = coord(1/4)
```
Abstract

Purpose - This paper aims to report findings from an exploratory study investigating the web interactions and technoliteracy of children in the early childhood years. Previous research has studied aspects of older children's technoliteracy and web searching; however, few studies have analyzed web search data from children younger than six years of age. Design/methodology/approach - The study explored the Google web searching and technoliteracy of young children who are enrolled in a "preparatory classroom" or kindergarten (the year before young children begin compulsory schooling in Queensland, Australia). Young children were video- and audio-taped while conducting Google web searches in the classroom. The data were qualitatively analysed to understand the young children's web search behaviour. Findings - The findings show that young children engage in complex web searches, including keyword searching and browsing, query formulation and reformulation, relevance judgments, successive searches, information multitasking and collaborative behaviours. The study results provide significant initial insights into young children's web searching and technoliteracy. Practical implications - The use of web search engines by young children is an important research area with implications for educators and web technologies developers. Originality/value - This is the first study of young children's interaction with a web search engine.
Notess, G.R.: Searching the hidden Internet (1997) 0.01
```
0.009052756 = product of:
  0.036211025 = sum of:
    0.036211025 = weight(_text_:data in 4802) [ClassicSimilarity], result of:
      0.036211025 = score(doc=4802,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24455236 = fieldWeight in 4802, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4802)
  0.25 = coord(1/4)
```
Abstract

WWW search engines are not comprehensive in their searches. They do not search: Adobe PDF file or other formatted files, registration files, and data sets. Basic search strategies can give access to some of the hidden content. 2 databases are also available to provide access to the hidden information. Excite's News Tracker searches a database of selected online publications. ATI databases from PLS, Inc. presents access to a variety of Internet accessible databases that may require membership or the payment of a registration fee
Ford, N.; Miller, D.; Moss, N.: Web search strategies and human individual differences : cognitive and demographic factors, Internet attitudes, and approaches (2005) 0.01
```
0.0077595054 = product of:
  0.031038022 = sum of:
    0.031038022 = weight(_text_:data in 3475) [ClassicSimilarity], result of:
      0.031038022 = score(doc=3475,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2096163 = fieldWeight in 3475, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3475)
  0.25 = coord(1/4)
```
Abstract

The research reported here was an exploratory study that sought to discover the effects of human individual differences an Web search strategy. These differences consisted of (a) study approaches, (b) cognitive and demographic features, and (c) perceptions of and preferred approaches to Web-based information seeking. Sixtyeight master's students used AItaVista to search for information an three assigned search topics graded in terms of complexity. Five hundred seven search queries were factor analyzed to identify relationships between the individual difference variables and Boolean and best-match search strategies. A number of consistent patterns of relationship were found. As task complexity increased, a number of strategic shifts were also observed an the part of searchers possessing particular combinations of characteristics. A second article (published in this issue of JASIST; Ford, Miller, & Moss, 2005) presents a combined analyses of the data including a series of regression analyses.
Ford, N.; Miller, D.; Moss, N.: Web search strategies and human individual differences : a combined analysis (2005) 0.01
```
0.0077595054 = product of:
  0.031038022 = sum of:
    0.031038022 = weight(_text_:data in 3476) [ClassicSimilarity], result of:
      0.031038022 = score(doc=3476,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2096163 = fieldWeight in 3476, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3476)
  0.25 = coord(1/4)
```
Abstract

This is the second of two articles published in this issue of JASIST reporting the results of a study investigating relationships between Web search strategies and a range of human individual differences. In this article we provide a combined analysis of the factor analyses previously presented separately in relation to each of three groups of human individual difference (study approaches, cognitive and demographic features, and perceptions of and approaches to Internet-based information seeking). It also introduces two series of regression analyses conducted an data spanning all three individual difference groups. The results are discussed in terms of the extent to which they satisfy the original aim of this exploratory research, namely to identify any relationships between search strategy and individual difference variables for which there is a prima facie case for more focused systematic study. It is argued that a number of such relationships do exist. The results of the project are summarized and suggestions are made for further research.
Lucas, W.; Topi, H.: Form and function : the impact of query term and operator usage on Web search results (2002) 0.01
```
0.006466255 = product of:
  0.02586502 = sum of:
    0.02586502 = weight(_text_:data in 198) [ClassicSimilarity], result of:
      0.02586502 = score(doc=198,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 198, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=198)
  0.25 = coord(1/4)
```
Abstract

Conventional wisdom holds that queries to information retrieval systems will yield more relevant results if they contain multiple topic-related terms and use Boolean and phrase operators to enhance interpretation. Although studies have shown that the users of Web-based search engines typically enter short, term-based queries and rarely use search operators, little information exists concerning the effects of term and operator usage on the relevancy of search results. In this study, search engine users formulated queries on eight search topics. Each query was submitted to the user-specified search engine, and relevancy ratings for the retrieved pages were assigned. Expert-formulated queries were also submitted and provided a basis for comparing relevancy ratings across search engines. Data analysis based on our research model of the term and operator factors affecting relevancy was then conducted. The results show that the difference in the number of terms between expert and nonexpert searches, the percentage of matching terms between those searches, and the erroneous use of nonsupported operators in nonexpert searches explain most of the variation in the relevancy of search results. These findings highlight the need for designing search engine interfaces that provide greater support in the areas of term selection and operator usage
Jansen, B.J.; Resnick, M.: ¬An examination of searcher's perceptions of nonsponsored and sponsored links during ecommerce Web searching (2006) 0.01
```
0.006466255 = product of:
  0.02586502 = sum of:
    0.02586502 = weight(_text_:data in 221) [ClassicSimilarity], result of:
      0.02586502 = score(doc=221,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 221, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=221)
  0.25 = coord(1/4)
```
Abstract

In this article, we report results of an investigation into the effect of sponsored links on ecommerce information seeking on the Web. In this research, 56 participants each engaged in six ecommerce Web searching tasks. We extracted these tasks from the transaction log of a Web search engine, so they represent actual ecommerce searching information needs. Using 60 organic and 30 sponsored Web links, the quality of the Web search engine results was controlled by switching nonsponsored and sponsored links on half of the tasks for each participant. This allowed for investigating the bias toward sponsored links while controlling for quality of content. The study also investigated the relationship between searching self-efficacy, searching experience, types of ecommerce information needs, and the order of links on the viewing of sponsored links. Data included 2,453 interactions with links from result pages and 961 utterances evaluating these links. The results of the study indicate that there is a strong preference for nonsponsored links, with searchers viewing these results first more than 82% of the time. Searching self-efficacy and experience does not increase the likelihood of viewing sponsored links, and the order of the result listing does not appear to affect searcher evaluation of sponsored links. The implications for sponsored links as a long-term business model are discussed.
Mansourian, I.: Web search efficacy : definition and implementation (2008) 0.01
```
0.006466255 = product of:
  0.02586502 = sum of:
    0.02586502 = weight(_text_:data in 2565) [ClassicSimilarity], result of:
      0.02586502 = score(doc=2565,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 2565, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2565)
  0.25 = coord(1/4)
```
Abstract

Purpose - This paper aims to report a number of factors that are perceived by web users as influential elements in their search procedure. The paper introduces a new conceptual measure called "web search efficacy" (hereafter WSE) to evaluate the performance of searches mainly based on users' perceptions. Design/methodology/approach - A rich dataset of a wider study was inductively re-explored to identify different categories that are perceived influential by web users on the final outcome of their searches. A selective review of the literature was carried out to discover to what extent previous research supports the findings of the current study. Findings - The analysis of the dataset led to the identification of five categories of influential factors. Within each group different factors have been recognized. Accordingly, the concept of WSE has been introduced. The five "Ss" which determine WSE are searcher's performance, search tool's performance, search strategy, search topic, and search situation. Research limitations/implications - The research body is scattered in different areas and it is difficult to carry out a comprehensive review. The WSE table, which is derived from the empirical data and was supported by previous research, can be employed for further research in various groups of web users. Originality/value - The paper contributes to the area of information seeking on the web by providing researchers with a new conceptual framework to evaluate the efficiency of each search session and identify the underlying factors on the final outcome of web searching.

Sanchiza, M.; Chinb, J.; Chevaliera, A.; Fuc, W.T.; Amadieua, F.; Hed, J.: Searching for information on the web : impact of cognitive aging, prior domain knowledge and complexity of the search problems (2017) 0.01

0.0063588563 = product of:
  0.025435425 = sum of:
    0.025435425 = product of:
      0.05087085 = sum of:
        0.05087085 = weight(_text_:processing in 3294) [ClassicSimilarity], result of:
          0.05087085 = score(doc=3294,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.26835677 = fieldWeight in 3294, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=3294)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information processing and management. 53(2017) no.1, S.281-294

Hupfer, M.E.; Detlor, B.: Gender and Web information seeking : a self-concept orientation model (2006) 0.01
```
0.005299047 = product of:
  0.021196188 = sum of:
    0.021196188 = product of:
      0.042392377 = sum of:
        0.042392377 = weight(_text_:processing in 5119) [ClassicSimilarity], result of:
          0.042392377 = score(doc=5119,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.22363065 = fieldWeight in 5119, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5119)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Adapting the consumer behavior selectivity model to the Web environment, this paper's key contribution is the introduction of a self-concept orientation model of Web information seeking. This model, which addresses gender, effort, and information content factors, questions the commonly assumed equivalence of sex and gender by specifying the measurement of gender-related selfconcept traits known as self- and other-orientation. Regression analyses identified associations between self-orientation, other-orientation, and self-reported search frequencies for content with identical subject domain (e.g., medical information, government information) and differing relevance (i.e., important to the individual personally versus important to someone close to him or her). Self- and other-orientation interacted such that when individuals were highly self-oriented, their frequency of search for both self- and other-relevant information depended on their level of other-orientation. Specifically, high-self/high-other individuals, with a comprehensive processing strategy, searched most often, whereas high-self/low-other respondents, with an effort minimization strategy, reported the lowest search frequencies. This interaction pattern was even more pronounced for other-relevant information seeking. We found no sex differences in search frequency for either self-relevant or other-relevant information.

Hsieh-Yee, I.: Search tactics of Web users in searching for texts, graphics, known items and subjects : a search simulation study (1998) 0.00

0.0047583506 = product of:
  0.019033402 = sum of:
    0.019033402 = product of:
      0.038066804 = sum of:
        0.038066804 = weight(_text_:22 in 2404) [ClassicSimilarity], result of:
          0.038066804 = score(doc=2404,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.23214069 = fieldWeight in 2404, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2404)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 25.12.1998 19:22:31

Search (14 results, page 1 of 1)

Authors

Years

Themes