Search (1374 results, page 1 of 69)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.23

0.22707656 = product of:
  0.30276874 = sum of:
    0.071140714 = product of:
      0.21342213 = sum of:
        0.21342213 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.21342213 = score(doc=562,freq=2.0), product of:
            0.3797425 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04479146 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.21342213 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.21342213 = score(doc=562,freq=2.0), product of:
        0.3797425 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.04479146 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.018205874 = product of:
      0.036411747 = sum of:
        0.036411747 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.036411747 = score(doc=562,freq=2.0), product of:
            0.15685207 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04479146 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Schrodt, R.: Tiefen und Untiefen im wissenschaftlichen Sprachgebrauch (2008) 0.19

0.18970858 = product of:
  0.37941715 = sum of:
    0.09485429 = product of:
      0.28456286 = sum of:
        0.28456286 = weight(_text_:3a in 140) [ClassicSimilarity], result of:
          0.28456286 = score(doc=140,freq=2.0), product of:
            0.3797425 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04479146 = queryNorm
            0.7493574 = fieldWeight in 140, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0625 = fieldNorm(doc=140)
      0.33333334 = coord(1/3)
    0.28456286 = weight(_text_:2f in 140) [ClassicSimilarity], result of:
      0.28456286 = score(doc=140,freq=2.0), product of:
        0.3797425 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.04479146 = queryNorm
        0.7493574 = fieldWeight in 140, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0625 = fieldNorm(doc=140)
  0.5 = coord(2/4)

Content: Vgl. auch: https://studylibde.com/doc/13053640/richard-schrodt. Vgl. auch: http%3A%2F%2Fwww.univie.ac.at%2FGermanistik%2Fschrodt%2Fvorlesung%2Fwissenschaftssprache.doc&usg=AOvVaw1lDLDR6NFf1W0-oC9mEUJf.

Vetere, G.; Lenzerini, M.: Models for semantic interoperability in service-oriented architectures (2005) 0.17

0.165995 = product of:
  0.33199 = sum of:
    0.0829975 = product of:
      0.2489925 = sum of:
        0.2489925 = weight(_text_:3a in 306) [ClassicSimilarity], result of:
          0.2489925 = score(doc=306,freq=2.0), product of:
            0.3797425 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04479146 = queryNorm
            0.65568775 = fieldWeight in 306, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0546875 = fieldNorm(doc=306)
      0.33333334 = coord(1/3)
    0.2489925 = weight(_text_:2f in 306) [ClassicSimilarity], result of:
      0.2489925 = score(doc=306,freq=2.0), product of:
        0.3797425 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.04479146 = queryNorm
        0.65568775 = fieldWeight in 306, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0546875 = fieldNorm(doc=306)
  0.5 = coord(2/4)

Content: Vgl.: http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5386707&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D5386707.

Mas, S.; Marleau, Y.: Proposition of a faceted classification model to support corporate information organization and digital records management (2009) 0.14

0.14228143 = product of:
  0.28456286 = sum of:
    0.071140714 = product of:
      0.21342213 = sum of:
        0.21342213 = weight(_text_:3a in 2918) [ClassicSimilarity], result of:
          0.21342213 = score(doc=2918,freq=2.0), product of:
            0.3797425 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04479146 = queryNorm
            0.56201804 = fieldWeight in 2918, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=2918)
      0.33333334 = coord(1/3)
    0.21342213 = weight(_text_:2f in 2918) [ClassicSimilarity], result of:
      0.21342213 = score(doc=2918,freq=2.0), product of:
        0.3797425 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.04479146 = queryNorm
        0.56201804 = fieldWeight in 2918, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=2918)
  0.5 = coord(2/4)

Footnote: Vgl.: http://ieeexplore.ieee.org/Xplore/login.jsp?reload=true&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F4755313%2F4755314%2F04755480.pdf%3Farnumber%3D4755480&authDecision=-203.

Donsbach, W.: Wahrheit in den Medien : über den Sinn eines methodischen Objektivitätsbegriffes (2001) 0.12

0.118567854 = product of:
  0.23713571 = sum of:
    0.059283927 = product of:
      0.17785178 = sum of:
        0.17785178 = weight(_text_:3a in 5895) [ClassicSimilarity], result of:
          0.17785178 = score(doc=5895,freq=2.0), product of:
            0.3797425 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04479146 = queryNorm
            0.46834838 = fieldWeight in 5895, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5895)
      0.33333334 = coord(1/3)
    0.17785178 = weight(_text_:2f in 5895) [ClassicSimilarity], result of:
      0.17785178 = score(doc=5895,freq=2.0), product of:
        0.3797425 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.04479146 = queryNorm
        0.46834838 = fieldWeight in 5895, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5895)
  0.5 = coord(2/4)

Source: Politische Meinung. 381(2001) Nr.1, S.65-74 [https%3A%2F%2Fwww.dgfe.de%2Ffileadmin%2FOrdnerRedakteure%2FSektionen%2FSek02_AEW%2FKWF%2FPublikationen_Reihe_1989-2003%2FBand_17%2FBd_17_1994_355-406_A.pdf&usg=AOvVaw2KcbRsHy5UQ9QRIUyuOLNi]

Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.11

0.11066668 = product of:
  0.22133335 = sum of:
    0.17885299 = weight(_text_:engines in 3445) [ClassicSimilarity], result of:
      0.17885299 = score(doc=3445,freq=2.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.7858995 = fieldWeight in 3445, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.109375 = fieldNorm(doc=3445)
    0.042480372 = product of:
      0.084960744 = sum of:
        0.084960744 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
          0.084960744 = score(doc=3445,freq=2.0), product of:
            0.15685207 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04479146 = queryNorm
            0.5416616 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 25. 8.2005 17:42:22

Su, L.T.: ¬A comprehensive and systematic model of user evaluation of Web search engines : Il. An evaluation by undergraduates (2003) 0.11
```
0.10858273 = product of:
  0.21716546 = sum of:
    0.2019939 = weight(_text_:engines in 2117) [ClassicSimilarity], result of:
      0.2019939 = score(doc=2117,freq=20.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.88758314 = fieldWeight in 2117, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2117)
    0.015171562 = product of:
      0.030343125 = sum of:
        0.030343125 = weight(_text_:22 in 2117) [ClassicSimilarity], result of:
          0.030343125 = score(doc=2117,freq=2.0), product of:
            0.15685207 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04479146 = queryNorm
            0.19345059 = fieldWeight in 2117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2117)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

This paper presents an application of the model described in Part I to the evaluation of Web search engines by undergraduates. The study observed how 36 undergraduate used four major search engines to find information for their own individual problems and how they evaluated these engines based an actual interaction with the search engines. User evaluation was based an 16 performance measures representing five evaluation criteria: relevance, efficiency, utility, user satisfaction, and connectivity. Non-performance (user-related) measures were also applied. Each participant searched his/ her own topic an all four engines and provided satisfaction ratings for system features and interaction and reasons for satisfaction. Each also made relevance judgements of retrieved items in relation to his/her own information need and participated in post-search Interviews to provide reactions to the search results and overall performance. The study found significant differences in precision PR1 relative recall, user satisfaction with output display, time saving, value of search results, and overall performance among the four engines and also significant engine by discipline interactions an all these measures. In addition, the study found significant differences in user satisfaction with response time among four engines, and significant engine by discipline interaction in user satisfaction with search interface. None of the four search engines dominated in every aspect of the multidimensional evaluation. Content analysis of verbal data identified a number of user criteria and users evaluative comments based an these criteria. Results from both quantitative analysis and content analysis provide insight for system design and development, and useful feedback an strengths and weaknesses of search engines for system improvement

Date

24. 1.2004 18:27:22

Fan, W.; Gordon, M.D.; Pathak, P.: ¬A generic ranking function discovery framework by genetic programming for information retrieval (2004) 0.10

0.09734263 = product of:
  0.19468527 = sum of:
    0.089426495 = weight(_text_:engines in 2554) [ClassicSimilarity], result of:
      0.089426495 = score(doc=2554,freq=2.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.39294976 = fieldWeight in 2554, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2554)
    0.10525877 = product of:
      0.21051754 = sum of:
        0.21051754 = weight(_text_:programming in 2554) [ClassicSimilarity], result of:
          0.21051754 = score(doc=2554,freq=4.0), product of:
            0.29361802 = queryWeight, product of:
              6.5552235 = idf(docFreq=170, maxDocs=44218)
              0.04479146 = queryNorm
            0.7169776 = fieldWeight in 2554, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.5552235 = idf(docFreq=170, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2554)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Ranking functions play a substantial role in the performance of information retrieval (IR) systems and search engines. Although there are many ranking functions available in the IR literature, various empirical evaluation studies show that ranking functions do not perform consistently well across different contexts (queries, collections, users). Moreover, it is often difficult and very expensive for human beings to design optimal ranking functions that work well in all these contexts. In this paper, we propose a novel ranking function discovery framework based on Genetic Programming and show through various experiments how this new framework helps automate the ranking function design/discovery process.

Carroll, N.: Search engine optimization (2009) 0.09

0.09363182 = product of:
  0.18726364 = sum of:
    0.102201715 = weight(_text_:engines in 3874) [ClassicSimilarity], result of:
      0.102201715 = score(doc=3874,freq=2.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.44908544 = fieldWeight in 3874, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.0625 = fieldNorm(doc=3874)
    0.08506193 = product of:
      0.17012386 = sum of:
        0.17012386 = weight(_text_:programming in 3874) [ClassicSimilarity], result of:
          0.17012386 = score(doc=3874,freq=2.0), product of:
            0.29361802 = queryWeight, product of:
              6.5552235 = idf(docFreq=170, maxDocs=44218)
              0.04479146 = queryNorm
            0.57940537 = fieldWeight in 3874, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.5552235 = idf(docFreq=170, maxDocs=44218)
              0.0625 = fieldNorm(doc=3874)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Search engine optimization (SEO) is the craft of elevating Web sites or individual Web site pages to higher rankings on search engines through programming, marketing, or content acumen. This section covers the origins of SEO, strategies and tactics, history and trends, and the evolution of user behavior in online searching.

Thelwall, M.: Quantitative comparisons of search engine results (2008) 0.09
```
0.09045792 = product of:
  0.18091585 = sum of:
    0.12775214 = weight(_text_:engines in 2350) [ClassicSimilarity], result of:
      0.12775214 = score(doc=2350,freq=8.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.5613568 = fieldWeight in 2350, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2350)
    0.053163704 = product of:
      0.10632741 = sum of:
        0.10632741 = weight(_text_:programming in 2350) [ClassicSimilarity], result of:
          0.10632741 = score(doc=2350,freq=2.0), product of:
            0.29361802 = queryWeight, product of:
              6.5552235 = idf(docFreq=170, maxDocs=44218)
              0.04479146 = queryNorm
            0.36212835 = fieldWeight in 2350, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.5552235 = idf(docFreq=170, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2350)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Search engines are normally used to find information or Web sites, but Webometric investigations use them for quantitative data such as the number of pages matching a query and the international spread of those pages. For this type of application, the accuracy of the hit count estimates and range of URLs in the full results are important. Here, we compare the applications programming interfaces of Google, Yahoo!, and Live Search for 1,587 single word searches. The hit count estimates were broadly consistent but with Yahoo! and Google, reporting 5-6 times more hits than Live Search. Yahoo! tended to return slightly more matching URLs than Google, with Live Search returning significantly fewer. Yahoo!'s result URLs included a significantly wider range of domains and sites than the other two, and there was little consistency between the three engines in the number of different domains. In contrast, the three engines were reasonably consistent in the number of different top-level domains represented in the result URLs, although Yahoo! tended to return the most. In conclusion, quantitative results from the three search engines are mostly consistent but with unexpected types of inconsistency that users should be aware of. Google is recommended for hit count estimates but Yahoo! is recommended for all other Webometric purposes.
Morrison, P.J.: Tagging and searching : search retrieval effectiveness of folksonomies on the World Wide Web (2008) 0.08
```
0.0754849 = product of:
  0.1509698 = sum of:
    0.13276392 = weight(_text_:engines in 2109) [ClassicSimilarity], result of:
      0.13276392 = score(doc=2109,freq=6.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.58337915 = fieldWeight in 2109, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.046875 = fieldNorm(doc=2109)
    0.018205874 = product of:
      0.036411747 = sum of:
        0.036411747 = weight(_text_:22 in 2109) [ClassicSimilarity], result of:
          0.036411747 = score(doc=2109,freq=2.0), product of:
            0.15685207 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04479146 = queryNorm
            0.23214069 = fieldWeight in 2109, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2109)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Many Web sites have begun allowing users to submit items to a collection and tag them with keywords. The folksonomies built from these tags are an interesting topic that has seen little empirical research. This study compared the search information retrieval (IR) performance of folksonomies from social bookmarking Web sites against search engines and subject directories. Thirty-four participants created 103 queries for various information needs. Results from each IR system were collected and participants judged relevance. Folksonomy search results overlapped with those from the other systems, and documents found by both search engines and folksonomies were significantly more likely to be judged relevant than those returned by any single IR system type. The search engines in the study had the highest precision and recall, but the folksonomies fared surprisingly well. Del.icio.us was statistically indistinguishable from the directories in many cases. Overall the directories were more precise than the folksonomies but they had similar recall scores. Better query handling may enhance folksonomy IR performance further. The folksonomies studied were promising, and may be able to improve Web search performance.

Date

1. 8.2008 12:39:22
Hock, R.: Search engines (2009) 0.07
```
0.07069787 = product of:
  0.28279147 = sum of:
    0.28279147 = weight(_text_:engines in 3876) [ClassicSimilarity], result of:
      0.28279147 = score(doc=3876,freq=20.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        1.2426164 = fieldWeight in 3876, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3876)
  0.25 = coord(1/4)
```
Abstract

This entry provides an overview of Web search engines, looking at the definition, components, leading engines, searching capabilities, and types of engines. It examines the components that make up a search engine and briefly discusses the process involved in identifying content for the engines' databases and the indexing of that content. Typical search options are reviewed and the major Web search engines are identified and described. Also identified and described are various specialty search engines, such as those for special content such as video and images, and engines that take significantly different approaches to the search problem, such as visualization engines and metasearch engines.

Zillmann, H.: OSIRIS und eLib : Information Retrieval und Search Engines in Full-text Databases (2001) 0.06

0.06323811 = product of:
  0.12647621 = sum of:
    0.102201715 = weight(_text_:engines in 5937) [ClassicSimilarity], result of:
      0.102201715 = score(doc=5937,freq=2.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.44908544 = fieldWeight in 5937, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.0625 = fieldNorm(doc=5937)
    0.024274498 = product of:
      0.048548996 = sum of:
        0.048548996 = weight(_text_:22 in 5937) [ClassicSimilarity], result of:
          0.048548996 = score(doc=5937,freq=2.0), product of:
            0.15685207 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04479146 = queryNorm
            0.30952093 = fieldWeight in 5937, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=5937)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 14. 6.2001 12:22:31

McIlwaine, I.C.: Trends in knowledge organization research (2003) 0.06

0.06323811 = product of:
  0.12647621 = sum of:
    0.102201715 = weight(_text_:engines in 2289) [ClassicSimilarity], result of:
      0.102201715 = score(doc=2289,freq=2.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.44908544 = fieldWeight in 2289, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.0625 = fieldNorm(doc=2289)
    0.024274498 = product of:
      0.048548996 = sum of:
        0.048548996 = weight(_text_:22 in 2289) [ClassicSimilarity], result of:
          0.048548996 = score(doc=2289,freq=2.0), product of:
            0.15685207 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04479146 = queryNorm
            0.30952093 = fieldWeight in 2289, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=2289)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: This paper looks at current trends in knowledge organization research, concentrating an universal systems, mapping vocabularies and interoperability concerns, problems of blas, the Internet and search engines, resource discovery, thesauri and visual presentation. Some Problems facing researchers at the present time are discussed. It is accompanied by a bibliography of recent work in the field.
Date: 10. 6.2004 19:22:56

Cohen, D.J.: From Babel to knowledge : data mining large digital collections (2006) 0.06
```
0.057399243 = product of:
  0.114798486 = sum of:
    0.072267525 = weight(_text_:engines in 1178) [ClassicSimilarity], result of:
      0.072267525 = score(doc=1178,freq=4.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.31755137 = fieldWeight in 1178, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.03125 = fieldNorm(doc=1178)
    0.042530965 = product of:
      0.08506193 = sum of:
        0.08506193 = weight(_text_:programming in 1178) [ClassicSimilarity], result of:
          0.08506193 = score(doc=1178,freq=2.0), product of:
            0.29361802 = queryWeight, product of:
              6.5552235 = idf(docFreq=170, maxDocs=44218)
              0.04479146 = queryNorm
            0.28970268 = fieldWeight in 1178, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.5552235 = idf(docFreq=170, maxDocs=44218)
              0.03125 = fieldNorm(doc=1178)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

In Jorge Luis Borges's curious short story The Library of Babel, the narrator describes an endless collection of books stored from floor to ceiling in a labyrinth of countless hexagonal rooms. The pages of the library's books seem to contain random sequences of letters and spaces; occasionally a few intelligible words emerge in the sea of paper and ink. Nevertheless, readers diligently, and exasperatingly, scan the shelves for coherent passages. The narrator himself has wandered numerous rooms in search of enlightenment, but with resignation he simply awaits his death and burial - which Borges explains (with signature dark humor) consists of being tossed unceremoniously over the library's banister. Borges's nightmare, of course, is a cursed vision of the research methods of disciplines such as literature, history, and philosophy, where the careful reading of books, one after the other, is supposed to lead inexorably to knowledge and understanding. Computer scientists would approach Borges's library far differently. Employing the information theory that forms the basis for search engines and other computerized techniques for assessing in one fell swoop large masses of documents, they would quickly realize the collection's incoherence though sampling and statistical methods - and wisely start looking for the library's exit. These computational methods, which allow us to find patterns, determine relationships, categorize documents, and extract information from massive corpuses, will form the basis for new tools for research in the humanities and other disciplines in the coming decade. For the past three years I have been experimenting with how to provide such end-user tools - that is, tools that harness the power of vast electronic collections while hiding much of their complicated technical plumbing. In particular, I have made extensive use of the application programming interfaces (APIs) the leading search engines provide for programmers to query their databases directly (from server to server without using their web interfaces). In addition, I have explored how one might extract information from large digital collections, from the well-curated lexicographic database WordNet to the democratic (and poorly curated) online reference work Wikipedia. While processing these digital corpuses is currently an imperfect science, even now useful tools can be created by combining various collections and methods for searching and analyzing them. And more importantly, these nascent services suggest a future in which information can be gleaned from, and sense can be made out of, even imperfect digital libraries of enormous scale. A brief examination of two approaches to data mining large digital collections hints at this future, while also providing some lessons about how to get there.
Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.06
```
0.057169482 = product of:
  0.114338964 = sum of:
    0.102201715 = weight(_text_:engines in 2741) [ClassicSimilarity], result of:
      0.102201715 = score(doc=2741,freq=8.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.44908544 = fieldWeight in 2741, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.03125 = fieldNorm(doc=2741)
    0.012137249 = product of:
      0.024274498 = sum of:
        0.024274498 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
          0.024274498 = score(doc=2741,freq=2.0), product of:
            0.15685207 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04479146 = queryNorm
            0.15476047 = fieldWeight in 2741, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

This study seeks to find out how human beings cluster Web pages naturally. Twenty Web pages retrieved by the Northem Light search engine for each of 10 queries were sorted by 3 subjects into categories that were natural or meaningful to them. lt was found that different subjects clustered the same set of Web pages quite differently and created different categories. The average inter-subject similarity of the clusters created was a low 0.27. Subjects created an average of 5.4 clusters for each sorting. The categories constructed can be divided into 10 types. About 1/3 of the categories created were topical. Another 20% of the categories relate to the degree of relevance or usefulness. The rest of the categories were subject-independent categories such as format, purpose, authoritativeness and direction to other sources. The authors plan to develop automatic methods for categorizing Web pages using the common categories created by the subjects. lt is hoped that the techniques developed can be used by Web search engines to automatically organize Web pages retrieved into categories that are natural to users. 1. Introduction The World Wide Web is an increasingly important source of information for people globally because of its ease of access, the ease of publishing, its ability to transcend geographic and national boundaries, its flexibility and heterogeneity and its dynamic nature. However, Web users also find it increasingly difficult to locate relevant and useful information in this vast information storehouse. Web search engines, despite their scope and power, appear to be quite ineffective. They retrieve too many pages, and though they attempt to rank retrieved pages in order of probable relevance, often the relevant documents do not appear in the top-ranked 10 or 20 documents displayed. Several studies have found that users do not know how to use the advanced features of Web search engines, and do not know how to formulate and re-formulate queries. Users also typically exert minimal effort in performing, evaluating and refining their searches, and are unwilling to scan more than 10 or 20 items retrieved (Jansen, Spink, Bateman & Saracevic, 1998). This suggests that the conventional ranked-list display of search results does not satisfy user requirements, and that better ways of presenting and summarizing search results have to be developed. One promising approach is to group retrieved pages into clusters or categories to allow users to navigate immediately to the "promising" clusters where the most useful Web pages are likely to be located. This approach has been adopted by a number of search engines (notably Northem Light) and search agents.

Date

12. 9.2004 9:56:22

Rose, D.E.: Reconciling information-seeking behavior with search user interfaces for the Web (2006) 0.06

0.05533334 = product of:
  0.11066668 = sum of:
    0.089426495 = weight(_text_:engines in 5296) [ClassicSimilarity], result of:
      0.089426495 = score(doc=5296,freq=2.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.39294976 = fieldWeight in 5296, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5296)
    0.021240186 = product of:
      0.042480372 = sum of:
        0.042480372 = weight(_text_:22 in 5296) [ClassicSimilarity], result of:
          0.042480372 = score(doc=5296,freq=2.0), product of:
            0.15685207 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04479146 = queryNorm
            0.2708308 = fieldWeight in 5296, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5296)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: User interfaces of Web search engines reflect attributes of the underlying tools used to create them, rather than what we know about how people look for information. In this article, the author examines several characteristics of user search behavior: the variety of information-seeking goals, the cultural and situational context of search, and the iterative nature of the search task. An analysis of these characteristics suggests ways that interfaces can be redesigned to make searching more effective for users.
Date: 22. 7.2006 17:58:06

Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.05
```
0.054213837 = product of:
  0.21685535 = sum of:
    0.21685535 = sum of:
      0.1804436 = weight(_text_:programming in 2239) [ClassicSimilarity], result of:
        0.1804436 = score(doc=2239,freq=4.0), product of:
          0.29361802 = queryWeight, product of:
            6.5552235 = idf(docFreq=170, maxDocs=44218)
            0.04479146 = queryNorm
          0.6145522 = fieldWeight in 2239, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            6.5552235 = idf(docFreq=170, maxDocs=44218)
            0.046875 = fieldNorm(doc=2239)
      0.036411747 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
        0.036411747 = score(doc=2239,freq=2.0), product of:
          0.15685207 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04479146 = queryNorm
          0.23214069 = fieldWeight in 2239, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2239)
  0.25 = coord(1/4)
```
Abstract

Genetic-based evolutionary learning algorithms, such as genetic algorithms (GAs) and genetic programming (GP), have been applied to information retrieval (IR) since the 1980s. Recently, GP has been applied to a new IR taskdiscovery of ranking functions for Web search-and has achieved very promising results. However, in our prior research, only one fitness function has been used for GP-based learning. It is unclear how other fitness functions may affect ranking function discovery for Web search, especially since it is weIl known that choosing a proper fitness function is very important for the effectiveness and efficiency of evolutionary algorithms. In this article, we report our experience in contrasting different fitness function designs an GP-based learning using a very large Web corpus. Our results indicate that the design of fitness functions is instrumental in performance improvement. We also give recommendations an the design of fitness functions for genetic-based information retrieval experiments.

Date

31. 5.2004 19:22:06
Hsu, C.-N.; Chang, C.-H.; Hsieh, C.-H.; Lu, J.-J.; Chang, C.-C.: Reconfigurable Web wrapper agents for biological information integration (2005) 0.05
```
0.053626906 = product of:
  0.21450762 = sum of:
    0.21450762 = sum of:
      0.1841645 = weight(_text_:programming in 5263) [ClassicSimilarity], result of:
        0.1841645 = score(doc=5263,freq=6.0), product of:
          0.29361802 = queryWeight, product of:
            6.5552235 = idf(docFreq=170, maxDocs=44218)
            0.04479146 = queryNorm
          0.62722474 = fieldWeight in 5263, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            6.5552235 = idf(docFreq=170, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5263)
      0.030343125 = weight(_text_:22 in 5263) [ClassicSimilarity], result of:
        0.030343125 = score(doc=5263,freq=2.0), product of:
          0.15685207 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04479146 = queryNorm
          0.19345059 = fieldWeight in 5263, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5263)
  0.25 = coord(1/4)
```
Abstract

A variety of biological data is transferred and exchanged in overwhelming volumes on the World Wide Web. How to rapidly capture, utilize, and integrate the information on the Internet to discover valuable biological knowledge is one of the most critical issues in bioinformatics. Many information integration systems have been proposed for integrating biological data. These systems usually rely on an intermediate software layer called wrappers to access connected information sources. Wrapper construction for Web data sources is often specially hand coded to accommodate the differences between each Web site. However, programming a Web wrapper requires substantial programming skill, and is time-consuming and hard to maintain. In this article we provide a solution for rapidly building software agents that can serve as Web wrappers for biological information integration. We define an XML-based language called Web Navigation Description Language (WNDL), to model a Web-browsing session. A WNDL script describes how to locate the data, extract the data, and combine the data. By executing different WNDL scripts, we can automate virtually all types of Web-browsing sessions. We also describe IEPAD (Information Extraction Based on Pattern Discovery), a data extractor based on pattern discovery techniques. IEPAD allows our software agents to automatically discover the extraction rules to extract the contents of a structurally formatted Web page. With a programming-by-example authoring tool, a user can generate a complete Web wrapper agent by browsing the target Web sites. We built a variety of biological applications to demonstrate the feasibility of our approach.

Date

22. 7.2006 14:36:42
Chau, M.; Lu, Y.; Fang, X.; Yang, C.C.: Characteristics of character usage in Chinese Web searching (2009) 0.05
```
0.052752987 = product of:
  0.10550597 = sum of:
    0.09033441 = weight(_text_:engines in 2456) [ClassicSimilarity], result of:
      0.09033441 = score(doc=2456,freq=4.0), product of:
        0.22757743 = queryWeight, product of:
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.04479146 = queryNorm
        0.39693922 = fieldWeight in 2456, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.080822 = idf(docFreq=746, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2456)
    0.015171562 = product of:
      0.030343125 = sum of:
        0.030343125 = weight(_text_:22 in 2456) [ClassicSimilarity], result of:
          0.030343125 = score(doc=2456,freq=2.0), product of:
            0.15685207 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04479146 = queryNorm
            0.19345059 = fieldWeight in 2456, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2456)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The use of non-English Web search engines has been prevalent. Given the popularity of Chinese Web searching and the unique characteristics of Chinese language, it is imperative to conduct studies with focuses on the analysis of Chinese Web search queries. In this paper, we report our research on the character usage of Chinese search logs from a Web search engine in Hong Kong. By examining the distribution of search query terms, we found that users tended to use more diversified terms and that the usage of characters in search queries was quite different from the character usage of general online information in Chinese. After studying the Zipf distribution of n-grams with different values of n, we found that the curve of unigram is the most curved one of all while the bigram curve follows the Zipf distribution best, and that the curves of n-grams with larger n (n = 3-6) had similar structures with ?-values in the range of 0.66-0.86. The distribution of combined n-grams was also studied. All the analyses are performed on the data both before and after the removal of function terms and incomplete terms and similar findings are revealed. We believe the findings from this study have provided some insights into further research in non-English Web searching and will assist in the design of more effective Chinese Web search engines.

Date

22.11.2008 17:57:22

Search (1374 results, page 1 of 69)

Authors

Languages

Types

Themes