Search (614 results, page 2 of 31)

Park, E.-K.; Ra, D.-Y.; Jang, M.-G.: Techniques for improving web retrieval effectiveness (2005) 0.04

0.04432874 = product of:
  0.11082185 = sum of:
    0.07428389 = weight(_text_:retrieval in 1060) [ClassicSimilarity], result of:
      0.07428389 = score(doc=1060,freq=14.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.5305404 = fieldWeight in 1060, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1060)
    0.03653796 = product of:
      0.07307592 = sum of:
        0.07307592 = weight(_text_:web in 1060) [ClassicSimilarity], result of:
          0.07307592 = score(doc=1060,freq=10.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.48375595 = fieldWeight in 1060, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=1060)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: This paper talks about several schemes for improving retrieval effectiveness that can be used in the named page finding tasks of web information retrieval (Overview of the TREC-2002 web track. In: Proceedings of the Eleventh Text Retrieval Conference TREC-2002, NIST Special Publication #500-251, 2003). These methods were applied on top of the basic information retrieval model as additional mechanisms to upgrade the system. Use of the title of web pages was found to be effective. It was confirmed that anchor texts of incoming links was beneficial as suggested in other works. Sentence-query similarity is a new type of information proposed by us and was identified to be the best information to take advantage of. Stratifying and re-ranking the retrieval list based on the maximum count of index terms in common between a sentence and a query resulted in significant improvement of performance. To demonstrate these facts a large-scale web information retrieval system was developed and used for experimentation.

Flores-Herr, N.; Sack, H.; Bossert, K.: Suche in Multimediaarchiven von Kultureinrichtungen (2011) 0.04
```
0.04132867 = product of:
  0.10332167 = sum of:
    0.075019486 = weight(_text_:semantic in 346) [ClassicSimilarity], result of:
      0.075019486 = score(doc=346,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.38979942 = fieldWeight in 346, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=346)
    0.028302183 = product of:
      0.056604367 = sum of:
        0.056604367 = weight(_text_:web in 346) [ClassicSimilarity], result of:
          0.056604367 = score(doc=346,freq=6.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.37471575 = fieldWeight in 346, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=346)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

In diesem Kapitel werden Vorschläge für neue Suchparadigmen nach multimedialen Inhalten in Archiven von Kultureinrichtungen vorgestellt. Um die Notwendigkeit für eine Integration dieser neuen Technologien zu zeigen, werden zunächst Einschränkungen der klassischen katalogbasierten Bibliothekssuche im Zeitalter von immer weiter wachsenden Multimediasammlungen beschrieben. Im Anschluss werden die Vor- und Nachteile zweier Suchparadigmen dargestellt, mit deren Hilfe in Zukunft für Wissenschaftler und Kulturschaffende die Suche nach multimedialen Inhalten erleichtert werden könnte. Zunächst werden die Perspektiven einer semantischen Suche auf Basis von Semantic-Web-Technologien in Bibliotheken beschrieben. Im Anschluss werden Suchmöglichkeiten für Multimediainhalte auf Basis von automatischer inhaltsbasierter Medienanalyse gezeigt. Das Kapitel endet mit einem Ausblick auf eine mögliche Vereinigung der beiden neuen Ansätze mit katalogbasierter Bibliothekssuche.

Source

Handbuch Internet-Suchmaschinen, 2: Neue Entwicklungen in der Web-Suche. Hrsg.: D. Lewandowski

Theme

Semantic Web

Naing, M.-M.; Lim, E.-P.; Chiang, R.H.L.: Extracting link chains of relationship instances from a Web site (2006) 0.04

0.040827036 = product of:
  0.10206759 = sum of:
    0.05304678 = weight(_text_:semantic in 6111) [ClassicSimilarity], result of:
      0.05304678 = score(doc=6111,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.2756298 = fieldWeight in 6111, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=6111)
    0.049020812 = product of:
      0.098041624 = sum of:
        0.098041624 = weight(_text_:web in 6111) [ClassicSimilarity], result of:
          0.098041624 = score(doc=6111,freq=18.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.64902663 = fieldWeight in 6111, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=6111)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Web pages from a Web site can often be associated with concepts in an ontology, and pairs of Web pages also can be associated with relationships between concepts. With such associations, the Web site can be searched, browsed, or even reorganized based on the concept and relationship labels of its Web pages. In this article, we study the link chain extraction problem that is critical to the extraction of Web pages that are related. A link chain is an ordered list of anchor elements linking two Web pages related by some semantic relationship. We propose a link chain extraction method that derives extraction rules for identifying the anchor elements forming the link chains. We applied the proposed method to two well-structured Web sites and found that its performance in terms of precision and recall is good, even with a small number of training examples.

Croft, W.B.: Combining approaches to information retrieval (2000) 0.04
```
0.038301237 = product of:
  0.09575309 = sum of:
    0.07941282 = weight(_text_:retrieval in 6862) [ClassicSimilarity], result of:
      0.07941282 = score(doc=6862,freq=16.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.5671716 = fieldWeight in 6862, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=6862)
    0.01634027 = product of:
      0.03268054 = sum of:
        0.03268054 = weight(_text_:web in 6862) [ClassicSimilarity], result of:
          0.03268054 = score(doc=6862,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.21634221 = fieldWeight in 6862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=6862)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the "meta-search" engines used on the Web. This paper examines the development of this technique, including both experimental results and the retrieval models that have been proposed as formal frameworks for combination. We show that combining approaches for information retrieval can be modeled as combining the outputs of multiple classifiers based on one or more representations, and that this simple model can provide explanations for many of the experimental results. We also show that this view of combination is very similar to the inference net model, and that a new approach to retrieval based on language models supports combination and can be integrated with the inference net model

Series

The Kluwer international series on information retrieval; 7

Source

Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Ed.: W.B. Croft

Castillo, C.; Baeza-Yates, R.: Web retrieval and mining (2009) 0.04

0.035580706 = product of:
  0.08895176 = sum of:
    0.04632414 = weight(_text_:retrieval in 3904) [ClassicSimilarity], result of:
      0.04632414 = score(doc=3904,freq=4.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.33085006 = fieldWeight in 3904, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3904)
    0.042627618 = product of:
      0.085255235 = sum of:
        0.085255235 = weight(_text_:web in 3904) [ClassicSimilarity], result of:
          0.085255235 = score(doc=3904,freq=10.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.5643819 = fieldWeight in 3904, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3904)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The advent of the Web in the mid-1990s followed by its fast adoption in a relatively short time, posed significant challenges to classical information retrieval methods developed in the 1970s and the 1980s. The major challenges include that the Web is massive, dynamic, and distributed. The two main types of tasks that are carried on the Web are searching and mining. Searching is locating information given an information need, and mining is extracting information and/or knowledge from a corpus. The metrics for success when carrying these tasks on the Web include precision, recall (completeness), freshness, and efficiency.

Dominich, S.; Skrop, A.: PageRank and interaction information retrieval (2005) 0.04

0.035533555 = product of:
  0.08883388 = sum of:
    0.056153342 = weight(_text_:retrieval in 3268) [ClassicSimilarity], result of:
      0.056153342 = score(doc=3268,freq=8.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.40105087 = fieldWeight in 3268, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3268)
    0.03268054 = product of:
      0.06536108 = sum of:
        0.06536108 = weight(_text_:web in 3268) [ClassicSimilarity], result of:
          0.06536108 = score(doc=3268,freq=8.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.43268442 = fieldWeight in 3268, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3268)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The PageRank method is used by the Google Web search engine to compute the importance of Web pages. Two different views have been developed for the Interpretation of the PageRank method and values: (a) stochastic (random surfer): the PageRank values can be conceived as the steady-state distribution of a Markov chain, and (b) algebraic: the PageRank values form the eigenvector corresponding to eigenvalue 1 of the Web link matrix. The Interaction Information Retrieval (1**2 R) method is a nonclassical information retrieval paradigm, which represents a connectionist approach based an dynamic systems. In the present paper, a different Interpretation of PageRank is proposed, namely, a dynamic systems viewpoint, by showing that the PageRank method can be formally interpreted as a particular case of the Interaction Information Retrieval method; and thus, the PageRank values may be interpreted as neutral equilibrium points of the Web.

Agata, T.: ¬A measure for evaluating search engines on the World Wide Web : retrieval test with ESL (Expected Search Length) (1997) 0.04

0.035533555 = product of:
  0.08883388 = sum of:
    0.056153342 = weight(_text_:retrieval in 3892) [ClassicSimilarity], result of:
      0.056153342 = score(doc=3892,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.40105087 = fieldWeight in 3892, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=3892)
    0.03268054 = product of:
      0.06536108 = sum of:
        0.06536108 = weight(_text_:web in 3892) [ClassicSimilarity], result of:
          0.06536108 = score(doc=3892,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.43268442 = fieldWeight in 3892, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=3892)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Gordon, M.; Pathak, P.: Finding information on the World Wide Web : the retrieval effectiveness of search engines. (1999) 0.04

0.035533555 = product of:
  0.08883388 = sum of:
    0.056153342 = weight(_text_:retrieval in 3941) [ClassicSimilarity], result of:
      0.056153342 = score(doc=3941,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.40105087 = fieldWeight in 3941, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=3941)
    0.03268054 = product of:
      0.06536108 = sum of:
        0.06536108 = weight(_text_:web in 3941) [ClassicSimilarity], result of:
          0.06536108 = score(doc=3941,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.43268442 = fieldWeight in 3941, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=3941)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Meghabghab, G.: Google's Web page ranking applied to different topological Web graph structures (2001) 0.04
```
0.03548062 = product of:
  0.088701546 = sum of:
    0.023397226 = weight(_text_:retrieval in 6028) [ClassicSimilarity], result of:
      0.023397226 = score(doc=6028,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.16710453 = fieldWeight in 6028, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6028)
    0.065304324 = product of:
      0.13060865 = sum of:
        0.13060865 = weight(_text_:web in 6028) [ClassicSimilarity], result of:
          0.13060865 = score(doc=6028,freq=46.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.86461735 = fieldWeight in 6028, product of:
              6.78233 = tf(freq=46.0), with freq of:
                46.0 = termFreq=46.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6028)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

This research is part of the ongoing study to better understand web page ranking on the web. It looks at a web page as a graph structure or a web graph, and tries to classify different web graphs in the new coordinate space: (out-degree, in-degree). The out-degree coordinate od is defined as the number of outgoing web pages from a given web page. The in-degree id coordinate is the number of web pages that point to a given web page. In this new coordinate space a metric is built to classify how close or far different web graphs are. Google's web ranking algorithm (Brin & Page, 1998) on ranking web pages is applied in this new coordinate space. The results of the algorithm has been modified to fit different topological web graph structures. Also the algorithm was not successful in the case of general web graphs and new ranking web algorithms have to be considered. This study does not look at enhancing web ranking by adding any contextual information. It only considers web links as a source to web page ranking. The author believes that understanding the underlying web page as a graph will help design better ranking web algorithms, enhance retrieval and web performance, and recommends using graphs as a part of visual aid for browsing engine designers
Toms, E.G.; Taves, A.R.: Measuring user perceptions of Web site reputation (2004) 0.03
```
0.034906417 = product of:
  0.08726604 = sum of:
    0.04420565 = weight(_text_:semantic in 2565) [ClassicSimilarity], result of:
      0.04420565 = score(doc=2565,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.22969149 = fieldWeight in 2565, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2565)
    0.043060396 = product of:
      0.08612079 = sum of:
        0.08612079 = weight(_text_:web in 2565) [ClassicSimilarity], result of:
          0.08612079 = score(doc=2565,freq=20.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.5701118 = fieldWeight in 2565, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2565)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

In this study, we compare a search tool, TOPIC, with three other widely used tools that retrieve information from the Web: AltaVista, Google, and Lycos. These tools use different techniques for outputting and ranking Web sites: external link structure (TOPIC and Google) and semantic content analysis (AltaVista and Lycos). TOPIC purports to output, and highly rank within its hit list, reputable Web sites for searched topics. In this study, 80 participants reviewed the output (i.e., highly ranked sites) from each tool and assessed the quality of retrieved sites. The 4800 individual assessments of 240 sites that represent 12 topics indicated that Google tends to identify and highly rank significantly more reputable Web sites than TOPIC, which, in turn, outputs more than AltaVista and Lycos, but this was not consistent from topic to topic. Metrics derived from reputation research were used in the assessment and a factor analysis was employed to identify a key factor, which we call 'repute'. The results of this research include insight into the factors that Web users consider in formulating perceptions of Web site reputation, and insight into which search tools are outputting reputable sites for Web users. Our findings, we believe, have implications for Web users and suggest the need for future research to assess the relationship between Web page characteristics and their perceived reputation.

Höfer, W.: Detektive im Web (1999) 0.03

0.034357738 = product of:
  0.17178869 = sum of:
    0.17178869 = sum of:
      0.06536108 = weight(_text_:web in 4007) [ClassicSimilarity], result of:
        0.06536108 = score(doc=4007,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.43268442 = fieldWeight in 4007, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.09375 = fieldNorm(doc=4007)
      0.1064276 = weight(_text_:22 in 4007) [ClassicSimilarity], result of:
        0.1064276 = score(doc=4007,freq=4.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.6565931 = fieldWeight in 4007, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.09375 = fieldNorm(doc=4007)
  0.2 = coord(1/5)

Date: 22. 8.1999 20:22:06

Falk, H.: World Wide Web search and retrieval (1997) 0.03

0.034123536 = product of:
  0.08530884 = sum of:
    0.04679445 = weight(_text_:retrieval in 6815) [ClassicSimilarity], result of:
      0.04679445 = score(doc=6815,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.33420905 = fieldWeight in 6815, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=6815)
    0.03851439 = product of:
      0.07702878 = sum of:
        0.07702878 = weight(_text_:web in 6815) [ClassicSimilarity], result of:
          0.07702878 = score(doc=6815,freq=4.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.5099235 = fieldWeight in 6815, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=6815)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Lists and briefly describes the range of facilities available for searching the WWW, such as search engines, link-following packages and content delivery services. Includes web site information (URLs) with each product and service reviewed

Poynder, R.: Web research engines? (1996) 0.03
```
0.034067273 = product of:
  0.08516818 = sum of:
    0.048630223 = weight(_text_:retrieval in 5698) [ClassicSimilarity], result of:
      0.048630223 = score(doc=5698,freq=6.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.34732026 = fieldWeight in 5698, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5698)
    0.03653796 = product of:
      0.07307592 = sum of:
        0.07307592 = weight(_text_:web in 5698) [ClassicSimilarity], result of:
          0.07307592 = score(doc=5698,freq=10.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.48375595 = fieldWeight in 5698, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5698)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Describes the shortcomings of search engines for the WWW comparing their current capabilities to those of the first generation CD-ROM products. Some allow phrase searching and most are improving their Boolean searching. Few allow truncation, wild cards or nested logic. They are stateless, losing previous search criteria. Unlike the indexing and classification systems for today's CD-ROMs, those for Web pages are random, unstructured and of variable quality. Considers that at best Web search engines can only offer free text searching. Discusses whether automatic data classification systems such as Infoseek Ultra can overcome the haphazard nature of the Web with neural network technology, and whether Boolean search techniques may be redundant when replaced by technology such as the Euroferret search engine. However, artificial intelligence is rarely successful on huge, varied databases. Relevance ranking and automatic query expansion still use the same simple inverted indexes. Most Web search engines do nothing more than word counting. Further complications arise with foreign languages

Theme

Verbale Doksprachen im Online-Retrieval
Klassifikationssysteme im Online-Retrieval
Semantisches Umfeld in Indexierung u. Retrieval

Tjondronegoro, D.; Spink, A.: Web search engine multimedia functionality (2008) 0.03

0.03290849 = product of:
  0.082271226 = sum of:
    0.028076671 = weight(_text_:retrieval in 2038) [ClassicSimilarity], result of:
      0.028076671 = score(doc=2038,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.20052543 = fieldWeight in 2038, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2038)
    0.054194555 = product of:
      0.10838911 = sum of:
        0.10838911 = weight(_text_:web in 2038) [ClassicSimilarity], result of:
          0.10838911 = score(doc=2038,freq=22.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.717526 = fieldWeight in 2038, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2038)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Web search engines are beginning to offer access to multimedia searching, including audio, video and image searching. In this paper we report findings from a study examining the state of multimedia search functionality on major general and specialized Web search engines. We investigated 102 Web search engines to examine: (1) how many Web search engines offer multimedia searching, (2) the type of multimedia search functionality and methods offered, such as "query by example", and (3) the supports for personalization or customization which are accessible as advanced search. Findings include: (1) few major Web search engines offer multimedia searching and (2) multimedia Web search functionality is generally limited. Our findings show that despite the increasing level of interest in multimedia Web search, those few Web search engines offering multimedia Web search, provide limited multimedia search functionality. Keywords are still the only means of multimedia retrieval, while other methods such as "query by example" are offered by less than 1% of Web search engines examined.

Vidmar, D.J.: Darwin on the Web : the evolution of search tools (1999) 0.03

0.03281058 = product of:
  0.1640529 = sum of:
    0.1640529 = sum of:
      0.076254606 = weight(_text_:web in 3175) [ClassicSimilarity], result of:
        0.076254606 = score(doc=3175,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.50479853 = fieldWeight in 3175, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.109375 = fieldNorm(doc=3175)
      0.08779829 = weight(_text_:22 in 3175) [ClassicSimilarity], result of:
        0.08779829 = score(doc=3175,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.5416616 = fieldWeight in 3175, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.109375 = fieldNorm(doc=3175)
  0.2 = coord(1/5)

Source: Computers in libraries. 19(1999) no.5, S.22-28

Stock, W.G.: Qualitätskriterien von Suchmaschinen : Checkliste für Retrievalsysteme (2000) 0.03
```
0.03279502 = product of:
  0.081987545 = sum of:
    0.023397226 = weight(_text_:retrieval in 5773) [ClassicSimilarity], result of:
      0.023397226 = score(doc=5773,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.16710453 = fieldWeight in 5773, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5773)
    0.05859032 = sum of:
      0.027233787 = weight(_text_:web in 5773) [ClassicSimilarity], result of:
        0.027233787 = score(doc=5773,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.18028519 = fieldWeight in 5773, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5773)
      0.031356532 = weight(_text_:22 in 5773) [ClassicSimilarity], result of:
        0.031356532 = score(doc=5773,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.19345059 = fieldWeight in 5773, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5773)
  0.4 = coord(2/5)
```
Abstract

Suchmaschinen im World Wide Web wird nachgesagt, dass sie - insbesondere im Vergleich zur Retrievalsoftware kommerzieller Online-Archive suboptimale Methoden und Werkzeuge einsetzen. Elaborierte befehlsorientierte Retrievalsysteme sind vom Laien gar nicht und vom Professional nur dann zu bedienen, wenn man stets damit arbeitet. Die Suchsysteme einiger "independents", also isolierter Informationsproduzenten im Internet, zeichnen sich durch einen Minimalismus aus, der an den Befehlsumfang anfangs der 70er Jahre erinnert. Retrievalsoftware in Intranets, wenn sie denn überhaupt benutzt wird, setzt fast ausnahmslos auf automatische Methoden von Indexierung und Retrieval und ignoriert dabei nahezu vollständig dokumentarisches Know how. Suchmaschinen bzw. Retrievalsysteme - wir wollen beide Bezeichnungen synonym verwenden - bereiten demnach, egal wo sie vorkommen, Schwierigkeiten. An ihrer Qualität wird gezweifelt. Aber was heißt überhaupt: Qualität von Suchmaschinen? Was zeichnet ein gutes Retrievalsystem aus? Und was fehlt einem schlechten? Wir wollen eine Liste von Kriterien entwickeln, die für gutes Suchen (und Finden!) wesentlich sind. Es geht also ausschließlich um Quantität und Qualität der Suchoptionen, nicht um weitere Leistungsindikatoren wie Geschwindigkeit oder ergonomische Benutzerschnittstellen. Stillschweigend vorausgesetzt wirdjedoch der Abschied von ausschließlich befehlsorientierten Systemen, d.h. wir unterstellen Bildschirmgestaltungen, die die Befehle intuitiv einleuchtend darstellen. Unsere Checkliste enthält nur solche Optionen, die entweder (bei irgendwelchen Systemen) schon im Einsatz sind (und wiederholt damit zum Teil Altbekanntes) oder deren technische Realisierungsmöglichkeit bereits in experimentellen Umgebungen aufgezeigt worden ist. insofern ist die Liste eine Minimalforderung an Retrievalsysteme, die durchaus erweiterungsfähig ist. Gegliedert wird der Kriterienkatalog nach (1.) den Basisfunktionen zur Suche singulärer Datensätze, (2.) den informetrischen Funktionen zur Charakterisierunggewisser Nachweismengen sowie (3.) den Kriterien zur Mächtigkeit automatischer Indexierung und natürlichsprachiger Suche

Source

Password. 2000, H.5, S.22-31
Baeza-Yates, R.; Boldi, P.; Castillo, C.: Generalizing PageRank : damping functions for linkbased ranking algorithms (2006) 0.03
```
0.03279502 = product of:
  0.081987545 = sum of:
    0.023397226 = weight(_text_:retrieval in 2565) [ClassicSimilarity], result of:
      0.023397226 = score(doc=2565,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.16710453 = fieldWeight in 2565, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2565)
    0.05859032 = sum of:
      0.027233787 = weight(_text_:web in 2565) [ClassicSimilarity], result of:
        0.027233787 = score(doc=2565,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.18028519 = fieldWeight in 2565, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2565)
      0.031356532 = weight(_text_:22 in 2565) [ClassicSimilarity], result of:
        0.031356532 = score(doc=2565,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.19345059 = fieldWeight in 2565, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2565)
  0.4 = coord(2/5)
```
Abstract

This paper introduces a family of link-based ranking algorithms that propagate page importance through links. In these algorithms there is a damping function that decreases with distance, so a direct link implies more endorsement than a link through a long path. PageRank is the most widely known ranking function of this family. The main objective of this paper is to determine whether this family of ranking techniques has some interest per se, and how different choices for the damping function impact on rank quality and on convergence speed. Even though our results suggest that PageRank can be approximated with other simpler forms of rankings that may be computed more efficiently, our focus is of more speculative nature, in that it aims at separating the kernel of PageRank, that is, link-based importance propagation, from the way propagation decays over paths. We focus on three damping functions, having linear, exponential, and hyperbolic decay on the lengths of the paths. The exponential decay corresponds to PageRank, and the other functions are new. Our presentation includes algorithms, analysis, comparisons and experiments that study their behavior under different parameters in real Web graph data. Among other results, we show how to calculate a linear approximation that induces a page ordering that is almost identical to PageRank's using a fixed small number of iterations; comparisons were performed using Kendall's tau on large domain datasets.

Date

16. 1.2016 10:22:28

Source

http://chato.cl/papers/baeza06_general_pagerank_damping_functions_link_ranking.pdf [Proceedings of the ACM Special Interest Group on Information Retrieval (SIGIR) Conference, SIGIR'06, August 6-10, 2006, Seattle, Washington, USA]
Lewandowski, D.: Web Information Retrieval (2005) 0.03
```
0.032705363 = product of:
  0.08176341 = sum of:
    0.052941877 = weight(_text_:retrieval in 4028) [ClassicSimilarity], result of:
      0.052941877 = score(doc=4028,freq=16.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.37811437 = fieldWeight in 4028, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=4028)
    0.028821532 = product of:
      0.057643063 = sum of:
        0.057643063 = weight(_text_:web in 4028) [ClassicSimilarity], result of:
          0.057643063 = score(doc=4028,freq=14.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.38159183 = fieldWeight in 4028, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=4028)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

WebInformationRetrieval hat sich als gesonderter Forschungsbereich herausgebildet. Neben den im klassischen Information Retrieval behandelten Fragen ergeben sich durch die Eigenheiten des Web neue und zusätzliche Forschungsfragen. Die Unterschiede zwischen Information Retrieval und Web Information Retrieval werden diskutiert. Derzweite Teil des Aufsatzes gibt einen Überblick über die Forschungsliteratur der letzten zwei Jahre. Dieser Aufsatz gibt einen Überblick über den Stand der Forschung im Bereich Web Information Retrieval. Im ersten Teil werden die besonderen Probleme, die sich in diesem Bereich ergeben, anhand einer Gegenüberstellung mit dem "klassischen" Information Retrieval erläutert. Der weitere Text diskutiert die wichtigste in den letzten Jahren erschienene Literatur zum Thema, wobei ein Schwerpunkt auf die - so vorhanden-deutschsprachige Literatur gelegt wird. Der Schwerpunkt liegt auf Literatur aus den Jahren 2003 und 2004. Zum einen zeigt sich in dem betrachteten Forschungsfeld eine schnelle Entwicklung, so dass viele ältere Untersuchungen nur noch einen historischen bzw. methodischen Wert haben; andererseits existieren umfassende ältere Reviewartikel (s. v.a. Rasmussen 2003). Schon bei der Durchsicht der Literatur wird allerdings deutlich, dass zu einigen Themenfeldern keine oder nur wenig deutschsprachige Literatur vorhanden ist. Leider ist dies aber nicht nur darauf zurückzuführen, dass die Autoren aus den deutschsprachigen Ländern ihre Ergebnisse in englischer Sprache publizieren. Vielmehr wird deutlich, dass in diesen Ländern nur wenig Forschung im Suchmaschinen-Bereich stattfindet. Insbesondere zu sprachspezifischen Problemen von Web-Suchmaschinen fehlen Untersuchungen. Ein weiteres Problem der Forschung im Suchmaschinen-Bereich liegt in der Tatsache begründet, dass diese zu einem großen Teil innerhalb von Unternehmen stattfindet, welche sich scheuen, die Ergebnisse in großem Umfang zu publizieren, da sie fürchten, die Konkurrenz könnte von solchen Veröffentlichungen profitieren. So finden sich etwa auch Vergleichszahlen über einzelne Suchmaschinen oft nur innerhalb von Vorträgen oder Präsentationen von Firmenvertretern (z.B. Singhal 2004; Dean 2004). Das Hauptaugenmerk dieses Artikels liegt auf der Frage, inwieweit Suchmaschinen in der Lage sind, die im Web vorhanden Inhalte zu indexieren, mit welchen Methoden sie dies tun und ob bzw. wie sie ihre Ziele erreichen. Ausgenommen bleiben damit explizit Fragen der Effizienz bei der Erschließung des Web und der Skalierbarkeit von Suchmaschinen. Anders formuliert: Diese Übersicht orientiert sich an klassisch informationswissenschaftlichen Fragen und spart die eher im Bereich der Informatik diskutierten Fragen weitgehend aus.
Eine regelmäßige Übersicht neuer US-Patente und US-Patentanmeldungen im Bereich Information Retrieval bietet die News-Seite Resourceshelf (www.resourceshelf.com).

Content

Mit einer Tabelle, die eine Gegenüberstellung des WebRetrieval zum 'klassischen' Information Retrieval anbietet

Clarke, S.J.; Willett, P.: Estimating the recall performance of Web search engines (1997) 0.03

0.03240385 = product of:
  0.08100962 = sum of:
    0.03743556 = weight(_text_:retrieval in 760) [ClassicSimilarity], result of:
      0.03743556 = score(doc=760,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.26736724 = fieldWeight in 760, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=760)
    0.043574058 = product of:
      0.087148115 = sum of:
        0.087148115 = weight(_text_:web in 760) [ClassicSimilarity], result of:
          0.087148115 = score(doc=760,freq=8.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.5769126 = fieldWeight in 760, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=760)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Reports a comparison of the retrieval effectiveness of the AltaVista, Excite and Lycos Web search engines. Describes a method for comparing the recall of the 3 sets of searches, despite the fact that they are carried out on non identical sets of Web pages. It is thus possible, unlike previous comparative studies of Web search engines, to consider both recall and precision when evaluating the effectiveness of search engines

Munson, K.I.: Internet search engines : understanding their design to improve information retrieval (2000) 0.03

0.03240385 = product of:
  0.08100962 = sum of:
    0.03743556 = weight(_text_:retrieval in 6105) [ClassicSimilarity], result of:
      0.03743556 = score(doc=6105,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.26736724 = fieldWeight in 6105, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=6105)
    0.043574058 = product of:
      0.087148115 = sum of:
        0.087148115 = weight(_text_:web in 6105) [ClassicSimilarity], result of:
          0.087148115 = score(doc=6105,freq=8.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.5769126 = fieldWeight in 6105, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=6105)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The relationship between the methods currently used for indexing the World Wide Web and the programs, languages, and protocols on which the World Wide Web is based is examined. Two methods for indexing the Web are described, directories being briefly discussed while search engines are considered in detail. The automated approach used to create these tools is examined with special emphasis on the parts of a document used in indexing. Shortcomings of the approach are described. Suggestions for effective use of Web search engines are given

Search (614 results, page 2 of 31)

Authors

Years

Languages

Themes