Search (681 results, page 1 of 35)

Hogan, A.; Harth, A.; Umbrich, J.; Kinsella, S.; Polleres, A.; Decker, S.: Searching and browsing Linked Data with SWSE : the Semantic Web Search Engine (2011) 0.22

0.21735626 = product of:
  0.28980836 = sum of:
    0.100764915 = weight(_text_:web in 438) [ClassicSimilarity], result of:
      0.100764915 = score(doc=438,freq=24.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.6245262 = fieldWeight in 438, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=438)
    0.093319535 = weight(_text_:search in 438) [ClassicSimilarity], result of:
      0.093319535 = score(doc=438,freq=16.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.54307455 = fieldWeight in 438, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=438)
    0.095723905 = product of:
      0.19144781 = sum of:
        0.19144781 = weight(_text_:engine in 438) [ClassicSimilarity], result of:
          0.19144781 = score(doc=438,freq=12.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.72387516 = fieldWeight in 438, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.0390625 = fieldNorm(doc=438)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: In this paper, we discuss the architecture and implementation of the Semantic Web Search Engine (SWSE). Following traditional search engine architecture, SWSE consists of crawling, data enhancing, indexing and a user interface for search, browsing and retrieval of information; unlike traditional search engines, SWSE operates over RDF Web data - loosely also known as Linked Data - which implies unique challenges for the system design, architecture, algorithms, implementation and user interface. In particular, many challenges exist in adopting Semantic Web technologies for Web data: the unique challenges of the Web - in terms of scale, unreliability, inconsistency and noise - are largely overlooked by the current Semantic Web standards. Herein, we describe the current SWSE system, initially detailing the architecture and later elaborating upon the function, design, implementation and performance of each individual component. In so doing, we also give an insight into how current Semantic Web standards can be tailored, in a best-effort manner, for use on Web data. Throughout, we offer evaluation and complementary argumentation to support our design choices, and also offer discussion on future directions and open research questions. Later, we also provide candid discussion relating to the difficulties currently faced in bringing such a search engine into the mainstream, and lessons learnt from roughly six years working on the Semantic Web Search Engine project.
Object: Semantic Web Search Engine
Theme: Semantic Web

Brin, S.; Page, L.: ¬The anatomy of a large-scale hypertextual Web search engine (1998) 0.20

0.20150885 = product of:
  0.26867846 = sum of:
    0.07696048 = weight(_text_:web in 947) [ClassicSimilarity], result of:
      0.07696048 = score(doc=947,freq=14.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.47698978 = fieldWeight in 947, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=947)
    0.10433441 = weight(_text_:search in 947) [ClassicSimilarity], result of:
      0.10433441 = score(doc=947,freq=20.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.60717577 = fieldWeight in 947, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=947)
    0.08738356 = product of:
      0.17476712 = sum of:
        0.17476712 = weight(_text_:engine in 947) [ClassicSimilarity], result of:
          0.17476712 = score(doc=947,freq=10.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.66080457 = fieldWeight in 947, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.0390625 = fieldNorm(doc=947)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want

Oreskovic, A.: Google introduces new 'Hummingbird' search algorithm (2013) 0.20

0.20123148 = product of:
  0.26830864 = sum of:
    0.05817665 = weight(_text_:web in 2517) [ClassicSimilarity], result of:
      0.05817665 = score(doc=2517,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.36057037 = fieldWeight in 2517, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.078125 = fieldNorm(doc=2517)
    0.13197374 = weight(_text_:search in 2517) [ClassicSimilarity], result of:
      0.13197374 = score(doc=2517,freq=8.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.7680234 = fieldWeight in 2517, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.078125 = fieldNorm(doc=2517)
    0.07815824 = product of:
      0.15631647 = sum of:
        0.15631647 = weight(_text_:engine in 2517) [ClassicSimilarity], result of:
          0.15631647 = score(doc=2517,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.59104156 = fieldWeight in 2517, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.078125 = fieldNorm(doc=2517)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Google Inc has overhauled its search algorithm, the foundation of the Internet's dominant search engine, to better cope with the longer, more complex queries it has been getting from Web users.
Source: http://www.reuters.com/article/net-us-google-search-idUSBRE98P11O20131002

Radhakrishnan, A.: Swoogle : an engine for the Semantic Web (2007) 0.19
```
0.1853866 = product of:
  0.24718213 = sum of:
    0.095947385 = weight(_text_:web in 4709) [ClassicSimilarity], result of:
      0.095947385 = score(doc=4709,freq=34.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.59466785 = fieldWeight in 4709, product of:
          5.8309517 = tf(freq=34.0), with freq of:
            34.0 = termFreq=34.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=4709)
    0.07465562 = weight(_text_:search in 4709) [ClassicSimilarity], result of:
      0.07465562 = score(doc=4709,freq=16.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.43445963 = fieldWeight in 4709, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=4709)
    0.076579124 = product of:
      0.15315825 = sum of:
        0.15315825 = weight(_text_:engine in 4709) [ClassicSimilarity], result of:
          0.15315825 = score(doc=4709,freq=12.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.57910013 = fieldWeight in 4709, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.03125 = fieldNorm(doc=4709)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Content

"Swoogle, the Semantic web search engine, is a research project carried out by the ebiquity research group in the Computer Science and Electrical Engineering Department at the University of Maryland. It's an engine tailored towards finding documents on the semantic web. The whole research paper is available here. Semantic web is touted as the next generation of online content representation where the web documents are represented in a language that is not only easy for humans but is machine readable (easing the integration of data as never thought possible) as well. And the main elements of the semantic web include data model description formats such as Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, Turtle, N-Triples), and notations such as RDF Schema (RDFS), the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain (Wikipedia). And Swoogle is an attempt to mine and index this new set of web documents. The engine performs crawling of semantic documents like most web search engines and the search is available as web service too. The engine is primarily written in Java with the PHP used for the front-end and MySQL for database. Swoogle is capable of searching over 10,000 ontologies and indexes more that 1.3 million web documents. It also computes the importance of a Semantic Web document. The techniques used for indexing are the more google-type page ranking and also mining the documents for inter-relationships that are the basis for the semantic web. For more information on how the RDF framework can be used to relate documents, read the link here. Being a research project, and with a non-commercial motive, there is not much hype around Swoogle. However, the approach to indexing of Semantic web documents is an approach that most engines will have to take at some point of time. When the Internet debuted, there were no specific engines available for indexing or searching. The Search domain only picked up as more and more content became available. One fundamental question that I've always wondered about it is - provided that the search engines return very relevant results for a query - how to ascertain that the documents are indeed the most relevant ones available. There is always an inherent delay in indexing of document. Its here that the new semantic documents search engines can close delay. Experimenting with the concept of Search in the semantic web can only bore well for the future of search technology."

Source

http://www.searchenginejournal.com/swoogle-an-engine-for-the-semantic-web/5469/

Theme

Semantic Web

Li, Z.: ¬A domain specific search engine with explicit document relations (2013) 0.17

0.17355756 = product of:
  0.23141009 = sum of:
    0.08726498 = weight(_text_:web in 1210) [ClassicSimilarity], result of:
      0.08726498 = score(doc=1210,freq=18.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.5408555 = fieldWeight in 1210, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1210)
    0.06598687 = weight(_text_:search in 1210) [ClassicSimilarity], result of:
      0.06598687 = score(doc=1210,freq=8.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.3840117 = fieldWeight in 1210, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1210)
    0.07815824 = product of:
      0.15631647 = sum of:
        0.15631647 = weight(_text_:engine in 1210) [ClassicSimilarity], result of:
          0.15631647 = score(doc=1210,freq=8.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.59104156 = fieldWeight in 1210, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1210)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: The current web consists of documents that are highly heterogeneous and hard for machines to understand. The Semantic Web is a progressive movement of the Word Wide Web, aiming at converting the current web of unstructured documents to the web of data. In the Semantic Web, web documents are annotated with metadata using standardized ontology language. These annotated documents are directly processable by machines and it highly improves their usability and usefulness. In Ericsson, similar problems occur. There are massive documents being created with well-defined structures. Though these documents are about domain specific knowledge and can have rich relations, they are currently managed by a traditional search engine, which ignores the rich domain specific information and presents few data to users. Motivated by the Semantic Web, we aim to find standard ways to process these documents, extract rich domain specific information and annotate these data to documents with formal markup languages. We propose this project to develop a domain specific search engine for processing different documents and building explicit relations for them. This research project consists of the three main focuses: examining different domain specific documents and finding ways to extract their metadata; integrating a text search engine with an ontology server; exploring novel ways to build relations for documents. We implement this system and demonstrate its functions. As a prototype, the system provides required features and will be extended in the future.
Theme: Semantic Web

Khare, R.; Cutting, D.; Sitaker, K.; Rifkin, A.: Nutch: a flexible and scalable open-source Web search engine (2004) 0.16

0.16148674 = product of:
  0.21531567 = sum of:
    0.06981198 = weight(_text_:web in 852) [ClassicSimilarity], result of:
      0.06981198 = score(doc=852,freq=8.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.43268442 = fieldWeight in 852, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=852)
    0.07918424 = weight(_text_:search in 852) [ClassicSimilarity], result of:
      0.07918424 = score(doc=852,freq=8.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.460814 = fieldWeight in 852, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=852)
    0.06631946 = product of:
      0.13263892 = sum of:
        0.13263892 = weight(_text_:engine in 852) [ClassicSimilarity], result of:
          0.13263892 = score(doc=852,freq=4.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.5015154 = fieldWeight in 852, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.046875 = fieldNorm(doc=852)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Nutch is an open-source Web search engine that can be used at global, local, and even personal scale. Its initial design goal was to enable a transparent alternative for global Web search in the public interest - one of its signature features is the ability to "explain" its result rankings. Recent work has emphasized how it can also be used for intranets; by local communities with richer data models, such as the Creative Commons metadata-enabled search for licensed content; on a personal scale to index a user's files, email, and web-surfing history; and we also report on several other research projects built on Nutch. In this paper, we present how the architecture of the Nutch system enables it to be more flexible and scalable than other comparable systems today.

Ding, L.; Finin, T.; Joshi, A.; Peng, Y.; Cost, R.S.; Sachs, J.; Pan, R.; Reddivari, P.; Doshi, V.: Swoogle : a Semantic Web search and metadata engine (2004) 0.16

0.15987685 = product of:
  0.21316913 = sum of:
    0.09235258 = weight(_text_:web in 4704) [ClassicSimilarity], result of:
      0.09235258 = score(doc=4704,freq=14.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.57238775 = fieldWeight in 4704, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4704)
    0.03959212 = weight(_text_:search in 4704) [ClassicSimilarity], result of:
      0.03959212 = score(doc=4704,freq=2.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.230407 = fieldWeight in 4704, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=4704)
    0.08122443 = product of:
      0.16244885 = sum of:
        0.16244885 = weight(_text_:engine in 4704) [ClassicSimilarity], result of:
          0.16244885 = score(doc=4704,freq=6.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.6142285 = fieldWeight in 4704, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.046875 = fieldNorm(doc=4704)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Swoogle is a crawler-based indexing and retrieval system for the Semantic Web, i.e., for Web documents in RDF or OWL. It extracts metadata for each discovered document, and computes relations between documents. Discovered documents are also indexed by an information retrieval system which can use either character N-Gram or URIrefs as keywords to find relevant documents and to compute the similarity among a set of documents. One of the interesting properties we compute is rank, a measure of the importance of a Semantic Web document.
Content: Vgl. unter: http://www.dblab.ntua.gr/~bikakis/LD/5.pdf Vgl. auch: http://swoogle.umbc.edu/. Vgl. auch: http://ebiquity.umbc.edu/paper/html/id/183/. Vgl. auch: Radhakrishnan, A.: Swoogle : An Engine for the Semantic Web unter: http://www.searchenginejournal.com/swoogle-an-engine-for-the-semantic-web/5469/.
Theme: Semantic Web

Smith, A.G.: Search features of digital libraries (2000) 0.16

0.15525165 = product of:
  0.20700221 = sum of:
    0.03490599 = weight(_text_:web in 940) [ClassicSimilarity], result of:
      0.03490599 = score(doc=940,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.21634221 = fieldWeight in 940, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=940)
    0.12520128 = weight(_text_:search in 940) [ClassicSimilarity], result of:
      0.12520128 = score(doc=940,freq=20.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.72861093 = fieldWeight in 940, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=940)
    0.04689494 = product of:
      0.09378988 = sum of:
        0.09378988 = weight(_text_:engine in 940) [ClassicSimilarity], result of:
          0.09378988 = score(doc=940,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.35462496 = fieldWeight in 940, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.046875 = fieldNorm(doc=940)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Traditional on-line search services such as Dialog, DataStar and Lexis provide a wide range of search features (boolean and proximity operators, truncation, etc). This paper discusses the use of these features for effective searching, and argues that these features are required, regardless of advances in search engine technology. The literature on on-line searching is reviewed, identifying features that searchers find desirable for effective searching. A selective survey of current digital libraries available on the Web was undertaken, identifying which search features are present. The survey indicates that current digital libraries do not implement a wide range of search features. For instance: under half of the examples included controlled vocabulary, under half had proximity searching, only one enabled browsing of term indexes, and none of the digital libraries enable searchers to refine an initial search. Suggestions are made for enhancing the search effectiveness of digital libraries; for instance, by providing a full range of search operators, enabling browsing of search terms, enhancement of records with controlled vocabulary, enabling the refining of initial searches, etc.

Bensman, S.J.: Eugene Garfield, Francis Narin, and PageRank : the theoretical bases of the Google search engine (2013) 0.15

0.15254721 = product of:
  0.30509442 = sum of:
    0.07465562 = weight(_text_:search in 1149) [ClassicSimilarity], result of:
      0.07465562 = score(doc=1149,freq=4.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.43445963 = fieldWeight in 1149, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0625 = fieldNorm(doc=1149)
    0.2304388 = sum of:
      0.1768519 = weight(_text_:engine in 1149) [ClassicSimilarity], result of:
        0.1768519 = score(doc=1149,freq=4.0), product of:
          0.26447627 = queryWeight, product of:
            5.349498 = idf(docFreq=570, maxDocs=44218)
            0.049439456 = queryNorm
          0.6686872 = fieldWeight in 1149, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.349498 = idf(docFreq=570, maxDocs=44218)
            0.0625 = fieldNorm(doc=1149)
      0.053586908 = weight(_text_:22 in 1149) [ClassicSimilarity], result of:
        0.053586908 = score(doc=1149,freq=2.0), product of:
          0.17312855 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.049439456 = queryNorm
          0.30952093 = fieldWeight in 1149, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=1149)
  0.5 = coord(2/4)

Abstract: This paper presents a test of the validity of using Google Scholar to evaluate the publications of researchers by comparing the premises on which its search engine, PageRank, is based, to those of Garfield's theory of citation indexing. It finds that the premises are identical and that PageRank and Garfield's theory of citation indexing validate each other.
Date: 17.12.2013 11:02:22

Schaer, P.; Mayr, P.; Sünkler, S.; Lewandowski, D.: How relevant is the long tail? : a relevance assessment study on million short (2016) 0.15

0.15173718 = product of:
  0.20231625 = sum of:
    0.050382458 = weight(_text_:web in 3144) [ClassicSimilarity], result of:
      0.050382458 = score(doc=3144,freq=6.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.3122631 = fieldWeight in 3144, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3144)
    0.07377557 = weight(_text_:search in 3144) [ClassicSimilarity], result of:
      0.07377557 = score(doc=3144,freq=10.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.4293381 = fieldWeight in 3144, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3144)
    0.07815824 = product of:
      0.15631647 = sum of:
        0.15631647 = weight(_text_:engine in 3144) [ClassicSimilarity], result of:
          0.15631647 = score(doc=3144,freq=8.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.59104156 = fieldWeight in 3144, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3144)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Users of web search engines are known to mostly focus on the top ranked results of the search engine result page. While many studies support this well known information seeking pattern only few studies concentrate on the question what users are missing by neglecting lower ranked results. To learn more about the relevance distributions in the so-called long tail we conducted a relevance assessment study with the Million Short long-tail web search engine. While we see a clear difference in the content between the head and the tail of the search engine result list we see no statistical significant differences in the binary relevance judgments and weak significant differences when using graded relevance. The tail contains different but still valuable results. We argue that the long tail can be a rich source for the diversification of web search engine result lists but it needs more evaluation to clearly describe the differences.

Warnick, W.L.; Leberman, A.; Scott, R.L.; Spence, K.J.; Johnsom, L.A.; Allen, V.S.: Searching the deep Web : directed query engine applications at the Department of Energy (2001) 0.15

0.1453627 = product of:
  0.19381693 = sum of:
    0.049364526 = weight(_text_:web in 1215) [ClassicSimilarity], result of:
      0.049364526 = score(doc=1215,freq=4.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.3059541 = fieldWeight in 1215, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=1215)
    0.03959212 = weight(_text_:search in 1215) [ClassicSimilarity], result of:
      0.03959212 = score(doc=1215,freq=2.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.230407 = fieldWeight in 1215, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=1215)
    0.10486028 = product of:
      0.20972057 = sum of:
        0.20972057 = weight(_text_:engine in 1215) [ClassicSimilarity], result of:
          0.20972057 = score(doc=1215,freq=10.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.79296553 = fieldWeight in 1215, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.046875 = fieldNorm(doc=1215)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Directed Query Engines, an emerging class of search engine specifically designed to access distributed resources on the deep web, offer the opportunity to create inexpensive digital libraries. Already, one such engine, Distributed Explorer, has been used to select and assemble high quality information resources and incorporate them into publicly available systems for the physical sciences. By nesting Directed Query Engines so that one query launches several other engines in a cascading fashion, enormous virtual collections may soon be assembled to form a comprehensive information infrastructure for the physical sciences. Once a Directed Query Engine has been configured for a set of information resources, distributed alerts tools can provide patrons with personalized, profile-based notices of recent additions to any of the selected resources. Due to the potentially enormous size and scope of Directed Query Engine applications, consideration must be given to issues surrounding the representation of large quantities of information from multiple, heterogeneous sources.

Austin, D.: How Google finds your needle in the Web's haystack : as we'll see, the trick is to ask the web itself to rank the importance of pages... (2006) 0.14
```
0.13873328 = product of:
  0.18497771 = sum of:
    0.05759195 = weight(_text_:web in 93) [ClassicSimilarity], result of:
      0.05759195 = score(doc=93,freq=16.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.35694647 = fieldWeight in 93, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=93)
    0.08000484 = weight(_text_:search in 93) [ClassicSimilarity], result of:
      0.08000484 = score(doc=93,freq=24.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.46558946 = fieldWeight in 93, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.02734375 = fieldNorm(doc=93)
    0.047380917 = product of:
      0.09476183 = sum of:
        0.09476183 = weight(_text_:engine in 93) [ClassicSimilarity], result of:
          0.09476183 = score(doc=93,freq=6.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.35829994 = fieldWeight in 93, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.02734375 = fieldNorm(doc=93)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

Imagine a library containing 25 billion documents but with no centralized organization and no librarians. In addition, anyone may add a document at any time without telling anyone. You may feel sure that one of the documents contained in the collection has a piece of information that is vitally important to you, and, being impatient like most of us, you'd like to find it in a matter of seconds. How would you go about doing it? Posed in this way, the problem seems impossible. Yet this description is not too different from the World Wide Web, a huge, highly-disorganized collection of documents in many different formats. Of course, we're all familiar with search engines (perhaps you found this article using one) so we know that there is a solution. This article will describe Google's PageRank algorithm and how it returns pages from the web's collection of 25 billion documents that match search criteria so well that "google" has become a widely used verb. Most search engines, including Google, continually run an army of computer programs that retrieve pages from the web, index the words in each document, and store this information in an efficient format. Each time a user asks for a web search using a search phrase, such as "search engine," the search engine determines all the pages on the web that contains the words in the search phrase. (Perhaps additional information such as the distance between the words "search" and "engine" will be noted as well.) Here is the problem: Google now claims to index 25 billion pages. Roughly 95% of the text in web pages is composed from a mere 10,000 words. This means that, for most searches, there will be a huge number of pages containing the words in the search phrase. What is needed is a means of ranking the importance of the pages that fit the search criteria so that the pages can be sorted with the most important pages at the top of the list. One way to determine the importance of pages is to use a human-generated ranking. For instance, you may have seen pages that consist mainly of a large number of links to other resources in a particular area of interest. Assuming the person maintaining this page is reliable, the pages referenced are likely to be useful. Of course, the list may quickly fall out of date, and the person maintaining the list may miss some important pages, either unintentionally or as a result of an unstated bias. Google's PageRank algorithm assesses the importance of web pages without human evaluation of the content. In fact, Google feels that the value of its service is largely in its ability to provide unbiased results to search queries; Google claims, "the heart of our software is PageRank." As we'll see, the trick is to ask the web itself to rank the importance of pages.

Rajasurya, S.; Muralidharan, T.; Devi, S.; Swamynathan, S.: Semantic information retrieval using ontology in university domain (2012) 0.14

0.13841115 = product of:
  0.1845482 = sum of:
    0.05817665 = weight(_text_:web in 2861) [ClassicSimilarity], result of:
      0.05817665 = score(doc=2861,freq=8.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.36057037 = fieldWeight in 2861, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2861)
    0.08729243 = weight(_text_:search in 2861) [ClassicSimilarity], result of:
      0.08729243 = score(doc=2861,freq=14.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.5079997 = fieldWeight in 2861, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2861)
    0.03907912 = product of:
      0.07815824 = sum of:
        0.07815824 = weight(_text_:engine in 2861) [ClassicSimilarity], result of:
          0.07815824 = score(doc=2861,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.29552078 = fieldWeight in 2861, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2861)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Today's conventional search engines hardly do provide the essential content relevant to the user's search query. This is because the context and semantics of the request made by the user is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is upcoming in the area of web search which combines Natural Language Processing and Artificial Intelligence. The objective of the work done here is to design, develop and implement a semantic search engine- SIEU(Semantic Information Extraction in University Domain) confined to the university domain. SIEU uses ontology as a knowledge base for the information retrieval process. It is not just a mere keyword search. It is one layer above what Google or any other search engines retrieve by analyzing just the keywords. Here the query is analyzed both syntactically and semantically. The developed system retrieves the web results more relevant to the user query through keyword expansion. The results obtained here will be accurate enough to satisfy the request made by the user. The level of accuracy will be enhanced since the query is analyzed semantically. The system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query.

Binghampton University Libraries: Comparing search engines (1998) 0.14

0.1371822 = product of:
  0.2743644 = sum of:
    0.14931124 = weight(_text_:search in 1996) [ClassicSimilarity], result of:
      0.14931124 = score(doc=1996,freq=4.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.86891925 = fieldWeight in 1996, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.125 = fieldNorm(doc=1996)
    0.12505318 = product of:
      0.25010636 = sum of:
        0.25010636 = weight(_text_:engine in 1996) [ClassicSimilarity], result of:
          0.25010636 = score(doc=1996,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.94566655 = fieldWeight in 1996, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.125 = fieldNorm(doc=1996)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Source: http://library.lib.binghampton.edu/webdocs/search-engine-comparison.html

Zhang, L.; Liu, Q.L.; Zhang, J.; Wang, H.F.; Pan, Y.; Yu, Y.: Semplore: an IR approach to scalable hybrid query of Semantic Web data (2007) 0.14
```
0.13666046 = product of:
  0.18221395 = sum of:
    0.09647507 = weight(_text_:web in 231) [ClassicSimilarity], result of:
      0.09647507 = score(doc=231,freq=22.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.59793836 = fieldWeight in 231, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=231)
    0.046659768 = weight(_text_:search in 231) [ClassicSimilarity], result of:
      0.046659768 = score(doc=231,freq=4.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.27153727 = fieldWeight in 231, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=231)
    0.03907912 = product of:
      0.07815824 = sum of:
        0.07815824 = weight(_text_:engine in 231) [ClassicSimilarity], result of:
          0.07815824 = score(doc=231,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.29552078 = fieldWeight in 231, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.0390625 = fieldNorm(doc=231)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

As an extension to the current Web, Semantic Web will not only contain structured data with machine understandable semantics but also textual information. While structured queries can be used to find information more precisely on the Semantic Web, keyword searches are still needed to help exploit textual information. It thus becomes very important that we can combine precise structured queries with imprecise keyword searches to have a hybrid query capability. In addition, due to the huge volume of information on the Semantic Web, the hybrid query must be processed in a very scalable way. In this paper, we define such a hybrid query capability that combines unary tree-shaped structured queries with keyword searches. We show how existing information retrieval (IR) index structures and functions can be reused to index semantic web data and its textual information, and how the hybrid query is evaluated on the index structure using IR engines in an efficient and scalable manner. We implemented this IR approach in an engine called Semplore. Comprehensive experiments on its performance show that it is a promising approach. It leads us to believe that it may be possible to evolve current web search engines to query and search the Semantic Web. Finally, we briefy describe how Semplore is used for searching Wikipedia and an IBM customer's product information.

Source

Proceeding ISWC'07/ASWC'07 : Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference. Ed.: K. Aberer et al

Theme

Semantic Web
Fife, E.D.; Husch, L.: ¬The Mathematics Archives : making mathematics easy to find on the Web (1999) 0.13
```
0.13291532 = product of:
  0.17722042 = sum of:
    0.041137107 = weight(_text_:web in 1239) [ClassicSimilarity], result of:
      0.041137107 = score(doc=1239,freq=4.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.25496176 = fieldWeight in 1239, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1239)
    0.08081709 = weight(_text_:search in 1239) [ClassicSimilarity], result of:
      0.08081709 = score(doc=1239,freq=12.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.47031635 = fieldWeight in 1239, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1239)
    0.05526622 = product of:
      0.11053244 = sum of:
        0.11053244 = weight(_text_:engine in 1239) [ClassicSimilarity], result of:
          0.11053244 = score(doc=1239,freq=4.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.41792953 = fieldWeight in 1239, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1239)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

Do a search on AltaVista for "algebra". What do you get? Nearly 700,000 hits, of which AltaVista will allow you to view only what it determines is the top 200. Major search engines such as AltaVista, Excite, HotBot, Lycos, and the like continue to provide a valuable service, but with the recent growth of the Internet, topic-specific sites that provide some organization to the topic are increasingly important. It the goal of the Mathematics Archives to make it easier for the ordinary user to find useful mathematical information on the Web. The Mathematics Archives (http://archives.math.utk.edu) is a multipurpose site for mathematics on the Internet. The focus is on materials which can be used in mathematics education (primarily at the undergraduate level). Resources available range from shareware and public domain software to electronic proceedings of various conferences, to an extensive collection of annotated links to other mathematical sites. All materials on the Archives are categorized and cross referenced for the convenience of the user. Several search mechanisms are provided. The Harvest search engine is implemented to provide a full text search of most of the pages on the Archives. The software we house and our list of annotated links to mathematical sites are both categorized by subject matter. Each of these collections has a specialized search engine to assist the user in locating desired material. Services at the Mathematics Archives are divided up into five broad topics: * Links organized by Mathematical Topics * Software * Teaching Materials * Other Math Archives Features * Other Links

Lossau, N.: Search engine technology and digital libraries : libraries need to discover the academic internet (2004) 0.13

0.13157946 = product of:
  0.17543927 = sum of:
    0.04072366 = weight(_text_:web in 1161) [ClassicSimilarity], result of:
      0.04072366 = score(doc=1161,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.25239927 = fieldWeight in 1161, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1161)
    0.08000484 = weight(_text_:search in 1161) [ClassicSimilarity], result of:
      0.08000484 = score(doc=1161,freq=6.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.46558946 = fieldWeight in 1161, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1161)
    0.05471077 = product of:
      0.10942154 = sum of:
        0.10942154 = weight(_text_:engine in 1161) [ClassicSimilarity], result of:
          0.10942154 = score(doc=1161,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.41372913 = fieldWeight in 1161, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1161)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: With the development of the World Wide Web, the "information search" has grown to be a significant business sector of a global, competitive and commercial market. Powerful players have entered this market, such as commercial internet search engines, information portals, multinational publishers and online content integrators. Will Google, Yahoo or Microsoft be the only portals to global knowledge in 2010? If libraries do not want to become marginalized in a key area of their traditional services, they need to acknowledge the challenges that come with the globalisation of scholarly information, the existence and further growth of the academic internet

Mirizzi, R.: Exploratory browsing in the Web of Data (2011) 0.13
```
0.12962143 = product of:
  0.17282857 = sum of:
    0.07618698 = weight(_text_:web in 4803) [ClassicSimilarity], result of:
      0.07618698 = score(doc=4803,freq=28.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.47219574 = fieldWeight in 4803, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4803)
    0.06928621 = weight(_text_:search in 4803) [ClassicSimilarity], result of:
      0.06928621 = score(doc=4803,freq=18.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.40321225 = fieldWeight in 4803, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4803)
    0.027355384 = product of:
      0.05471077 = sum of:
        0.05471077 = weight(_text_:engine in 4803) [ClassicSimilarity], result of:
          0.05471077 = score(doc=4803,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.20686457 = fieldWeight in 4803, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4803)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

Thanks to the recent Linked Data initiative, the foundations of the Semantic Web have been built. Shared, open and linked RDF datasets give us the possibility to exploit both the strong theoretical results and the robust technologies and tools developed since the seminal paper in the Semantic Web appeared in 2001. In a simplistic way, we may think at the Semantic Web as a ultra large distributed database we can query to get information coming from different sources. In fact, every dataset exposes a SPARQL endpoint to make the data accessible through exact queries. If we know the URI of the famous actress Nicole Kidman in DBpedia we may retrieve all the movies she acted with a simple SPARQL query. Eventually we may aggregate this information with users ratings and genres from IMDB. Even though these are very exciting results and applications, there is much more behind the curtains. Datasets come with the description of their schema structured in an ontological way. Resources refer to classes which are in turn organized in well structured and rich ontologies. Exploiting also this further feature we go beyond the notion of a distributed database and we can refer to the Semantic Web as a distributed knowledge base. If in our knowledge base we have that Paris is located in France (ontological level) and that Moulin Rouge! is set in Paris (data level) we may query the Semantic Web (interpreted as a set of interconnected datasets and related ontologies) to return all the movies starred by Nicole Kidman set in France and Moulin Rouge! will be in the final result set. The ontological level makes possible to infer new relations among data.
The Linked Data initiative and the state of the art in semantic technologies led off all brand new search and mash-up applications. The basic idea is to have smarter lookup services for a huge, distributed and social knowledge base. All these applications catch and (re)propose, under a semantic data perspective, the view of the classical Web as a distributed collection of documents to retrieve. The interlinked nature of the Web, and consequently of the Semantic Web, is exploited (just) to collect and aggregate data coming from different sources. Of course, this is a big step forward in search and Web technologies, but if we limit our investi- gation to retrieval tasks, we miss another important feature of the current Web: browsing and in particular exploratory browsing (a.k.a. exploratory search). Thanks to its hyperlinked nature, the Web defined a new way of browsing documents and knowledge: selection by lookup, navigation and trial-and-error tactics were, and still are, exploited by users to search for relevant information satisfying some initial requirements. The basic assumptions behind a lookup search, typical of Information Retrieval (IR) systems, are no more valid in an exploratory browsing context. An IR system, such as a search engine, assumes that: the user has a clear picture of what she is looking for ; she knows the terminology of the specific knowledge space. On the other side, as argued in, the main challenges in exploratory search can be summarized as: support querying and rapid query refinement; other facets and metadata-based result filtering; leverage search context; support learning and understanding; other visualization to support insight/decision making; facilitate collaboration. In Section 3 we will show two applications for exploratory search in the Semantic Web addressing some of the above challenges.

Theme

Semantic Web
Summann, F.; Lossau, N.: Search engine technology and digital libraries : moving from theory to practice (2004) 0.13
```
0.12927133 = product of:
  0.17236176 = sum of:
    0.023270661 = weight(_text_:web in 1196) [ClassicSimilarity], result of:
      0.023270661 = score(doc=1196,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.14422815 = fieldWeight in 1196, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=1196)
    0.07918424 = weight(_text_:search in 1196) [ClassicSimilarity], result of:
      0.07918424 = score(doc=1196,freq=18.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.460814 = fieldWeight in 1196, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=1196)
    0.06990685 = product of:
      0.1398137 = sum of:
        0.1398137 = weight(_text_:engine in 1196) [ClassicSimilarity], result of:
          0.1398137 = score(doc=1196,freq=10.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.52864367 = fieldWeight in 1196, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.03125 = fieldNorm(doc=1196)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

This article describes the journey from the conception of and vision for a modern search-engine-based search environment to its technological realisation. In doing so, it takes up the thread of an earlier article on this subject, this time from a technical viewpoint. As well as presenting the conceptual considerations of the initial stages, this article will principally elucidate the technological aspects of this journey. The starting point for the deliberations about development of an academic search engine was the experience we gained through the generally successful project "Digital Library NRW", in which from 1998 to 2000-with Bielefeld University Library in overall charge-we designed a system model for an Internet-based library portal with an improved academic search environment at its core. At the heart of this system was a metasearch with an availability function, to which we added a user interface integrating all relevant source material for study and research. The deficiencies of this approach were felt soon after the system was launched in June 2001. There were problems with the stability and performance of the database retrieval system, with the integration of full-text documents and Internet pages, and with acceptance by users, because users are increasingly performing the searches themselves using search engines rather than going to the library for help in doing searches. Since a long list of problems are also encountered using commercial search engines for academic use (in particular the retrieval of academic information and long-term availability), the idea was born for a search engine configured specifically for academic use. We also hoped that with one single access point founded on improved search engine technology, we could access the heterogeneous academic resources of subject-based bibliographic databases, catalogues, electronic newspapers, document servers and academic web pages.

Mayfield, J.; Finin, T.: Information retrieval on the Semantic Web : integrating inference and retrieval 0.13

0.12704045 = product of:
  0.16938725 = sum of:
    0.09975218 = weight(_text_:web in 4330) [ClassicSimilarity], result of:
      0.09975218 = score(doc=4330,freq=12.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.6182494 = fieldWeight in 4330, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4330)
    0.046190813 = weight(_text_:search in 4330) [ClassicSimilarity], result of:
      0.046190813 = score(doc=4330,freq=2.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.2688082 = fieldWeight in 4330, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4330)
    0.023444273 = product of:
      0.046888545 = sum of:
        0.046888545 = weight(_text_:22 in 4330) [ClassicSimilarity], result of:
          0.046888545 = score(doc=4330,freq=2.0), product of:
            0.17312855 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049439456 = queryNorm
            0.2708308 = fieldWeight in 4330, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4330)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: One vision of the Semantic Web is that it will be much like the Web we know today, except that documents will be enriched by annotations in machine understandable markup. These annotations will provide metadata about the documents as well as machine interpretable statements capturing some of the meaning of document content. We discuss how the information retrieval paradigm might be recast in such an environment. We suggest that retrieval can be tightly bound to inference. Doing so makes today's Web search engines useful to Semantic Web inference engines, and causes improvements in either retrieval or inference to lead directly to improvements in the other.
Date: 12. 2.2011 17:35:22
Theme: Semantic Web

Search (681 results, page 1 of 35)

Authors

Years

Languages

Types

Themes

Subjects

Classifications