Search (314 results, page 1 of 16)

Hooland, S. van; Verborgh, R.; Wilde, M. De; Hercher, J.; Mannens, E.; Wa, R.Van de: Evaluating the success of vocabulary reconciliation for cultural heritage collections (2013) 0.08

0.08257161 = product of:
  0.12385741 = sum of:
    0.00890397 = weight(_text_:a in 662) [ClassicSimilarity], result of:
      0.00890397 = score(doc=662,freq=10.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.1709182 = fieldWeight in 662, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=662)
    0.114953436 = sum of:
      0.07822566 = weight(_text_:de in 662) [ClassicSimilarity], result of:
        0.07822566 = score(doc=662,freq=4.0), product of:
          0.19416152 = queryWeight, product of:
            4.297489 = idf(docFreq=1634, maxDocs=44218)
            0.045180224 = queryNorm
          0.4028896 = fieldWeight in 662, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            4.297489 = idf(docFreq=1634, maxDocs=44218)
            0.046875 = fieldNorm(doc=662)
      0.03672778 = weight(_text_:22 in 662) [ClassicSimilarity], result of:
        0.03672778 = score(doc=662,freq=2.0), product of:
          0.15821345 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.045180224 = queryNorm
          0.23214069 = fieldWeight in 662, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=662)
  0.6666667 = coord(2/3)

Abstract: The concept of Linked Data has made its entrance in the cultural heritage sector due to its potential use for the integration of heterogeneous collections and deriving additional value out of existing metadata. However, practitioners and researchers alike need a better understanding of what outcome they can reasonably expect of the reconciliation process between their local metadata and established controlled vocabularies which are already a part of the Linked Data cloud. This paper offers an in-depth analysis of how a locally developed vocabulary can be successfully reconciled with the Library of Congress Subject Headings (LCSH) and the Arts and Architecture Thesaurus (AAT) through the help of a general-purpose tool for interactive data transformation (OpenRefine). Issues negatively affecting the reconciliation process are identified and solutions are proposed in order to derive maximum value from existing metadata and controlled vocabularies in an automated manner.
Date: 22. 3.2013 19:29:20
Type: a

Keyser, P. de: Indexing : from thesauri to the Semantic Web (2012) 0.06

0.06401577 = product of:
  0.09602365 = sum of:
    0.0039819763 = weight(_text_:a in 3197) [ClassicSimilarity], result of:
      0.0039819763 = score(doc=3197,freq=2.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.07643694 = fieldWeight in 3197, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=3197)
    0.09204167 = sum of:
      0.055313893 = weight(_text_:de in 3197) [ClassicSimilarity], result of:
        0.055313893 = score(doc=3197,freq=2.0), product of:
          0.19416152 = queryWeight, product of:
            4.297489 = idf(docFreq=1634, maxDocs=44218)
            0.045180224 = queryNorm
          0.28488597 = fieldWeight in 3197, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.297489 = idf(docFreq=1634, maxDocs=44218)
            0.046875 = fieldNorm(doc=3197)
      0.03672778 = weight(_text_:22 in 3197) [ClassicSimilarity], result of:
        0.03672778 = score(doc=3197,freq=2.0), product of:
          0.15821345 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.045180224 = queryNorm
          0.23214069 = fieldWeight in 3197, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=3197)
  0.6666667 = coord(2/3)

Abstract: Indexing consists of both novel and more traditional techniques. Cutting-edge indexing techniques, such as automatic indexing, ontologies, and topic maps, were developed independently of older techniques such as thesauri, but it is now recognized that these older methods also hold expertise. Indexing describes various traditional and novel indexing techniques, giving information professionals and students of library and information sciences a broad and comprehensible introduction to indexing. This title consists of twelve chapters: an Introduction to subject readings and theasauri; Automatic indexing versus manual indexing; Techniques applied in automatic indexing of text material; Automatic indexing of images; The black art of indexing moving images; Automatic indexing of music; Taxonomies and ontologies; Metadata formats and indexing; Tagging; Topic maps; Indexing the web; and The Semantic Web.
Date: 24. 8.2016 14:03:22

OWL Web Ontology Language Test Cases (2004) 0.04

0.040907413 = product of:
  0.12272224 = sum of:
    0.12272224 = sum of:
      0.07375186 = weight(_text_:de in 4685) [ClassicSimilarity], result of:
        0.07375186 = score(doc=4685,freq=2.0), product of:
          0.19416152 = queryWeight, product of:
            4.297489 = idf(docFreq=1634, maxDocs=44218)
            0.045180224 = queryNorm
          0.37984797 = fieldWeight in 4685, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.297489 = idf(docFreq=1634, maxDocs=44218)
            0.0625 = fieldNorm(doc=4685)
      0.048970375 = weight(_text_:22 in 4685) [ClassicSimilarity], result of:
        0.048970375 = score(doc=4685,freq=2.0), product of:
          0.15821345 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.045180224 = queryNorm
          0.30952093 = fieldWeight in 4685, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=4685)
  0.33333334 = coord(1/3)

Date: 14. 8.2011 13:33:22
Editor: Carroll, J.J. u. J. de Roo

Stojanovic, N.: Ontology-based Information Retrieval : methods and tools for cooperative query answering (2005) 0.03
```
0.03142788 = product of:
  0.047141816 = sum of:
    0.035879087 = product of:
      0.14351635 = sum of:
        0.14351635 = weight(_text_:3a in 701) [ClassicSimilarity], result of:
          0.14351635 = score(doc=701,freq=2.0), product of:
            0.38303843 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.045180224 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
      0.25 = coord(1/4)
    0.011262729 = weight(_text_:a in 701) [ClassicSimilarity], result of:
      0.011262729 = score(doc=701,freq=36.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.2161963 = fieldWeight in 701, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.03125 = fieldNorm(doc=701)
  0.6666667 = coord(2/3)
```
Abstract

By the explosion of possibilities for a ubiquitous content production, the information overload problem reaches the level of complexity which cannot be managed by traditional modelling approaches anymore. Due to their pure syntactical nature traditional information retrieval approaches did not succeed in treating content itself (i.e. its meaning, and not its representation). This leads to a very low usefulness of the results of a retrieval process for a user's task at hand. In the last ten years ontologies have been emerged from an interesting conceptualisation paradigm to a very promising (semantic) modelling technology, especially in the context of the Semantic Web. From the information retrieval point of view, ontologies enable a machine-understandable form of content description, such that the retrieval process can be driven by the meaning of the content. However, the very ambiguous nature of the retrieval process in which a user, due to the unfamiliarity with the underlying repository and/or query syntax, just approximates his information need in a query, implies a necessity to include the user in the retrieval process more actively in order to close the gap between the meaning of the content and the meaning of a user's query (i.e. his information need). This thesis lays foundation for such an ontology-based interactive retrieval process, in which the retrieval system interacts with a user in order to conceptually interpret the meaning of his query, whereas the underlying domain ontology drives the conceptualisation process. In that way the retrieval process evolves from a query evaluation process into a highly interactive cooperation between a user and the retrieval system, in which the system tries to anticipate the user's information need and to deliver the relevant content proactively. Moreover, the notion of content relevance for a user's query evolves from a content dependent artefact to the multidimensional context-dependent structure, strongly influenced by the user's preferences. This cooperation process is realized as the so-called Librarian Agent Query Refinement Process. In order to clarify the impact of an ontology on the retrieval process (regarding its complexity and quality), a set of methods and tools for different levels of content and query formalisation is developed, ranging from pure ontology-based inferencing to keyword-based querying in which semantics automatically emerges from the results. Our evaluation studies have shown that the possibilities to conceptualize a user's information need in the right manner and to interpret the retrieval results accordingly are key issues for realizing much more meaningful information retrieval systems.

Content

Vgl.: http%3A%2F%2Fdigbib.ubka.uni-karlsruhe.de%2Fvolltexte%2Fdocuments%2F1627&ei=tAtYUYrBNoHKtQb3l4GYBw&usg=AFQjCNHeaxKkKU3-u54LWxMNYGXaaDLCGw&sig2=8WykXWQoDKjDSdGtAakH2Q&bvm=bv.44442042,d.Yms.

Finke, M.; Risch, J.: "Match Me If You Can" : Sammeln und semantisches Aufbereiten von Fußballdaten (2017) 0.03

0.028123489 = product of:
  0.042185232 = sum of:
    0.0053093014 = weight(_text_:a in 3723) [ClassicSimilarity], result of:
      0.0053093014 = score(doc=3723,freq=2.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.10191591 = fieldWeight in 3723, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=3723)
    0.03687593 = product of:
      0.07375186 = sum of:
        0.07375186 = weight(_text_:de in 3723) [ClassicSimilarity], result of:
          0.07375186 = score(doc=3723,freq=2.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.37984797 = fieldWeight in 3723, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0625 = fieldNorm(doc=3723)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Footnote: Vgl.: www.info7.de/info7_2017-2_S-36-51.pdf.
Type: a

Virgilio, R. De; Cappellari, P.; Maccioni, A.; Torlone, R.: Path-oriented keyword search query over RDF (2012) 0.03
```
0.027582306 = product of:
  0.041373458 = sum of:
    0.008779433 = weight(_text_:a in 429) [ClassicSimilarity], result of:
      0.008779433 = score(doc=429,freq=14.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.1685276 = fieldWeight in 429, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=429)
    0.032594025 = product of:
      0.06518805 = sum of:
        0.06518805 = weight(_text_:de in 429) [ClassicSimilarity], result of:
          0.06518805 = score(doc=429,freq=4.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.33574134 = fieldWeight in 429, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=429)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

We are witnessing a smooth evolution of the Web from a worldwide information space of linked documents to a global knowledge base, where resources are identified by means of uniform resource identifiers (URIs, essentially string identifiers) and are semantically described and correlated through resource description framework (RDF, a metadata data model) statements. With the size and availability of data constantly increasing (currently around 7 billion RDF triples and 150 million RDF links), a fundamental problem lies in the difficulty users face to find and retrieve the information they are interested in. In general, to access semantic data, users need to know the organization of data and the syntax of a specific query language (e.g., SPARQL or variants thereof). Clearly, this represents an obstacle to information access for nonexpert users. For this reason, keyword search-based systems are increasingly capturing the attention of researchers. Recently, many approaches to keyword-based search over structured and semistructured data have been proposed]. These approaches usually implement IR strategies on top of traditional database management systems with the goal of freeing the users from having to know data organization and query languages.

Source

Semantic search over the Web. Eds.: R. De Virgilio, et al
Weiand, K.; Hartl, A.; Hausmann, S.; Furche, T.; Bry, F.: Keyword-based search over semantic data (2012) 0.03
```
0.027148133 = product of:
  0.0407222 = sum of:
    0.008128175 = weight(_text_:a in 432) [ClassicSimilarity], result of:
      0.008128175 = score(doc=432,freq=12.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.15602624 = fieldWeight in 432, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=432)
    0.032594025 = product of:
      0.06518805 = sum of:
        0.06518805 = weight(_text_:de in 432) [ClassicSimilarity], result of:
          0.06518805 = score(doc=432,freq=4.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.33574134 = fieldWeight in 432, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=432)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

For a long while, the creation of Web content required at least basic knowledge of Web technologies, meaning that for many Web users, the Web was de facto a read-only medium. This changed with the arrival of the "social Web," when Web applications started to allow users to publish Web content without technological expertise. Here, content creation is often an inclusive, iterative, and interactive process. Examples of social Web applications include blogs, social networking sites, as well as many specialized applications, for example, for saving and sharing bookmarks and publishing photos. Social semantic Web applications are social Web applications in which knowledge is expressed not only in the form of text and multimedia but also through informal to formal annotations that describe, reflect, and enhance the content. These annotations often take the shape of RDF graphs backed by ontologies, but less formal annotations such as free-form tags or tags from a controlled vocabulary may also be available. Wikis are one example of social Web applications for collecting and sharing knowledge. They allow users to easily create and edit documents, so-called wiki pages, using a Web browser. The pages in a wiki are often heavily interlinked, which makes it easy to find related information and browse the content.

Source

Semantic search over the Web. Eds.: R. De Virgilio, et al

Papadakis, I. et al.: Highlighting timely information in libraries through social and semantic Web technologies (2016) 0.02

0.02482874 = product of:
  0.03724311 = sum of:
    0.0066366266 = weight(_text_:a in 2090) [ClassicSimilarity], result of:
      0.0066366266 = score(doc=2090,freq=2.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.12739488 = fieldWeight in 2090, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.078125 = fieldNorm(doc=2090)
    0.030606484 = product of:
      0.061212968 = sum of:
        0.061212968 = weight(_text_:22 in 2090) [ClassicSimilarity], result of:
          0.061212968 = score(doc=2090,freq=2.0), product of:
            0.15821345 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045180224 = queryNorm
            0.38690117 = fieldWeight in 2090, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2090)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
Type: a

De Luca, E.W.: Using multilingual lexical resources for extending the linked data cloud (2017) 0.02

0.024608051 = product of:
  0.036912076 = sum of:
    0.0046456386 = weight(_text_:a in 3506) [ClassicSimilarity], result of:
      0.0046456386 = score(doc=3506,freq=2.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.089176424 = fieldWeight in 3506, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3506)
    0.032266438 = product of:
      0.064532876 = sum of:
        0.064532876 = weight(_text_:de in 3506) [ClassicSimilarity], result of:
          0.064532876 = score(doc=3506,freq=2.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.33236697 = fieldWeight in 3506, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3506)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Type: a

Faaborg, A.; Lagoze, C.: Semantic browsing (2003) 0.02

0.021869322 = product of:
  0.032803982 = sum of:
    0.011379444 = weight(_text_:a in 1026) [ClassicSimilarity], result of:
      0.011379444 = score(doc=1026,freq=12.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.21843673 = fieldWeight in 1026, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1026)
    0.02142454 = product of:
      0.04284908 = sum of:
        0.04284908 = weight(_text_:22 in 1026) [ClassicSimilarity], result of:
          0.04284908 = score(doc=1026,freq=2.0), product of:
            0.15821345 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045180224 = queryNorm
            0.2708308 = fieldWeight in 1026, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1026)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: We have created software applications that allow users to both author and use Semantic Web metadata. To create and use a layer of semantic content on top of the existing Web, we have (1) implemented a user interface that expedites the task of attributing metadata to resources on the Web, and (2) augmented a Web browser to leverage this semantic metadata to provide relevant information and tasks to the user. This project provides a framework for annotating and reorganizing existing files, pages, and sites on the Web that is similar to Vannevar Bushrsquos original concepts of trail blazing and associative indexing.
Source: Research and advanced technology for digital libraries : 7th European Conference, proceedings / ECDL 2003, Trondheim, Norway, August 17-22, 2003
Type: a

Bianchini, D.; Antonellis, V. De: Linked data services and semantics-enabled mashup (2012) 0.02
```
0.021718508 = product of:
  0.03257776 = sum of:
    0.00650254 = weight(_text_:a in 435) [ClassicSimilarity], result of:
      0.00650254 = score(doc=435,freq=12.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.12482099 = fieldWeight in 435, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.03125 = fieldNorm(doc=435)
    0.02607522 = product of:
      0.05215044 = sum of:
        0.05215044 = weight(_text_:de in 435) [ClassicSimilarity], result of:
          0.05215044 = score(doc=435,freq=4.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.26859307 = fieldWeight in 435, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.03125 = fieldNorm(doc=435)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The Web of Linked Data can be seen as a global database, where resources are identified through URIs, are self-described (by means of the URI dereferencing mechanism), and are globally connected through RDF links. According to the Linked Data perspective, research attention is progressively shifting from data organization and representation to linkage and composition of the huge amount of data available on the Web. For example, at the time of this writing, the DBpedia knowledge base describes more than 3.5 million things, conceptualized through 672 million RDF triples, with 6.5 million external links into other RDF datasets. Useful applications have been provided for enabling people to browse this wealth of data, like Tabulator. Other systems have been implemented to collect, index, and provide advanced searching facilities over the Web of Linked Data, such as Watson and Sindice. Besides these applications, domain-specific systems to gather and mash up Linked Data have been proposed, like DBpedia Mobile and Revyu . corn. DBpedia Mobile is a location-aware client for the semantic Web that can be used on an iPhone and other mobile devices. Based on the current GPS position of a mobile device, DBpedia Mobile renders a map indicating nearby locations from the DBpedia dataset. Starting from this map, the user can explore background information about his or her surroundings. Revyu . corn is a Web site where you can review and rate whatever is possible to identify (through a URI) on the Web. Nevertheless, the potential advantages implicit in the Web of Linked Data are far from being fully exploited. Current applications hardly go beyond presenting together data gathered from different sources. Recently, research on the Web of Linked Data has been devoted to the study of models and languages to add functionalities to the Web of Linked Data by means of Linked Data services.

Source

Semantic search over the Web. Eds.: R. De Virgilio, et al
Boer, V. de; Wielemaker, J.; Gent, J. van; Hildebrand, M.; Isaac, A.; Ossenbruggen, J. van; Schreiber, G.: Supporting linked data production for cultural heritage institutes : the Amsterdam Museum case study (2012) 0.02
```
0.021217927 = product of:
  0.03182689 = sum of:
    0.008779433 = weight(_text_:a in 265) [ClassicSimilarity], result of:
      0.008779433 = score(doc=265,freq=14.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.1685276 = fieldWeight in 265, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=265)
    0.023047457 = product of:
      0.046094913 = sum of:
        0.046094913 = weight(_text_:de in 265) [ClassicSimilarity], result of:
          0.046094913 = score(doc=265,freq=2.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.23740499 = fieldWeight in 265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=265)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Within the cultural heritage field, proprietary metadata and vocabularies are being transformed into public Linked Data. These efforts have mostly been at the level of large-scale aggregators such as Europeana where the original data is abstracted to a common format and schema. Although this approach ensures a level of consistency and interoperability, the richness of the original data is lost in the process. In this paper, we present a transparent and interactive methodology for ingesting, converting and linking cultural heritage metadata into Linked Data. The methodology is designed to maintain the richness and detail of the original metadata. We introduce the XMLRDF conversion tool and describe how it is integrated in the ClioPatria semantic web toolkit. The methodology and the tools have been validated by converting the Amsterdam Museum metadata to a Linked Data version. In this way, the Amsterdam Museum became the first 'small' cultural heritage institution with a node in the Linked Data cloud.

Type

a
Zenz, G.; Zhou, X.; Minack, E.; Siberski, W.; Nejdl, W.: Interactive query construction for keyword search on the Semantic Web (2012) 0.02
```
0.021217927 = product of:
  0.03182689 = sum of:
    0.008779433 = weight(_text_:a in 430) [ClassicSimilarity], result of:
      0.008779433 = score(doc=430,freq=14.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.1685276 = fieldWeight in 430, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=430)
    0.023047457 = product of:
      0.046094913 = sum of:
        0.046094913 = weight(_text_:de in 430) [ClassicSimilarity], result of:
          0.046094913 = score(doc=430,freq=2.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.23740499 = fieldWeight in 430, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=430)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

With the advance of the semantic Web, increasing amounts of data are available in a structured and machine-understandable form. This opens opportunities for users to employ semantic queries instead of simple keyword-based ones to accurately express the information need. However, constructing semantic queries is a demanding task for human users [11]. To compose a valid semantic query, a user has to (1) master a query language (e.g., SPARQL) and (2) acquire sufficient knowledge about the ontology or the schema of the data source. While there are systems which support this task with visual tools [21, 26] or natural language interfaces [3, 13, 14, 18], the process of query construction can still be complex and time consuming. According to [24], users prefer keyword search, and struggle with the construction of semantic queries although being supported with a natural language interface. Several keyword search approaches have already been proposed to ease information seeking on semantic data [16, 32, 35] or databases [1, 31]. However, keyword queries lack the expressivity to precisely describe the user's intent. As a result, ranking can at best put query intentions of the majority on top, making it impossible to take the intentions of all users into consideration.

Source

Semantic search over the Web. Eds.: R. De Virgilio, et al
Ioannou, E.; Nejdl, W.; Niederée, C.; Velegrakis, Y.: Embracing uncertainty in entity linking (2012) 0.02
```
0.020783756 = product of:
  0.031175632 = sum of:
    0.008128175 = weight(_text_:a in 433) [ClassicSimilarity], result of:
      0.008128175 = score(doc=433,freq=12.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.15602624 = fieldWeight in 433, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=433)
    0.023047457 = product of:
      0.046094913 = sum of:
        0.046094913 = weight(_text_:de in 433) [ClassicSimilarity], result of:
          0.046094913 = score(doc=433,freq=2.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.23740499 = fieldWeight in 433, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=433)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The modern Web has grown from a publishing place of well-structured data and HTML pages for companies and experienced users into a vivid publishing and data exchange community in which everyone can participate, both as a data consumer and as a data producer. Unavoidably, the data available on the Web became highly heterogeneous, ranging from highly structured and semistructured to highly unstructured user-generated content, reflecting different perspectives and structuring principles. The full potential of such data can only be realized by combining information from multiple sources. For instance, the knowledge that is typically embedded in monolithic applications can be outsourced and, thus, used also in other applications. Numerous systems nowadays are already actively utilizing existing content from various sources such as WordNet or Wikipedia. Some well-known examples of such systems include DBpedia, Freebase, Spock, and DBLife. A major challenge during combining and querying information from multiple heterogeneous sources is entity linkage, i.e., the ability to detect whether two pieces of information correspond to the same real-world object. This chapter introduces a novel approach for addressing the entity linkage problem for heterogeneous, uncertain, and volatile data.

Source

Semantic search over the Web. Eds.: R. De Virgilio, et al
Call, A.; Gottlob, G.; Pieris, A.: ¬The return of the entity-relationship model : ontological query answering (2012) 0.02
```
0.020783756 = product of:
  0.031175632 = sum of:
    0.008128175 = weight(_text_:a in 434) [ClassicSimilarity], result of:
      0.008128175 = score(doc=434,freq=12.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.15602624 = fieldWeight in 434, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=434)
    0.023047457 = product of:
      0.046094913 = sum of:
        0.046094913 = weight(_text_:de in 434) [ClassicSimilarity], result of:
          0.046094913 = score(doc=434,freq=2.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.23740499 = fieldWeight in 434, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=434)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The Entity-Relationship (ER) model is a fundamental formalism for conceptual modeling in database design; it was introduced by Chen in his milestone paper, and it is now widely used, being flexible and easily understood by practitioners. With the rise of the Semantic Web, conceptual modeling formalisms have gained importance again as ontology formalisms, in the Semantic Web parlance. Ontologies and conceptual models are aimed at representing, rather than the structure of data, the domain of interest, that is, the fragment of the real world that is being represented by the data and the schema. A prominent formalism for modeling ontologies are Description Logics (DLs), which are decidable fragments of first-order logic, particularly suitable for ontological modeling and querying. In particular, DL ontologies are sets of assertions describing sets of objects and (usually binary) relations among such sets, exactly in the same fashion as the ER model. Recently, research on DLs has been focusing on the problem of answering queries under ontologies, that is, given a query q, an instance B, and an ontology X, answering q under B and amounts to compute the answers that are logically entailed from B by using the assertions of X. In this context, where data size is usually large, a central issue the data complexity of query answering, i.e., the computational complexity with respect to the data set B only, while the ontology X and the query q are fixed.

Source

Semantic search over the Web. Eds.: R. De Virgilio, et al

Heflin, J.; Hendler, J.: Semantic interoperability on the Web (2000) 0.02

0.020477211 = product of:
  0.030715816 = sum of:
    0.009291277 = weight(_text_:a in 759) [ClassicSimilarity], result of:
      0.009291277 = score(doc=759,freq=8.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.17835285 = fieldWeight in 759, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=759)
    0.02142454 = product of:
      0.04284908 = sum of:
        0.04284908 = weight(_text_:22 in 759) [ClassicSimilarity], result of:
          0.04284908 = score(doc=759,freq=2.0), product of:
            0.15821345 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045180224 = queryNorm
            0.2708308 = fieldWeight in 759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=759)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: XML will have a profound impact on the way data is exchanged on the Internet. An important feature of this language is the separation of content from presentation, which makes it easier to select and/or reformat the data. However, due to the likelihood of numerous industry and domain specific DTDs, those who wish to integrate information will still be faced with the problem of semantic interoperability. In this paper we discuss why this problem is not solved by XML, and then discuss why the Resource Description Framework is only a partial solution. We then present the SHOE language, which we feel has many of the features necessary to enable a semantic web, and describe an existing set of tools that make it easy to use the language.
Date: 11. 5.2013 19:22:18
Type: a

Semantic search over the Web (2012) 0.02
```
0.020448808 = product of:
  0.03067321 = sum of:
    0.0045979903 = weight(_text_:a in 411) [ClassicSimilarity], result of:
      0.0045979903 = score(doc=411,freq=6.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.088261776 = fieldWeight in 411, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.03125 = fieldNorm(doc=411)
    0.02607522 = product of:
      0.05215044 = sum of:
        0.05215044 = weight(_text_:de in 411) [ClassicSimilarity], result of:
          0.05215044 = score(doc=411,freq=4.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.26859307 = fieldWeight in 411, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.03125 = fieldNorm(doc=411)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The Web has become the world's largest database, with search being the main tool that allows organizations and individuals to exploit its huge amount of information. Search on the Web has been traditionally based on textual and structural similarities, ignoring to a large degree the semantic dimension, i.e., understanding the meaning of the query and of the document content. Combining search and semantics gives birth to the idea of semantic search. Traditional search engines have already advertised some semantic dimensions. Some of them, for instance, can enhance their generated result sets with documents that are semantically related to the query terms even though they may not include these terms. Nevertheless, the exploitation of the semantic search has not yet reached its full potential. In this book, Roberto De Virgilio, Francesco Guerra and Yannis Velegrakis present an extensive overview of the work done in Semantic Search and other related areas. They explore different technologies and solutions in depth, making their collection a valuable and stimulating reading for both academic and industrial researchers. The book is divided into three parts. The first introduces the readers to the basic notions of the Web of Data. It describes the different kinds of data that exist, their topology, and their storing and indexing techniques. The second part is dedicated to Web Search. It presents different types of search, like the exploratory or the path-oriented, alongside methods for their efficient and effective implementation. Other related topics included in this part are the use of uncertainty in query answering, the exploitation of ontologies, and the use of semantics in mashup design and operation. The focus of the third part is on linked data, and more specifically, on applying ideas originating in recommender systems on linked data management, and on techniques for the efficiently querying answering on linked data.

Content

Inhalt: Introduction.- Part I Introduction to Web of Data.- Topology of the Web of Data.- Storing and Indexing Massive RDF Data Sets.- Designing Exploratory Search Applications upon Web Data Sources.- Part II Search over the Web.- Path-oriented Keyword Search query over RDF.- Interactive Query Construction for Keyword Search on the SemanticWeb.- Understanding the Semantics of Keyword Queries on Relational DataWithout Accessing the Instance.- Keyword-Based Search over Semantic Data.- Semantic Link Discovery over Relational Data.- Embracing Uncertainty in Entity Linking.- The Return of the Entity-Relationship Model: Ontological Query Answering.- Linked Data Services and Semantics-enabled Mashup.- Part III Linked Data Search engines.- A Recommender System for Linked Data.- Flint: from Web Pages to Probabilistic Semantic Data.- Searching and Browsing Linked Data with SWSE.

Editor

Virgilio, R. de
Vocht, L. De: Exploring semantic relationships in the Web of Data : Semantische relaties verkennen in data op het web (2017) 0.02
```
0.02043378 = product of:
  0.030650668 = sum of:
    0.0076032113 = weight(_text_:a in 4232) [ClassicSimilarity], result of:
      0.0076032113 = score(doc=4232,freq=42.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.14594918 = fieldWeight in 4232, product of:
          6.4807405 = tf(freq=42.0), with freq of:
            42.0 = termFreq=42.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.01953125 = fieldNorm(doc=4232)
    0.023047457 = product of:
      0.046094913 = sum of:
        0.046094913 = weight(_text_:de in 4232) [ClassicSimilarity], result of:
          0.046094913 = score(doc=4232,freq=8.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.23740499 = fieldWeight in 4232, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.01953125 = fieldNorm(doc=4232)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

After the launch of the World Wide Web, it became clear that searching documentson the Web would not be trivial. Well-known engines to search the web, like Google, focus on search in web documents using keywords. The documents are structured and indexed to ensure keywords match documents as accurately as possible. However, searching by keywords does not always suice. It is oen the case that users do not know exactly how to formulate the search query or which keywords guarantee retrieving the most relevant documents. Besides that, it occurs that users rather want to browse information than looking up something specific. It turned out that there is need for systems that enable more interactivity and facilitate the gradual refinement of search queries to explore the Web. Users expect more from the Web because the short keyword-based queries they pose during search, do not suffice for all cases. On top of that, the Web is changing structurally. The Web comprises, apart from a collection of documents, more and more linked data, pieces of information structured so they can be processed by machines. The consequently applied semantics allow users to exactly indicate machines their search intentions. This is made possible by describing data following controlled vocabularies, concept lists composed by experts, published uniquely identifiable on the Web. Even so, it is still not trivial to explore data on the Web. There is a large variety of vocabularies and various data sources use different terms to identify the same concepts.
This PhD-thesis describes how to effectively explore linked data on the Web. The main focus is on scenarios where users want to discover relationships between resources rather than finding out more about something specific. Searching for a specific document or piece of information fits in the theoretical framework of information retrieval and is associated with exploratory search. Exploratory search goes beyond 'looking up something' when users are seeking more detailed understanding, further investigation or navigation of the initial search results. The ideas behind exploratory search and querying linked data merge when it comes to the way knowledge is represented and indexed by machines - how data is structured and stored for optimal searchability. Queries and information should be aligned to facilitate that searches also reveal connections between results. This implies that they take into account the same semantic entities, relevant at that moment. To realize this, we research three techniques that are evaluated one by one in an experimental set-up to assess how well they succeed in their goals. In the end, the techniques are applied to a practical use case that focuses on forming a bridge between the Web and the use of digital libraries in scientific research. Our first technique focuses on the interactive visualization of search results. Linked data resources can be brought in relation with each other at will. This leads to complex and diverse graphs structures. Our technique facilitates navigation and supports a workflow starting from a broad overview on the data and allows narrowing down until the desired level of detail to then broaden again. To validate the flow, two visualizations where implemented and presented to test-users. The users judged the usability of the visualizations, how the visualizations fit in the workflow and to which degree their features seemed useful for the exploration of linked data.
The ideas behind exploratory search and querying linked data merge when it comes to the way knowledge is represented and indexed by machines - how data is structured and stored for optimal searchability. eries and information should be aligned to facilitate that searches also reveal connections between results. This implies that they take into account the same semantic entities, relevant at that moment. To realize this, we research three techniques that are evaluated one by one in an experimental set-up to assess how well they succeed in their goals. In the end, the techniques are applied to a practical use case that focuses on forming a bridge between the Web and the use of digital libraries in scientific research.
Our first technique focuses on the interactive visualization of search results. Linked data resources can be brought in relation with each other at will. This leads to complex and diverse graphs structures. Our technique facilitates navigation and supports a workflow starting from a broad overview on the data and allows narrowing down until the desired level of detail to then broaden again. To validate the flow, two visualizations where implemented and presented to test-users. The users judged the usability of the visualizations, how the visualizations fit in the workflow and to which degree their features seemed useful for the exploration of linked data. There is a difference in the way users interact with resources, visually or textually, and how resources are represented for machines to be processed by algorithms. This difference complicates bridging the users' intents and machine executable queries. It is important to implement this 'translation' mechanism to impact the search as favorable as possible in terms of performance, complexity and accuracy. To do this, we explain a second technique, that supports such a bridging component. Our second technique is developed around three features that support the search process: looking up, relating and ranking resources. The main goal is to ensure that resources in the results are as precise and relevant as possible. During the evaluation of this technique, we did not only look at the precision of the search results but also investigated how the effectiveness of the search evolved while the user executed certain actions sequentially.
When we speak about finding relationships between resources, it is necessary to dive deeper in the structure. The graph structure of linked data where the semantics give meaning to the relationships between resources enable the execution of pathfinding algorithms. The assigned weights and heuristics are base components of such algorithms and ultimately define (the order) which resources are included in a path. These paths explain indirect connections between resources. Our third technique proposes an algorithm that optimizes the choice of resources in terms of serendipity. Some optimizations guard the consistence of candidate-paths where the coherence of consecutive connections is maximized to avoid trivial and too arbitrary paths. The implementation uses the A* algorithm, the de-facto reference when it comes to heuristically optimized minimal cost paths. The effectiveness of paths was measured based on common automatic metrics and surveys where the users could indicate their preference for paths, generated each time in a different way. Finally, all our techniques are applied to a use case about publications in digital libraries where they are aligned with information about scientific conferences and researchers. The application to this use case is a practical example because the different aspects of exploratory search come together. In fact, the techniques also evolved from the experiences when implementing the use case. Practical details about the semantic model are explained and the implementation of the search system is clarified module by module. The evaluation positions the result, a prototype of a tool to explore scientific publications, researchers and conferences next to some important alternatives.

Content

Proefschrift ingediend tot het behalen van de graad van Doctor in de ingenieurswetenschappen: computerwetenschappen. Vgl. unter: https://www.researchgate.net/publication/319667837_Exploring_semantic_relationships_in_the_web_of_data.
Harth, A.; Hogan, A.; Umbrich, J.; Kinsella, S.; Polleres, A.; Decker, S.: Searching and browsing linked data with SWSE* (2012) 0.02
```
0.020311622 = product of:
  0.030467432 = sum of:
    0.0074199745 = weight(_text_:a in 410) [ClassicSimilarity], result of:
      0.0074199745 = score(doc=410,freq=10.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.14243183 = fieldWeight in 410, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=410)
    0.023047457 = product of:
      0.046094913 = sum of:
        0.046094913 = weight(_text_:de in 410) [ClassicSimilarity], result of:
          0.046094913 = score(doc=410,freq=2.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.23740499 = fieldWeight in 410, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=410)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Web search engines such as Google, Yahoo! MSN/Bing, and Ask are far from the consummate Web search solution: they do not typically produce direct answers to queries but instead typically recommend a selection of related documents from the Web. We note that in more recent years, search engines have begun to provide direct answers to prose queries matching certain common templates-for example, "population of china" or "12 euro in dollars"-but again, such functionality is limited to a small subset of popular user queries. Furthermore, search engines now provide individual and focused search interfaces over images, videos, locations, news articles, books, research papers, blogs, and real-time social media-although these tools are inarguably powerful, they are limited to their respective domains. In the general case, search engines are not suitable for complex information gathering tasks requiring aggregation from multiple indexed documents: for such tasks, users must manually aggregate tidbits of pertinent information from various pages. In effect, such limitations are predicated on the lack of machine-interpretable structure in HTML-documents, which is often limited to generic markup tags mainly concerned with document renderign and linking. Most of the real content is contained in prose text which is inherently difficult for machines to interpret.

Source

Semantic search over the Web. Eds.: R. De Virgilio, et al
Blanco, L.; Bronzi, M.; Crescenzi, V.; Merialdo, P.; Papotti, P.: Flint: from Web pages to probabilistic semantic data (2012) 0.02
```
0.020311622 = product of:
  0.030467432 = sum of:
    0.0074199745 = weight(_text_:a in 437) [ClassicSimilarity], result of:
      0.0074199745 = score(doc=437,freq=10.0), product of:
        0.05209492 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.045180224 = queryNorm
        0.14243183 = fieldWeight in 437, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=437)
    0.023047457 = product of:
      0.046094913 = sum of:
        0.046094913 = weight(_text_:de in 437) [ClassicSimilarity], result of:
          0.046094913 = score(doc=437,freq=2.0), product of:
            0.19416152 = queryWeight, product of:
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.045180224 = queryNorm
            0.23740499 = fieldWeight in 437, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.297489 = idf(docFreq=1634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=437)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The Web is a surprisingly extensive source of information: it offers a huge number of sites containing data about a disparate range of topics. Although Web pages are built for human fruition, not for automatic processing of the data, we observe that an increasing number of Web sites deliver pages containing structured information about recognizable concepts, relevant to specific application domains, such as movies, finance, sport, products, etc. The development of scalable techniques to discover, extract, and integrate data from fairly structured large corpora available on the Web is a challenging issue, because to face the Web scale, these activities should be accomplished automatically by domain-independent techniques. To cope with the complexity and the heterogeneity of Web data, state-of-the-art approaches focus on information organized according to specific patterns that frequently occur on the Web. Meaningful examples are WebTables, which focuses on data published in HTML tables, and information extraction systems, such as TextRunner, which exploits lexical-syntactic patterns. As noticed by Cafarella et al., even if a small fraction of the Web is organized according to these patterns, due to the Web scale, the amount of data involved is impressive. In this chapter, we focus on methods and techniques to wring out value from the data delivered by large data-intensive Web sites.

Source

Semantic search over the Web. Eds.: R. De Virgilio, et al

Search (314 results, page 1 of 16)

Authors

Years

Languages

Types

Themes

Subjects

Classifications