Search (366 results, page 2 of 19)

Networked knowledge organization systems (2001) 0.03

0.028236724 = product of:
  0.05647345 = sum of:
    0.031038022 = weight(_text_:data in 6473) [ClassicSimilarity], result of:
      0.031038022 = score(doc=6473,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2096163 = fieldWeight in 6473, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=6473)
    0.025435425 = product of:
      0.05087085 = sum of:
        0.05087085 = weight(_text_:processing in 6473) [ClassicSimilarity], result of:
          0.05087085 = score(doc=6473,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.26835677 = fieldWeight in 6473, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=6473)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Knowledge Organization Systems can comprise thesauri and other controlled lists of keywords, ontologies, classification systems, clustering approaches, taxonomies, gazetteers, dictionaries, lexical databases, concept maps/spaces, semantic road maps, etc. These schemas enable knowledge structuring and management, knowledge-based data processing and systematic access to knowledge structures in individual collections and digital libraries. Used as interactive information services on the Internet they have an increased potential to support the description, discovery and retrieval of heterogeneous information resources and to contribute to an overall resource discovery infrastructure

Mongin, L.; Fu, Y.Y.; Mostafa, J.: Open Archives data Service prototype and automated subject indexing using D-Lib archive content as a testbed (2003) 0.03

0.028236724 = product of:
  0.05647345 = sum of:
    0.031038022 = weight(_text_:data in 1167) [ClassicSimilarity], result of:
      0.031038022 = score(doc=1167,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2096163 = fieldWeight in 1167, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1167)
    0.025435425 = product of:
      0.05087085 = sum of:
        0.05087085 = weight(_text_:processing in 1167) [ClassicSimilarity], result of:
          0.05087085 = score(doc=1167,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.26835677 = fieldWeight in 1167, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=1167)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: The Indiana University School of Library and Information Science opened a new research laboratory in January 2003; The Indiana University School of Library and Information Science Information Processing Laboratory [IU IP Lab]. The purpose of the new laboratory is to facilitate collaboration between scientists in the department in the areas of information retrieval (IR) and information visualization (IV) research. The lab has several areas of focus. These include grid and cluster computing, and a standard Java-based software platform to support plug and play research datasets, a selection of standard IR modules and standard IV algorithms. Future development includes software to enable researchers to contribute datasets, IR algorithms, and visualization algorithms into the standard environment. We decided early on to use OAI-PMH as a resource discovery tool because it is consistent with our mission.

Snajder, J.; Almic, P.: Modeling semantic compositionality of Croatian multiword expressions (2015) 0.03
```
0.028236724 = product of:
  0.05647345 = sum of:
    0.031038022 = weight(_text_:data in 2920) [ClassicSimilarity], result of:
      0.031038022 = score(doc=2920,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2096163 = fieldWeight in 2920, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=2920)
    0.025435425 = product of:
      0.05087085 = sum of:
        0.05087085 = weight(_text_:processing in 2920) [ClassicSimilarity], result of:
          0.05087085 = score(doc=2920,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.26835677 = fieldWeight in 2920, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=2920)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

A distinguishing feature of many multiword expressions (MWEs) is their semantic non-compositionality. Determining the semantic compositionality of MWEs is important for many natural language processing tasks. We address the task of modeling semantic compositionality of Croatian MWEs. We adopt a composition-based approach within the distributional semantics framework. We build and evaluate models based on Latent Semantic Analysis and the recently proposed neural network-based Skip-gram model, and experiment with different composition functions. We show that the compositionality scores predicted by the Skip-gram additive models correlate well with human judgments (=0.50). When framed as a classification task, the model achieves an accuracy of 0.64.

Content

Vgl. unter: http://takelab.fer.hr/data/cromwesc/. The dataset is available from here: TakeLab-CroMWEsc.tar.gz. The archive contains one file, which contains a list of 200 Croatian multiword expressions annotated with semantic compositionality scores. Twenty expressions were annotated by 24 annotators (denoted by "*") and the rest of them were annotated by 6 annotators. Besides median, we provide mode, mean, and standard deviation for each expression. Consult the above mentioned paper for details.
Baker, T.; Bermès, E.; Coyle, K.; Dunsire, G.; Isaac, A.; Murray, P.; Panzer, M.; Schneider, J.; Singer, R.; Summers, E.; Waites, W.; Young, J.; Zeng, M.: Library Linked Data Incubator Group Final Report (2011) 0.03
```
0.027372966 = product of:
  0.10949186 = sum of:
    0.10949186 = weight(_text_:data in 4796) [ClassicSimilarity], result of:
      0.10949186 = score(doc=4796,freq=56.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.7394569 = fieldWeight in 4796, product of:
          7.483315 = tf(freq=56.0), with freq of:
            56.0 = termFreq=56.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=4796)
  0.25 = coord(1/4)
```
Abstract

The mission of the W3C Library Linked Data Incubator Group, chartered from May 2010 through August 2011, has been "to help increase global interoperability of library data on the Web, by bringing together people involved in Semantic Web activities - focusing on Linked Data - in the library community and beyond, building on existing initiatives, and identifying collaboration tracks for the future." In Linked Data [LINKEDDATA], data is expressed using standards such as Resource Description Framework (RDF) [RDF], which specifies relationships between things, and Uniform Resource Identifiers (URIs, or "Web addresses") [URI]. This final report of the Incubator Group examines how Semantic Web standards and Linked Data principles can be used to make the valuable information assets that library create and curate - resources such as bibliographic data, authorities, and concept schemes - more visible and re-usable outside of their original library context on the wider Web. The Incubator Group began by eliciting reports on relevant activities from parties ranging from small, independent projects to national library initiatives (see the separate report, Library Linked Data Incubator Group: Use Cases) [USECASE]. These use cases provided the starting point for the work summarized in the report: an analysis of the benefits of library Linked Data, a discussion of current issues with regard to traditional library data, existing library Linked Data initiatives, and legal rights over library data; and recommendations for next steps. The report also summarizes the results of a survey of current Linked Data technologies and an inventory of library Linked Data resources available today (see also the more detailed report, Library Linked Data Incubator Group: Datasets, Value Vocabularies, and Metadata Element Sets) [VOCABDATASET].
Key recommendations of the report are: - That library leaders identify sets of data as possible candidates for early exposure as Linked Data and foster a discussion about Open Data and rights; - That library standards bodies increase library participation in Semantic Web standardization, develop library data standards that are compatible with Linked Data, and disseminate best-practice design patterns tailored to library Linked Data; - That data and systems designers design enhanced user services based on Linked Data capabilities, create URIs for the items in library datasets, develop policies for managing RDF vocabularies and their URIs, and express library data by re-using or mapping to existing Linked Data vocabularies; - That librarians and archivists preserve Linked Data element sets and value vocabularies and apply library experience in curation and long-term preservation to Linked Data datasets.
Cohen, D.J.: From Babel to knowledge : data mining large digital collections (2006) 0.03
```
0.026398288 = product of:
  0.052796576 = sum of:
    0.035839625 = weight(_text_:data in 1178) [ClassicSimilarity], result of:
      0.035839625 = score(doc=1178,freq=6.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24204408 = fieldWeight in 1178, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=1178)
    0.016956951 = product of:
      0.033913903 = sum of:
        0.033913903 = weight(_text_:processing in 1178) [ClassicSimilarity], result of:
          0.033913903 = score(doc=1178,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.17890452 = fieldWeight in 1178, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.03125 = fieldNorm(doc=1178)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

In Jorge Luis Borges's curious short story The Library of Babel, the narrator describes an endless collection of books stored from floor to ceiling in a labyrinth of countless hexagonal rooms. The pages of the library's books seem to contain random sequences of letters and spaces; occasionally a few intelligible words emerge in the sea of paper and ink. Nevertheless, readers diligently, and exasperatingly, scan the shelves for coherent passages. The narrator himself has wandered numerous rooms in search of enlightenment, but with resignation he simply awaits his death and burial - which Borges explains (with signature dark humor) consists of being tossed unceremoniously over the library's banister. Borges's nightmare, of course, is a cursed vision of the research methods of disciplines such as literature, history, and philosophy, where the careful reading of books, one after the other, is supposed to lead inexorably to knowledge and understanding. Computer scientists would approach Borges's library far differently. Employing the information theory that forms the basis for search engines and other computerized techniques for assessing in one fell swoop large masses of documents, they would quickly realize the collection's incoherence though sampling and statistical methods - and wisely start looking for the library's exit. These computational methods, which allow us to find patterns, determine relationships, categorize documents, and extract information from massive corpuses, will form the basis for new tools for research in the humanities and other disciplines in the coming decade. For the past three years I have been experimenting with how to provide such end-user tools - that is, tools that harness the power of vast electronic collections while hiding much of their complicated technical plumbing. In particular, I have made extensive use of the application programming interfaces (APIs) the leading search engines provide for programmers to query their databases directly (from server to server without using their web interfaces). In addition, I have explored how one might extract information from large digital collections, from the well-curated lexicographic database WordNet to the democratic (and poorly curated) online reference work Wikipedia. While processing these digital corpuses is currently an imperfect science, even now useful tools can be created by combining various collections and methods for searching and analyzing them. And more importantly, these nascent services suggest a future in which information can be gleaned from, and sense can be made out of, even imperfect digital libraries of enormous scale. A brief examination of two approaches to data mining large digital collections hints at this future, while also providing some lessons about how to get there.

Theme

Data Mining
Monireh, E.; Sarker, M.K.; Bianchi, F.; Hitzler, P.; Doran, D.; Xie, N.: Reasoning over RDF knowledge bases using deep learning (2018) 0.03
```
0.026219916 = product of:
  0.05243983 = sum of:
    0.03657866 = weight(_text_:data in 4553) [ClassicSimilarity], result of:
      0.03657866 = score(doc=4553,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24703519 = fieldWeight in 4553, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4553)
    0.01586117 = product of:
      0.03172234 = sum of:
        0.03172234 = weight(_text_:22 in 4553) [ClassicSimilarity], result of:
          0.03172234 = score(doc=4553,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.19345059 = fieldWeight in 4553, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4553)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Semantic Web knowledge representation standards, and in particular RDF and OWL, often come endowed with a formal semantics which is considered to be of fundamental importance for the field. Reasoning, i.e., the drawing of logical inferences from knowledge expressed in such standards, is traditionally based on logical deductive methods and algorithms which can be proven to be sound and complete and terminating, i.e. correct in a very strong sense. For various reasons, though, in particular the scalability issues arising from the ever increasing amounts of Semantic Web data available and the inability of deductive algorithms to deal with noise in the data, it has been argued that alternative means of reasoning should be investigated which bear high promise for high scalability and better robustness. From this perspective, deductive algorithms can be considered the gold standard regarding correctness against which alternative methods need to be tested. In this paper, we show that it is possible to train a Deep Learning system on RDF knowledge graphs, such that it is able to perform reasoning over new RDF knowledge graphs, with high precision and recall compared to the deductive gold standard.

Date

16.11.2018 14:22:01
Ding, J.: Can data die? : why one of the Internet's oldest images lives on wirhout its subjects's consent (2021) 0.03
```
0.025747139 = product of:
  0.051494278 = sum of:
    0.040896185 = weight(_text_:data in 423) [ClassicSimilarity], result of:
      0.040896185 = score(doc=423,freq=20.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.27619374 = fieldWeight in 423, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.01953125 = fieldNorm(doc=423)
    0.010598094 = product of:
      0.021196188 = sum of:
        0.021196188 = weight(_text_:processing in 423) [ClassicSimilarity], result of:
          0.021196188 = score(doc=423,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.111815326 = fieldWeight in 423, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.01953125 = fieldNorm(doc=423)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Lena Forsén, the real human behind the Lenna image, was first published in Playboy in 1972. Soon after, USC engineers searching for a suitable test image for their image processing research sought inspiration from the magazine. They deemed Lenna the right fit and scanned the image into digital, RGB existence. From here, the story of the image follows the story of the internet. Lenna was one of the first inhabitants of ARPANet, the internet's predecessor, and then the world wide web. While the image's reach was limited to a few research papers in the '70s and '80s, in 1991, Lenna was featured on the cover of an engineering journal alongside another popular test image, Peppers. This caught the attention of Playboy, which threatened a copyright infringement lawsuit. Engineers who had grown attached to Lenna fought back. Ultimately, they prevailed, and as a Playboy VP reflected on the drama: "We decided we should exploit this because it is a phenomenon." The Playboy controversy canonized Lenna in engineering folklore and prompted an explosion of conversation about the image. Image hits on the internet rose to a peak number in 1995.

Content

"Having known Lenna for almost a decade, I have struggled to understand what the story of the image means for what tech culture is and what it is becoming. To me, the crux of the Lenna story is how little power we have over our data and how it is used and abused. This threat seems disproportionately higher for women who are often overrepresented in internet content, but underrepresented in internet company leadership and decision making. Given this reality, engineering and product decisions will continue to consciously (and unconsciously) exclude our needs and concerns. While social norms are changing towards non-consensual data collection and data exploitation, digital norms seem to be moving in the opposite direction. Advancements in machine learning algorithms and data storage capabilities are only making data misuse easier. Whether the outcome is revenge porn or targeted ads, surveillance or discriminatory AI, if we want a world where our data can retire when it's outlived its time, or when it's directly harming our lives, we must create the tools and policies that empower data subjects to have a say in what happens to their data. including allowing their data to die."
Bizer, C.; Cyganiak, R.; Heath, T.: How to publish Linked Data on the Web (2007) 0.03
```
0.025605064 = product of:
  0.102420256 = sum of:
    0.102420256 = weight(_text_:data in 3791) [ClassicSimilarity], result of:
      0.102420256 = score(doc=3791,freq=16.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.69169855 = fieldWeight in 3791, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3791)
  0.25 = coord(1/4)
```
Abstract

This document provides a tutorial on how to publish Linked Data on the Web. After a general overview of the concept of Linked Data, we describe several practical recipes for publishing information as Linked Data on the Web.

Content

This tutorial has been superseeded by the book Linked Data: Evolving the Web into a Global Data Space written by Tom Heath and Christian Bizer. This tutorial was published in 2007 and is still online for historical reasons. The Linked Data book was published in 2011 and provides a more detailed and up-to-date introduction into Linked Data.
Priss, U.: Description logic and faceted knowledge representation (1999) 0.03
```
0.025035713 = product of:
  0.050071426 = sum of:
    0.031038022 = weight(_text_:data in 2655) [ClassicSimilarity], result of:
      0.031038022 = score(doc=2655,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2096163 = fieldWeight in 2655, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=2655)
    0.019033402 = product of:
      0.038066804 = sum of:
        0.038066804 = weight(_text_:22 in 2655) [ClassicSimilarity], result of:
          0.038066804 = score(doc=2655,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.23214069 = fieldWeight in 2655, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2655)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The term "facet" was introduced into the field of library classification systems by Ranganathan in the 1930's [Ranganathan, 1962]. A facet is a viewpoint or aspect. In contrast to traditional classification systems, faceted systems are modular in that a domain is analyzed in terms of baseline facets which are then synthesized. In this paper, the term "facet" is used in a broader meaning. Facets can describe different aspects on the same level of abstraction or the same aspect on different levels of abstraction. The notion of facets is related to database views, multicontexts and conceptual scaling in formal concept analysis [Ganter and Wille, 1999], polymorphism in object-oriented design, aspect-oriented programming, views and contexts in description logic and semantic networks. This paper presents a definition of facets in terms of faceted knowledge representation that incorporates the traditional narrower notion of facets and potentially facilitates translation between different knowledge representation formalisms. A goal of this approach is a modular, machine-aided knowledge base design mechanism. A possible application is faceted thesaurus construction for information retrieval and data mining. Reasoning complexity depends on the size of the modules (facets). A more general analysis of complexity will be left for future research.

Date

22. 1.2016 17:30:31
Wright, H.: Semantic Web and ontologies (2018) 0.02
```
0.023951344 = product of:
  0.09580538 = sum of:
    0.09580538 = weight(_text_:data in 80) [ClassicSimilarity], result of:
      0.09580538 = score(doc=80,freq=14.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.64702475 = fieldWeight in 80, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=80)
  0.25 = coord(1/4)
```
Abstract

The Semantic Web and ontologies can help archaeologists combine and share data, making it more open and useful. Archaeologists create diverse types of data, using a wide variety of technologies and methodologies. Like all research domains, these data are increasingly digital. The creation of data that are now openly and persistently available from disparate sources has also inspired efforts to bring archaeological resources together and make them more interoperable. This allows functionality such as federated cross-search across different datasets, and the mapping of heterogeneous data to authoritative structures to build a single data source. Ontologies provide the structure and relationships for Semantic Web data, and have been developed for use in cultural heritage applications generally, and archaeology specifically. A variety of online resources for archaeology now incorporate Semantic Web principles and technologies.
Donahue, J.; Hendricks, L.A.; Guadarrama, S.; Rohrbach, M.; Venugopalan, S.; Saenko, K.; Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description (2014) 0.02
```
0.023530604 = product of:
  0.04706121 = sum of:
    0.02586502 = weight(_text_:data in 1873) [ClassicSimilarity], result of:
      0.02586502 = score(doc=1873,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.17468026 = fieldWeight in 1873, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1873)
    0.021196188 = product of:
      0.042392377 = sum of:
        0.042392377 = weight(_text_:processing in 1873) [ClassicSimilarity], result of:
          0.042392377 = score(doc=1873,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.22363065 = fieldWeight in 1873, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1873)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent, or "temporally deep", are effective for tasks involving sequences, visual and otherwise. We develop a novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and demonstrate the value of these models on benchmark video recognition tasks, image description and retrieval problems, and video narration challenges. In contrast to current models which assume a fixed spatio-temporal receptive field or simple temporal averaging for sequential processing, recurrent convolutional models are "doubly deep" in that they can be compositional in spatial and temporal "layers". Such models may have advantages when target concepts are complex and/or training data are limited. Learning long-term dependencies is possible when nonlinearities are incorporated into the network state updates. Long-term RNN models are appealing in that they directly can map variable-length inputs (e.g., video frames) to variable length outputs (e.g., natural language text) and can model complex temporal dynamics; yet they can be optimized with backpropagation. Our recurrent long-term models are directly connected to modern visual convnet models and can be jointly trained to simultaneously learn temporal dynamics and convolutional perceptual representations. Our results show such models have distinct advantages over state-of-the-art models for recognition or generation which are separately defined and/or optimized.
Hill, L.L.; Frew, J.; Zheng, Q.: Geographic names : the implementation of a gazetteer in a georeferenced digital library (1999) 0.02
```
0.023109939 = product of:
  0.046219878 = sum of:
    0.029262928 = weight(_text_:data in 1240) [ClassicSimilarity], result of:
      0.029262928 = score(doc=1240,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.19762816 = fieldWeight in 1240, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=1240)
    0.016956951 = product of:
      0.033913903 = sum of:
        0.033913903 = weight(_text_:processing in 1240) [ClassicSimilarity], result of:
          0.033913903 = score(doc=1240,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.17890452 = fieldWeight in 1240, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.03125 = fieldNorm(doc=1240)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The Alexandria Digital Library (ADL) Project has developed a content standard for gazetteer objects and a hierarchical type scheme for geographic features. Both of these developments are based on ADL experience with an earlier gazetteer component for the Library, based on two gazetteers maintained by the U.S. federal government. We define the minimum components of a gazetteer entry as (1) a geographic name, (2) a geographic location represented by coordinates, and (3) a type designation. With these attributes, a gazetteer can function as a tool for indirect spatial location identification through names and types. The ADL Gazetteer Content Standard supports contribution and sharing of gazetteer entries with rich descriptions beyond the minimum requirements. This paper describes the content standard, the feature type thesaurus, and the implementation and research issues. A gazetteer is list of geographic names, together with their geographic locations and other descriptive information. A geographic name is a proper name for a geographic place and feature, such as Santa Barbara County, Mount Washington, St. Francis Hospital, and Southern California. There are many types of printed gazetteers. For example, the New York Times Atlas has a gazetteer section that can be used to look up a geographic name and find the page(s) and grid reference(s) where the corresponding feature is shown. Some gazetteers provide information about places and features; for example, a history of the locale, population data, physical data such as elevation, or the pronunciation of the name. Some lists of geographic names are available as hierarchical term sets (thesauri) designed for information retreival; these are used to describe bibliographic or museum materials. Examples include the authority files of the U.S. Library of Congress and the GeoRef Thesaurus produced by the American Geological Institute. The Getty Museum has recently made their Thesaurus of Geographic Names available online. This is a major project to develop a controlled vocabulary of current and historical names to describe (i.e., catalog) art and architecture literature. U.S. federal government mapping agencies maintain gazetteers containing the official names of places and/or the names that appear on map series. Examples include the U.S. Geological Survey's Geographic Names Information System (GNIS) and the National Imagery and Mapping Agency's Geographic Names Processing System (GNPS). Both of these are maintained in cooperation with the U.S. Board of Geographic Names (BGN). Many other examples could be cited -- for local areas, for other countries, and for special purposes. There is remarkable diversity in approaches to the description of geographic places and no standardization beyond authoritative sources for the geographic names themselves.
Auer, S.; Lehmann, J.: Making the Web a data washing machine : creating knowledge out of interlinked data (2010) 0.02
```
0.022399765 = product of:
  0.08959906 = sum of:
    0.08959906 = weight(_text_:data in 112) [ClassicSimilarity], result of:
      0.08959906 = score(doc=112,freq=24.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.60511017 = fieldWeight in 112, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=112)
  0.25 = coord(1/4)
```
Abstract

Over the past 3 years, the semantic web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into a very promising candidate for addressing one of the biggest challenges in the area of the Semantic Web vision: the exploitation of the Web as a platform for data and information integration. To translate this initial success into a world-scale reality, a number of research challenges need to be addressed: the performance gap between relational and RDF data management has to be closed, coherence and quality of data published on theWeb have to be improved, provenance and trust on the Linked Data Web must be established and generally the entrance barrier for data publishers and users has to be lowered. In this vision statement we discuss these challenges and argue, that research approaches tackling these challenges should be integrated into a mutual refinement cycle. We also present two crucial use-cases for the widespread adoption of linked data.

Content

Vgl.: http://www.semantic-web-journal.net/content/new-submission-making-web-data-washing-machine-creating-knowledge-out-interlinked-data http://www.semantic-web-journal.net/sites/default/files/swj24_0.pdf.

Stapleton, M.; Adams, M.: Faceted categorisation for the corporate desktop : visualisation and interaction using metadata to enhance user experience (2007) 0.02

0.022234414 = product of:
  0.088937655 = sum of:
    0.088937655 = sum of:
      0.05087085 = weight(_text_:processing in 718) [ClassicSimilarity], result of:
        0.05087085 = score(doc=718,freq=2.0), product of:
          0.18956426 = queryWeight, product of:
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.046827413 = queryNorm
          0.26835677 = fieldWeight in 718, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.046875 = fieldNorm(doc=718)
      0.038066804 = weight(_text_:22 in 718) [ClassicSimilarity], result of:
        0.038066804 = score(doc=718,freq=2.0), product of:
          0.16398162 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046827413 = queryNorm
          0.23214069 = fieldWeight in 718, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=718)
  0.25 = coord(1/4)

Abstract: Mark Stapleton and Matt Adamson began their presentation by describing how Dow Jones' Factiva range of information services processed an average of 170,000 documents every day, drawn from over 10,000 sources in 22 languages. These documents are categorized within five facets: Company, Subject, Industry, Region and Language. The digital feeds received from information providers undergo a series of processing stages, initially to prepare them for automatic categorization and then to format them ready for distribution. The categorization stage is able to handle 98% of documents automatically, the remaining 2% requiring some form of human intervention. Depending on the source, categorization can involve any combination of 'Autocoding', 'Dictionary-based Categorizing', 'Rules-based Coding' or 'Manual Coding'

Bittner, T.; Donnelly, M.; Winter, S.: Ontology and semantic interoperability (2006) 0.02
```
0.022234414 = product of:
  0.088937655 = sum of:
    0.088937655 = sum of:
      0.05087085 = weight(_text_:processing in 4820) [ClassicSimilarity], result of:
        0.05087085 = score(doc=4820,freq=2.0), product of:
          0.18956426 = queryWeight, product of:
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.046827413 = queryNorm
          0.26835677 = fieldWeight in 4820, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.048147 = idf(docFreq=2097, maxDocs=44218)
            0.046875 = fieldNorm(doc=4820)
      0.038066804 = weight(_text_:22 in 4820) [ClassicSimilarity], result of:
        0.038066804 = score(doc=4820,freq=2.0), product of:
          0.16398162 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046827413 = queryNorm
          0.23214069 = fieldWeight in 4820, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=4820)
  0.25 = coord(1/4)
```
Abstract

One of the major problems facing systems for Computer Aided Design (CAD), Architecture Engineering and Construction (AEC) and Geographic Information Systems (GIS) applications today is the lack of interoperability among the various systems. When integrating software applications, substantial di culties can arise in translating information from one application to the other. In this paper, we focus on semantic di culties that arise in software integration. Applications may use di erent terminologies to describe the same domain. Even when appli-cations use the same terminology, they often associate di erent semantics with the terms. This obstructs information exchange among applications. To cir-cumvent this obstacle, we need some way of explicitly specifying the semantics for each terminology in an unambiguous fashion. Ontologies can provide such specification. It will be the task of this paper to explain what ontologies are and how they can be used to facilitate interoperability between software systems used in computer aided design, architecture engineering and construction, and geographic information processing.

Date

3.12.2016 18:39:22

Woods, E.W.; IFLA Section on classification and Indexing and Indexing and Information Technology; Joint Working Group on a Classification Format: Requirements for a format of classification data : Final report, July 1996 (1996) 0.02

0.021947198 = product of:
  0.08778879 = sum of:
    0.08778879 = weight(_text_:data in 3008) [ClassicSimilarity], result of:
      0.08778879 = score(doc=3008,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.5928845 = fieldWeight in 3008, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.09375 = fieldNorm(doc=3008)
  0.25 = coord(1/4)

Object: USMARC for classification data

Hannemann, J.; Kett, J.: Linked data for libraries (2010) 0.02
```
0.021947198 = product of:
  0.08778879 = sum of:
    0.08778879 = weight(_text_:data in 3964) [ClassicSimilarity], result of:
      0.08778879 = score(doc=3964,freq=16.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.5928845 = fieldWeight in 3964, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3964)
  0.25 = coord(1/4)
```
Abstract

The Semantic Web in general and the Linking Open Data initiative in particular encourage institutions to publish, share and interlink their data. This has considerable potential for libraries, which can complement their data by linking it to other, external data sources. This paper details the first linked open data service of the German National Library. The focus is on the challenges met during the inception of this service. Extrapolating from our experiences, the paper further discusses the German National Library's perspective on the future of library data exchange and the potential for the creation of globally interlinked library data. We outline how this process can be facilitated and how new services can be offered based on these growing metadata collections.
Lamb, I.; Larson, C.: Shining a light on scientific data : building a data catalog to foster data sharing and reuse (2016) 0.02
```
0.021947198 = product of:
  0.08778879 = sum of:
    0.08778879 = weight(_text_:data in 3195) [ClassicSimilarity], result of:
      0.08778879 = score(doc=3195,freq=16.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.5928845 = fieldWeight in 3195, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3195)
  0.25 = coord(1/4)
```
Abstract

The scientific community's growing eagerness to make research data available to the public provides libraries - with our expertise in metadata and discovery - an interesting new opportunity. This paper details the in-house creation of a "data catalog" which describes datasets ranging from population-level studies like the US Census to small, specialized datasets created by researchers at our own institution. Based on Symfony2 and Solr, the data catalog provides a powerful search interface to help researchers locate the data that can help them, and an administrative interface so librarians can add, edit, and manage metadata elements at will. This paper will outline the successes, failures, and total redos that culminated in the current manifestation of our data catalog.
Wongthontham, P.; Abu-Salih, B.: Ontology-based approach for semantic data extraction from social big data : state-of-the-art and research directions (2018) 0.02
```
0.021947198 = product of:
  0.08778879 = sum of:
    0.08778879 = weight(_text_:data in 4097) [ClassicSimilarity], result of:
      0.08778879 = score(doc=4097,freq=16.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.5928845 = fieldWeight in 4097, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=4097)
  0.25 = coord(1/4)
```
Abstract

A challenge of managing and extracting useful knowledge from social media data sources has attracted much attention from academic and industry. To address this challenge, semantic analysis of textual data is focused in this paper. We propose an ontology-based approach to extract semantics of textual data and define the domain of data. In other words, we semantically analyse the social data at two levels i.e. the entity level and the domain level. We have chosen Twitter as a social channel challenge for a purpose of concept proof. Domain knowledge is captured in ontologies which are then used to enrich the semantics of tweets provided with specific semantic conceptual representation of entities that appear in the tweets. Case studies are used to demonstrate this approach. We experiment and evaluate our proposed approach with a public dataset collected from Twitter and from the politics domain. The ontology-based approach leverages entity extraction and concept mappings in terms of quantity and accuracy of concept identification.

Theme

Data Mining
Bahls, D.; Scherp, G.; Tochtermann, K.; Hasselbring, W.: Towards a recommender system for statistical research data (2012) 0.02
```
0.021446142 = product of:
  0.08578457 = sum of:
    0.08578457 = weight(_text_:data in 474) [ClassicSimilarity], result of:
      0.08578457 = score(doc=474,freq=22.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.5793489 = fieldWeight in 474, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=474)
  0.25 = coord(1/4)
```
Abstract

To effectively promote the exchange of scientific data, retrieval services are required to suit the needs of the research community. A large amount of research in the field of economics is based on statistical data, which is often drawn from external sources like data agencies, statistical offices or affiated institutes. Since producing such data for a particular research question is expensive in time and money-if possible at all- research activities are often influenced by the availability of suitable data. Researchers choose or adjust their questions, so that the empirical foundation to support their results is given. As a consequence, researchers look out and poll for newly available data in all sorts of directions due to a lacking information infrastructure for this domain. This circumstance and a recent report from the High Level Expert Group on Scientific Data motivate recommendation and notification services for research data sets. In this paper, we elaborate on a case-based recommender system for statistical data, which allows for precise query specification. We discuss required similarity measures on the basis of cross-domain code lists and propose a system architecture. To address the problem of continuous polling, we elaborate on a notification service to inform researchers on newly avaible data sets based on their personal request.

Search (366 results, page 2 of 19)

Authors

Years

Types

Themes