Search (397 results, page 1 of 20)

Kleineberg, M.: Context analysis and context indexing : formal pragmatics in knowledge organization (2014) 0.24

0.24197838 = product of:
  0.7259351 = sum of:
    0.10370501 = product of:
      0.31111503 = sum of:
        0.31111503 = weight(_text_:3a in 1826) [ClassicSimilarity], result of:
          0.31111503 = score(doc=1826,freq=2.0), product of:
            0.3321406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03917671 = queryNorm
            0.93669677 = fieldWeight in 1826, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.078125 = fieldNorm(doc=1826)
      0.33333334 = coord(1/3)
    0.31111503 = weight(_text_:2f in 1826) [ClassicSimilarity], result of:
      0.31111503 = score(doc=1826,freq=2.0), product of:
        0.3321406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03917671 = queryNorm
        0.93669677 = fieldWeight in 1826, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.078125 = fieldNorm(doc=1826)
    0.31111503 = weight(_text_:2f in 1826) [ClassicSimilarity], result of:
      0.31111503 = score(doc=1826,freq=2.0), product of:
        0.3321406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03917671 = queryNorm
        0.93669677 = fieldWeight in 1826, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.078125 = fieldNorm(doc=1826)
  0.33333334 = coord(3/9)

Source: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=5&ved=0CDQQFjAE&url=http%3A%2F%2Fdigbib.ubka.uni-karlsruhe.de%2Fvolltexte%2Fdocuments%2F3131107&ei=HzFWVYvGMsiNsgGTyoFI&usg=AFQjCNE2FHUeR9oQTQlNC4TPedv4Mo3DaQ&sig2=Rlzpr7a3BLZZkqZCXXN_IA&bvm=bv.93564037,d.bGg&cad=rja

Shala, E.: ¬Die Autonomie des Menschen und der Maschine : gegenwärtige Definitionen von Autonomie zwischen philosophischem Hintergrund und technologischer Umsetzbarkeit (2014) 0.12

0.12098919 = product of:
  0.36296755 = sum of:
    0.051852506 = product of:
      0.15555751 = sum of:
        0.15555751 = weight(_text_:3a in 4388) [ClassicSimilarity], result of:
          0.15555751 = score(doc=4388,freq=2.0), product of:
            0.3321406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03917671 = queryNorm
            0.46834838 = fieldWeight in 4388, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4388)
      0.33333334 = coord(1/3)
    0.15555751 = weight(_text_:2f in 4388) [ClassicSimilarity], result of:
      0.15555751 = score(doc=4388,freq=2.0), product of:
        0.3321406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03917671 = queryNorm
        0.46834838 = fieldWeight in 4388, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4388)
    0.15555751 = weight(_text_:2f in 4388) [ClassicSimilarity], result of:
      0.15555751 = score(doc=4388,freq=2.0), product of:
        0.3321406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03917671 = queryNorm
        0.46834838 = fieldWeight in 4388, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4388)
  0.33333334 = coord(3/9)

Footnote: Vgl. unter: https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=2ahUKEwizweHljdbcAhVS16QKHXcFD9QQFjABegQICRAB&url=https%3A%2F%2Fwww.researchgate.net%2Fpublication%2F271200105_Die_Autonomie_des_Menschen_und_der_Maschine_-_gegenwartige_Definitionen_von_Autonomie_zwischen_philosophischem_Hintergrund_und_technologischer_Umsetzbarkeit_Redigierte_Version_der_Magisterarbeit_Karls&usg=AOvVaw06orrdJmFF2xbCCp_hL26q.

Assem, M. van: Converting and integrating vocabularies for the Semantic Web (2010) 0.06

0.06416068 = product of:
  0.14436153 = sum of:
    0.08878562 = weight(_text_:applications in 4639) [ClassicSimilarity], result of:
      0.08878562 = score(doc=4639,freq=14.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.51477134 = fieldWeight in 4639, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03125 = fieldNorm(doc=4639)
    0.011975031 = weight(_text_:of in 4639) [ClassicSimilarity], result of:
      0.011975031 = score(doc=4639,freq=16.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.19546966 = fieldWeight in 4639, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=4639)
    0.016351866 = weight(_text_:systems in 4639) [ClassicSimilarity], result of:
      0.016351866 = score(doc=4639,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1358164 = fieldWeight in 4639, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03125 = fieldNorm(doc=4639)
    0.027249003 = weight(_text_:software in 4639) [ClassicSimilarity], result of:
      0.027249003 = score(doc=4639,freq=2.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.17532499 = fieldWeight in 4639, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03125 = fieldNorm(doc=4639)
  0.44444445 = coord(4/9)

Abstract: This thesis focuses on conversion of vocabularies for representation and integration of collections on the Semantic Web. A secondary focus is how to represent metadata schemas (RDF Schemas representing metadata element sets) such that they interoperate with vocabularies. The primary domain in which we operate is that of cultural heritage collections. The background worldview in which a solution is sought is that of the Semantic Web research paradigmwith its associated theories, methods, tools and use cases. In other words, we assume the SemanticWeb is in principle able to provide the context to realize interoperable collections. Interoperability is dependent on the interplay between representations and the applications that use them. We mean applications in the widest sense, such as "search" and "annotation". These applications or tasks are often present in software applications, such as the E-Culture application. It is therefore necessary that applications requirements on the vocabulary representation are met. This leads us to formulate the following problem statement: HOW CAN EXISTING VOCABULARIES BE MADE AVAILABLE TO SEMANTIC WEB APPLICATIONS?
We refine the problem statement into three research questions. The first two focus on the problem of conversion of a vocabulary to a Semantic Web representation from its original format. Conversion of a vocabulary to a representation in a Semantic Web language is necessary to make the vocabulary available to SemanticWeb applications. In the last question we focus on integration of collection metadata schemas in a way that allows for vocabulary representations as produced by our methods. Academisch proefschrift ter verkrijging van de graad Doctor aan de Vrije Universiteit Amsterdam, Dutch Research School for Information and Knowledge Systems.

Rehurek, R.; Sojka, P.: Software framework for topic modelling with large corpora (2010) 0.05

0.046727955 = product of:
  0.14018387 = sum of:
    0.050336715 = weight(_text_:applications in 1058) [ClassicSimilarity], result of:
      0.050336715 = score(doc=1058,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2918479 = fieldWeight in 1058, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.046875 = fieldNorm(doc=1058)
    0.019052157 = weight(_text_:of in 1058) [ClassicSimilarity], result of:
      0.019052157 = score(doc=1058,freq=18.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.3109903 = fieldWeight in 1058, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1058)
    0.070794985 = weight(_text_:software in 1058) [ClassicSimilarity], result of:
      0.070794985 = score(doc=1058,freq=6.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.4555077 = fieldWeight in 1058, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.046875 = fieldNorm(doc=1058)
  0.33333334 = coord(3/9)

Abstract: Large corpora are ubiquitous in today's world and memory quickly becomes the limiting factor in practical applications of the Vector Space Model (VSM). In this paper, we identify a gap in existing implementations of many of the popular algorithms, which is their scalability and ease of use. We describe a Natural Language Processing software framework which is based on the idea of document streaming, i.e. processing corpora document after document, in a memory independent fashion. Within this framework, we implement several popular algorithms for topical inference, including Latent Semantic Analysis and Latent Dirichlet Allocation, in a way that makes them completely independent of the training corpus size. Particular emphasis is placed on straightforward and intuitive framework design, so that modifications and extensions of the methods and/or their application by interested practitioners are effortless. We demonstrate the usefulness of our approach on a real-world scenario of computing document similarities within an existing digital library DML-CZ.
Content: Für die Software, vgl.: http://radimrehurek.com/gensim/index.html. Für eine Demo, vgl.: http://dml.cz/handle/10338.dmlcz/100785/SimilarArticles.

Gómez-Pérez, A.; Corcho, O.: Ontology languages for the Semantic Web (2015) 0.04
```
0.04371041 = product of:
  0.13113123 = sum of:
    0.08389453 = weight(_text_:applications in 3297) [ClassicSimilarity], result of:
      0.08389453 = score(doc=3297,freq=8.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.4864132 = fieldWeight in 3297, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3297)
    0.011833867 = weight(_text_:of in 3297) [ClassicSimilarity], result of:
      0.011833867 = score(doc=3297,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.19316542 = fieldWeight in 3297, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3297)
    0.03540283 = weight(_text_:systems in 3297) [ClassicSimilarity], result of:
      0.03540283 = score(doc=3297,freq=6.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.29405114 = fieldWeight in 3297, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3297)
  0.33333334 = coord(3/9)
```
Abstract

Ontologies have proven to be an essential element in many applications. They are used in agent systems, knowledge management systems, and e-commerce platforms. They can also generate natural language, integrate intelligent information, provide semantic-based access to the Internet, and extract information from texts in addition to being used in many other applications to explicitly declare the knowledge embedded in them. However, not only are ontologies useful for applications in which knowledge plays a key role, but they can also trigger a major change in current Web contents. This change is leading to the third generation of the Web-known as the Semantic Web-which has been defined as "the conceptual structuring of the Web in an explicit machine-readable way."1 This definition does not differ too much from the one used for defining an ontology: "An ontology is an explicit, machinereadable specification of a shared conceptualization."2 In fact, new ontology-based applications and knowledge architectures are developing for this new Web. A common claim for all of these approaches is the need for languages to represent the semantic information that this Web requires-solving the heterogeneous data exchange in this heterogeneous environment. Here, we don't decide which language is best of the Semantic Web. Rather, our goal is to help developers find the most suitable language for their representation needs. The authors analyze the most representative ontology languages created for the Web and compare them using a common framework.

Source

IEEE intelligent systems 2002, Jan./Feb., S.54-60

Stoykova, V.; Petkova, E.: Automatic extraction of mathematical terms for precalculus (2012) 0.04

0.039748333 = product of:
  0.11924499 = sum of:
    0.05872617 = weight(_text_:applications in 156) [ClassicSimilarity], result of:
      0.05872617 = score(doc=156,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.34048924 = fieldWeight in 156, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0546875 = fieldNorm(doc=156)
    0.0128330635 = weight(_text_:of in 156) [ClassicSimilarity], result of:
      0.0128330635 = score(doc=156,freq=6.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.20947541 = fieldWeight in 156, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=156)
    0.047685754 = weight(_text_:software in 156) [ClassicSimilarity], result of:
      0.047685754 = score(doc=156,freq=2.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.30681872 = fieldWeight in 156, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.0546875 = fieldNorm(doc=156)
  0.33333334 = coord(3/9)

Abstract: In this work, we present the results of research for evaluating a methodology for extracting mathematical terms for precalculus using the techniques for semantically-oriented statistical search. We use the corpus-based approach and the combination of different statistically-based techniques for extracting keywords, collocations and co-occurrences incorporated in the Sketch Engine software. We evaluate the collocations candidate terms for the basic concept function(s) and approve the related methodology by precalculus domain conceptual terms definitions. Finally, we offer a conceptual terms hierarchical representation and discuss the results with respect to their possible applications.

Ledl, A.: Demonstration of the BAsel Register of Thesauri, Ontologies & Classifications (BARTOC) (2015) 0.04

0.035035666 = product of:
  0.105106995 = sum of:
    0.050336715 = weight(_text_:applications in 2038) [ClassicSimilarity], result of:
      0.050336715 = score(doc=2038,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2918479 = fieldWeight in 2038, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.046875 = fieldNorm(doc=2038)
    0.020082738 = weight(_text_:of in 2038) [ClassicSimilarity], result of:
      0.020082738 = score(doc=2038,freq=20.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.32781258 = fieldWeight in 2038, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2038)
    0.034687545 = weight(_text_:systems in 2038) [ClassicSimilarity], result of:
      0.034687545 = score(doc=2038,freq=4.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.28811008 = fieldWeight in 2038, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=2038)
  0.33333334 = coord(3/9)

Abstract: The BAsel Register of Thesauri, Ontologies & Classifications (BARTOC, http://bartoc.org) is a bibliographic database aiming to record metadata of as many Knowledge Organization Systems as possible. It has a facetted, responsive web design search interface in 20 EU languages. With more than 1'300 interdisciplinary items in 77 languages, BARTOC is the largest database of its kind, multilingual both by content and features, and it is still growing. This being said, the demonstration of BARTOC would be suitable for topic nr. 10 [Multilingual and Interdisciplinary KOS applications and tools]. BARTOC has been developed by the University Library of Basel, Switzerland. It is rooted in the tradition of library and information science of collecting bibliographic records of controlled and structured vocabularies, yet in a more contemporary manner. BARTOC is based on the open source content management system Drupal 7.
Content: Vortrag anlässlich: 14th European Networked Knowledge Organization Systems (NKOS) Workshop, TPDL 2015 Conference in Poznan, Poland, Friday 18th September 2015. Vgl. auch: http://bartoc.org/.

Shen, M.; Liu, D.-R.; Huang, Y.-S.: Extracting semantic relations to enrich domain ontologies (2012) 0.03

0.034636453 = product of:
  0.10390935 = sum of:
    0.05872617 = weight(_text_:applications in 267) [ClassicSimilarity], result of:
      0.05872617 = score(doc=267,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.34048924 = fieldWeight in 267, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0546875 = fieldNorm(doc=267)
    0.016567415 = weight(_text_:of in 267) [ClassicSimilarity], result of:
      0.016567415 = score(doc=267,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2704316 = fieldWeight in 267, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=267)
    0.028615767 = weight(_text_:systems in 267) [ClassicSimilarity], result of:
      0.028615767 = score(doc=267,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.23767869 = fieldWeight in 267, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0546875 = fieldNorm(doc=267)
  0.33333334 = coord(3/9)

Abstract: Domain ontologies facilitate the organization, sharing and reuse of domain knowledge, and enable various vertical domain applications to operate successfully. Most methods for automatically constructing ontologies focus on taxonomic relations, such as is-kind-of and is- part-of relations. However, much of the domain-specific semantics is ignored. This work proposes a semi-unsupervised approach for extracting semantic relations from domain-specific text documents. The approach effectively utilizes text mining and existing taxonomic relations in domain ontologies to discover candidate keywords that can represent semantic relations. A preliminary experiment on the natural science domain (Taiwan K9 education) indicates that the proposed method yields valuable recommendations. This work enriches domain ontologies by adding distilled semantics.
Source: Journal of Intelligent Information Systems

Arenas, M.; Cuenca Grau, B.; Kharlamov, E.; Marciuska, S.; Zheleznyakov, D.: Faceted search over ontology-enhanced RDF data (2014) 0.03

0.029688384 = product of:
  0.08906515 = sum of:
    0.050336715 = weight(_text_:applications in 2207) [ClassicSimilarity], result of:
      0.050336715 = score(doc=2207,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2918479 = fieldWeight in 2207, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.046875 = fieldNorm(doc=2207)
    0.014200641 = weight(_text_:of in 2207) [ClassicSimilarity], result of:
      0.014200641 = score(doc=2207,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.23179851 = fieldWeight in 2207, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2207)
    0.0245278 = weight(_text_:systems in 2207) [ClassicSimilarity], result of:
      0.0245278 = score(doc=2207,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.2037246 = fieldWeight in 2207, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=2207)
  0.33333334 = coord(3/9)

Abstract: An increasing number of applications rely on RDF, OWL2, and SPARQL for storing and querying data. SPARQL, however, is not targeted towards end-users, and suitable query interfaces are needed. Faceted search is a prominent approach for end-user data access, and several RDF-based faceted search systems have been developed. There is, however, a lack of rigorous theoretical underpinning for faceted search in the context of RDF and OWL2. In this paper, we provide such solid foundations. We formalise faceted interfaces for this context, identify a fragment of first-order logic capturing the underlying queries, and study the complexity of answering such queries for RDF and OWL2 profiles. We then study interface generation and update, and devise efficiently implementable algorithms. Finally, we have implemented and tested our faceted search algorithms for scalability, with encouraging results.

Mirizzi, R.: Exploratory browsing in the Web of Data (2011) 0.03
```
0.029583763 = product of:
  0.088751286 = sum of:
    0.05872617 = weight(_text_:applications in 4803) [ClassicSimilarity], result of:
      0.05872617 = score(doc=4803,freq=8.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.34048924 = fieldWeight in 4803, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4803)
    0.015717229 = weight(_text_:of in 4803) [ClassicSimilarity], result of:
      0.015717229 = score(doc=4803,freq=36.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.25655392 = fieldWeight in 4803, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4803)
    0.014307884 = weight(_text_:systems in 4803) [ClassicSimilarity], result of:
      0.014307884 = score(doc=4803,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.118839346 = fieldWeight in 4803, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4803)
  0.33333334 = coord(3/9)
```
Abstract

Thanks to the recent Linked Data initiative, the foundations of the Semantic Web have been built. Shared, open and linked RDF datasets give us the possibility to exploit both the strong theoretical results and the robust technologies and tools developed since the seminal paper in the Semantic Web appeared in 2001. In a simplistic way, we may think at the Semantic Web as a ultra large distributed database we can query to get information coming from different sources. In fact, every dataset exposes a SPARQL endpoint to make the data accessible through exact queries. If we know the URI of the famous actress Nicole Kidman in DBpedia we may retrieve all the movies she acted with a simple SPARQL query. Eventually we may aggregate this information with users ratings and genres from IMDB. Even though these are very exciting results and applications, there is much more behind the curtains. Datasets come with the description of their schema structured in an ontological way. Resources refer to classes which are in turn organized in well structured and rich ontologies. Exploiting also this further feature we go beyond the notion of a distributed database and we can refer to the Semantic Web as a distributed knowledge base. If in our knowledge base we have that Paris is located in France (ontological level) and that Moulin Rouge! is set in Paris (data level) we may query the Semantic Web (interpreted as a set of interconnected datasets and related ontologies) to return all the movies starred by Nicole Kidman set in France and Moulin Rouge! will be in the final result set. The ontological level makes possible to infer new relations among data.
The Linked Data initiative and the state of the art in semantic technologies led off all brand new search and mash-up applications. The basic idea is to have smarter lookup services for a huge, distributed and social knowledge base. All these applications catch and (re)propose, under a semantic data perspective, the view of the classical Web as a distributed collection of documents to retrieve. The interlinked nature of the Web, and consequently of the Semantic Web, is exploited (just) to collect and aggregate data coming from different sources. Of course, this is a big step forward in search and Web technologies, but if we limit our investi- gation to retrieval tasks, we miss another important feature of the current Web: browsing and in particular exploratory browsing (a.k.a. exploratory search). Thanks to its hyperlinked nature, the Web defined a new way of browsing documents and knowledge: selection by lookup, navigation and trial-and-error tactics were, and still are, exploited by users to search for relevant information satisfying some initial requirements. The basic assumptions behind a lookup search, typical of Information Retrieval (IR) systems, are no more valid in an exploratory browsing context. An IR system, such as a search engine, assumes that: the user has a clear picture of what she is looking for ; she knows the terminology of the specific knowledge space. On the other side, as argued in, the main challenges in exploratory search can be summarized as: support querying and rapid query refinement; other facets and metadata-based result filtering; leverage search context; support learning and understanding; other visualization to support insight/decision making; facilitate collaboration. In Section 3 we will show two applications for exploratory search in the Semantic Web addressing some of the above challenges.
Balakrishnan, U.; Voß, J.: ¬The Cocoda mapping tool (2015) 0.03
```
0.028905444 = product of:
  0.08671633 = sum of:
    0.019949809 = weight(_text_:of in 4205) [ClassicSimilarity], result of:
      0.019949809 = score(doc=4205,freq=58.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.32564276 = fieldWeight in 4205, product of:
          7.615773 = tf(freq=58.0), with freq of:
            58.0 = termFreq=58.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4205)
    0.042923648 = weight(_text_:systems in 4205) [ClassicSimilarity], result of:
      0.042923648 = score(doc=4205,freq=18.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.35651803 = fieldWeight in 4205, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4205)
    0.023842877 = weight(_text_:software in 4205) [ClassicSimilarity], result of:
      0.023842877 = score(doc=4205,freq=2.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.15340936 = fieldWeight in 4205, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4205)
  0.33333334 = coord(3/9)
```
Abstract

Since the 90s, we have seen an explosion of information and with it there is an increase in the need for data and information aggregation systems that store and manage information. However, most of the information sources apply different Knowledge Organizations Systems (KOS) to describe the content of stored data. This heterogeneous mix of KOS in different systems complicate access and seamless sharing of information and knowledge. Concordances also known as cross-concordances or terminology mappings map different (KOS) to each other for improvement of information retrieval in such heterogeneous mix of systems. (Mayr 2010, Keil 2012). Also for coherent indexing with different terminologies, mappings are considered to be a valuable and essential working tool. However, despite efforts at standardization (e.g. SKOS, ISO 25964-2, Keil 2012, Soergel 2011); there is a significant scarcity of concordances that has led an inability to establish uniform exchange formats as well as methods and tools for maintaining mappings and making them easily accessible. This is particularly true in the field of library classification schemes. In essence, there is a lack of infrastructure for provision/exchange of concordances, their management and quality assessment as well as tools that would enable semi-automatic generation of mappings. The project "coli-conc" therefore aims to address this gap by creating the necessary infrastructure. This includes the specification of a data format for exchange of concordances (JSKOS), specification and implementation of web APIs to query concordance databases (JSKOS-API), and a modular web application to enable uniform access to knowledge organization systems, concordances and concordance assessments (Cocoda).
The focus of the project "coli-conc" lies in semi-automatic creation of mappings between different KOS in general and the two important library classification schemes in particular - Dewey classification system (DDC) and Regensburg classification system (RVK). In the year 2000, the national libraries of Germany, Austria and Switzerland adopted DDC in an endeavor to develop a nation-wide classification scheme. But historically, in the German speaking regions, the academic libraries have been using their own home-grown systems, the most prominent and popular being the RVK. However, with the launch of DDC, building concordances between DDC and RVK has become an imperative, although it is still rare. The delay in building comprehensive concordances between these two systems has been because of major challenges posed by the sheer largeness of these two systems (38.000 classes in DDC and ca. 860.000 classes in RVK), the strong disparity in their respective structure, the variation in the perception and representation of the concepts. The challenge is compounded geometrically for any manual attempt in this direction. Although there have been efforts on automatic mappings (OAEI Library Track 2012 -- 2014 and e.g. Pfeffer 2013) in the recent years; such concordances carry the risks of inaccurate mappings, and the approaches are rather more suitable for mapping suggestions than for automatic generation of concordances (Lauser 2008; Reiner 2010). The project "coli-conc" will facilitate the creation, evaluation, and reuse of mappings with a public collection of concordances and a web application of mapping management. The proposed presentation will give an introduction to the tools and standards created and planned in the project "coli-conc". This includes preliminary work on DDC concordances (Balakrishnan 2013), an overview of the software concept, technical architecture (Voß 2015) and a demonstration of the Cocoda web application.

Content

Vortrag anlässlich: 14th European Networked Knowledge Organization Systems (NKOS) Workshop, TPDL 2015 Conference in Poznan, Poland, Friday 18th September 2015. Vgl. auch: http://eprints.rclis.org/28007/. Vgl. auch: http://coli-conc.gbv.de/.
Markoff, J.: Researchers announce advance in image-recognition software (2014) 0.03
```
0.02862486 = product of:
  0.08587458 = sum of:
    0.014249863 = weight(_text_:of in 1875) [ClassicSimilarity], result of:
      0.014249863 = score(doc=1875,freq=58.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.23260197 = fieldWeight in 1875, product of:
          7.615773 = tf(freq=58.0), with freq of:
            58.0 = termFreq=58.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1875)
    0.010219917 = weight(_text_:systems in 1875) [ClassicSimilarity], result of:
      0.010219917 = score(doc=1875,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.08488525 = fieldWeight in 1875, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1875)
    0.061404802 = weight(_text_:software in 1875) [ClassicSimilarity], result of:
      0.061404802 = score(doc=1875,freq=26.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.39508954 = fieldWeight in 1875, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1875)
  0.33333334 = coord(3/9)
```
Abstract

Two groups of scientists, working independently, have created artificial intelligence software capable of recognizing and describing the content of photographs and videos with far greater accuracy than ever before, sometimes even mimicking human levels of understanding.

Content

"Until now, so-called computer vision has largely been limited to recognizing individual objects. The new software, described on Monday by researchers at Google and at Stanford University, teaches itself to identify entire scenes: a group of young men playing Frisbee, for example, or a herd of elephants marching on a grassy plain. The software then writes a caption in English describing the picture. Compared with human observations, the researchers found, the computer-written descriptions are surprisingly accurate. The advances may make it possible to better catalog and search for the billions of images and hours of video available online, which are often poorly described and archived. At the moment, search engines like Google rely largely on written language accompanying an image or video to ascertain what it contains. "I consider the pixel data in images and video to be the dark matter of the Internet," said Fei-Fei Li, director of the Stanford Artificial Intelligence Laboratory, who led the research with Andrej Karpathy, a graduate student. "We are now starting to illuminate it." Dr. Li and Mr. Karpathy published their research as a Stanford University technical report. The Google team published their paper on arXiv.org, an open source site hosted by Cornell University.
In the longer term, the new research may lead to technology that helps the blind and robots navigate natural environments. But it also raises chilling possibilities for surveillance. During the past 15 years, video cameras have been placed in a vast number of public and private spaces. In the future, the software operating the cameras will not only be able to identify particular humans via facial recognition, experts say, but also identify certain types of behavior, perhaps even automatically alerting authorities. Two years ago Google researchers created image-recognition software and presented it with 10 million images taken from YouTube videos. Without human guidance, the program trained itself to recognize cats - a testament to the number of cat videos on YouTube. Current artificial intelligence programs in new cars already can identify pedestrians and bicyclists from cameras positioned atop the windshield and can stop the car automatically if the driver does not take action to avoid a collision. But "just single object recognition is not very beneficial," said Ali Farhadi, a computer scientist at the University of Washington who has published research on software that generates sentences from digital pictures. "We've focused on objects, and we've ignored verbs," he said, adding that these programs do not grasp what is going on in an image. Both the Google and Stanford groups tackled the problem by refining software programs known as neural networks, inspired by our understanding of how the brain works. Neural networks can "train" themselves to discover similarities and patterns in data, even when their human creators do not know the patterns exist.
In living organisms, webs of neurons in the brain vastly outperform even the best computer-based networks in perception and pattern recognition. But by adopting some of the same architecture, computers are catching up, learning to identify patterns in speech and imagery with increasing accuracy. The advances are apparent to consumers who use Apple's Siri personal assistant, for example, or Google's image search. Both groups of researchers employed similar approaches, weaving together two types of neural networks, one focused on recognizing images and the other on human language. In both cases the researchers trained the software with relatively small sets of digital images that had been annotated with descriptive sentences by humans. After the software programs "learned" to see patterns in the pictures and description, the researchers turned them on previously unseen images. The programs were able to identify objects and actions with roughly double the accuracy of earlier efforts, although still nowhere near human perception capabilities. "I was amazed that even with the small amount of training data that we were able to do so well," said Oriol Vinyals, a Google computer scientist who wrote the paper with Alexander Toshev, Samy Bengio and Dumitru Erhan, members of the Google Brain project. "The field is just starting, and we will see a lot of increases."
Computer vision specialists said that despite the improvements, these software systems had made only limited progress toward the goal of digitally duplicating human vision and, even more elusive, understanding. "I don't know that I would say this is 'understanding' in the sense we want," said John R. Smith, a senior manager at I.B.M.'s T.J. Watson Research Center in Yorktown Heights, N.Y. "I think even the ability to generate language here is very limited." But the Google and Stanford teams said that they expect to see significant increases in accuracy as they improve their software and train these programs with larger sets of annotated images. A research group led by Tamara L. Berg, a computer scientist at the University of North Carolina at Chapel Hill, is training a neural network with one million images annotated by humans. "You're trying to tell the story behind the image," she said. "A natural scene will be very complex, and you want to pick out the most important objects in the image.""

Footnote

A version of this article appears in print on November 18, 2014, on page A13 of the New York edition with the headline: Advance Reported in Content-Recognition Software. Vgl.: http://cs.stanford.edu/people/karpathy/cvpr2015.pdf. Vgl. auch: http://googleresearch.blogspot.de/2014/11/a-picture-is-worth-thousand-coherent.html. https://news.ycombinator.com/item?id=8621658 Vgl. auch: https://news.ycombinator.com/item?id=8621658.

Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2012) 0.03

0.027655158 = product of:
  0.08296547 = sum of:
    0.017962547 = weight(_text_:of in 1967) [ClassicSimilarity], result of:
      0.017962547 = score(doc=1967,freq=16.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2932045 = fieldWeight in 1967, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1967)
    0.042483397 = weight(_text_:systems in 1967) [ClassicSimilarity], result of:
      0.042483397 = score(doc=1967,freq=6.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.35286134 = fieldWeight in 1967, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=1967)
    0.022519529 = product of:
      0.045039058 = sum of:
        0.045039058 = weight(_text_:22 in 1967) [ClassicSimilarity], result of:
          0.045039058 = score(doc=1967,freq=4.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.32829654 = fieldWeight in 1967, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1967)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: This paper reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The paper discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and /or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the DDC (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.

Bozzato, L.; Braghin, S.; Trombetta, A.: ¬A method and guidelines for the cooperation of ontologies and relational databases in Semantic Web applications (2012) 0.03

0.027396318 = product of:
  0.08218895 = sum of:
    0.041947264 = weight(_text_:applications in 475) [ClassicSimilarity], result of:
      0.041947264 = score(doc=475,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2432066 = fieldWeight in 475, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=475)
    0.019801848 = weight(_text_:of in 475) [ClassicSimilarity], result of:
      0.019801848 = score(doc=475,freq=28.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.32322758 = fieldWeight in 475, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=475)
    0.020439833 = weight(_text_:systems in 475) [ClassicSimilarity], result of:
      0.020439833 = score(doc=475,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1697705 = fieldWeight in 475, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=475)
  0.33333334 = coord(3/9)

Abstract: Ontologies are a well-affirmed way of representing complex structured information and they provide a sound conceptual foundation to Semantic Web technologies. On the other hand, a huge amount of information available on the web is stored in legacy relational databases. The issues raised by the collaboration between such worlds are well known and addressed by consolidated mapping languages. Nevertheless, to the best of our knowledge, a best practice for such cooperation is missing: in this work we thus present a method to guide the definition of cooperations between ontology-based and relational databases systems. Our method, mainly based on ideas from knowledge reuse and re-engineering, is aimed at the separation of data between database and ontology instances and at the definition of suitable mappings in both directions, taking advantage of the representation possibilities offered by both models. We present the steps of our method along with guidelines for their application. Finally, we propose an example of its deployment in the context of a large repository of bio-medical images we developed.
Source: Proceedings of the 2nd International Workshop on Semantic Digital Archives held in conjunction with the 16th Int. Conference on Theory and Practice of Digital Libraries (TPDL) on September 27, 2012 in Paphos, Cyprus [http://ceur-ws.org/Vol-912/proceedings.pdf]. Eds.: A. Mitschik et al

Mayo, D.; Bowers, K.: ¬The devil's shoehorn : a case study of EAD to ArchivesSpace migration at a large university (2017) 0.03
```
0.026281446 = product of:
  0.07884434 = sum of:
    0.015876798 = weight(_text_:of in 3373) [ClassicSimilarity], result of:
      0.015876798 = score(doc=3373,freq=18.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.25915858 = fieldWeight in 3373, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3373)
    0.02890629 = weight(_text_:systems in 3373) [ClassicSimilarity], result of:
      0.02890629 = score(doc=3373,freq=4.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.24009174 = fieldWeight in 3373, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3373)
    0.034061253 = weight(_text_:software in 3373) [ClassicSimilarity], result of:
      0.034061253 = score(doc=3373,freq=2.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.21915624 = fieldWeight in 3373, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3373)
  0.33333334 = coord(3/9)
```
Abstract

A band of archivists and IT professionals at Harvard took on a project to convert nearly two million descriptions of archival collection components from marked-up text into the ArchivesSpace archival metadata management system. Starting in the mid-1990s, Harvard was an alpha implementer of EAD, an SGML (later XML) text markup language for electronic inventories, indexes, and finding aids that archivists use to wend their way through the sometimes quirky filing systems that bureaucracies establish for their records or the utter chaos in which some individuals keep their personal archives. These pathfinder documents, designed to cope with messy reality, can themselves be difficult to classify. Portions of them are rigorously structured, while other parts are narrative. Early documents predate the establishment of the standard; many feature idiosyncratic encoding that had been through several machine conversions, while others were freshly encoded and fairly consistent. In this paper, we will cover the practical and technical challenges involved in preparing a large (900MiB) corpus of XML for ingest into an open-source archival information system (ArchivesSpace). This case study will give an overview of the project, discuss problem discovery and problem solving, and address the technical challenges, analysis, solutions, and decisions and provide information on the tools produced and lessons learned. The authors of this piece are Kate Bowers, Collections Services Archivist for Metadata, Systems, and Standards at the Harvard University Archive, and Dave Mayo, a Digital Library Software Engineer for Harvard's Library and Technology Services. Kate was heavily involved in both metadata analysis and later problem solving, while Dave was the sole full-time developer assigned to the migration project.

Klic, L.; Miller, M.; Nelson, J.K.; Germann, J.E.: Approaching the largest 'API' : extracting information from the Internet with Python (2018) 0.03

0.025856502 = product of:
  0.11635426 = sum of:
    0.0089812735 = weight(_text_:of in 4239) [ClassicSimilarity], result of:
      0.0089812735 = score(doc=4239,freq=4.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.14660224 = fieldWeight in 4239, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=4239)
    0.107372984 = product of:
      0.21474597 = sum of:
        0.21474597 = weight(_text_:packages in 4239) [ClassicSimilarity], result of:
          0.21474597 = score(doc=4239,freq=6.0), product of:
            0.2706874 = queryWeight, product of:
              6.9093957 = idf(docFreq=119, maxDocs=44218)
              0.03917671 = queryNorm
            0.7933357 = fieldWeight in 4239, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              6.9093957 = idf(docFreq=119, maxDocs=44218)
              0.046875 = fieldNorm(doc=4239)
      0.5 = coord(1/2)
  0.22222222 = coord(2/9)

Abstract: This article explores the need for libraries to algorithmically access and manipulate the world's largest API: the Internet. The billions of pages on the 'Internet API' (HTTP, HTML, CSS, XPath, DOM, etc.) are easily accessible and manipulable. Libraries can assist in creating meaning through the datafication of information on the world wide web. Because most information is created for human consumption, some programming is required for automated extraction. Python is an easy-to-learn programming language with extensive packages and community support for web page automation. Four packages (Urllib, Selenium, BeautifulSoup, Scrapy) in Python can automate almost any web page for all sized projects. An example warrant data project is explained to illustrate how well Python packages can manipulate web pages to create meaning through assembling custom datasets.

Maltese, V.; Farazi, F.: Towards the integration of knowledge organization systems with the linked data cloud (2011) 0.03

0.025785297 = product of:
  0.07735589 = sum of:
    0.041947264 = weight(_text_:applications in 602) [ClassicSimilarity], result of:
      0.041947264 = score(doc=602,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2432066 = fieldWeight in 602, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=602)
    0.014968789 = weight(_text_:of in 602) [ClassicSimilarity], result of:
      0.014968789 = score(doc=602,freq=16.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.24433708 = fieldWeight in 602, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=602)
    0.020439833 = weight(_text_:systems in 602) [ClassicSimilarity], result of:
      0.020439833 = score(doc=602,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1697705 = fieldWeight in 602, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=602)
  0.33333334 = coord(3/9)

Abstract: In representing the shared view of all the people involved, building a Knowledge Organization System (KOS) from scratch is extremely costly, and it is therefore fundamental to reuse existing resources. This can be done by progressively extending the KOS with knowledge coming from similar KOS and by promoting interoperability among them. The linked data initiative is indeed fostering people to share and integrate their datasets into a giant network of interconnected resources. This enables different applications to interoperate and share their data. However, the integration should take into account the purpose of the datasets and make explicit the semantics. In fact, the difference in the purpose is reflected in the difference in the semantics. With this paper we (a) highlight the potential problems that may arise by not taking into account purpose and semantics, (b) make clear how the difference in the purpose is reflected in totally different semantics and (c) provide an algorithm to translate from one semantic into another as a preliminary step towards the integration of ontologies designed for different purposes. This will allow reusing the ontologies even in contexts different from those in which they were designed.
Content: Also: In the proceedings of the UDC seminar 2011.
Imprint: Trento : University of Trento / Department of Information engineering and Computer Science

Harlow, C.: Data munging tools in Preparation for RDF : Catmandu and LODRefine (2015) 0.03

0.02546304 = product of:
  0.07638912 = sum of:
    0.041947264 = weight(_text_:applications in 2277) [ClassicSimilarity], result of:
      0.041947264 = score(doc=2277,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2432066 = fieldWeight in 2277, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2277)
    0.0140020205 = weight(_text_:of in 2277) [ClassicSimilarity], result of:
      0.0140020205 = score(doc=2277,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.22855641 = fieldWeight in 2277, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2277)
    0.020439833 = weight(_text_:systems in 2277) [ClassicSimilarity], result of:
      0.020439833 = score(doc=2277,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1697705 = fieldWeight in 2277, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2277)
  0.33333334 = coord(3/9)

Abstract: Data munging, or the work of remediating, enhancing and transforming library datasets for new or improved uses, has become more important and staff-inclusive in many library technology discussions and projects. Many times we know how we want our data to look, as well as how we want our data to act in discovery interfaces or when exposed, but we are uncertain how to make the data we have into the data we want. This article introduces and compares two library data munging tools that can help: LODRefine (OpenRefine with the DERI RDF Extension) and Catmandu. The strengths and best practices of each tool are discussed in the context of metadata munging use cases for an institution's metadata migration workflow. There is a focus on Linked Open Data modeling and transformation applications of each tool, in particular how metadataists, catalogers, and programmers can create metadata quality reports, enhance existing data with LOD sets, and transform that data to a RDF model. Integration of these tools with other systems and projects, the use of domain specific transformation languages, and the expansion of vocabulary reconciliation services are mentioned.

Voß, J.: Classification of knowledge organization systems with Wikidata (2016) 0.03

0.025069844 = product of:
  0.07520953 = sum of:
    0.016802425 = weight(_text_:of in 3082) [ClassicSimilarity], result of:
      0.016802425 = score(doc=3082,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2742677 = fieldWeight in 3082, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3082)
    0.042483397 = weight(_text_:systems in 3082) [ClassicSimilarity], result of:
      0.042483397 = score(doc=3082,freq=6.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.35286134 = fieldWeight in 3082, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=3082)
    0.015923709 = product of:
      0.031847417 = sum of:
        0.031847417 = weight(_text_:22 in 3082) [ClassicSimilarity], result of:
          0.031847417 = score(doc=3082,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.23214069 = fieldWeight in 3082, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3082)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: This paper presents a crowd-sourced classification of knowledge organization systems based on open knowledge base Wikidata. The focus is less on the current result in its rather preliminary form but on the environment and process of categorization in Wikidata and the extraction of KOS from the collaborative database. Benefits and disadvantages are summarized and discussed for application to knowledge organization of other subject areas with Wikidata.
Pages: S.15-22
Source: Proceedings of the 15th European Networked Knowledge Organization Systems Workshop (NKOS 2016) co-located with the 20th International Conference on Theory and Practice of Digital Libraries 2016 (TPDL 2016), Hannover, Germany, September 9, 2016. Edi. by Philipp Mayr et al. [http://ceur-ws.org/Vol-1676/=urn:nbn:de:0074-1676-5]

Surfing versus Drilling for knowledge in science : When should you use your computer? When should you use your brain? (2018) 0.02
```
0.024916714 = product of:
  0.07475014 = sum of:
    0.011201616 = weight(_text_:of in 4564) [ClassicSimilarity], result of:
      0.011201616 = score(doc=4564,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.18284513 = fieldWeight in 4564, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=4564)
    0.016351866 = weight(_text_:systems in 4564) [ClassicSimilarity], result of:
      0.016351866 = score(doc=4564,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1358164 = fieldWeight in 4564, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03125 = fieldNorm(doc=4564)
    0.04719666 = weight(_text_:software in 4564) [ClassicSimilarity], result of:
      0.04719666 = score(doc=4564,freq=6.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.3036718 = fieldWeight in 4564, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03125 = fieldNorm(doc=4564)
  0.33333334 = coord(3/9)
```
Abstract

For this second Special Issue of Infozine, we have invited students, teachers, researchers, and software developers to share their opinions about one or the other aspect of this broad topic: how to balance drilling (for depth) vs. surfing (for breadth) in scientific learning, teaching, research, and software design - and how the modern digital-liberal system affects our ability to strike this balance. This special issue is meant to provide a wide and unbiased spectrum of possible viewpoints on the topic, helping readers to define lucidly their own position and information use behavior.

Content

Editorial: Surfing versus Drilling for Knowledge in Science: When should you use your computer? When should you use your brain? Blaise Pascal: Les deux infinis - The two infinities / Philippe Hünenberger and Oliver Renn - "Surfing" vs. "drilling" in the modern scientific world / Antonio Loprieno - Of millimeter paper and machine learning / Philippe Hünenberger - From one to many, from breadth to depth - industrializing research / Janne Soetbeer - "Deep drilling" requires "surfing" / Gerd Folkers and Laura Folkers - Surfing vs. drilling in science: A delicate balance / Alzbeta Kubincová - Digital trends in academia - for the sake of critical thinking or comfort? / Leif-Thore Deck - I diagnose, therefore I am a Doctor? Will drilling computer software replace human doctors in the future? / Yi Zheng - Surfing versus drilling in fundamental research / Wilfred van Gunsteren - Using brain vs. brute force in computational studies of biological systems / Arieh Warshel - Laboratory literature boards in the digital age / Jeffrey Bode - Research strategies in computational chemistry / Sereina Riniker - Surfing on the hype waves or drilling deep for knowledge? A perspective from industry / Nadine Schneider and Nikolaus Stiefl - The use and purpose of articles and scientists / Philip Mark Lund - Can you look at papers like artwork? / Oliver Renn - Dynamite fishing in the data swamp / Frank Perabo 34 Streetlights, augmented intelligence, and information discovery / Jeffrey Saffer and Vicki Burnett - "Yes Dave. Happy to do that for you." Why AI, machine learning, and blockchain will lead to deeper "drilling" / Michiel Kolman and Sjors de Heuvel - Trends in scientific document search ( Stefan Geißler - Power tools for text mining / Jane Reed 42 Publishing and patenting: Navigating the differences to ensure search success / Paul Peters

Search (397 results, page 1 of 20)

Authors

Languages

Types

Themes

Subjects

Classifications