Search (568 results, page 2 of 29)

Harlow, C.: Data munging tools in Preparation for RDF : Catmandu and LODRefine (2015) 0.03

0.02546304 = product of:
  0.07638912 = sum of:
    0.041947264 = weight(_text_:applications in 2277) [ClassicSimilarity], result of:
      0.041947264 = score(doc=2277,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2432066 = fieldWeight in 2277, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2277)
    0.0140020205 = weight(_text_:of in 2277) [ClassicSimilarity], result of:
      0.0140020205 = score(doc=2277,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.22855641 = fieldWeight in 2277, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2277)
    0.020439833 = weight(_text_:systems in 2277) [ClassicSimilarity], result of:
      0.020439833 = score(doc=2277,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1697705 = fieldWeight in 2277, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2277)
  0.33333334 = coord(3/9)

Abstract: Data munging, or the work of remediating, enhancing and transforming library datasets for new or improved uses, has become more important and staff-inclusive in many library technology discussions and projects. Many times we know how we want our data to look, as well as how we want our data to act in discovery interfaces or when exposed, but we are uncertain how to make the data we have into the data we want. This article introduces and compares two library data munging tools that can help: LODRefine (OpenRefine with the DERI RDF Extension) and Catmandu. The strengths and best practices of each tool are discussed in the context of metadata munging use cases for an institution's metadata migration workflow. There is a focus on Linked Open Data modeling and transformation applications of each tool, in particular how metadataists, catalogers, and programmers can create metadata quality reports, enhance existing data with LOD sets, and transform that data to a RDF model. Integration of these tools with other systems and projects, the use of domain specific transformation languages, and the expansion of vocabulary reconciliation services are mentioned.

Soergel, D.; Lauser, B.; Liang, A.; Fisseha, F.; Keizer, J.; Katz, S.: Reengineering thesauri for new applications : the AGROVOC example (2004) 0.03

0.025194416 = product of:
  0.11337487 = sum of:
    0.10067343 = weight(_text_:applications in 2347) [ClassicSimilarity], result of:
      0.10067343 = score(doc=2347,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.5836958 = fieldWeight in 2347, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.09375 = fieldNorm(doc=2347)
    0.012701439 = weight(_text_:of in 2347) [ClassicSimilarity], result of:
      0.012701439 = score(doc=2347,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.20732689 = fieldWeight in 2347, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=2347)
  0.22222222 = coord(2/9)

Source: Journal of digital information. 4(2004) no.4, art.#257

Chessum, K.; Haiming, L.; Frommholz, I.: ¬A study of search user interface design based on Hofstede's six cultural dimensions (2022) 0.03

0.025194416 = product of:
  0.11337487 = sum of:
    0.10067343 = weight(_text_:applications in 856) [ClassicSimilarity], result of:
      0.10067343 = score(doc=856,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.5836958 = fieldWeight in 856, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.09375 = fieldNorm(doc=856)
    0.012701439 = weight(_text_:of in 856) [ClassicSimilarity], result of:
      0.012701439 = score(doc=856,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.20732689 = fieldWeight in 856, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=856)
  0.22222222 = coord(2/9)

Source: 6th International Conference on Computer-Human Interaction Research and Applications, [https://www.researchgate.net/publication/364940444_A_Study_of_Search_User_Interface_Design_based_on_Hofstede's_Six_Cultural_Dimensions]

Voß, J.: Classification of knowledge organization systems with Wikidata (2016) 0.03

0.025069844 = product of:
  0.07520953 = sum of:
    0.016802425 = weight(_text_:of in 3082) [ClassicSimilarity], result of:
      0.016802425 = score(doc=3082,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2742677 = fieldWeight in 3082, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3082)
    0.042483397 = weight(_text_:systems in 3082) [ClassicSimilarity], result of:
      0.042483397 = score(doc=3082,freq=6.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.35286134 = fieldWeight in 3082, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=3082)
    0.015923709 = product of:
      0.031847417 = sum of:
        0.031847417 = weight(_text_:22 in 3082) [ClassicSimilarity], result of:
          0.031847417 = score(doc=3082,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.23214069 = fieldWeight in 3082, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3082)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: This paper presents a crowd-sourced classification of knowledge organization systems based on open knowledge base Wikidata. The focus is less on the current result in its rather preliminary form but on the environment and process of categorization in Wikidata and the extraction of KOS from the collaborative database. Benefits and disadvantages are summarized and discussed for application to knowledge organization of other subject areas with Wikidata.
Pages: S.15-22
Source: Proceedings of the 15th European Networked Knowledge Organization Systems Workshop (NKOS 2016) co-located with the 20th International Conference on Theory and Practice of Digital Libraries 2016 (TPDL 2016), Hannover, Germany, September 9, 2016. Edi. by Philipp Mayr et al. [http://ceur-ws.org/Vol-1676/=urn:nbn:de:0074-1676-5]

Oard, D.W.: Alternative approaches for cross-language text retrieval (1997) 0.02
```
0.024708923 = product of:
  0.074126765 = sum of:
    0.029363085 = weight(_text_:applications in 1164) [ClassicSimilarity], result of:
      0.029363085 = score(doc=1164,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.17024462 = fieldWeight in 1164, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1164)
    0.016147917 = weight(_text_:of in 1164) [ClassicSimilarity], result of:
      0.016147917 = score(doc=1164,freq=38.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2635841 = fieldWeight in 1164, product of:
          6.164414 = tf(freq=38.0), with freq of:
            38.0 = termFreq=38.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1164)
    0.028615767 = weight(_text_:systems in 1164) [ClassicSimilarity], result of:
      0.028615767 = score(doc=1164,freq=8.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.23767869 = fieldWeight in 1164, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1164)
  0.33333334 = coord(3/9)
```
Abstract

The explosive growth of the Internet and other sources of networked information have made automatic mediation of access to networked information sources an increasingly important problem. Much of this information is expressed as electronic text, and it is becoming practical to automatically convert some printed documents and recorded speech to electronic text as well. Thus, automated systems capable of detecting useful documents are finding widespread application. With even a small number of languages it can be inconvenient to issue the same query repeatedly in every language, so users who are able to read more than one language will likely prefer a multilingual text retrieval system over a collection of monolingual systems. And since reading ability in a language does not always imply fluent writing ability in that language, such users will likely find cross-language text retrieval particularly useful for languages in which they are less confident of their ability to express their information needs effectively. The use of such systems can be also be beneficial if the user is able to read only a single language. For example, when only a small portion of the document collection will ever be examined by the user, performing retrieval before translation can be significantly more economical than performing translation before retrieval. So when the application is sufficiently important to justify the time and effort required for translation, those costs can be minimized if an effective cross-language text retrieval system is available. Even when translation is not available, there are circumstances in which cross-language text retrieval could be useful to a monolingual user. For example, a researcher might find a paper published in an unfamiliar language useful if that paper contains references to works by the same author that are in the researcher's native language.
Multilingual text retrieval can be defined as selection of useful documents from collections that may contain several languages (English, French, Chinese, etc.). This formulation allows for the possibility that individual documents might contain more than one language, a common occurrence in some applications. Both cross-language and within-language retrieval are included in this formulation, but it is the cross-language aspect of the problem which distinguishes multilingual text retrieval from its well studied monolingual counterpart. At the SIGIR 96 workshop on "Cross-Linguistic Information Retrieval" the participants discussed the proliferation of terminology being used to describe the field and settled on "Cross-Language" as the best single description of the salient aspect of the problem. "Multilingual" was felt to be too broad, since that term has also been used to describe systems able to perform within-language retrieval in more than one language but that lack any cross-language capability. "Cross-lingual" and "cross-linguistic" were felt to be equally good descriptions of the field, but "crosslanguage" was selected as the preferred term in the interest of standardization. Unfortunately, at about the same time the U.S. Defense Advanced Research Projects Agency (DARPA) introduced "translingual" as their preferred term, so we are still some distance from reaching consensus on this matter.
I will not attempt to draw a sharp distinction between retrieval and filtering in this survey. Although my own work on adaptive cross-language text filtering has led me to make this distinction fairly carefully in other presentations (c.f., (Oard 1997b)), such an proach does little to help understand the fundamental techniques which have been applied or the results that have been obtained in this case. Since it is still common to view filtering (detection of useful documents in dynamic document streams) as a kind of retrieval, will simply adopt that perspective here.

Durno, J.: Digital archaeology and/or forensics : working with floppy disks from the 1980s (2016) 0.02

0.022857988 = product of:
  0.10286094 = sum of:
    0.008467626 = weight(_text_:of in 3196) [ClassicSimilarity], result of:
      0.008467626 = score(doc=3196,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.13821793 = fieldWeight in 3196, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=3196)
    0.09439332 = weight(_text_:software in 3196) [ClassicSimilarity], result of:
      0.09439332 = score(doc=3196,freq=6.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.6073436 = fieldWeight in 3196, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.0625 = fieldNorm(doc=3196)
  0.22222222 = coord(2/9)

Abstract: While software originating from the domain of digital forensics has demonstrated utility for data recovery from contemporary storage media, it is not as effective for working with floppy disks from the 1980s. This paper details alternative strategies for recovering data from floppy disks employing software originating from the software preservation and retro-computing communities. Imaging hardware, storage formats and processing workflows are also discussed.

Blanco, E.; Cankaya, H.C.; Moldovan, D.: Composition of semantic relations : model and applications (2010) 0.02

0.022137502 = product of:
  0.09961876 = sum of:
    0.083051346 = weight(_text_:applications in 4761) [ClassicSimilarity], result of:
      0.083051346 = score(doc=4761,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.4815245 = fieldWeight in 4761, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4761)
    0.016567415 = weight(_text_:of in 4761) [ClassicSimilarity], result of:
      0.016567415 = score(doc=4761,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2704316 = fieldWeight in 4761, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4761)
  0.22222222 = coord(2/9)

Abstract: This paper presents a framework for combining semantic relations extracted from text to reveal even more semantics that otherwise would be missed. A set of 26 relations is introduced, with their arguments defined on an ontology of sorts. A semantic parser is used to extract these relations from noun phrases and verb argument structures. The method was successfully used in two applications: rapid customization of semantic relations to arbitrary domains and recognizing entailments.
Source: Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Poster Volume, Beijing, China. Ed.: Chu-Ren Huang and Dan Jurafsky

Belpassi, E.: ¬The application software RIMMF : RDA thinking in action (2016) 0.02

0.021899873 = product of:
  0.09854943 = sum of:
    0.016802425 = weight(_text_:of in 2959) [ClassicSimilarity], result of:
      0.016802425 = score(doc=2959,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2742677 = fieldWeight in 2959, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2959)
    0.08174701 = weight(_text_:software in 2959) [ClassicSimilarity], result of:
      0.08174701 = score(doc=2959,freq=8.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.525975 = fieldWeight in 2959, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.046875 = fieldNorm(doc=2959)
  0.22222222 = coord(2/9)

Abstract: RIMMF software is grew out of the need to visualize and realize records according to the RDA guidelines. The article describes the software structure and features in the creation of a rball, that is a small database populated by recordings of bibliographic and authority resources enriched by relationships between and among entities involved. At first it's introduced the need that led to RIMMF outcome, then starts the software functional analysis. With a description of the main steps of the r-ball building, emphasizing the issues raised. The results highlights some critical aspects, but above all the wide scope of possible developments that open the Cultural Heritage Institutions horizon to the web prospective. Conclusions display the RDF-linkeddata development of the RIMMF incoming future.

Beppler, F.D.; Fonseca, F.T.; Pacheco, R.C.S.: Hermeneus: an architecture for an ontology-enabled information retrieval (2008) 0.02

0.021603964 = product of:
  0.06481189 = sum of:
    0.014200641 = weight(_text_:of in 3261) [ClassicSimilarity], result of:
      0.014200641 = score(doc=3261,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.23179851 = fieldWeight in 3261, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3261)
    0.034687545 = weight(_text_:systems in 3261) [ClassicSimilarity], result of:
      0.034687545 = score(doc=3261,freq=4.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.28811008 = fieldWeight in 3261, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=3261)
    0.015923709 = product of:
      0.031847417 = sum of:
        0.031847417 = weight(_text_:22 in 3261) [ClassicSimilarity], result of:
          0.031847417 = score(doc=3261,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.23214069 = fieldWeight in 3261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3261)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: Ontologies improve IR systems regarding its retrieval and presentation of information, which make the task of finding information more effective, efficient, and interactive. In this paper we argue that ontologies also greatly improve the engineering of such systems. We created a framework that uses ontology to drive the process of engineering an IR system. We developed a prototype that shows how a domain specialist without knowledge in the IR field can build an IR system with interactive components. The resulting system provides support for users not only to find their information needs but also to extend their state of knowledge. This way, our approach to ontology-enabled information retrieval addresses both the engineering aspect described here and also the usability aspect described elsewhere.
Date: 28.11.2016 12:43:22

Payette, S.; Blanchi, C.; Lagoze, C.; Overly, E.A.: Interoperability for digital objects and repositories : the Cornell/CNRI experiments (1999) 0.02
```
0.02144741 = product of:
  0.06434223 = sum of:
    0.020741362 = weight(_text_:of in 1248) [ClassicSimilarity], result of:
      0.020741362 = score(doc=1248,freq=48.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.33856338 = fieldWeight in 1248, product of:
          6.928203 = tf(freq=48.0), with freq of:
            48.0 = termFreq=48.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1248)
    0.016351866 = weight(_text_:systems in 1248) [ClassicSimilarity], result of:
      0.016351866 = score(doc=1248,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1358164 = fieldWeight in 1248, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03125 = fieldNorm(doc=1248)
    0.027249003 = weight(_text_:software in 1248) [ClassicSimilarity], result of:
      0.027249003 = score(doc=1248,freq=2.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.17532499 = fieldWeight in 1248, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03125 = fieldNorm(doc=1248)
  0.33333334 = coord(3/9)
```
Abstract

For several years the Digital Library Research Group at Cornell University and the Corporation for National Research Initiatives (CNRI) have been engaged in research focused on the design and development of infrastructures for open architecture, confederated digital libraries. The goal of this effort is to achieve interoperability and extensibility of digital library systems through the definition of key digital library services and their open interfaces, allowing flexible interaction of existing services and augmentation of the infrastructure with new services. Some aspects of this research have included the development and deployment of the Dienst software, the Handle System®, and the architecture of digital objects and repositories. In this paper, we describe the joint effort by Cornell and CNRI to prototype a rich and deployable architecture for interoperable digital objects and repositories. This effort has challenged us to move theories of interoperability closer to practice. The Cornell/CNRI collaboration builds on two existing projects focusing on the development of interoperable digital libraries. Details relating to the technology of these projects are described elsewhere. Both projects were strongly influenced by the fundamental abstractions of repositories and digital objects as articulated by Kahn and Wilensky in A Framework for Distributed Digital Object Services. Furthermore, both programs were influenced by the container architecture described in the Warwick Framework, and by the notions of distributed dynamic objects presented by Lagoze and Daniel in their Distributed Active Relationship work. With these common roots, one would expect that the CNRI and Cornell repositories would be at least theoretically interoperable. However, the actual test would be the extent to which our independently developed repositories were practically interoperable. This paper focuses on the definition of interoperability in the joint Cornell/CNRI work and the set of experiments conducted to formally test it. Our motivation for this work is the eventual deployment of formally tested reference implementations of the repository architecture for experimentation and development by fellow digital library researchers. In Section 2, we summarize the digital object and repository approach that was the focus of our interoperability experiments. In Section 3, we describe the set of experiments that progressively tested interoperability at increasing levels of functionality. In Section 4, we discuss general conclusions, and in Section 5, we give a preview of our future work, including our plans to evolve our experimentation to the point of defining a set of formal metrics for measuring interoperability for repositories and digital objects. This is still a work in progress that is expected to undergo additional refinements during its development.
Paskin, N.: DOI: current status and outlook (1999) 0.02
```
0.021379247 = product of:
  0.09620661 = sum of:
    0.07503755 = weight(_text_:applications in 1245) [ClassicSimilarity], result of:
      0.07503755 = score(doc=1245,freq=10.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.4350612 = fieldWeight in 1245, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03125 = fieldNorm(doc=1245)
    0.021169065 = weight(_text_:of in 1245) [ClassicSimilarity], result of:
      0.021169065 = score(doc=1245,freq=50.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.34554482 = fieldWeight in 1245, product of:
          7.071068 = tf(freq=50.0), with freq of:
            50.0 = termFreq=50.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1245)
  0.22222222 = coord(2/9)
```
Abstract

Over the past few months the International DOI Foundation (IDF) has produced a number of discussion papers and other materials about the Digital Object Identifier (DOIsm) initiative. They are all available at the DOI web site, including a brief summary of the DOI origins and purpose. The aim of the present paper is to update those papers, reflecting recent progress, and to provide a summary of the current position and context of the DOI. Although much of the material presented here is the result of a consensus by the organisations forming the International DOI Foundation, some of the points discuss work in progress. The paper describes the origin of the DOI as a persistent identifier for managing copyrighted materials and its development under the non-profit International DOI Foundation into a system providing identifiers of intellectual property with a framework for open applications to be built using them. Persistent identification implementations consistent with URN specifications have up to now been hindered by lack of widespread availability of resolution mechanisms, content typology consensus, and sufficiently flexible infrastructure; DOI attempts to overcome these obstacles. Resolution of the DOI uses the Handle System®, which offers the necessary functionality for open applications. The aim of the International DOI Foundation is to promote widespread applications of the DOI, which it is doing by pioneering some early implementations and by providing an extensible framework to ensure interoperability of future DOI uses. Applications of the DOI will require an interoperable scheme of declared metadata with each DOI; the basis of the DOI metadata scheme is a minimal "kernel" of elements supplemented by additional application-specific elements, under an umbrella data model (derived from the INDECS analysis) that promotes convergence of different application metadata sets. The IDF intends to require declaration of only a minimal set of metadata, sufficient to enable unambiguous look-up of a DOI, but this must be capable of extension by others to create open applications.
Lange, C.: Ontologies and languages for representing mathematical knowledge on the Semantic Web (2011) 0.02
```
0.021254174 = product of:
  0.06376252 = sum of:
    0.013388492 = weight(_text_:of in 135) [ClassicSimilarity], result of:
      0.013388492 = score(doc=135,freq=20.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.21854173 = fieldWeight in 135, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=135)
    0.023125032 = weight(_text_:systems in 135) [ClassicSimilarity], result of:
      0.023125032 = score(doc=135,freq=4.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.19207339 = fieldWeight in 135, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03125 = fieldNorm(doc=135)
    0.027249003 = weight(_text_:software in 135) [ClassicSimilarity], result of:
      0.027249003 = score(doc=135,freq=2.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.17532499 = fieldWeight in 135, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03125 = fieldNorm(doc=135)
  0.33333334 = coord(3/9)
```
Abstract

Mathematics is a ubiquitous foundation of science, technology, and engineering. Specific areas, such as numeric and symbolic computation or logics, enjoy considerable software support. Working mathematicians have recently started to adopt Web 2.0 environment, such as blogs and wikis, but these systems lack machine support for knowledge organization and reuse, and they are disconnected from tools such as computer algebra systems or interactive proof assistants.We argue that such scenarios will benefit from Semantic Web technology. Conversely, mathematics is still underrepresented on the Web of [Linked] Data. There are mathematics-related Linked Data, for example statistical government data or scientific publication databases, but their mathematical semantics has not yet been modeled. We argue that the services for the Web of Data will benefit from a deeper representation of mathematical knowledge. Mathematical knowledge comprises logical and functional structures - formulæ, statements, and theories -, a mixture of rigorous natural language and symbolic notation in documents, application-specific metadata, and discussions about conceptualizations, formalizations, proofs, and (counter-)examples. Our review of approaches to representing these structures covers ontologies for mathematical problems, proofs, interlinked scientific publications, scientific discourse, as well as mathematical metadata vocabularies and domain knowledge from pure and applied mathematics. Many fields of mathematics have not yet been implemented as proper Semantic Web ontologies; however, we show that MathML and OpenMath, the standard XML-based exchange languages for mathematical knowledge, can be fully integrated with RDF representations in order to contribute existing mathematical knowledge to theWeb of Data. We conclude with a roadmap for getting the mathematical Web of Data started: what datasets to publish, how to interlink them, and how to take advantage of these new connections.

Priss, U.: Faceted knowledge representation (1999) 0.02

0.021253616 = product of:
  0.06376085 = sum of:
    0.016567415 = weight(_text_:of in 2654) [ClassicSimilarity], result of:
      0.016567415 = score(doc=2654,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2704316 = fieldWeight in 2654, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2654)
    0.028615767 = weight(_text_:systems in 2654) [ClassicSimilarity], result of:
      0.028615767 = score(doc=2654,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.23767869 = fieldWeight in 2654, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2654)
    0.018577661 = product of:
      0.037155323 = sum of:
        0.037155323 = weight(_text_:22 in 2654) [ClassicSimilarity], result of:
          0.037155323 = score(doc=2654,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.2708308 = fieldWeight in 2654, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2654)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)

Abstract: Faceted Knowledge Representation provides a formalism for implementing knowledge systems. The basic notions of faceted knowledge representation are "unit", "relation", "facet" and "interpretation". Units are atomic elements and can be abstract elements or refer to external objects in an application. Relations are sequences or matrices of 0 and 1's (binary matrices). Facets are relational structures that combine units and relations. Each facet represents an aspect or viewpoint of a knowledge system. Interpretations are mappings that can be used to translate between different representations. This paper introduces the basic notions of faceted knowledge representation. The formalism is applied here to an abstract modeling of a faceted thesaurus as used in information retrieval.
Date: 22. 1.2016 17:30:31

Bauckhage, C.: Marginalizing over the PageRank damping factor (2014) 0.02

0.020995347 = product of:
  0.09447906 = sum of:
    0.08389453 = weight(_text_:applications in 928) [ClassicSimilarity], result of:
      0.08389453 = score(doc=928,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.4864132 = fieldWeight in 928, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.078125 = fieldNorm(doc=928)
    0.010584532 = weight(_text_:of in 928) [ClassicSimilarity], result of:
      0.010584532 = score(doc=928,freq=2.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.17277241 = fieldWeight in 928, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=928)
  0.22222222 = coord(2/9)

Abstract: In this note, we show how to marginalize over the damping parameter of the PageRank equation so as to obtain a parameter-free version known as TotalRank. Our discussion is meant as a reference and intended to provide a guided tour towards an interesting result that has applications in information retrieval and classification.

Crane, G.; Jones, A.: Text, information, knowledge and the evolving record of humanity (2006) 0.02
```
0.020951357 = product of:
  0.06285407 = sum of:
    0.015654733 = weight(_text_:of in 1182) [ClassicSimilarity], result of:
      0.015654733 = score(doc=1182,freq=70.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2555338 = fieldWeight in 1182, product of:
          8.3666 = tf(freq=70.0), with freq of:
            70.0 = termFreq=70.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1182)
    0.017701415 = weight(_text_:systems in 1182) [ClassicSimilarity], result of:
      0.017701415 = score(doc=1182,freq=6.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.14702557 = fieldWeight in 1182, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1182)
    0.029497914 = weight(_text_:software in 1182) [ClassicSimilarity], result of:
      0.029497914 = score(doc=1182,freq=6.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.18979488 = fieldWeight in 1182, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1182)
  0.33333334 = coord(3/9)
```
Abstract

Consider a sentence such as "the current price of tea in China is 35 cents per pound." In a library with millions of books we might find many statements of the above form that we could capture today with relatively simple rules: rather than pursuing every variation of a statement, programs can wait, like predators at a water hole, for their informational prey to reappear in a standard linguistic pattern. We can make inferences from sentences such as "NAME1 born at NAME2 in DATE" that NAME more likely than not represents a person and NAME a place and then convert the statement into a proposition about a person born at a given place and time. The changing price of tea in China, pedestrian birth and death dates, or other basic statements may not be truth and beauty in the Phaedrus, but a digital library that could plot the prices of various commodities in different markets over time, plot the various lifetimes of individuals, or extract and classify many events would be very useful. Services such as the Syllabus Finder1 and H-Bot2 (which Dan Cohen describes elsewhere in this issue of D-Lib) represent examples of information extraction already in use. H-Bot, in particular, builds on our evolving ability to extract information from very large corpora such as the billions of web pages available through the Google API. Aside from identifying higher order statements, however, users also want to search and browse named entities: they want to read about "C. P. E. Bach" rather than his father "Johann Sebastian" or about "Cambridge, Maryland", without hearing about "Cambridge, Massachusetts", Cambridge in the UK or any of the other Cambridges scattered around the world. Named entity identification is a well-established area with an ongoing literature. The Natural Language Processing Research Group at the University of Sheffield has developed its open source Generalized Architecture for Text Engineering (GATE) for years, while IBM's Unstructured Information Analysis and Search (UIMA) is "available as open source software to provide a common foundation for industry and academia." Powerful tools are thus freely available and more demanding users can draw upon published literature to develop their own systems. Major search engines such as Google and Yahoo also integrate increasingly sophisticated tools to categorize and identify places. The software resources are rich and expanding. The reference works on which these systems depend, however, are ill-suited for historical analysis. First, simple gazetteers and similar authority lists quickly grow too big for useful information extraction. They provide us with potential entities against which to match textual references, but existing electronic reference works assume that human readers can use their knowledge of geography and of the immediate context to pick the right Boston from the Bostons in the Getty Thesaurus of Geographic Names (TGN), but, with the crucial exception of geographic location, the TGN records do not provide any machine readable clues: we cannot tell which Bostons are large or small. If we are analyzing a document published in 1818, we cannot filter out those places that did not yet exist or that had different names: "Jefferson Davis" is not the name of a parish in Louisiana (tgn,2000880) or a county in Mississippi (tgn,2001118) until after the Civil War.
Although the Alexandria Digital Library provides far richer data than the TGN (5.9 vs. 1.3 million names), its added size lowers, rather than increases, the accuracy of most geographic name identification systems for historical documents: most of the extra 4.6 million names cover low frequency entities that rarely occur in any particular corpus. The TGN is sufficiently comprehensive to provide quite enough noise: we find place names that are used over and over (there are almost one hundred Washingtons) and semantically ambiguous (e.g., is Washington a person or a place?). Comprehensive knowledge sources emphasize recall but lower precision. We need data with which to determine which "Tribune" or "John Brown" a particular passage denotes. Secondly and paradoxically, our reference works may not be comprehensive enough. Human actors come and go over time. Organizations appear and vanish. Even places can change their names or vanish. The TGN does associate the obsolete name Siam with the nation of Thailand (tgn,1000142) - but also with towns named Siam in Iowa (tgn,2035651), Tennessee (tgn,2101519), and Ohio (tgn,2662003). Prussia appears but as a general region (tgn,7016786), with no indication when or if it was a sovereign nation. And if places do point to the same object over time, that object may have very different significance over time: in the foundational works of Western historiography, Herodotus reminds us that the great cities of the past may be small today, and the small cities of today great tomorrow (Hdt. 1.5), while Thucydides stresses that we cannot estimate the past significance of a place by its appearance today (Thuc. 1.10). In other words, we need to know the population figures for the various Washingtons in 1870 if we are analyzing documents from 1870. The foundations have been laid for reference works that provide machine actionable information about entities at particular times in history. The Alexandria Digital Library Gazetteer Content Standard8 represents a sophisticated framework with which to create such resources: places can be associated with temporal information about their foundation (e.g., Washington, DC, founded on 16 July 1790), changes in names for the same location (e.g., Saint Petersburg to Leningrad and back again), population figures at various times and similar historically contingent data. But if we have the software and the data structures, we do not yet have substantial amounts of historical content such as plentiful digital gazetteers, encyclopedias, lexica, grammars and other reference works to illustrate many periods and, even if we do, those resources may not be in a useful form: raw OCR output of a complex lexicon or gazetteer may have so many errors and have captured so little of the underlying structure that the digital resource is useless as a knowledge base. Put another way, human beings are still much better at reading and interpreting the contents of page images than machines. While people, places, and dates are probably the most important core entities, we will find a growing set of objects that we need to identify and track across collections, and each of these categories of objects will require its own knowledge sources. The following section enumerates and briefly describes some existing categories of documents that we need to mine for knowledge. This brief survey focuses on the format of print sources (e.g., highly structured textual "database" vs. unstructured text) to illustrate some of the challenges involved in converting our published knowledge into semantically annotated, machine actionable form.

Miller, E.: ¬An introduction to the Resource Description Framework (1998) 0.02

0.020428138 = product of:
  0.09192662 = sum of:
    0.01960283 = weight(_text_:of in 1231) [ClassicSimilarity], result of:
      0.01960283 = score(doc=1231,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.31997898 = fieldWeight in 1231, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1231)
    0.07232379 = product of:
      0.14464758 = sum of:
        0.14464758 = weight(_text_:packages in 1231) [ClassicSimilarity], result of:
          0.14464758 = score(doc=1231,freq=2.0), product of:
            0.2706874 = queryWeight, product of:
              6.9093957 = idf(docFreq=119, maxDocs=44218)
              0.03917671 = queryNorm
            0.5343713 = fieldWeight in 1231, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.9093957 = idf(docFreq=119, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1231)
      0.5 = coord(1/2)
  0.22222222 = coord(2/9)

Abstract: The Resource Description Framework (RDF) is an infrastructure that enables the encoding, exchange and reuse of structured metadata. RDF is an application of XML that imposes needed structural constraints to provide unambiguous methods of expressing semantics. RDF additionally provides a means for publishing both human-readable and machine-processable vocabularies designed to encourage the reuse and extension of metadata semantics among disparate information communities. The structural constraints RDF imposes to support the consistent encoding and exchange of standardized metadata provides for the interchangeability of separate packages of metadata defined by different resource description communities.

Takhirov, N.; Aalberg, T.; Duchateau, F.; Zumer, M.: FRBR-ML: a FRBR-based framework for semantic interoperability (2012) 0.02
```
0.020370431 = product of:
  0.061111294 = sum of:
    0.03355781 = weight(_text_:applications in 134) [ClassicSimilarity], result of:
      0.03355781 = score(doc=134,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.19456528 = fieldWeight in 134, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03125 = fieldNorm(doc=134)
    0.011201616 = weight(_text_:of in 134) [ClassicSimilarity], result of:
      0.011201616 = score(doc=134,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.18284513 = fieldWeight in 134, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=134)
    0.016351866 = weight(_text_:systems in 134) [ClassicSimilarity], result of:
      0.016351866 = score(doc=134,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1358164 = fieldWeight in 134, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03125 = fieldNorm(doc=134)
  0.33333334 = coord(3/9)
```
Abstract

Metadata related to cultural items such as literature, music and movies is a valuable resource that is currently exploited in many applications and services based on semantic web technologies. A vast amount of such information has been created by memory institutions in the last decades using different standard or ad hoc schemas, and a main challenge is to make this legacy data accessible as reusable semantic data. On one hand, this is a syntactic problem that can be solved by transforming to formats that are compatible with the tools and services used for semantic aware services. On the other hand, this is a semantic problem. Simply transforming from one format to another does not automatically enable semantic interoperability and legacy data often needs to be reinterpreted as well as transformed. The conceptual model in the Functional Requirements for Bibliographic Records, initially developed as a conceptual framework for library standards and systems, is a major step towards a shared semantic model of the products of artistic and intellectual endeavor of mankind. The model is generally accepted as sufficiently generic to serve as a conceptual framework for a broad range of cultural heritage metadata. Unfortunately, the existing large body of legacy data makes a transition to this model difficult. For instance, most bibliographic data is still only available in various MARC-based formats which is hard to render into reusable and meaningful semantic data. Making legacy bibliographic data accessible as semantic data is a complex problem that includes interpreting and transforming the information. In this article, we present our work on transforming and enhancing legacy bibliographic information into a representation where the structure and semantics of the FRBR model is explicit.

Aitken, S.; Reid, S.: Evaluation of an ontology-based information retrieval tool (2000) 0.02

0.019893078 = product of:
  0.08951885 = sum of:
    0.06711562 = weight(_text_:applications in 2862) [ClassicSimilarity], result of:
      0.06711562 = score(doc=2862,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.38913056 = fieldWeight in 2862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0625 = fieldNorm(doc=2862)
    0.022403233 = weight(_text_:of in 2862) [ClassicSimilarity], result of:
      0.022403233 = score(doc=2862,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.36569026 = fieldWeight in 2862, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=2862)
  0.22222222 = coord(2/9)

Abstract: This paper evaluates the use of an explicit domain ontology in an information retrieval tool. The evaluation compares the performance of ontology-enhanced retrieval with keyword retrieval for a fixed set of queries across several data sets. The robustness of the IR approach is assessed by comparing the performance of the tool on the original data set with that on previously unseen data.
Content: Beitrag für: Workshop on the Applications of Ontologies and Problem-Solving Methods, (eds) Gómez-Pérez, A., Benjamins, V.R., Guarino, N., and Uschold, M. European Conference on Artificial Intelligence 2000, Berlin.

Perovsek, M.; Kranjca, J.; Erjaveca, T.; Cestnika, B.; Lavraca, N.: TextFlows : a visual programming platform for text mining and natural language processing (2016) 0.02
```
0.01981098 = product of:
  0.08914941 = sum of:
    0.07118686 = weight(_text_:applications in 2697) [ClassicSimilarity], result of:
      0.07118686 = score(doc=2697,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.41273528 = fieldWeight in 2697, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.046875 = fieldNorm(doc=2697)
    0.017962547 = weight(_text_:of in 2697) [ClassicSimilarity], result of:
      0.017962547 = score(doc=2697,freq=16.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2932045 = fieldWeight in 2697, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2697)
  0.22222222 = coord(2/9)
```
Abstract

Text mining and natural language processing are fast growing areas of research, with numerous applications in business, science and creative industries. This paper presents TextFlows, a web-based text mining and natural language processing platform supporting workflow construction, sharing and execution. The platform enables visual construction of text mining workflows through a web browser, and the execution of the constructed workflows on a processing cloud. This makes TextFlows an adaptable infrastructure for the construction and sharing of text processing workflows, which can be reused in various applications. The paper presents the implemented text mining and language processing modules, and describes some precomposed workflows. Their features are demonstrated on three use cases: comparison of document classifiers and of different part-of-speech taggers on a text categorization problem, and outlier detection in document corpora.

Source

Science of computer programming. In Press, 2016

Zhang, L.; Wang, S.; Liu, B.: Deep learning for sentiment analysis : a survey (2018) 0.02

0.019523773 = product of:
  0.08785698 = sum of:
    0.06711562 = weight(_text_:applications in 4092) [ClassicSimilarity], result of:
      0.06711562 = score(doc=4092,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.38913056 = fieldWeight in 4092, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0625 = fieldNorm(doc=4092)
    0.020741362 = weight(_text_:of in 4092) [ClassicSimilarity], result of:
      0.020741362 = score(doc=4092,freq=12.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.33856338 = fieldWeight in 4092, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=4092)
  0.22222222 = coord(2/9)

Abstract: Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. Along with the success of deep learning in many other application domains, deep learning is also popularly used in sentiment analysis in recent years. This paper first gives an overview of deep learning and then provides a comprehensive survey of its current applications in sentiment analysis.

Search (568 results, page 2 of 29)

Authors

Years

Languages

Types

Themes