Search (34 results, page 1 of 2)

Cao, N.; Sun, J.; Lin, Y.-R.; Gotz, D.; Liu, S.; Qu, H.: FacetAtlas : Multifaceted visualization for rich text corpora (2010) 0.02

0.019306172 = product of:
  0.042473577 = sum of:
    0.004532476 = product of:
      0.009064952 = sum of:
        0.009064952 = weight(_text_:h in 3366) [ClassicSimilarity], result of:
          0.009064952 = score(doc=3366,freq=2.0), product of:
            0.0660481 = queryWeight, product of:
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.026584605 = queryNorm
            0.13724773 = fieldWeight in 3366, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3366)
      0.5 = coord(1/2)
    0.0043660053 = weight(_text_:a in 3366) [ClassicSimilarity], result of:
      0.0043660053 = score(doc=3366,freq=10.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.14243183 = fieldWeight in 3366, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3366)
    0.016092705 = weight(_text_:r in 3366) [ClassicSimilarity], result of:
      0.016092705 = score(doc=3366,freq=2.0), product of:
        0.088001914 = queryWeight, product of:
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.026584605 = queryNorm
        0.18286766 = fieldWeight in 3366, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3366)
    0.0017360178 = weight(_text_:s in 3366) [ClassicSimilarity], result of:
      0.0017360178 = score(doc=3366,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.060061958 = fieldWeight in 3366, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3366)
    0.015746372 = weight(_text_:u in 3366) [ClassicSimilarity], result of:
      0.015746372 = score(doc=3366,freq=2.0), product of:
        0.08704981 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.026584605 = queryNorm
        0.1808892 = fieldWeight in 3366, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3366)
  0.45454547 = coord(5/11)

Abstract: Documents in rich text corpora usually contain multiple facets of information. For example, an article about a specific disease often consists of different facets such as symptom, treatment, cause, diagnosis, prognosis, and prevention. Thus, documents may have different relations based on different facets. Powerful search tools have been developed to help users locate lists of individual documents that are most related to specific keywords. However, there is a lack of effective analysis tools that reveal the multifaceted relations of documents within or cross the document clusters. In this paper, we present FacetAtlas, a multifaceted visualization technique for visually analyzing rich text corpora. FacetAtlas combines search technology with advanced visual analytical tools to convey both global and local patterns simultaneously. We describe several unique aspects of FacetAtlas, including (1) node cliques and multifaceted edges, (2) an optimized density map, and (3) automated opacity pattern enhancement for highlighting visual patterns, (4) interactive context switch between facets. In addition, we demonstrate the power of FacetAtlas through a case study that targets patient education in the health care domain. Our evaluation shows the benefits of this work, especially in support of complex multifaceted data analysis.
Theme: Semantisches Umfeld in Indexierung u. Retrieval
Type: a

Eckert, K: ¬The ICE-map visualization (2011) 0.02

0.017093942 = product of:
  0.062677786 = sum of:
    0.0069856085 = weight(_text_:a in 4743) [ClassicSimilarity], result of:
      0.0069856085 = score(doc=4743,freq=10.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.22789092 = fieldWeight in 4743, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=4743)
    0.025748327 = weight(_text_:r in 4743) [ClassicSimilarity], result of:
      0.025748327 = score(doc=4743,freq=2.0), product of:
        0.088001914 = queryWeight, product of:
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.026584605 = queryNorm
        0.29258826 = fieldWeight in 4743, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.0625 = fieldNorm(doc=4743)
    0.02994385 = weight(_text_:k in 4743) [ClassicSimilarity], result of:
      0.02994385 = score(doc=4743,freq=2.0), product of:
        0.09490114 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.026584605 = queryNorm
        0.31552678 = fieldWeight in 4743, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.0625 = fieldNorm(doc=4743)
  0.27272728 = coord(3/11)

Abstract: In this paper, we describe in detail the Information Content Evaluation Map (ICE-Map Visualization, formerly referred to as IC Difference Analysis). The ICE-Map Visualization is a visual data mining approach for all kinds of concept hierarchies that uses statistics about the concept usage to help a user in the evaluation and maintenance of the hierarchy. It consists of a statistical framework that employs the the notion of information content from information theory, as well as a visualization of the hierarchy and the result of the statistical analysis by means of a treemap.
Type: r

Denton, W.: On dentographs, a new method of visualizing library collections (2012) 0.01

0.006288091 = product of:
  0.0345845 = sum of:
    0.008836173 = weight(_text_:a in 580) [ClassicSimilarity], result of:
      0.008836173 = score(doc=580,freq=16.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.28826174 = fieldWeight in 580, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=580)
    0.025748327 = weight(_text_:r in 580) [ClassicSimilarity], result of:
      0.025748327 = score(doc=580,freq=2.0), product of:
        0.088001914 = queryWeight, product of:
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.026584605 = queryNorm
        0.29258826 = fieldWeight in 580, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.0625 = fieldNorm(doc=580)
  0.18181819 = coord(2/11)

Abstract: A dentograph is a visualization of a library's collection built on the idea that a classification scheme is a mathematical function mapping one set of things (books or the universe of knowledge) onto another (a set of numbers and letters). Dentographs can visualize aspects of just one collection or can be used to compare two or more collections. This article describes how to build them, with examples and code using Ruby and R, and discusses some problems and future directions.
Type: a

Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.01
```
0.005812889 = product of:
  0.015985444 = sum of:
    0.002266238 = product of:
      0.004532476 = sum of:
        0.004532476 = weight(_text_:h in 1211) [ClassicSimilarity], result of:
          0.004532476 = score(doc=1211,freq=2.0), product of:
            0.0660481 = queryWeight, product of:
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.026584605 = queryNorm
            0.06862386 = fieldWeight in 1211, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1211)
      0.5 = coord(1/2)
    0.004978012 = weight(_text_:a in 1211) [ClassicSimilarity], result of:
      0.004978012 = score(doc=1211,freq=52.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.16239727 = fieldWeight in 1211, product of:
          7.2111025 = tf(freq=52.0), with freq of:
            52.0 = termFreq=52.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1211)
    8.680089E-4 = weight(_text_:s in 1211) [ClassicSimilarity], result of:
      8.680089E-4 = score(doc=1211,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.030030979 = fieldWeight in 1211, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1211)
    0.007873186 = weight(_text_:u in 1211) [ClassicSimilarity], result of:
      0.007873186 = score(doc=1211,freq=2.0), product of:
        0.08704981 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.026584605 = queryNorm
        0.0904446 = fieldWeight in 1211, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1211)
  0.36363637 = coord(4/11)
```
Abstract

In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary generation algorithm to generate a set of concepts for the digital library and a technique called the max-min distance technique to cluster them. Additionally, the concepts were visualized in a spring embedding graph layout to depict the semantic relationship among them. The resulting graph layout serves as an aid to users for retrieving documents. An online archive containing the contents of D-Lib Magazine from July 1995 to May 2002 was used to test the utility of an implemented retrieval and visualization system. We believe that the method developed and tested can be applied to many different domains to help users get a better understanding of online document collections and to minimize users' cognitive load during execution of search tasks. Over the past few years, the volume of information available through the World Wide Web has been expanding exponentially. Never has so much information been so readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over networks have made it hard for users to sift through and find relevant information. To deal with this problem, information retrieval (IR) techniques have gained more intensive attention from both industrial and academic researchers. Numerous IR techniques have been developed to help deal with the information overload problem. These techniques concentrate on mathematical models and algorithms for retrieval. Popular IR models such as the Boolean model, the vector-space model, the probabilistic model and their variants are well established.
From the user's perspective, however, it is still difficult to use current information retrieval systems. Users frequently have problems expressing their information needs and translating those needs into queries. This is partly due to the fact that information needs cannot be expressed appropriately in systems terms. It is not unusual for users to input search terms that are different from the index terms information systems use. Various methods have been proposed to help users choose search terms and articulate queries. One widely used approach is to incorporate into the information system a thesaurus-like component that represents both the important concepts in a particular subject area and the semantic relationships among those concepts. Unfortunately, the development and use of thesauri is not without its own problems. The thesaurus employed in a specific information system has often been developed for a general subject area and needs significant enhancement to be tailored to the information system where it is to be used. This thesaurus development process, if done manually, is both time consuming and labor intensive. Usage of a thesaurus in searching is complex and may raise barriers for the user. For illustration purposes, let us consider two scenarios of thesaurus usage. In the first scenario the user inputs a search term and the thesaurus then displays a matching set of related terms. Without an overview of the thesaurus - and without the ability to see the matching terms in the context of other terms - it may be difficult to assess the quality of the related terms in order to select the correct term. In the second scenario the user browses the whole thesaurus, which is organized as in an alphabetically ordered list. The problem with this approach is that the list may be long, and neither does it show users the global semantic relationship among all the listed terms.
Nevertheless, because thesaurus use has shown to improve retrieval, for our method we integrate functions in the search interface that permit users to explore built-in search vocabularies to improve retrieval from digital libraries. Our method automatically generates the terms and their semantic relationships representing relevant topics covered in a digital library. We call these generated terms the "concepts", and the generated terms and their semantic relationships we call the "concept space". Additionally, we used a visualization technique to display the concept space and allow users to interact with this space. The automatically generated term set is considered to be more representative of subject area in a corpus than an "externally" imposed thesaurus, and our method has the potential of saving a significant amount of time and labor for those who have been manually creating thesauri as well. Information visualization is an emerging discipline and developed very quickly in the last decade. With growing volumes of documents and associated complexities, information visualization has become increasingly important. Researchers have found information visualization to be an effective way to use and understand information while minimizing a user's cognitive load. Our work was based on an algorithmic approach of concept discovery and association. Concepts are discovered using an algorithm based on an automated thesaurus generation procedure. Subsequently, similarities among terms are computed using the cosine measure, and the associations among terms are established using a method known as max-min distance clustering. The concept space is then visualized in a spring embedding graph, which roughly shows the semantic relationships among concepts in a 2-D visual representation. The semantic space of the visualization is used as a medium for users to retrieve the desired documents. In the remainder of this article, we present our algorithmic approach of concept generation and clustering, followed by description of the visualization technique and interactive interface. The paper ends with key conclusions and discussions on future work.

Content

The JAVA applet is available at <http://ella.slis.indiana.edu/~junzhang/dlib/IV.html>. A prototype of this interface has been developed and is available at <http://ella.slis.indiana.edu/~junzhang/dlib/IV.html>. The D-Lib search interface is available at <http://www.dlib.org/Architext/AT-dlib2query.html>.

Source

D-Lib magazine. 8(2002) no.10, x S

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Type

a

Linden, E.J. van der; Vliegen, R.; Wijk, J.J. van: Visual Universal Decimal Classification (2007) 0.01

0.005784713 = product of:
  0.021210615 = sum of:
    0.0033818933 = weight(_text_:a in 548) [ClassicSimilarity], result of:
      0.0033818933 = score(doc=548,freq=6.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.11032722 = fieldWeight in 548, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=548)
    0.016092705 = weight(_text_:r in 548) [ClassicSimilarity], result of:
      0.016092705 = score(doc=548,freq=2.0), product of:
        0.088001914 = queryWeight, product of:
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.026584605 = queryNorm
        0.18286766 = fieldWeight in 548, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.0390625 = fieldNorm(doc=548)
    0.0017360178 = weight(_text_:s in 548) [ClassicSimilarity], result of:
      0.0017360178 = score(doc=548,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.060061958 = fieldWeight in 548, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.0390625 = fieldNorm(doc=548)
  0.27272728 = coord(3/11)

Abstract: UDC aims to be a consistent and complete classification system, that enables practitioners to classify documents swiftly and smoothly. The eventual goal of UDC is to enable the public at large to retrieve documents from large collections of documents that are classified with UDC. The large size of the UDC Master Reference File, MRF with over 66.000 records, makes it difficult to obtain an overview and to understand its structure. Moreover, finding the right classification in MRF turns out to be difficult in practice. Last but not least, retrieval of documents requires insight and understanding of the coding system. Visualization is an effective means to support the development of UDC as well as its use by practitioners. Moreover, visualization offers possibilities to use the classification without use of the coding system as such. MagnaView has developed an application which demonstrates the use of interactive visualization to face these challenges. In our presentation, we discuss these challenges, and we give a demonstration of the way the application helps face these. Examples of visualizations can be found below.
Source: Extensions and corrections to the UDC. 29(2007), S.297-300
Type: a

Maaten, L. van den: Accelerating t-SNE using Tree-Based Algorithms (2014) 0.00

0.004937262 = product of:
  0.018103292 = sum of:
    0.0027335514 = weight(_text_:a in 3886) [ClassicSimilarity], result of:
      0.0027335514 = score(doc=3886,freq=2.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.089176424 = fieldWeight in 3886, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3886)
    0.012939317 = product of:
      0.05175727 = sum of:
        0.05175727 = weight(_text_:o in 3886) [ClassicSimilarity], result of:
          0.05175727 = score(doc=3886,freq=2.0), product of:
            0.13338262 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.026584605 = queryNorm
            0.38803607 = fieldWeight in 3886, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3886)
      0.25 = coord(1/4)
    0.0024304248 = weight(_text_:s in 3886) [ClassicSimilarity], result of:
      0.0024304248 = score(doc=3886,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.08408674 = fieldWeight in 3886, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3886)
  0.27272728 = coord(3/11)

Abstract: The paper investigates the acceleration of t-SNE-an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots-using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N*logN). Our experiments show that the resulting algorithms substantially accelerate t-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of t-SNE appears to outperform the dual-tree variant.
Source: Journal of machine learning research. 15(2014), S.3221-3245
Type: a

Singh, A.; Sinha, U.; Sharma, D.k.: Semantic Web and data visualization (2020) 0.00
```
0.004857842 = product of:
  0.017812086 = sum of:
    0.0038261751 = weight(_text_:a in 79) [ClassicSimilarity], result of:
      0.0038261751 = score(doc=79,freq=12.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.12482099 = fieldWeight in 79, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.03125 = fieldNorm(doc=79)
    0.0013888142 = weight(_text_:s in 79) [ClassicSimilarity], result of:
      0.0013888142 = score(doc=79,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.048049565 = fieldWeight in 79, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.03125 = fieldNorm(doc=79)
    0.012597097 = weight(_text_:u in 79) [ClassicSimilarity], result of:
      0.012597097 = score(doc=79,freq=2.0), product of:
        0.08704981 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.026584605 = queryNorm
        0.14471136 = fieldWeight in 79, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03125 = fieldNorm(doc=79)
  0.27272728 = coord(3/11)
```
Abstract

With the terrific growth of data volume and data being produced every second on millions of devices across the globe, there is a desperate need to manage the unstructured data available on web pages efficiently. Semantic Web or also known as Web of Trust structures the scattered data on the Internet according to the needs of the user. It is an extension of the World Wide Web (WWW) which focuses on manipulating web data on behalf of Humans. Due to the ability of the Semantic Web to integrate data from disparate sources and hence makes it more user-friendly, it is an emerging trend. Tim Berners-Lee first introduced the term Semantic Web and since then it has come a long way to become a more intelligent and intuitive web. Data Visualization plays an essential role in explaining complex concepts in a universal manner through pictorial representation, and the Semantic Web helps in broadening the potential of Data Visualization and thus making it an appropriate combination. The objective of this chapter is to provide fundamental insights concerning the semantic web technologies and in addition to that it also elucidates the issues as well as the solutions regarding the semantic web. The purpose of this chapter is to highlight the semantic web architecture in detail while also comparing it with the traditional search system. It classifies the semantic web architecture into three major pillars i.e. RDF, Ontology, and XML. Moreover, it describes different semantic web tools used in the framework and technology. It attempts to illustrate different approaches of the semantic web search engines. Besides stating numerous challenges faced by the semantic web it also illustrates the solutions.

Pages

S.133-150

Type

a

Keil-Slawik, R.: Konzepte digitaler Medien : semantische Karten, räumliche Strukturierung von Wissen (2005) 0.00

0.004681514 = product of:
  0.051496655 = sum of:
    0.051496655 = weight(_text_:r in 2735) [ClassicSimilarity], result of:
      0.051496655 = score(doc=2735,freq=2.0), product of:
        0.088001914 = queryWeight, product of:
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.026584605 = queryNorm
        0.5851765 = fieldWeight in 2735, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.125 = fieldNorm(doc=2735)
  0.09090909 = coord(1/11)

Fowler, R.H.; Wilson, B.A.; Fowler, W.A.L.: Information navigator : an information system using associative networks for display and retrieval (1992) 0.00
```
0.0045626834 = product of:
  0.025094757 = sum of:
    0.0061991126 = weight(_text_:a in 919) [ClassicSimilarity], result of:
      0.0061991126 = score(doc=919,freq=14.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.20223314 = fieldWeight in 919, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=919)
    0.018895645 = weight(_text_:u in 919) [ClassicSimilarity], result of:
      0.018895645 = score(doc=919,freq=2.0), product of:
        0.08704981 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.026584605 = queryNorm
        0.21706703 = fieldWeight in 919, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.046875 = fieldNorm(doc=919)
  0.18181819 = coord(2/11)
```
Abstract

Document retrieval is a highly interactive process dealing with large amounts of information. Visual representations can provide both a means for managing the complexity of large information structures and an interface style well suited to interactive manipulation. The system we have designed utilizes visually displayed graphic structures and a direct manipulation interface style to supply an integrated environment for retrieval. A common visually displayed network structure is used for query, document content, and term relations. A query can be modified through direct manipulation of its visual form by incorporating terms from any other information structure the system displays. An associative thesaurus of terms and an inter-document network provide information about a document collection that can complement other retrieval aids. Visualization of these large data structures makes use of fisheye views and overview diagrams to help overcome some of the inherent difficulties of orientation and navigation in large information structures.

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Type

a

Wu, Y.; Bai, R.: ¬An event relationship model for knowledge organization and visualization (2017) 0.00

0.0045546377 = product of:
  0.025050508 = sum of:
    0.0057392623 = weight(_text_:a in 3867) [ClassicSimilarity], result of:
      0.0057392623 = score(doc=3867,freq=12.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.18723148 = fieldWeight in 3867, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=3867)
    0.019311246 = weight(_text_:r in 3867) [ClassicSimilarity], result of:
      0.019311246 = score(doc=3867,freq=2.0), product of:
        0.088001914 = queryWeight, product of:
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.026584605 = queryNorm
        0.2194412 = fieldWeight in 3867, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.046875 = fieldNorm(doc=3867)
  0.18181819 = coord(2/11)

Abstract: An event is a specific occurrence involving participants, which is a typed, n-ary association of entities or other events, each identified as a participant in a specific semantic role in the event (Pyysalo et al. 2012; Linguistic Data Consortium 2005). Event types may vary across domains. Representing relationships between events can facilitate the understanding of knowledge in complex systems (such as economic systems, human body, social systems). In the simplest form, an event can be represented as Entity A <Relation> Entity B. This paper evaluates several knowledge organization and visualization models and tools, such as concept maps (Cmap), topic maps (Ontopia), network analysis models (Gephi), and ontology (Protégé), then proposes an event relationship model that aims to integrate the strengths of these models, and can represent complex knowledge expressed in events and their relationships.
Type: a

Palm, F.: QVIZ : Query and context based visualization of time-spatial cultural dynamics (2007) 0.00

0.0029172269 = product of:
  0.016044747 = sum of:
    0.0052392064 = weight(_text_:a in 1289) [ClassicSimilarity], result of:
      0.0052392064 = score(doc=1289,freq=10.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.1709182 = fieldWeight in 1289, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=1289)
    0.010805541 = product of:
      0.021611081 = sum of:
        0.021611081 = weight(_text_:22 in 1289) [ClassicSimilarity], result of:
          0.021611081 = score(doc=1289,freq=2.0), product of:
            0.09309476 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.026584605 = queryNorm
            0.23214069 = fieldWeight in 1289, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1289)
      0.5 = coord(1/2)
  0.18181819 = coord(2/11)

Abstract: QVIZ will research and create a framework for visualizing and querying archival resources by a time-space interface based on maps and emergent knowledge structures. The framework will also integrate social software, such as wikis, in order to utilize knowledge in existing and new communities of practice. QVIZ will lead to improved information sharing and knowledge creation, easier access to information in a user-adapted context and innovative ways of exploring and visualizing materials over time, between countries and other administrative units. The common European framework for sharing and accessing archival information provided by the QVIZ project will open a considerably larger commercial market based on archival materials as well as a richer understanding of European history.
Content: Vortrag anlässlich des Workshops: "Extending the multilingual capacity of The European Library in the EDL project Stockholm, Swedish National Library, 22-23 November 2007".

Waechter, U.: Visualisierung von Netzwerkstrukturen (2002) 0.00

0.0022903814 = product of:
  0.025194194 = sum of:
    0.025194194 = weight(_text_:u in 1735) [ClassicSimilarity], result of:
      0.025194194 = score(doc=1735,freq=2.0), product of:
        0.08704981 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.026584605 = queryNorm
        0.28942272 = fieldWeight in 1735, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.0625 = fieldNorm(doc=1735)
  0.09090909 = coord(1/11)

Kraker, P.; Kittel, C,; Enkhbayar, A.: Open Knowledge Maps : creating a visual interface to the world's scientific knowledge based on natural language processing (2016) 0.00

0.0021938365 = product of:
  0.0120661 = sum of:
    0.0054389704 = product of:
      0.010877941 = sum of:
        0.010877941 = weight(_text_:h in 3205) [ClassicSimilarity], result of:
          0.010877941 = score(doc=3205,freq=2.0), product of:
            0.0660481 = queryWeight, product of:
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.026584605 = queryNorm
            0.16469726 = fieldWeight in 3205, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.046875 = fieldNorm(doc=3205)
      0.5 = coord(1/2)
    0.0066271294 = weight(_text_:a in 3205) [ClassicSimilarity], result of:
      0.0066271294 = score(doc=3205,freq=16.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.2161963 = fieldWeight in 3205, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=3205)
  0.18181819 = coord(2/11)

Abstract: The goal of Open Knowledge Maps is to create a visual interface to the world's scientific knowledge. The base for this visual interface consists of so-called knowledge maps, which enable the exploration of existing knowledge and the discovery of new knowledge. Our open source knowledge mapping software applies a mixture of summarization techniques and similarity measures on article metadata, which are iteratively chained together. After processing, the representation is saved in a database for use in a web visualization. In the future, we want to create a space for collective knowledge mapping that brings together individuals and communities involved in exploration and discovery. We want to enable people to guide each other in their discovery by collaboratively annotating and modifying the automatically created maps.
Source: 027.7 Zeitschrift für Bibliothekskultur. 4(2016), H.2
Type: a

Graphic details : a scientific study of the importance of diagrams to science (2016) 0.00
```
0.0021293839 = product of:
  0.0117116105 = sum of:
    0.0063088397 = weight(_text_:a in 3035) [ClassicSimilarity], result of:
      0.0063088397 = score(doc=3035,freq=58.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.20581275 = fieldWeight in 3035, product of:
          7.615773 = tf(freq=58.0), with freq of:
            58.0 = termFreq=58.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3035)
    0.0054027704 = product of:
      0.010805541 = sum of:
        0.010805541 = weight(_text_:22 in 3035) [ClassicSimilarity], result of:
          0.010805541 = score(doc=3035,freq=2.0), product of:
            0.09309476 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.026584605 = queryNorm
            0.116070345 = fieldWeight in 3035, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0234375 = fieldNorm(doc=3035)
      0.5 = coord(1/2)
  0.18181819 = coord(2/11)
```
Abstract

A PICTURE is said to be worth a thousand words. That metaphor might be expected to pertain a fortiori in the case of scientific papers, where a figure can brilliantly illuminate an idea that might otherwise be baffling. Papers with figures in them should thus be easier to grasp than those without. They should therefore reach larger audiences and, in turn, be more influential simply by virtue of being more widely read. But are they?

Content

Bill Howe and his colleagues at the University of Washington, in Seattle, decided to find out. First, they trained a computer algorithm to distinguish between various sorts of figures-which they defined as diagrams, equations, photographs, plots (such as bar charts and scatter graphs) and tables. They exposed their algorithm to between 400 and 600 images of each of these types of figure until it could distinguish them with an accuracy greater than 90%. Then they set it loose on the more-than-650,000 papers (containing more than 10m figures) stored on PubMed Central, an online archive of biomedical-research articles. To measure each paper's influence, they calculated its article-level Eigenfactor score-a modified version of the PageRank algorithm Google uses to provide the most relevant results for internet searches. Eigenfactor scoring gives a better measure than simply noting the number of times a paper is cited elsewhere, because it weights citations by their influence. A citation in a paper that is itself highly cited is worth more than one in a paper that is not.
As the team describe in a paper posted (http://arxiv.org/abs/1605.04951) on arXiv, they found that figures did indeed matter-but not all in the same way. An average paper in PubMed Central has about one diagram for every three pages and gets 1.67 citations. Papers with more diagrams per page and, to a lesser extent, plots per page tended to be more influential (on average, a paper accrued two more citations for every extra diagram per page, and one more for every extra plot per page). By contrast, including photographs and equations seemed to decrease the chances of a paper being cited by others. That agrees with a study from 2012, whose authors counted (by hand) the number of mathematical expressions in over 600 biology papers and found that each additional equation per page reduced the number of citations a paper received by 22%. This does not mean that researchers should rush to include more diagrams in their next paper. Dr Howe has not shown what is behind the effect, which may merely be one of correlation, rather than causation. It could, for example, be that papers with lots of diagrams tend to be those that illustrate new concepts, and thus start a whole new field of inquiry. Such papers will certainly be cited a lot. On the other hand, the presence of equations really might reduce citations. Biologists (as are most of those who write and read the papers in PubMed Central) are notoriously mathsaverse. If that is the case, looking in a physics archive would probably produce a different result.
Dr Howe and his colleagues do, however, believe that the study of diagrams can result in new insights. A figure showing new metabolic pathways in a cell, for example, may summarise hundreds of experiments. Since illustrations can convey important scientific concepts in this way, they think that browsing through related figures from different papers may help researchers come up with new theories. As Dr Howe puts it, "the unit of scientific currency is closer to the figure than to the paper." With this thought in mind, the team have created a website (viziometrics.org (http://viziometrics.org/) ) where the millions of images sorted by their program can be searched using key words. Their next plan is to extract the information from particular types of scientific figure, to create comprehensive "super" figures: a giant network of all the known chemical processes in a cell for example, or the best-available tree of life. At just one such superfigure per paper, though, the citation records of articles containing such all-embracing diagrams may very well undermine the correlation that prompted their creation in the first place. Call it the ultimate marriage of chart and science.

Language

a

Type

a

Teutsch, K.: ¬Die Welt ist doch eine Scheibe : Google-Herausforderer eyePlorer (2009) 0.00

0.0018591749 = product of:
  0.010225462 = sum of:
    8.680089E-4 = weight(_text_:s in 2678) [ClassicSimilarity], result of:
      8.680089E-4 = score(doc=2678,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.030030979 = fieldWeight in 2678, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.01953125 = fieldNorm(doc=2678)
    0.009357453 = weight(_text_:k in 2678) [ClassicSimilarity], result of:
      0.009357453 = score(doc=2678,freq=2.0), product of:
        0.09490114 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.026584605 = queryNorm
        0.098602116 = fieldWeight in 2678, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.01953125 = fieldNorm(doc=2678)
  0.18181819 = coord(2/11)

Source: http://www.faz.net/s/Rub4521147CD87A4D9390DA8578416FA2EC/Doc~EE86EE2941DCB4B80833E7F07A08860BE~ATpl~Ecommon~Scontent.html

Maaten, L. van den; Hinton, G.: Visualizing non-metric similarities in multiple maps (2012) 0.00
```
0.0015837002 = product of:
  0.008710351 = sum of:
    0.0066271294 = weight(_text_:a in 3884) [ClassicSimilarity], result of:
      0.0066271294 = score(doc=3884,freq=16.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.2161963 = fieldWeight in 3884, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=3884)
    0.0020832212 = weight(_text_:s in 3884) [ClassicSimilarity], result of:
      0.0020832212 = score(doc=3884,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.072074346 = fieldWeight in 3884, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.046875 = fieldNorm(doc=3884)
  0.18181819 = coord(2/11)
```
Abstract

Techniques for multidimensional scaling visualize objects as points in a low-dimensional metric map. As a result, the visualizations are subject to the fundamental limitations of metric spaces. These limitations prevent multidimensional scaling from faithfully representing non-metric similarity data such as word associations or event co-occurrences. In particular, multidimensional scaling cannot faithfully represent intransitive pairwise similarities in a visualization, and it cannot faithfully visualize "central" objects. In this paper, we present an extension of a recently proposed multidimensional scaling technique called t-SNE. The extension aims to address the problems of traditional multidimensional scaling techniques when these techniques are used to visualize non-metric similarities. The new technique, called multiple maps t-SNE, alleviates these problems by constructing a collection of maps that reveal complementary structure in the similarity data. We apply multiple maps t-SNE to a large data set of word association data and to a data set of NIPS co-authorships, demonstrating its ability to successfully visualize non-metric similarities.

Source

Machine learning. 87(2012) no.1, S.33-55

Type

a

Maaten, L. van den: Learning a parametric embedding by preserving local structure (2009) 0.00

0.0015532422 = product of:
  0.008542832 = sum of:
    0.006112407 = weight(_text_:a in 3883) [ClassicSimilarity], result of:
      0.006112407 = score(doc=3883,freq=10.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.19940455 = fieldWeight in 3883, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3883)
    0.0024304248 = weight(_text_:s in 3883) [ClassicSimilarity], result of:
      0.0024304248 = score(doc=3883,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.08408674 = fieldWeight in 3883, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3883)
  0.18181819 = coord(2/11)

Abstract: The paper presents a new unsupervised dimensionality reduction technique, called parametric t-SNE, that learns a parametric mapping between the high-dimensional data space and the low-dimensional latent space. Parametric t-SNE learns the parametric mapping in such a way that the local structure of the data is preserved as well as possible in the latent space. We evaluate the performance of parametric t-SNE in experiments on three datasets, in which we compare it to the performance of two other unsupervised parametric dimensionality reduction techniques. The results of experiments illustrate the strong performance of parametric t-SNE, in particular, in learning settings in which the dimensionality of the latent space is relatively low.
Source: Proceedings of the Twelfth International Conference on Artificial Intelligence & Statistics (AI-STATS), JMLR W&CP 5, 2009. S.384-391
Type: a

Braun, S.: Manifold: a custom analytics platform to visualize research impact (2015) 0.00

0.0013313505 = product of:
  0.007322428 = sum of:
    0.0052392064 = weight(_text_:a in 2906) [ClassicSimilarity], result of:
      0.0052392064 = score(doc=2906,freq=10.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.1709182 = fieldWeight in 2906, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=2906)
    0.0020832212 = weight(_text_:s in 2906) [ClassicSimilarity], result of:
      0.0020832212 = score(doc=2906,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.072074346 = fieldWeight in 2906, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.046875 = fieldNorm(doc=2906)
  0.18181819 = coord(2/11)

Abstract: The use of research impact metrics and analytics has become an integral component to many aspects of institutional assessment. Many platforms currently exist to provide such analytics, both proprietary and open source; however, the functionality of these systems may not always overlap to serve uniquely specific needs. In this paper, I describe a novel web-based platform, named Manifold, that I built to serve custom research impact assessment needs in the University of Minnesota Medical School. Built on a standard LAMP architecture, Manifold automatically pulls publication data for faculty from Scopus through APIs, calculates impact metrics through automated analytics, and dynamically generates report-like profiles that visualize those metrics. Work on this project has resulted in many lessons learned about challenges to sustainability and scalability in developing a system of such magnitude.
Type: a

Maaten, L. van den; Hinton, G.: Visualizing data using t-SNE (2008) 0.00
```
0.0013197502 = product of:
  0.007258626 = sum of:
    0.0055226083 = weight(_text_:a in 3888) [ClassicSimilarity], result of:
      0.0055226083 = score(doc=3888,freq=16.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.18016359 = fieldWeight in 3888, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3888)
    0.0017360178 = weight(_text_:s in 3888) [ClassicSimilarity], result of:
      0.0017360178 = score(doc=3888,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.060061958 = fieldWeight in 3888, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3888)
  0.18181819 = coord(2/11)
```
Abstract

We present a new technique called "t-SNE" that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. t-SNE is better than existing techniques at creating a single map that reveals structure at many different scales. This is particularly important for high-dimensional data that lie on several different, but related, low-dimensional manifolds, such as images of objects from multiple classes seen from multiple viewpoints. For visualizing the structure of very large data sets, we show how t-SNE can use random walks on neighborhood graphs to allow the implicit structure of all of the data to influence the way in which a subset of the data is displayed. We illustrate the performance of t-SNE on a wide variety of data sets and compare it with many other non-parametric visualization techniques, including Sammon mapping, Isomap, and Locally Linear Embedding. The visualizations produced by t-SNE are significantly better than those produced by the other techniques on almost all of the data sets.

Source

Journal of machine learning research. 9(2008), S.2579-2605

Type

a
Beagle, D.: Visualizing keyword distribution across multidisciplinary c-space (2003) 0.00
```
0.001243936 = product of:
  0.006841648 = sum of:
    0.005368588 = weight(_text_:a in 1202) [ClassicSimilarity], result of:
      0.005368588 = score(doc=1202,freq=42.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.17513901 = fieldWeight in 1202, product of:
          6.4807405 = tf(freq=42.0), with freq of:
            42.0 = termFreq=42.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1202)
    0.00147306 = weight(_text_:s in 1202) [ClassicSimilarity], result of:
      0.00147306 = score(doc=1202,freq=4.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.050964262 = fieldWeight in 1202, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1202)
  0.18181819 = coord(2/11)
```
Abstract

The concept of c-space is proposed as a visualization schema relating containers of content to cataloging surrogates and classification structures. Possible applications of keyword vector clusters within c-space could include improved retrieval rates through the use of captioning within visual hierarchies, tracings of semantic bleeding among subclasses, and access to buried knowledge within subject-neutral publication containers. The Scholastica Project is described as one example, following a tradition of research dating back to the 1980's. Preliminary focus group assessment indicates that this type of classification rendering may offer digital library searchers enriched entry strategies and an expanded range of re-entry vocabularies. Those of us who work in traditional libraries typically assume that our systems of classification: Library of Congress Classification (LCC) and Dewey Decimal Classification (DDC), are descriptive rather than prescriptive. In other words, LCC classes and subclasses approximate natural groupings of texts that reflect an underlying order of knowledge, rather than arbitrary categories prescribed by librarians to facilitate efficient shelving. Philosophical support for this assumption has traditionally been found in a number of places, from the archetypal tree of knowledge, to Aristotelian categories, to the concept of discursive formations proposed by Michel Foucault. Gary P. Radford has elegantly described an encounter with Foucault's discursive formations in the traditional library setting: "Just by looking at the titles on the spines, you can see how the books cluster together...You can identify those books that seem to form the heart of the discursive formation and those books that reside on the margins. Moving along the shelves, you see those books that tend to bleed over into other classifications and that straddle multiple discursive formations. You can physically and sensually experience...those points that feel like state borders or national boundaries, those points where one subject ends and another begins, or those magical places where one subject has morphed into another..."
But what happens to this awareness in a digital library? Can discursive formations be represented in cyberspace, perhaps through diagrams in a visualization interface? And would such a schema be helpful to a digital library user? To approach this question, it is worth taking a moment to reconsider what Radford is looking at. First, he looks at titles to see how the books cluster. To illustrate, I scanned one hundred books on the shelves of a college library under subclass HT 101-395, defined by the LCC subclass caption as Urban groups. The City. Urban sociology. Of the first 100 titles in this sequence, fifty included the word "urban" or variants (e.g. "urbanization"). Another thirty-five used the word "city" or variants. These keywords appear to mark their titles as the heart of this discursive formation. The scattering of titles not using "urban" or "city" used related terms such as "town," "community," or in one case "skyscrapers." So we immediately see some empirical correlation between keywords and classification. But we also see a problem with the commonly used search technique of title-keyword. A student interested in urban studies will want to know about this entire subclass, and may wish to browse every title available therein. A title-keyword search on "urban" will retrieve only half of the titles, while a search on "city" will retrieve just over a third. There will be no overlap, since no titles in this sample contain both words. The only place where both words appear in a common string is in the LCC subclass caption, but captions are not typically indexed in library Online Public Access Catalogs (OPACs). In a traditional library, this problem is mitigated when the student goes to the shelf looking for any one of the books and suddenly discovers a much wider selection than the keyword search had led him to expect. But in a digital library, the issue of non-retrieval can be more problematic, as studies have indicated. Micco and Popp reported that, in a study funded partly by the U.S. Department of Education, 65 of 73 unskilled users searching for material on U.S./Soviet foreign relations found some material but never realized they had missed a large percentage of what was in the database.

Source

D-Lib magazine. 9(2003) no.6, x S

Type

a

Search (34 results, page 1 of 2)

Authors

Years

Languages

Types

Themes