Search (14 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  • × theme_ss:"Wissensrepräsentation"
  1. Sebastian, Y.: Literature-based discovery by learning heterogeneous bibliographic information networks (2017) 0.02
    0.018960943 = product of:
      0.05688283 = sum of:
        0.01339484 = weight(_text_:information in 535) [ClassicSimilarity], result of:
          0.01339484 = score(doc=535,freq=10.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.1734784 = fieldWeight in 535, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=535)
        0.04348799 = weight(_text_:networks in 535) [ClassicSimilarity], result of:
          0.04348799 = score(doc=535,freq=2.0), product of:
            0.20804176 = queryWeight, product of:
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.043984205 = queryNorm
            0.2090349 = fieldWeight in 535, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.72992 = idf(docFreq=1060, maxDocs=44218)
              0.03125 = fieldNorm(doc=535)
      0.33333334 = coord(2/6)
    
    Abstract
    Literature-based discovery (LBD) research aims at finding effective computational methods for predicting previously unknown connections between clusters of research papers from disparate research areas. Existing methods encompass two general approaches. The first approach searches for these unknown connections by examining the textual contents of research papers. In addition to the existing textual features, the second approach incorporates structural features of scientific literatures, such as citation structures. These approaches, however, have not considered research papers' latent bibliographic metadata structures as important features that can be used for predicting previously unknown relationships between them. This thesis investigates a new graph-based LBD method that exploits the latent bibliographic metadata connections between pairs of research papers. The heterogeneous bibliographic information network is proposed as an efficient graph-based data structure for modeling the complex relationships between these metadata. In contrast to previous approaches, this method seamlessly combines textual and citation information in the form of pathbased metadata features for predicting future co-citation links between research papers from disparate research fields. The results reported in this thesis provide evidence that the method is effective for reconstructing the historical literature-based discovery hypotheses. This thesis also investigates the effects of semantic modeling and topic modeling on the performance of the proposed method. For semantic modeling, a general-purpose word sense disambiguation technique is proposed to reduce the lexical ambiguity in the title and abstract of research papers. The experimental results suggest that the reduced lexical ambiguity did not necessarily lead to a better performance of the method. This thesis discusses some of the possible contributing factors to these results. Finally, topic modeling is used for learning the latent topical relations between research papers. The learned topic model is incorporated into the heterogeneous bibliographic information network graph and allows new predictive features to be learned. The results in this thesis suggest that topic modeling improves the performance of the proposed method by increasing the overall accuracy for predicting the future co-citation links between disparate research papers.
    Footnote
    A thesis submitted in ful llment of the requirements for the degree of Doctor of Philosophy Monash University, Faculty of Information Technology.
  2. Järvelin, K.; Kristensen, J.; Niemi, T.; Sormunen, E.; Keskustalo, H.: ¬A deductive data model for query expansion (1996) 0.01
    0.00895443 = product of:
      0.026863288 = sum of:
        0.0089855315 = weight(_text_:information in 2230) [ClassicSimilarity], result of:
          0.0089855315 = score(doc=2230,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.116372846 = fieldWeight in 2230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2230)
        0.017877758 = product of:
          0.035755515 = sum of:
            0.035755515 = weight(_text_:22 in 2230) [ClassicSimilarity], result of:
              0.035755515 = score(doc=2230,freq=2.0), product of:
                0.1540252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043984205 = queryNorm
                0.23214069 = fieldWeight in 2230, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2230)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Source
    Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR '96), Zürich, Switzerland, August 18-22, 1996. Eds.: H.P. Frei et al
  3. Thenmalar, S.; Geetha, T.V.: Enhanced ontology-based indexing and searching (2014) 0.01
    0.008418022 = product of:
      0.025254063 = sum of:
        0.014825371 = weight(_text_:information in 1633) [ClassicSimilarity], result of:
          0.014825371 = score(doc=1633,freq=16.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.1920054 = fieldWeight in 1633, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1633)
        0.010428692 = product of:
          0.020857384 = sum of:
            0.020857384 = weight(_text_:22 in 1633) [ClassicSimilarity], result of:
              0.020857384 = score(doc=1633,freq=2.0), product of:
                0.1540252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043984205 = queryNorm
                0.1354154 = fieldWeight in 1633, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1633)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Purpose - The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based information retrieval aims to identify relevant information based on the meanings of the query terms or on the context of the terms and the performance of semantic information retrieval is carried out through standard measures-precision and recall. Higher precision leads to the (meaningful) relevant documents obtained and lower recall leads to the less coverage of the concepts. Design/methodology/approach - In this paper, the authors enhance the existing ontology-based indexing proposed by Kohler et al., by incorporating sibling information to the index. The index designed by Kohler et al., contains only super and sub-concepts from the ontology. In addition, in our approach, we focus on two tasks; query expansion and ranking of the expanded queries, to improve the efficiency of the ontology-based search. The aforementioned tasks make use of ontological concepts, and relations existing between those concepts so as to obtain semantically more relevant search results for a given query. Findings - The proposed ontology-based indexing technique is investigated by analysing the coverage of concepts that are being populated in the index. Here, we introduce a new measure called index enhancement measure, to estimate the coverage of ontological concepts being indexed. We have evaluated the ontology-based search for the tourism domain with the tourism documents and tourism-specific ontology. The comparison of search results based on the use of ontology "with and without query expansion" is examined to estimate the efficiency of the proposed query expansion task. The ranking is compared with the ORank system to evaluate the performance of our ontology-based search. From these analyses, the ontology-based search results shows better recall when compared to the other concept-based search systems. The mean average precision of the ontology-based search is found to be 0.79 and the recall is found to be 0.65, the ORank system has the mean average precision of 0.62 and the recall is found to be 0.51, while the concept-based search has the mean average precision of 0.56 and the recall is found to be 0.42. Practical implications - When the concept is not present in the domain-specific ontology, the concept cannot be indexed. When the given query term is not available in the ontology then the term-based results are retrieved. Originality/value - In addition to super and sub-concepts, we incorporate the concepts present in same level (siblings) to the ontological index. The structural information from the ontology is determined for the query expansion. The ranking of the documents depends on the type of the query (single concept query, multiple concept queries and concept with relation queries) and the ontological relations that exists in the query and the documents. With this ontological structural information, the search results showed us better coverage of concepts with respect to the query.
    Date
    20. 1.2015 18:30:22
    Source
    Aslib journal of information management. 66(2014) no.6, S.678-696
  4. Baofu, P.: ¬The future of information architecture : conceiving a better way to understand taxonomy, network, and intelligence (2008) 0.00
    0.004991962 = product of:
      0.029951772 = sum of:
        0.029951772 = weight(_text_:information in 2257) [ClassicSimilarity], result of:
          0.029951772 = score(doc=2257,freq=32.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.38790947 = fieldWeight in 2257, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2257)
      0.16666667 = coord(1/6)
    
    Abstract
    The Future of Information Architecture examines issues surrounding why information is processed, stored and applied in the way that it has, since time immemorial. Contrary to the conventional wisdom held by many scholars in human history, the recurrent debate on the explanation of the most basic categories of information (eg space, time causation, quality, quantity) has been misconstrued, to the effect that there exists some deeper categories and principles behind these categories of information - with enormous implications for our understanding of reality in general. To understand this, the book is organised in to four main parts: Part I begins with the vital question concerning the role of information within the context of the larger theoretical debate in the literature. Part II provides a critical examination of the nature of data taxonomy from the main perspectives of culture, society, nature and the mind. Part III constructively invesitgates the world of information network from the main perspectives of culture, society, nature and the mind. Part IV proposes six main theses in the authors synthetic theory of information architecture, namely, (a) the first thesis on the simpleness-complicatedness principle, (b) the second thesis on the exactness-vagueness principle (c) the third thesis on the slowness-quickness principle (d) the fourth thesis on the order-chaos principle, (e) the fifth thesis on the symmetry-asymmetry principle, and (f) the sixth thesis on the post-human stage.
    LCSH
    Information resources
    Information organization
    Information storage and retrieval systems
    RSWK
    Suchmaschine / Information Retrieval
    Subject
    Information resources
    Information organization
    Information storage and retrieval systems
    Suchmaschine / Information Retrieval
  5. Atanassova, I.; Bertin, M.: Semantic facets for scientific information retrieval (2014) 0.00
    0.0034943735 = product of:
      0.020966241 = sum of:
        0.020966241 = weight(_text_:information in 4471) [ClassicSimilarity], result of:
          0.020966241 = score(doc=4471,freq=8.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.27153665 = fieldWeight in 4471, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4471)
      0.16666667 = coord(1/6)
    
    Abstract
    We present an Information Retrieval System for scientific publications that provides the possibility to filter results according to semantic facets. We use sentence-level semantic annotations that identify specific semantic relations in texts, such as methods, definitions, hypotheses, that correspond to common information needs related to scientific literature. The semantic annotations are obtained using a rule-based method that identifies linguistic clues organized into a linguistic ontology. The system is implemented using Solr Search Server and offers efficient search and navigation in scientific papers.
    Series
    Communications in computer and information science; vol.475
  6. Calegari, S.; Sanchez, E.: Object-fuzzy concept network : an enrichment of ontologies in semantic information retrieval (2008) 0.00
    0.0021615832 = product of:
      0.0129694985 = sum of:
        0.0129694985 = weight(_text_:information in 2393) [ClassicSimilarity], result of:
          0.0129694985 = score(doc=2393,freq=6.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.16796975 = fieldWeight in 2393, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2393)
      0.16666667 = coord(1/6)
    
    Abstract
    This article shows how a fuzzy ontology-based approach can improve semantic documents retrieval. After formally defining a fuzzy ontology and a fuzzy knowledge base, a special type of new fuzzy relationship called (semantic) correlation, which links the concepts or entities in a fuzzy ontology, is discussed. These correlations, first assigned by experts, are updated after querying or when a document has been inserted into a database. Moreover, in order to define a dynamic knowledge of a domain adapting itself to the context, it is shown how to handle a tradeoff between the correct definition of an object, taken in the ontology structure, and the actual meaning assigned by individuals. The notion of a fuzzy concept network is extended, incorporating database objects so that entities and documents can similarly be represented in the network. Information retrieval (IR) algorithm, using an object-fuzzy concept network (O-FCN), is introduced and described. This algorithm allows us to derive a unique path among the entities involved in the query to obtain maxima semantic associations in the knowledge domain. Finally, the study has been validated by querying a database using fuzzy recall, fuzzy precision, and coefficient variant measures in the crisp and fuzzy cases.
    Source
    Journal of the American Society for Information Science and Technology. 59(2008) no.13, S.2171-2185
  7. Drexel, G.: Knowledge engineering for intelligent information retrieval (2001) 0.00
    0.0021179102 = product of:
      0.012707461 = sum of:
        0.012707461 = weight(_text_:information in 4043) [ClassicSimilarity], result of:
          0.012707461 = score(doc=4043,freq=4.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.16457605 = fieldWeight in 4043, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4043)
      0.16666667 = coord(1/6)
    
    Abstract
    This paper presents a clustered approach to designing an overall ontological model together with a general rule-based component that serves as a mapping device. By observational criteria, a multi-lingual team of experts excerpts concepts from general communication in the media. The team, then, finds equivalent expressions in English, German, French, and Spanish. On the basis of a set of ontological and lexical relations, a conceptual network is built up. Concepts are thought to be universal. Objects unique in time and space are identified by names and will be explained by the universals as their instances. Our approach relies on multi-relational descriptions of concepts. It provides a powerful tool for documentation and conceptual language learning. First and foremost, our multi-lingual, polyhierarchical ontology fills the gap of semantically-based information retrieval by generating enhanced and improved queries for internet search
  8. Mäkelä, E.; Hyvönen, E.; Saarela, S.; Vilfanen, K.: Application of ontology techniques to view-based semantic serach and browsing (2012) 0.00
    0.0021179102 = product of:
      0.012707461 = sum of:
        0.012707461 = weight(_text_:information in 3264) [ClassicSimilarity], result of:
          0.012707461 = score(doc=3264,freq=4.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.16457605 = fieldWeight in 3264, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3264)
      0.16666667 = coord(1/6)
    
    Abstract
    We scho how the beenfits of the view-based search method, developed within the information retrieval community, can be extended with ontology-based search, developed within the Semantic Web community, and with semantic recommendations. As a proof of the concept, we have implemented an ontology-and view-based search engine and recommendations system Ontogaotr for RDF(S) repositories. Ontogator is innovative in two ways. Firstly, the RDFS.based ontologies used for annotating metadata are used in the user interface to facilitate view-based information retrieval. The views provide the user with an overview of the repositorys contents and a vocabulary for expressing search queries. Secondlyy, a semantic browsing function is provided by a recommender system. This system enriches instance level metadata by ontologies and provides the user with links to semantically related relevant resources. The semantic linkage is specified in terms of logical rules. To illustrate and discuss the ideas, a deployed application of Ontogator to a photo repository of the Helsinki University Museum is presented.
  9. Koopman, B.; Zuccon, G.; Bruza, P.; Sitbon, L.; Lawley, M.: Information retrieval as semantic inference : a graph Inference model applied to medical search (2016) 0.00
    0.001996785 = product of:
      0.011980709 = sum of:
        0.011980709 = weight(_text_:information in 3260) [ClassicSimilarity], result of:
          0.011980709 = score(doc=3260,freq=8.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.1551638 = fieldWeight in 3260, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3260)
      0.16666667 = coord(1/6)
    
    Abstract
    This paper presents a Graph Inference retrieval model that integrates structured knowledge resources, statistical information retrieval methods and inference in a unified framework. Key components of the model are a graph-based representation of the corpus and retrieval driven by an inference mechanism achieved as a traversal over the graph. The model is proposed to tackle the semantic gap problem-the mismatch between the raw data and the way a human being interprets it. We break down the semantic gap problem into five core issues, each requiring a specific type of inference in order to be overcome. Our model and evaluation is applied to the medical domain because search within this domain is particularly challenging and, as we show, often requires inference. In addition, this domain features both structured knowledge resources as well as unstructured text. Our evaluation shows that inference can be effective, retrieving many new relevant documents that are not retrieved by state-of-the-art information retrieval models. We show that many retrieved documents were not pooled by keyword-based search methods, prompting us to perform additional relevance assessment on these new documents. A third of the newly retrieved documents judged were found to be relevant. Our analysis provides a thorough understanding of when and how to apply inference for retrieval, including a categorisation of queries according to the effect of inference. The inference mechanism promoted recall by retrieving new relevant documents not found by previous keyword-based approaches. In addition, it promoted precision by an effective reranking of documents. When inference is used, performance gains can generally be expected on hard queries. However, inference should not be applied universally: for easy, unambiguous queries and queries with few relevant documents, inference did adversely affect effectiveness. These conclusions reflect the fact that for retrieval as inference to be effective, a careful balancing act is involved. Finally, although the Graph Inference model is developed and applied to medical search, it is a general retrieval model applicable to other areas such as web search, where an emerging research trend is to utilise structured knowledge resources for more effective semantic search.
    Source
    Information Retrieval Journal. 19(2016) no.1, S.6-37
  10. Jansen, B.; Browne, G.M.: Navigating information spaces : index / mind map / topic map? (2021) 0.00
    0.001996785 = product of:
      0.011980709 = sum of:
        0.011980709 = weight(_text_:information in 436) [ClassicSimilarity], result of:
          0.011980709 = score(doc=436,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.1551638 = fieldWeight in 436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=436)
      0.16666667 = coord(1/6)
    
  11. Smith, D.A.; Shadbolt, N.R.: FacetOntology : expressive descriptions of facets in the Semantic Web (2012) 0.00
    0.0017649251 = product of:
      0.01058955 = sum of:
        0.01058955 = weight(_text_:information in 2208) [ClassicSimilarity], result of:
          0.01058955 = score(doc=2208,freq=4.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.13714671 = fieldWeight in 2208, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2208)
      0.16666667 = coord(1/6)
    
    Abstract
    The formal structure of the information on the Semantic Web lends itself to faceted browsing, an information retrieval method where users can filter results based on the values of properties ("facets"). Numerous faceted browsers have been created to browse RDF and Linked Data, but these systems use their own ontologies for defining how data is queried to populate their facets. Since the source data is the same format across these systems (specifically, RDF), we can unify the different methods of describing how to quer the underlying data, to enable compatibility across systems, and provide an extensible base ontology for future systems. To this end, we present FacetOntology, an ontology that defines how to query data to form a faceted browser, and a number of transformations and filters that can be applied to data before it is shown to users. FacetOntology overcomes limitations in the expressivity of existing work, by enabling the full expressivity of SPARQL when selecting data for facets. By applying a FacetOntology definition to data, a set of facets are specified, each with queries and filters to source RDF data, which enables faceted browsing systems to be created using that RDF data.
  12. Vallet, D.; Fernández, M.; Castells, P.: ¬An ontology-based information retrieval model (2005) 0.00
    0.0014975886 = product of:
      0.0089855315 = sum of:
        0.0089855315 = weight(_text_:information in 4708) [ClassicSimilarity], result of:
          0.0089855315 = score(doc=4708,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.116372846 = fieldWeight in 4708, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4708)
      0.16666667 = coord(1/6)
    
  13. Wang, Y.-H.; Jhuo, P.-S.: ¬A semantic faceted search with rule-based inference (2009) 0.00
    0.0014975886 = product of:
      0.0089855315 = sum of:
        0.0089855315 = weight(_text_:information in 540) [ClassicSimilarity], result of:
          0.0089855315 = score(doc=540,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.116372846 = fieldWeight in 540, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=540)
      0.16666667 = coord(1/6)
    
    Abstract
    Semantic Search has become an active research of Semantic Web in recent years. The classification methodology plays a pretty critical role in the beginning of search process to disambiguate irrelevant information. However, the applications related to Folksonomy suffer from many obstacles. This study attempts to eliminate the problems resulted from Folksonomy using existing semantic technology. We also focus on how to effectively integrate heterogeneous ontologies over the Internet to acquire the integrity of domain knowledge. A faceted logic layer is abstracted in order to strengthen category framework and organize existing available ontologies according to a series of steps based on the methodology of faceted classification and ontology construction. The result showed that our approach can facilitate the integration of inconsistent or even heterogeneous ontologies. This paper also generalizes the principles of picking appropriate facets with which our facet browser completely complies so that better semantic search result can be obtained.
  14. Cao, N.; Sun, J.; Lin, Y.-R.; Gotz, D.; Liu, S.; Qu, H.: FacetAtlas : Multifaceted visualization for rich text corpora (2010) 0.00
    0.0012479905 = product of:
      0.007487943 = sum of:
        0.007487943 = weight(_text_:information in 3366) [ClassicSimilarity], result of:
          0.007487943 = score(doc=3366,freq=2.0), product of:
            0.0772133 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.043984205 = queryNorm
            0.09697737 = fieldWeight in 3366, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3366)
      0.16666667 = coord(1/6)
    
    Abstract
    Documents in rich text corpora usually contain multiple facets of information. For example, an article about a specific disease often consists of different facets such as symptom, treatment, cause, diagnosis, prognosis, and prevention. Thus, documents may have different relations based on different facets. Powerful search tools have been developed to help users locate lists of individual documents that are most related to specific keywords. However, there is a lack of effective analysis tools that reveal the multifaceted relations of documents within or cross the document clusters. In this paper, we present FacetAtlas, a multifaceted visualization technique for visually analyzing rich text corpora. FacetAtlas combines search technology with advanced visual analytical tools to convey both global and local patterns simultaneously. We describe several unique aspects of FacetAtlas, including (1) node cliques and multifaceted edges, (2) an optimized density map, and (3) automated opacity pattern enhancement for highlighting visual patterns, (4) interactive context switch between facets. In addition, we demonstrate the power of FacetAtlas through a case study that targets patient education in the health care domain. Our evaluation shows the benefits of this work, especially in support of complex multifaceted data analysis.