Search (124 results, page 2 of 7)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Mlodzka-Stybel, A.: Towards continuous improvement of users' access to a library catalogue (2014) 0.01
    0.011348485 = product of:
      0.04539394 = sum of:
        0.028138565 = weight(_text_:data in 1466) [ClassicSimilarity], result of:
          0.028138565 = score(doc=1466,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.24455236 = fieldWeight in 1466, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1466)
        0.017255375 = product of:
          0.03451075 = sum of:
            0.03451075 = weight(_text_:22 in 1466) [ClassicSimilarity], result of:
              0.03451075 = score(doc=1466,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.2708308 = fieldWeight in 1466, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1466)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The paper discusses the issue of increasing users' access to library records by their publication in Google. Data from the records, converted into html format, have been indexed by Google. The process covered basic formal description fields of the records, description of the content, supported with a thesaurus, as well as an abstract, if present in the record. In addition to monitoring the end users' statistics, the pilot testing covered visibility of library records in Google search results.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  2. Quiroga, L.M.; Mostafa, J.: ¬An experiment in building profiles in information filtering : the role of context of user relevance feedback (2002) 0.01
    0.0112238005 = product of:
      0.044895202 = sum of:
        0.028424243 = weight(_text_:data in 2579) [ClassicSimilarity], result of:
          0.028424243 = score(doc=2579,freq=4.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.24703519 = fieldWeight in 2579, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2579)
        0.01647096 = product of:
          0.03294192 = sum of:
            0.03294192 = weight(_text_:processing in 2579) [ClassicSimilarity], result of:
              0.03294192 = score(doc=2579,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.22363065 = fieldWeight in 2579, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2579)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    An experiment was conducted to see how relevance feedback could be used to build and adjust profiles to improve the performance of filtering systems. Data was collected during the system interaction of 18 graduate students with SIFTER (Smart Information Filtering Technology for Electronic Resources), a filtering system that ranks incoming information based on users' profiles. The data set came from a collection of 6000 records concerning consumer health. In the first phase of the study, three different modes of profile acquisition were compared. The explicit mode allowed users to directly specify the profile; the implicit mode utilized relevance feedback to create and refine the profile; and the combined mode allowed users to initialize the profile and to continuously refine it using relevance feedback. Filtering performance, measured in terms of Normalized Precision, showed that the three approaches were significantly different ( [small alpha, Greek] =0.05 and p =0.012). The explicit mode of profile acquisition consistently produced superior results. Exclusive reliance on relevance feedback in the implicit mode resulted in inferior performance. The low performance obtained by the implicit acquisition mode motivated the second phase of the study, which aimed to clarify the role of context in relevance feedback judgments. An inductive content analysis of thinking aloud protocols showed dimensions that were highly situational, establishing the importance context plays in feedback relevance assessments. Results suggest the need for better representation of documents, profiles, and relevance feedback mechanisms that incorporate dimensions identified in this research.
    Source
    Information processing and management. 38(2002) no.5, S.671-694
  3. Melucci, M.: Contextual search : a computational framework (2012) 0.01
    0.0112238005 = product of:
      0.044895202 = sum of:
        0.028424243 = weight(_text_:data in 4913) [ClassicSimilarity], result of:
          0.028424243 = score(doc=4913,freq=4.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.24703519 = fieldWeight in 4913, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4913)
        0.01647096 = product of:
          0.03294192 = sum of:
            0.03294192 = weight(_text_:processing in 4913) [ClassicSimilarity], result of:
              0.03294192 = score(doc=4913,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.22363065 = fieldWeight in 4913, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4913)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The growing availability of data in electronic form, the expansion of the World Wide Web and the accessibility of computational methods for large-scale data processing have allowed researchers in Information Retrieval (IR) to design systems which can effectively and efficiently constrain search within the boundaries given by context, thus transforming classical search into contextual search. Contextual Search: A Computational Framework introduces contextual search within a computational framework based on contextual variables, contextual factors and statistical models. It describes how statistical models can process contextual variables to infer the contextual factors underlying the current search context. It also provides background to the subject by: placing it among other surveys on relevance, interaction, context, and behaviour; providing a description of the contextual variables used for implementing the statistical models which represent and predict relevance and contextual factors; and providing an overview of the evaluation methodologies and findings relevant to this subject. Contextual Search: A Computational Framework is a highly recommended read, both for beginners who are embarking on research in this area and as a useful reference for established IR researchers.
  4. Hendahewa, C.; Shah, C.: Implicit search feature based approach to assist users in exploratory search tasks (2015) 0.01
    0.0112238005 = product of:
      0.044895202 = sum of:
        0.028424243 = weight(_text_:data in 2678) [ClassicSimilarity], result of:
          0.028424243 = score(doc=2678,freq=4.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.24703519 = fieldWeight in 2678, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2678)
        0.01647096 = product of:
          0.03294192 = sum of:
            0.03294192 = weight(_text_:processing in 2678) [ClassicSimilarity], result of:
              0.03294192 = score(doc=2678,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.22363065 = fieldWeight in 2678, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2678)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Analyzing and modeling users' online search behaviors when conducting exploratory search tasks could be instrumental in discovering search behavior patterns that can then be leveraged to assist users in reaching their search task goals. We propose a framework for evaluating exploratory search based on implicit features and user search action sequences extracted from the transactional log data to model different aspects of exploratory search namely uncertainty, creativity, exploration, and knowledge discovery. We show the effectiveness of the proposed framework by demonstrating how it can be used to understand and evaluate user search performance and thereby make meaningful recommendations to improve the overall search performance of users. We used data collected from a user study consisting of 18 users conducting an exploratory search task for two sessions with two different topics in the experimental analysis. With this analysis we show that we can effectively model their behavior using implicit features to predict the user's future performance level with above 70% accuracy in most cases. Further, using simulations we demonstrate that our search process based recommendations improve the search performance of low performing users over time and validate these findings using both qualitative and quantitative approaches.
    Source
    Information processing and management. 51(2015) no.5, S.643-661
  5. Lin, J.; DiCuccio, M.; Grigoryan, V.; Wilbur, W.J.: Navigating information spaces : a case study of related article search in PubMed (2008) 0.01
    0.01097098 = product of:
      0.04388392 = sum of:
        0.02411877 = weight(_text_:data in 2124) [ClassicSimilarity], result of:
          0.02411877 = score(doc=2124,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.2096163 = fieldWeight in 2124, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=2124)
        0.01976515 = product of:
          0.0395303 = sum of:
            0.0395303 = weight(_text_:processing in 2124) [ClassicSimilarity], result of:
              0.0395303 = score(doc=2124,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.26835677 = fieldWeight in 2124, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2124)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The concept of an "information space" provides a powerful metaphor for guiding the design of interactive retrieval systems. We present a case study of related article search, a browsing tool designed to help users navigate the information space defined by results of the PubMed® search engine. This feature leverages content-similarity links that tie MEDLINE® citations together in a vast document network. We examine the effectiveness of related article search from two perspectives: a topological analysis of networks generated from information needs represented in the TREC 2005 genomics track and a query log analysis of real PubMed users. Together, data suggest that related article search is a useful feature and that browsing related articles has become an integral part of how users interact with PubMed.
    Source
    Information processing and management. 44(2008) no.5, S.1771-1783
  6. Blanco, R.; Matthews, M.; Mika, P.: Ranking of daily deals with concept expansion (2015) 0.01
    0.01097098 = product of:
      0.04388392 = sum of:
        0.02411877 = weight(_text_:data in 2663) [ClassicSimilarity], result of:
          0.02411877 = score(doc=2663,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.2096163 = fieldWeight in 2663, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=2663)
        0.01976515 = product of:
          0.0395303 = sum of:
            0.0395303 = weight(_text_:processing in 2663) [ClassicSimilarity], result of:
              0.0395303 = score(doc=2663,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.26835677 = fieldWeight in 2663, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2663)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Daily deals have emerged in the last three years as a successful form of online advertising. The downside of this success is that users are increasingly overloaded by the many thousands of deals offered each day by dozens of deal providers and aggregators. The challenge is thus offering the right deals to the right users i.e., the relevance ranking of deals. This is the problem we address in our paper. Exploiting the characteristics of deals data, we propose a combination of a term- and a concept-based retrieval model that closes the semantic gap between queries and documents expanding both of them with category information. The method consistently outperforms state-of-the-art methods based on term-matching alone and existing approaches for ad classification and ranking.
    Source
    Information processing and management. 51(2015) no.4, S.359-372
  7. Shiri, A.A.; Revie, C.: Query expansion behavior within a thesaurus-enhanced search environment : a user-centered evaluation (2006) 0.01
    0.010187378 = product of:
      0.040749513 = sum of:
        0.028424243 = weight(_text_:data in 56) [ClassicSimilarity], result of:
          0.028424243 = score(doc=56,freq=4.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.24703519 = fieldWeight in 56, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=56)
        0.012325268 = product of:
          0.024650536 = sum of:
            0.024650536 = weight(_text_:22 in 56) [ClassicSimilarity], result of:
              0.024650536 = score(doc=56,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.19345059 = fieldWeight in 56, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=56)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The study reported here investigated the query expansion behavior of end-users interacting with a thesaurus-enhanced search system on the Web. Two groups, namely academic staff and postgraduate students, were recruited into this study. Data were collected from 90 searches performed by 30 users using the OVID interface to the CAB abstracts database. Data-gathering techniques included questionnaires, screen capturing software, and interviews. The results presented here relate to issues of search-topic and search-term characteristics, number and types of expanded queries, usefulness of thesaurus terms, and behavioral differences between academic staff and postgraduate students in their interaction. The key conclusions drawn were that (a) academic staff chose more narrow and synonymous terms than did postgraduate students, who generally selected broader and related terms; (b) topic complexity affected users' interaction with the thesaurus in that complex topics required more query expansion and search term selection; (c) users' prior topic-search experience appeared to have a significant effect on their selection and evaluation of thesaurus terms; (d) in 50% of the searches where additional terms were suggested from the thesaurus, users stated that they had not been aware of the terms at the beginning of the search; this observation was particularly noticeable in the case of postgraduate students.
    Date
    22. 7.2006 16:32:43
  8. Kim, H.H.: Toward video semantic search based on a structured folksonomy (2011) 0.01
    0.009804734 = product of:
      0.07843787 = sum of:
        0.07843787 = weight(_text_:higher in 4350) [ClassicSimilarity], result of:
          0.07843787 = score(doc=4350,freq=4.0), product of:
            0.19113865 = queryWeight, product of:
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.03638826 = queryNorm
            0.41037157 = fieldWeight in 4350, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4350)
      0.125 = coord(1/8)
    
    Abstract
    This study investigated the effectiveness of query expansion using synonymous and co-occurrence tags in users' video searches as well as the effect of visual storyboard surrogates on users' relevance judgments when browsing videos. To do so, we designed a structured folksonomy-based system in which tag queries can be expanded via synonyms or co-occurrence words, based on the use of WordNet 2.1 synonyms and Flickr's related tags. To evaluate the structured folksonomy-based system, we conducted an experiment, the results of which suggest that the mean recall rate in the structured folksonomy-based system is statistically higher than that in a tag-based system without query expansion; however, the mean precision rate in the structured folksonomy-based system is not statistically higher than that in the tag-based system. Next, we compared the precision rates of the proposed system with storyboards (SB), in which SB and text metadata are shown to users when they browse video search results, with those of the proposed system without SB, in which only text metadata are shown. Our result showed that browsing only text surrogates-including tags without multimedia surrogates-is not sufficient for users' relevance judgments.
  9. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.01
    0.009727273 = product of:
      0.038909093 = sum of:
        0.02411877 = weight(_text_:data in 2419) [ClassicSimilarity], result of:
          0.02411877 = score(doc=2419,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.2096163 = fieldWeight in 2419, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=2419)
        0.014790321 = product of:
          0.029580642 = sum of:
            0.029580642 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
              0.029580642 = score(doc=2419,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.23214069 = fieldWeight in 2419, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
    Date
    16.11.2008 16:22:48
  10. Kruschwitz, U.; AI-Bakour, H.: Users want more sophisticated search assistants : results of a task-based evaluation (2005) 0.01
    0.009142484 = product of:
      0.036569934 = sum of:
        0.020098975 = weight(_text_:data in 4575) [ClassicSimilarity], result of:
          0.020098975 = score(doc=4575,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.17468026 = fieldWeight in 4575, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4575)
        0.01647096 = product of:
          0.03294192 = sum of:
            0.03294192 = weight(_text_:processing in 4575) [ClassicSimilarity], result of:
              0.03294192 = score(doc=4575,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.22363065 = fieldWeight in 4575, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4575)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The Web provides a massive knowledge source, as do intranets and other electronic document collections. However, much of that knowledge is encoded implicitly and cannot be applied directly without processing into some more appropriate structures. Searching, browsing, question answering, for example, could all benefit from domain-specific knowledge contained in the documents, and in applications such as simple search we do not actually need very "deep" knowledge structures such as ontologies, but we can get a long way with a model of the domain that consists of term hierarchies. We combine domain knowledge automatically acquired by exploiting the documents' markup structure with knowledge extracted an the fly to assist a user with ad hoc search requests. Such a search system can suggest query modification options derived from the actual data and thus guide a user through the space of documents. This article gives a detailed account of a task-based evaluation that compares a search system that uses the outlined domain knowledge with a standard search system. We found that users do use the query modification suggestions proposed by the system. The main conclusion we can draw from this evaluation, however, is that users prefer a system that can suggest query modifications over a standard search engine, which simply presents a ranked list of documents. Most interestingly, we observe this user preference despite the fact that the baseline system even performs slightly better under certain criteria.
  11. Wongthontham, P.; Abu-Salih, B.: Ontology-based approach for semantic data extraction from social big data : state-of-the-art and research directions (2018) 0.01
    0.008527273 = product of:
      0.06821819 = sum of:
        0.06821819 = weight(_text_:data in 4097) [ClassicSimilarity], result of:
          0.06821819 = score(doc=4097,freq=16.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.5928845 = fieldWeight in 4097, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=4097)
      0.125 = coord(1/8)
    
    Abstract
    A challenge of managing and extracting useful knowledge from social media data sources has attracted much attention from academic and industry. To address this challenge, semantic analysis of textual data is focused in this paper. We propose an ontology-based approach to extract semantics of textual data and define the domain of data. In other words, we semantically analyse the social data at two levels i.e. the entity level and the domain level. We have chosen Twitter as a social channel challenge for a purpose of concept proof. Domain knowledge is captured in ontologies which are then used to enrich the semantics of tweets provided with specific semantic conceptual representation of entities that appear in the tweets. Case studies are used to demonstrate this approach. We experiment and evaluate our proposed approach with a public dataset collected from Twitter and from the politics domain. The ontology-based approach leverages entity extraction and concept mappings in terms of quantity and accuracy of concept identification.
    Theme
    Data Mining
  12. Berry, M.W.; Dumais, S.T.; O'Brien, G.W.: Using linear algebra for intelligent information retrieval (1995) 0.01
    0.008319592 = product of:
      0.06655674 = sum of:
        0.06655674 = weight(_text_:higher in 2206) [ClassicSimilarity], result of:
          0.06655674 = score(doc=2206,freq=2.0), product of:
            0.19113865 = queryWeight, product of:
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.03638826 = queryNorm
            0.34821182 = fieldWeight in 2206, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.046875 = fieldNorm(doc=2206)
      0.125 = coord(1/8)
    
    Abstract
    Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users' requests and those in or assigned to documents in a database. Because of the tremendous diversity in the words people use to describe the same document, lexical methods are necessarily incomplete and imprecise. Using the singular value decomposition (SVD), one can take advantage of the implicit higher-order structure in the association of terms with documents by determining the SVD of large sparse term by document matrices. Terms and documents represented by 200-300 of the largest singular vectors are then matched against user queries. We call this retrieval method Latent Semantic Indexing (LSI) because the subspace represents important associative relationships between terms and documents that are not evident in individual documents. LSI is a completely automatic yet intelligent indexing method, widely applicable, and a promising way to improve users...
  13. Semantic search over the Web (2012) 0.01
    0.008287019 = product of:
      0.06629615 = sum of:
        0.06629615 = weight(_text_:data in 411) [ClassicSimilarity], result of:
          0.06629615 = score(doc=411,freq=34.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.5761801 = fieldWeight in 411, product of:
              5.8309517 = tf(freq=34.0), with freq of:
                34.0 = termFreq=34.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=411)
      0.125 = coord(1/8)
    
    Abstract
    The Web has become the world's largest database, with search being the main tool that allows organizations and individuals to exploit its huge amount of information. Search on the Web has been traditionally based on textual and structural similarities, ignoring to a large degree the semantic dimension, i.e., understanding the meaning of the query and of the document content. Combining search and semantics gives birth to the idea of semantic search. Traditional search engines have already advertised some semantic dimensions. Some of them, for instance, can enhance their generated result sets with documents that are semantically related to the query terms even though they may not include these terms. Nevertheless, the exploitation of the semantic search has not yet reached its full potential. In this book, Roberto De Virgilio, Francesco Guerra and Yannis Velegrakis present an extensive overview of the work done in Semantic Search and other related areas. They explore different technologies and solutions in depth, making their collection a valuable and stimulating reading for both academic and industrial researchers. The book is divided into three parts. The first introduces the readers to the basic notions of the Web of Data. It describes the different kinds of data that exist, their topology, and their storing and indexing techniques. The second part is dedicated to Web Search. It presents different types of search, like the exploratory or the path-oriented, alongside methods for their efficient and effective implementation. Other related topics included in this part are the use of uncertainty in query answering, the exploitation of ontologies, and the use of semantics in mashup design and operation. The focus of the third part is on linked data, and more specifically, on applying ideas originating in recommender systems on linked data management, and on techniques for the efficiently querying answering on linked data.
    Content
    Inhalt: Introduction.- Part I Introduction to Web of Data.- Topology of the Web of Data.- Storing and Indexing Massive RDF Data Sets.- Designing Exploratory Search Applications upon Web Data Sources.- Part II Search over the Web.- Path-oriented Keyword Search query over RDF.- Interactive Query Construction for Keyword Search on the SemanticWeb.- Understanding the Semantics of Keyword Queries on Relational DataWithout Accessing the Instance.- Keyword-Based Search over Semantic Data.- Semantic Link Discovery over Relational Data.- Embracing Uncertainty in Entity Linking.- The Return of the Entity-Relationship Model: Ontological Query Answering.- Linked Data Services and Semantics-enabled Mashup.- Part III Linked Data Search engines.- A Recommender System for Linked Data.- Flint: from Web Pages to Probabilistic Semantic Data.- Searching and Browsing Linked Data with SWSE.
    Series
    Data-centric systems and applications
  14. Smith, D.A.; Shadbolt, N.R.: FacetOntology : expressive descriptions of facets in the Semantic Web (2012) 0.01
    0.007944818 = product of:
      0.06355854 = sum of:
        0.06355854 = weight(_text_:data in 2208) [ClassicSimilarity], result of:
          0.06355854 = score(doc=2208,freq=20.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.5523875 = fieldWeight in 2208, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2208)
      0.125 = coord(1/8)
    
    Abstract
    The formal structure of the information on the Semantic Web lends itself to faceted browsing, an information retrieval method where users can filter results based on the values of properties ("facets"). Numerous faceted browsers have been created to browse RDF and Linked Data, but these systems use their own ontologies for defining how data is queried to populate their facets. Since the source data is the same format across these systems (specifically, RDF), we can unify the different methods of describing how to quer the underlying data, to enable compatibility across systems, and provide an extensible base ontology for future systems. To this end, we present FacetOntology, an ontology that defines how to query data to form a faceted browser, and a number of transformations and filters that can be applied to data before it is shown to users. FacetOntology overcomes limitations in the expressivity of existing work, by enabling the full expressivity of SPARQL when selecting data for facets. By applying a FacetOntology definition to data, a set of facets are specified, each with queries and filters to source RDF data, which enables faceted browsing systems to be created using that RDF data.
  15. Efthimiadis, E.N.: User choices : a new yardstick for the evaluation of ranking algorithms for interactive query expansion (1995) 0.01
    0.007199057 = product of:
      0.057592455 = sum of:
        0.057592455 = sum of:
          0.03294192 = weight(_text_:processing in 5697) [ClassicSimilarity], result of:
            0.03294192 = score(doc=5697,freq=2.0), product of:
              0.14730503 = queryWeight, product of:
                4.048147 = idf(docFreq=2097, maxDocs=44218)
                0.03638826 = queryNorm
              0.22363065 = fieldWeight in 5697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.048147 = idf(docFreq=2097, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5697)
          0.024650536 = weight(_text_:22 in 5697) [ClassicSimilarity], result of:
            0.024650536 = score(doc=5697,freq=2.0), product of:
              0.12742549 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03638826 = queryNorm
              0.19345059 = fieldWeight in 5697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5697)
      0.125 = coord(1/8)
    
    Date
    22. 2.1996 13:14:10
    Source
    Information processing and management. 31(1995) no.4, S.605-620
  16. Jarvelin, K.: ¬A deductive data model for thesaurus navigation and query expansion (1996) 0.01
    0.0069624893 = product of:
      0.055699915 = sum of:
        0.055699915 = weight(_text_:data in 5625) [ClassicSimilarity], result of:
          0.055699915 = score(doc=5625,freq=6.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.48408815 = fieldWeight in 5625, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0625 = fieldNorm(doc=5625)
      0.125 = coord(1/8)
    
    Abstract
    Describes a deductive data model based on 3 abstraction levels for representing vocabularies for information retrieval: conceptual level; expression level; and occurrence level. The proposed data model can be used for the representation and navigation of indexing and retrieval thesauri and as a vocabulary source for concept based query expansion in heterogeneous retrieval environments
  17. Lehtokangas, R.; Järvelin, K.: Consistency of textual expression in newspaper articles : an argument for semantically based query expansion (2001) 0.01
    0.0069329934 = product of:
      0.055463947 = sum of:
        0.055463947 = weight(_text_:higher in 4485) [ClassicSimilarity], result of:
          0.055463947 = score(doc=4485,freq=2.0), product of:
            0.19113865 = queryWeight, product of:
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.03638826 = queryNorm
            0.2901765 = fieldWeight in 4485, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4485)
      0.125 = coord(1/8)
    
    Abstract
    This article investigates how consistent different newspapers are in their choice of words when writing about the same news events. News articles on the same news events were taken from three Finnish newspapers and compared in regard to their central concepts and words representing the concepts in the news texts. Consistency figures were calculated for each set of three articles (the total number of sets was sixty). Inconsistency in words and concepts was found between news articles from different newspapers. The mean value of consistency calculated on the basis of words was 65 per cent; this however depended on the article length. For short news wires consistency was 83 per cent while for long articles it was only 47 per cent. At the concept level, consistency was considerably higher, ranging from 92 per cent to 97 per cent between short and long articles. The articles also represented three categories of topic (event, process and opinion). Statistically significant differences in consistency were found in regard to length but not in regard to the categories of topic. We argue that the expression inconsistency is a clear sign of a retrieval problem and that query expansion based on semantic relationships can significantly improve retrieval performance on free-text sources.
  18. Bradford, R.B.: Relationship discovery in large text collections using Latent Semantic Indexing (2006) 0.01
    0.0064848484 = product of:
      0.025939394 = sum of:
        0.01607918 = weight(_text_:data in 1163) [ClassicSimilarity], result of:
          0.01607918 = score(doc=1163,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.1397442 = fieldWeight in 1163, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=1163)
        0.009860214 = product of:
          0.019720428 = sum of:
            0.019720428 = weight(_text_:22 in 1163) [ClassicSimilarity], result of:
              0.019720428 = score(doc=1163,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.15476047 = fieldWeight in 1163, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1163)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Source
    Proceedings of the Fourth Workshop on Link Analysis, Counterterrorism, and Security, SIAM Data Mining Conference, Bethesda, MD, 20-22 April, 2006. [http://www.siam.org/meetings/sdm06/workproceed/Link%20Analysis/15.pdf]
  19. Ekmekcioglu, F.C.; Robertson, A.M.; Willett, P.: Effectiveness of query expansion in ranked-output document retrieval systems (1992) 0.01
    0.005684849 = product of:
      0.04547879 = sum of:
        0.04547879 = weight(_text_:data in 5689) [ClassicSimilarity], result of:
          0.04547879 = score(doc=5689,freq=4.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.3952563 = fieldWeight in 5689, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0625 = fieldNorm(doc=5689)
      0.125 = coord(1/8)
    
    Abstract
    Reports an evaluation of 3 methods for the expansion of natural language queries in ranked output retrieval systems. The methods are based on term co-occurrence data, on Soundex codes, and on a string similarity measure. Searches for 110 queries in a data base of 26.280 titles and abstracts suggest that there is no significant difference in retrieval effectiveness between any of these methods and unexpanded searches
  20. Brambilla, M.; Ceri, S.: Designing exploratory search applications upon Web data sources (2012) 0.01
    0.005684849 = product of:
      0.04547879 = sum of:
        0.04547879 = weight(_text_:data in 428) [ClassicSimilarity], result of:
          0.04547879 = score(doc=428,freq=16.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.3952563 = fieldWeight in 428, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=428)
      0.125 = coord(1/8)
    
    Abstract
    Search is the preferred method to access information in today's computing systems. The Web, accessed through search engines, is universally recognized as the source for answering users' information needs. However, offering a link to a Web page does not cover all information needs. Even simple problems, such as "Which theater offers an at least three-stars action movie in London close to a good Italian restaurant," can only be solved by searching the Web multiple times, e.g., by extracting a list of the recent action movies filtered by ranking, then looking for movie theaters, then looking for Italian restaurants close to them. While search engines hint to useful information, the user's brain is the fundamental platform for information integration. An important trend is the availability of new, specialized data sources-the so-called "long tail" of the Web of data. Such carefully collected and curated data sources can be much more valuable than information currently available in Web pages; however, many sources remain hidden or insulated, in the lack of software solutions for bringing them to surface and making them usable in the search context. A new class of tailor-made systems, designed to satisfy the needs of users with specific aims, will support the publishing and integration of data sources for vertical domains; the user will be able to select sources based on individual or collective trust, and systems will be able to route queries to such sources and to provide easyto-use interfaces for combining them within search strategies, at the same time, rewarding the data source owners for each contribution to effective search. Efforts such as Google's Fusion Tables show that the technology for bringing hidden data sources to surface is feasible.
    Series
    Data-centric systems and applications

Years

Languages

  • e 118
  • d 5
  • f 1
  • More… Less…

Types

  • a 112
  • el 11
  • m 7
  • x 2
  • More… Less…