Search (162 results, page 1 of 9)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Song, D.; Bruza, P.D.: Towards context sensitive information inference (2003) 0.04
    0.040570874 = product of:
      0.14199805 = sum of:
        0.017665405 = weight(_text_:with in 1428) [ClassicSimilarity], result of:
          0.017665405 = score(doc=1428,freq=4.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.18826336 = fieldWeight in 1428, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1428)
        0.124332644 = sum of:
          0.097954325 = weight(_text_:humans in 1428) [ClassicSimilarity], result of:
            0.097954325 = score(doc=1428,freq=2.0), product of:
              0.26276368 = queryWeight, product of:
                6.7481275 = idf(docFreq=140, maxDocs=44218)
                0.038938753 = queryNorm
              0.37278488 = fieldWeight in 1428, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7481275 = idf(docFreq=140, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1428)
          0.026378317 = weight(_text_:22 in 1428) [ClassicSimilarity], result of:
            0.026378317 = score(doc=1428,freq=2.0), product of:
              0.13635688 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.038938753 = queryNorm
              0.19345059 = fieldWeight in 1428, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1428)
      0.2857143 = coord(2/7)
    
    Abstract
    Humans can make hasty, but generally robust judgements about what a text fragment is, or is not, about. Such judgements are termed information inference. This article furnishes an account of information inference from a psychologistic stance. By drawing an theories from nonclassical logic and applied cognition, an information inference mechanism is proposed that makes inferences via computations of information flow through an approximation of a conceptual space. Within a conceptual space information is represented geometrically. In this article, geometric representations of words are realized as vectors in a high dimensional semantic space, which is automatically constructed from a text corpus. Two approaches were presented for priming vector representations according to context. The first approach uses a concept combination heuristic to adjust the vector representation of a concept in the light of the representation of another concept. The second approach computes a prototypical concept an the basis of exemplar trace texts and moves it in the dimensional space according to the context. Information inference is evaluated by measuring the effectiveness of query models derived by information flow computations. Results show that information flow contributes significantly to query model effectiveness, particularly with respect to precision. Moreover, retrieval effectiveness compares favorably with two probabilistic query models, and another based an semantic association. More generally, this article can be seen as a contribution towards realizing operational systems that mimic text-based human reasoning.
    Date
    22. 3.2003 19:35:46
  2. Cool, C.; Spink, A.: Issues of context in information retrieval (IR) : an introduction to the special issue (2002) 0.03
    0.031710386 = product of:
      0.11098635 = sum of:
        0.08978786 = weight(_text_:interactions in 2587) [ClassicSimilarity], result of:
          0.08978786 = score(doc=2587,freq=2.0), product of:
            0.22965278 = queryWeight, product of:
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.038938753 = queryNorm
            0.39097226 = fieldWeight in 2587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.046875 = fieldNorm(doc=2587)
        0.021198487 = weight(_text_:with in 2587) [ClassicSimilarity], result of:
          0.021198487 = score(doc=2587,freq=4.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.22591603 = fieldWeight in 2587, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.046875 = fieldNorm(doc=2587)
      0.2857143 = coord(2/7)
    
    Abstract
    The subject of context has received a great deal of attention in the information retrieval (IR) literature over the past decade, primarily in studies of information seeking and IR interactions. Recently, attention to context in IR has expanded to address new problems in new environments. In this paper we outline five overlapping dimensions of context which we believe to be important constituent elements and we discuss how they are related to different issues in IR research. The papers in this special issue are summarized with respect to how they represent work that is being conducted within these dimensions of context. We conclude with future areas of research which are needed in order to fully understand the multidimensional nature of context in IR.
  3. Koike, A.; Takagi, T.: Knowledge discovery based on an implicit and explicit conceptual network (2007) 0.03
    0.027559668 = product of:
      0.09645883 = sum of:
        0.074823216 = weight(_text_:interactions in 85) [ClassicSimilarity], result of:
          0.074823216 = score(doc=85,freq=2.0), product of:
            0.22965278 = queryWeight, product of:
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.038938753 = queryNorm
            0.3258102 = fieldWeight in 85, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.0390625 = fieldNorm(doc=85)
        0.021635616 = weight(_text_:with in 85) [ClassicSimilarity], result of:
          0.021635616 = score(doc=85,freq=6.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.2305746 = fieldWeight in 85, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.0390625 = fieldNorm(doc=85)
      0.2857143 = coord(2/7)
    
    Abstract
    The amount of knowledge accumulated in published scientific papers has increased due to the continuing progress being made in scientific research. Since numerous papers have only reported fragments of scientific facts, there are possibilities for discovering new knowledge by connecting these facts. We therefore developed a system called BioTermNet to draft a conceptual network with hybrid methods of information extraction and information retrieval. Two concepts are regarded as related in this system if (a) their relationship is clearly described in MEDLINE abstracts or (b) they have distinctively co-occurred in abstracts. PRIME data, including protein interactions and functions extracted by NLP techniques, are used in the former, and the Singhalmeasure for information retrieval is used in the latter. Relationships that are not clearly or directly described in an abstract can be extracted by connecting multiple concepts. To evaluate how well this system performs, Swanson's association between Raynaud's disease and fish oil and that between migraine and magnesium were tested with abstracts that had been published before the discovery of these associations. The result was that when start and end concepts were given, plausible and understandable intermediate concepts connecting them could be detected. When only the start concept was given, not only the focused concept (magnesium and fish oil) but also other probable concepts could be detected as related concept candidates. Finally, this system was applied to find diseases related to the BRCA1 gene. Some other new potentially related diseases were detected along with diseases whose relations to BRCA1 were already known.
  4. Bergamaschi, S.; Domnori, E.; Guerra, F.; Rota, S.; Lado, R.T.; Velegrakis, Y.: Understanding the semantics of keyword queries on relational data without accessing the instance (2012) 0.02
    0.024947014 = product of:
      0.087314546 = sum of:
        0.074823216 = weight(_text_:interactions in 431) [ClassicSimilarity], result of:
          0.074823216 = score(doc=431,freq=2.0), product of:
            0.22965278 = queryWeight, product of:
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.038938753 = queryNorm
            0.3258102 = fieldWeight in 431, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.0390625 = fieldNorm(doc=431)
        0.012491328 = weight(_text_:with in 431) [ClassicSimilarity], result of:
          0.012491328 = score(doc=431,freq=2.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.1331223 = fieldWeight in 431, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.0390625 = fieldNorm(doc=431)
      0.2857143 = coord(2/7)
    
    Abstract
    The birth of the Web has brought an exponential growth to the amount of the information that is freely available to the Internet population, overloading users and entangling their efforts to satisfy their information needs. Web search engines such Google, Yahoo, or Bing have become popular mainly due to the fact that they offer an easy-to-use query interface (i.e., based on keywords) and an effective and efficient query execution mechanism. The majority of these search engines do not consider information stored on the deep or hidden Web [9,28], despite the fact that the size of the deep Web is estimated to be much bigger than the surface Web [9,47]. There have been a number of systems that record interactions with the deep Web sources or automatically submit queries them (mainly through their Web form interfaces) in order to index their context. Unfortunately, this technique is only partially indexing the data instance. Moreover, it is not possible to take advantage of the query capabilities of data sources, for example, of the relational query features, because their interface is often restricted from the Web form. Besides, Web search engines focus on retrieving documents and not on querying structured sources, so they are unable to access information based on concepts.
  5. Hu, K.; Luo, Q.; Qi, K.; Yang, S.; Mao, J.; Fu, X.; Zheng, J.; Wu, H.; Guo, Y.; Zhu, Q.: Understanding the topic evolution of scientific literatures like an evolving city : using Google Word2Vec model and spatial autocorrelation analysis (2019) 0.02
    0.019957611 = product of:
      0.06985164 = sum of:
        0.059858575 = weight(_text_:interactions in 5102) [ClassicSimilarity], result of:
          0.059858575 = score(doc=5102,freq=2.0), product of:
            0.22965278 = queryWeight, product of:
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.038938753 = queryNorm
            0.26064816 = fieldWeight in 5102, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.03125 = fieldNorm(doc=5102)
        0.009993061 = weight(_text_:with in 5102) [ClassicSimilarity], result of:
          0.009993061 = score(doc=5102,freq=2.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.10649783 = fieldWeight in 5102, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.03125 = fieldNorm(doc=5102)
      0.2857143 = coord(2/7)
    
    Abstract
    Topic evolution has been described by many approaches from a macro level to a detail level, by extracting topic dynamics from text in literature and other media types. However, why the evolution happens is less studied. In this paper, we focus on whether and how the keyword semantics can invoke or affect the topic evolution. We assume that the semantic relatedness among the keywords can affect topic popularity during literature surveying and citing process, thus invoking evolution. However, the assumption is needed to be confirmed in an approach that fully considers the semantic interactions among topics. Traditional topic evolution analyses in scientometric domains cannot provide such support because of using limited semantic meanings. To address this problem, we apply the Google Word2Vec, a deep learning language model, to enhance the keywords with more complete semantic information. We further develop the semantic space as an urban geographic space. We analyze the topic evolution geographically using the measures of spatial autocorrelation, as if keywords are the changing lands in an evolving city. The keyword citations (keyword citation counts one when the paper containing this keyword obtains a citation) are used as an indicator of keyword popularity. Using the bibliographical datasets of the geographical natural hazard field, experimental results demonstrate that in some local areas, the popularity of keywords is affecting that of the surrounding keywords. However, there are no significant impacts on the evolution of all keywords. The spatial autocorrelation analysis identifies the interaction patterns (including High-High leading, High-Low suppressing) among the keywords in local areas. This approach can be regarded as an analyzing framework borrowed from geospatial modeling. Moreover, the prediction results in local areas are demonstrated to be more accurate if considering the spatial autocorrelations.
  6. Shah, C.: Collaborative information seeking : the art and science of making the whole greater than the sum of all (2012) 0.02
    0.016905103 = product of:
      0.059167854 = sum of:
        0.019986123 = weight(_text_:with in 360) [ClassicSimilarity], result of:
          0.019986123 = score(doc=360,freq=8.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.21299566 = fieldWeight in 360, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.03125 = fieldNorm(doc=360)
        0.03918173 = product of:
          0.07836346 = sum of:
            0.07836346 = weight(_text_:humans in 360) [ClassicSimilarity], result of:
              0.07836346 = score(doc=360,freq=2.0), product of:
                0.26276368 = queryWeight, product of:
                  6.7481275 = idf(docFreq=140, maxDocs=44218)
                  0.038938753 = queryNorm
                0.2982279 = fieldWeight in 360, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  6.7481275 = idf(docFreq=140, maxDocs=44218)
                  0.03125 = fieldNorm(doc=360)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Today's complex, information-intensive problems often require people to work together. Mostly these tasks go far beyond simply searching together; they include information lookup, sharing, synthesis, and decision-making. In addition, they all have an end-goal that is mutually beneficial to all parties involved. Such "collaborative information seeking" (CIS) projects typically last several sessions and the participants all share an intention to contribute and benefit. Not surprisingly, these processes are highly interactive. Shah focuses on two individually well-understood notions: collaboration and information seeking, with the goal of bringing them together to show how it is a natural tendency for humans to work together on complex tasks. The first part of his book introduces the general notions of collaboration and information seeking, as well as related concepts, terminology, and frameworks; and thus provides the reader with a comprehensive treatment of the concepts underlying CIS. The second part of the book details CIS as a standalone domain. A series of frameworks, theories, and models are introduced to provide a conceptual basis for CIS. The final part describes several systems and applications of CIS, along with their broader implications on other fields such as computer-supported cooperative work (CSCW) and human-computer interaction (HCI). With this first comprehensive overview of an exciting new research field, Shah delivers to graduate students and researchers in academia and industry an encompassing description of the technologies involved, state-of-the-art results, and open challenges as well as research opportunities.
  7. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.01
    0.010578708 = product of:
      0.037025474 = sum of:
        0.021198487 = weight(_text_:with in 2419) [ClassicSimilarity], result of:
          0.021198487 = score(doc=2419,freq=4.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.22591603 = fieldWeight in 2419, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.046875 = fieldNorm(doc=2419)
        0.015826989 = product of:
          0.031653978 = sum of:
            0.031653978 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
              0.031653978 = score(doc=2419,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.23214069 = fieldWeight in 2419, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
    Date
    16.11.2008 16:22:48
  8. Hannech, A.: Système de recherche d'information étendue basé sur une projection multi-espaces (2018) 0.01
    0.010570128 = product of:
      0.03699545 = sum of:
        0.029929288 = weight(_text_:interactions in 4472) [ClassicSimilarity], result of:
          0.029929288 = score(doc=4472,freq=2.0), product of:
            0.22965278 = queryWeight, product of:
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.038938753 = queryNorm
            0.13032408 = fieldWeight in 4472, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.8977947 = idf(docFreq=329, maxDocs=44218)
              0.015625 = fieldNorm(doc=4472)
        0.0070661623 = weight(_text_:with in 4472) [ClassicSimilarity], result of:
          0.0070661623 = score(doc=4472,freq=4.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.07530534 = fieldWeight in 4472, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.015625 = fieldNorm(doc=4472)
      0.2857143 = coord(2/7)
    
    Abstract
    Since its appearance in the early 90's, the World Wide Web (WWW or Web) has provided universal access to knowledge and the world of information has been primarily witness to a great revolution (the digital revolution). It quickly became very popular, making it the largest and most comprehensive database and knowledge base thanks to the amount and diversity of data it contains. However, the considerable increase and evolution of these data raises important problems for users, in particular for accessing the documents most relevant to their search queries. In order to cope with this exponential explosion of data volume and facilitate their access by users, various models are offered by information retrieval systems (IRS) for the representation and retrieval of web documents. Traditional SRIs use simple keywords that are not semantically linked to index and retrieve these documents. This creates limitations in terms of the relevance and ease of exploration of results. To overcome these limitations, existing techniques enrich documents by integrating external keywords from different sources. However, these systems still suffer from limitations that are related to the exploitation techniques of these sources of enrichment. When the different sources are used so that they cannot be distinguished by the system, this limits the flexibility of the exploration models that can be applied to the results returned by this system. Users then feel lost to these results, and find themselves forced to filter them manually to select the relevant information. If they want to go further, they must reformulate and target their search queries even more until they reach the documents that best meet their expectations. In this way, even if the systems manage to find more relevant results, their presentation remains problematic. In order to target research to more user-specific information needs and improve the relevance and exploration of its research findings, advanced SRIs adopt different data personalization techniques that assume that current research of user is directly related to his profile and / or previous browsing / search experiences.
    However, this assumption does not hold in all cases, the needs of the user evolve over time and can move away from his previous interests stored in his profile. In other cases, the user's profile may be misused to extract or infer new information needs. This problem is much more accentuated with ambiguous queries. When multiple POIs linked to a search query are identified in the user's profile, the system is unable to select the relevant data from that profile to respond to that request. This has a direct impact on the quality of the results provided to this user. In order to overcome some of these limitations, in this research thesis, we have been interested in the development of techniques aimed mainly at improving the relevance of the results of current SRIs and facilitating the exploration of major collections of documents. To do this, we propose a solution based on a new concept and model of indexing and information retrieval called multi-spaces projection. This proposal is based on the exploitation of different categories of semantic and social information that enrich the universe of document representation and search queries in several dimensions of interpretations. The originality of this representation is to be able to distinguish between the different interpretations used for the description and the search for documents. This gives a better visibility on the results returned and helps to provide a greater flexibility of search and exploration, giving the user the ability to navigate one or more views of data that interest him the most. In addition, the proposed multidimensional representation universes for document description and search query interpretation help to improve the relevance of the user's results by providing a diversity of research / exploration that helps meet his diverse needs and those of other different users. This study exploits different aspects that are related to the personalized search and aims to solve the problems caused by the evolution of the information needs of the user. Thus, when the profile of this user is used by our system, a technique is proposed and used to identify the interests most representative of his current needs in his profile. This technique is based on the combination of three influential factors, including the contextual, frequency and temporal factor of the data. The ability of users to interact, exchange ideas and opinions, and form social networks on the Web, has led systems to focus on the types of interactions these users have at the level of interaction between them as well as their social roles in the system. This social information is discussed and integrated into this research work. The impact and how they are integrated into the IR process are studied to improve the relevance of the results.
  9. Brunetti, J.M.; Roberto García, R.: User-centered design and evaluation of overview components for semantic data exploration (2014) 0.01
    0.010568711 = product of:
      0.036990486 = sum of:
        0.026439158 = weight(_text_:with in 1626) [ClassicSimilarity], result of:
          0.026439158 = score(doc=1626,freq=14.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.2817668 = fieldWeight in 1626, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.03125 = fieldNorm(doc=1626)
        0.010551327 = product of:
          0.021102654 = sum of:
            0.021102654 = weight(_text_:22 in 1626) [ClassicSimilarity], result of:
              0.021102654 = score(doc=1626,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.15476047 = fieldWeight in 1626, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1626)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Purpose - The growing volumes of semantic data available in the web result in the need for handling the information overload phenomenon. The potential of this amount of data is enormous but in most cases it is very difficult for users to visualize, explore and use this data, especially for lay-users without experience with Semantic Web technologies. The paper aims to discuss these issues. Design/methodology/approach - The Visual Information-Seeking Mantra "Overview first, zoom and filter, then details-on-demand" proposed by Shneiderman describes how data should be presented in different stages to achieve an effective exploration. The overview is the first user task when dealing with a data set. The objective is that the user is capable of getting an idea about the overall structure of the data set. Different information architecture (IA) components supporting the overview tasks have been developed, so they are automatically generated from semantic data, and evaluated with end-users. Findings - The chosen IA components are well known to web users, as they are present in most web pages: navigation bars, site maps and site indexes. The authors complement them with Treemaps, a visualization technique for displaying hierarchical data. These components have been developed following an iterative User-Centered Design methodology. Evaluations with end-users have shown that they get easily used to them despite the fact that they are generated automatically from structured data, without requiring knowledge about the underlying semantic technologies, and that the different overview components complement each other as they focus on different information search needs. Originality/value - Obtaining semantic data sets overviews cannot be easily done with the current semantic web browsers. Overviews become difficult to achieve with large heterogeneous data sets, which is typical in the Semantic Web, because traditional IA techniques do not easily scale to large data sets. There is little or no support to obtain overview information quickly and easily at the beginning of the exploration of a new data set. This can be a serious limitation when exploring a data set for the first time, especially for lay-users. The proposal is to reuse and adapt existing IA components to provide this overview to users and show that they can be generated automatically from the thesaurus and ontologies that structure semantic data while providing a comparable user experience to traditional web sites.
    Date
    20. 1.2015 18:30:22
  10. Mlodzka-Stybel, A.: Towards continuous improvement of users' access to a library catalogue (2014) 0.01
    0.010272195 = product of:
      0.03595268 = sum of:
        0.017487857 = weight(_text_:with in 1466) [ClassicSimilarity], result of:
          0.017487857 = score(doc=1466,freq=2.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.1863712 = fieldWeight in 1466, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1466)
        0.018464822 = product of:
          0.036929645 = sum of:
            0.036929645 = weight(_text_:22 in 1466) [ClassicSimilarity], result of:
              0.036929645 = score(doc=1466,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.2708308 = fieldWeight in 1466, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1466)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    The paper discusses the issue of increasing users' access to library records by their publication in Google. Data from the records, converted into html format, have been indexed by Google. The process covered basic formal description fields of the records, description of the content, supported with a thesaurus, as well as an abstract, if present in the record. In addition to monitoring the end users' statistics, the pilot testing covered visibility of library records in Google search results.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  11. Lund, K.; Burgess, C.; Atchley, R.A.: Semantic and associative priming in high-dimensional semantic space (1995) 0.01
    0.010272195 = product of:
      0.03595268 = sum of:
        0.017487857 = weight(_text_:with in 2151) [ClassicSimilarity], result of:
          0.017487857 = score(doc=2151,freq=2.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.1863712 = fieldWeight in 2151, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2151)
        0.018464822 = product of:
          0.036929645 = sum of:
            0.036929645 = weight(_text_:22 in 2151) [ClassicSimilarity], result of:
              0.036929645 = score(doc=2151,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.2708308 = fieldWeight in 2151, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2151)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    We present a model of semantic memory that utilizes a high dimensional semantic space constructed from a co-occurrence matrix. This matrix was formed by analyzing a lot) million word corpus. Word vectors were then obtained by extracting rows and columns of this matrix, These vectors were subjected to multidimensional scaling. Words were found to cluster semantically. suggesting that interword distance may be interpretable as a measure of semantic similarity, In attempting to replicate with our simulation the semantic and ...
    Source
    Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society: July 22 - 25, 1995, University of Pittsburgh / ed. by Johanna D. Moore and Jill Fain Lehmann
  12. Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.01
    0.009949936 = product of:
      0.034824774 = sum of:
        0.021635616 = weight(_text_:with in 1343) [ClassicSimilarity], result of:
          0.021635616 = score(doc=1343,freq=6.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.2305746 = fieldWeight in 1343, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1343)
        0.013189158 = product of:
          0.026378317 = sum of:
            0.026378317 = weight(_text_:22 in 1343) [ClassicSimilarity], result of:
              0.026378317 = score(doc=1343,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.19345059 = fieldWeight in 1343, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1343)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. Recently, methods that exploit named entities have been shown to be more effective for query expansion than traditional pseudorelevance feedback methods. In this article, we introduce a supervised learning approach that exploits named entities for query expansion using Wikipedia as a repository of high-quality feedback documents. In contrast with existing entity-oriented pseudorelevance feedback approaches, we tackle query expansion as a learning-to-rank problem. As a result, not only do we select effective expansion terms but we also weigh these terms according to their predicted effectiveness. To this end, we exploit the rich structure of Wikipedia articles to devise discriminative term features, including each candidate term's proximity to the original query terms, as well as its frequency across multiple article fields and in category and infobox descriptors. Experiments on three Text REtrieval Conference web test collections attest the effectiveness of our approach, with gains of up to 23.32% in terms of mean average precision, 19.49% in terms of precision at 10, and 7.86% in terms of normalized discounted cumulative gain compared with a state-of-the-art approach for entity-oriented query expansion.
    Date
    22. 8.2014 17:07:50
  13. Shiri, A.A.; Revie, C.: Query expansion behavior within a thesaurus-enhanced search environment : a user-centered evaluation (2006) 0.01
    0.00881559 = product of:
      0.030854564 = sum of:
        0.017665405 = weight(_text_:with in 56) [ClassicSimilarity], result of:
          0.017665405 = score(doc=56,freq=4.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.18826336 = fieldWeight in 56, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.0390625 = fieldNorm(doc=56)
        0.013189158 = product of:
          0.026378317 = sum of:
            0.026378317 = weight(_text_:22 in 56) [ClassicSimilarity], result of:
              0.026378317 = score(doc=56,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.19345059 = fieldWeight in 56, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=56)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    The study reported here investigated the query expansion behavior of end-users interacting with a thesaurus-enhanced search system on the Web. Two groups, namely academic staff and postgraduate students, were recruited into this study. Data were collected from 90 searches performed by 30 users using the OVID interface to the CAB abstracts database. Data-gathering techniques included questionnaires, screen capturing software, and interviews. The results presented here relate to issues of search-topic and search-term characteristics, number and types of expanded queries, usefulness of thesaurus terms, and behavioral differences between academic staff and postgraduate students in their interaction. The key conclusions drawn were that (a) academic staff chose more narrow and synonymous terms than did postgraduate students, who generally selected broader and related terms; (b) topic complexity affected users' interaction with the thesaurus in that complex topics required more query expansion and search term selection; (c) users' prior topic-search experience appeared to have a significant effect on their selection and evaluation of thesaurus terms; (d) in 50% of the searches where additional terms were suggested from the thesaurus, users stated that they had not been aware of the terms at the beginning of the search; this observation was particularly noticeable in the case of postgraduate students.
    Date
    22. 7.2006 16:32:43
  14. Thenmalar, S.; Geetha, T.V.: Enhanced ontology-based indexing and searching (2014) 0.01
    0.008757308 = product of:
      0.030650577 = sum of:
        0.021418165 = weight(_text_:with in 1633) [ClassicSimilarity], result of:
          0.021418165 = score(doc=1633,freq=12.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.2282572 = fieldWeight in 1633, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1633)
        0.009232411 = product of:
          0.018464822 = sum of:
            0.018464822 = weight(_text_:22 in 1633) [ClassicSimilarity], result of:
              0.018464822 = score(doc=1633,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.1354154 = fieldWeight in 1633, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1633)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Purpose - The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based information retrieval aims to identify relevant information based on the meanings of the query terms or on the context of the terms and the performance of semantic information retrieval is carried out through standard measures-precision and recall. Higher precision leads to the (meaningful) relevant documents obtained and lower recall leads to the less coverage of the concepts. Design/methodology/approach - In this paper, the authors enhance the existing ontology-based indexing proposed by Kohler et al., by incorporating sibling information to the index. The index designed by Kohler et al., contains only super and sub-concepts from the ontology. In addition, in our approach, we focus on two tasks; query expansion and ranking of the expanded queries, to improve the efficiency of the ontology-based search. The aforementioned tasks make use of ontological concepts, and relations existing between those concepts so as to obtain semantically more relevant search results for a given query. Findings - The proposed ontology-based indexing technique is investigated by analysing the coverage of concepts that are being populated in the index. Here, we introduce a new measure called index enhancement measure, to estimate the coverage of ontological concepts being indexed. We have evaluated the ontology-based search for the tourism domain with the tourism documents and tourism-specific ontology. The comparison of search results based on the use of ontology "with and without query expansion" is examined to estimate the efficiency of the proposed query expansion task. The ranking is compared with the ORank system to evaluate the performance of our ontology-based search. From these analyses, the ontology-based search results shows better recall when compared to the other concept-based search systems. The mean average precision of the ontology-based search is found to be 0.79 and the recall is found to be 0.65, the ORank system has the mean average precision of 0.62 and the recall is found to be 0.51, while the concept-based search has the mean average precision of 0.56 and the recall is found to be 0.42. Practical implications - When the concept is not present in the domain-specific ontology, the concept cannot be indexed. When the given query term is not available in the ontology then the term-based results are retrieved. Originality/value - In addition to super and sub-concepts, we incorporate the concepts present in same level (siblings) to the ontological index. The structural information from the ontology is determined for the query expansion. The ranking of the documents depends on the type of the query (single concept query, multiple concept queries and concept with relation queries) and the ontological relations that exists in the query and the documents. With this ontological structural information, the search results showed us better coverage of concepts with respect to the query.
    Date
    20. 1.2015 18:30:22
  15. Landauer, T.K.; Foltz, P.W.; Laham, D.: ¬An introduction to Latent Semantic Analysis (1998) 0.01
    0.008396085 = product of:
      0.058772597 = sum of:
        0.058772597 = product of:
          0.117545195 = sum of:
            0.117545195 = weight(_text_:humans in 1162) [ClassicSimilarity], result of:
              0.117545195 = score(doc=1162,freq=2.0), product of:
                0.26276368 = queryWeight, product of:
                  6.7481275 = idf(docFreq=140, maxDocs=44218)
                  0.038938753 = queryNorm
                0.44734186 = fieldWeight in 1162, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  6.7481275 = idf(docFreq=140, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1162)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Abstract
    Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the contextual-usage meaning of words by statistical computations applied to a large corpus of text (Landauer and Dumais, 1997). The underlying idea is that the aggregate of all the word contexts in which a given word does and does not appear provides a set of mutual constraints that largely determines the similarity of meaning of words and sets of words to each other. The adequacy of LSA's reflection of human knowledge has been established in a variety of ways. For example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates word-word and passage-word lexical priming data; and as reported in 3 following articles in this issue, it accurately estimates passage coherence, learnability of passages by individual students, and the quality and quantity of knowledge contained in an essay.
  16. Bradford, R.B.: Relationship discovery in large text collections using Latent Semantic Indexing (2006) 0.01
    0.007959949 = product of:
      0.027859818 = sum of:
        0.017308492 = weight(_text_:with in 1163) [ClassicSimilarity], result of:
          0.017308492 = score(doc=1163,freq=6.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.18445967 = fieldWeight in 1163, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.03125 = fieldNorm(doc=1163)
        0.010551327 = product of:
          0.021102654 = sum of:
            0.021102654 = weight(_text_:22 in 1163) [ClassicSimilarity], result of:
              0.021102654 = score(doc=1163,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.15476047 = fieldWeight in 1163, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1163)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    This paper addresses the problem of information discovery in large collections of text. For users, one of the key problems in working with such collections is determining where to focus their attention. In selecting documents for examination, users must be able to formulate reasonably precise queries. Queries that are too broad will greatly reduce the efficiency of information discovery efforts by overwhelming the users with peripheral information. In order to formulate efficient queries, a mechanism is needed to automatically alert users regarding potentially interesting information contained within the collection. This paper presents the results of an experiment designed to test one approach to generation of such alerts. The technique of latent semantic indexing (LSI) is used to identify relationships among entities of interest. Entity extraction software is used to pre-process the text of the collection so that the LSI space contains representation vectors for named entities in addition to those for individual terms. In the LSI space, the cosine of the angle between the representation vectors for two entities captures important information regarding the degree of association of those two entities. For appropriate choices of entities, determining the entity pairs with the highest mutual cosine values yields valuable information regarding the contents of the text collection. The test database used for the experiment consists of 150,000 news articles. The proposed approach for alert generation is tested using a counterterrorism analysis example. The approach is shown to have significant potential for aiding users in rapidly focusing on information of potential importance in large text collections. The approach also has value in identifying possible use of aliases.
    Source
    Proceedings of the Fourth Workshop on Link Analysis, Counterterrorism, and Security, SIAM Data Mining Conference, Bethesda, MD, 20-22 April, 2006. [http://www.siam.org/meetings/sdm06/workproceed/Link%20Analysis/15.pdf]
  17. Efthimiadis, E.N.: User choices : a new yardstick for the evaluation of ranking algorithms for interactive query expansion (1995) 0.01
    0.007337282 = product of:
      0.025680486 = sum of:
        0.012491328 = weight(_text_:with in 5697) [ClassicSimilarity], result of:
          0.012491328 = score(doc=5697,freq=2.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.1331223 = fieldWeight in 5697, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5697)
        0.013189158 = product of:
          0.026378317 = sum of:
            0.026378317 = weight(_text_:22 in 5697) [ClassicSimilarity], result of:
              0.026378317 = score(doc=5697,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.19345059 = fieldWeight in 5697, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5697)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    The performance of 8 ranking algorithms was evaluated with respect to their effectiveness in ranking terms for query expansion. The evaluation was conducted within an investigation of interactive query expansion and relevance feedback in a real operational environment. Focuses on the identification of algorithms that most effectively take cognizance of user preferences. user choices (i.e. the terms selected by the searchers for the query expansion search) provided the yardstick for the evaluation of the 8 ranking algorithms. This methodology introduces a user oriented approach in evaluating ranking algorithms for query expansion in contrast to the standard, system oriented approaches. Similarities in the performance of the 8 algorithms and the ways these algorithms rank terms were the main focus of this evaluation. The findings demonstrate that the r-lohi, wpq, enim, and porter algorithms have similar performance in bringing good terms to the top of a ranked list of terms for query expansion. However, further evaluation of the algorithms in different (e.g. full text) environments is needed before these results can be generalized beyond the context of the present study
    Date
    22. 2.1996 13:14:10
  18. Boyack, K.W.; Wylie,B.N.; Davidson, G.S.: Information Visualization, Human-Computer Interaction, and Cognitive Psychology : Domain Visualizations (2002) 0.01
    0.0053292247 = product of:
      0.037304573 = sum of:
        0.037304573 = product of:
          0.074609146 = sum of:
            0.074609146 = weight(_text_:22 in 1352) [ClassicSimilarity], result of:
              0.074609146 = score(doc=1352,freq=4.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.54716086 = fieldWeight in 1352, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1352)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    22. 2.2003 17:25:39
    22. 2.2003 18:17:40
  19. Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.01
    0.005275664 = product of:
      0.036929645 = sum of:
        0.036929645 = product of:
          0.07385929 = sum of:
            0.07385929 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
              0.07385929 = score(doc=2134,freq=2.0), product of:
                0.13635688 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.038938753 = queryNorm
                0.5416616 = fieldWeight in 2134, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2134)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    30. 3.2001 13:32:22
  20. Kulyukin, V.A.; Settle, A.: Ranked retrieval with semantic networks and vector spaces (2001) 0.00
    0.0049452838 = product of:
      0.034616984 = sum of:
        0.034616984 = weight(_text_:with in 6934) [ClassicSimilarity], result of:
          0.034616984 = score(doc=6934,freq=6.0), product of:
            0.09383348 = queryWeight, product of:
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.038938753 = queryNorm
            0.36891934 = fieldWeight in 6934, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.409771 = idf(docFreq=10797, maxDocs=44218)
              0.0625 = fieldNorm(doc=6934)
      0.14285715 = coord(1/7)
    
    Abstract
    The equivalence of semantic networks with spreading activation and vector spaces with dot product is investigated under ranked retrieval. Semantic networks are viewed as networks of concepts organized in terms of abstraction and packaging relations. It is shown that the two models can be effectively constructed from each other. A formal method is suggested to analyze the models in terms of their relative performance in the same universe of objects

Years

Languages

  • e 155
  • d 5
  • chi 1
  • f 1
  • More… Less…

Types

  • a 147
  • el 19
  • m 9
  • r 3
  • x 1
  • More… Less…

Classifications