Search (124 results, page 1 of 7)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Adhikari, A.; Dutta, B.; Dutta, A.; Mondal, D.; Singh, S.: ¬An intrinsic information content-based semantic similarity measure considering the disjoint common subsumers of concepts of an ontology (2018) 0.03
    0.034512706 = product of:
      0.092033885 = sum of:
        0.055463947 = weight(_text_:higher in 4372) [ClassicSimilarity], result of:
          0.055463947 = score(doc=4372,freq=2.0), product of:
            0.19113865 = queryWeight, product of:
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.03638826 = queryNorm
            0.2901765 = fieldWeight in 4372, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4372)
        0.020098975 = weight(_text_:data in 4372) [ClassicSimilarity], result of:
          0.020098975 = score(doc=4372,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.17468026 = fieldWeight in 4372, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4372)
        0.01647096 = product of:
          0.03294192 = sum of:
            0.03294192 = weight(_text_:processing in 4372) [ClassicSimilarity], result of:
              0.03294192 = score(doc=4372,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.22363065 = fieldWeight in 4372, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4372)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    Finding similarity between concepts based on semantics has become a new trend in many applications (e.g., biomedical informatics, natural language processing). Measuring the Semantic Similarity (SS) with higher accuracy is a challenging task. In this context, the Information Content (IC)-based SS measure has gained popularity over the others. The notion of IC evolves from the science of information theory. Information theory has very high potential to characterize the semantics of concepts. Designing an IC-based SS framework comprises (i) an IC calculator, and (ii) an SS calculator. In this article, we propose a generic intrinsic IC-based SS calculator. We also introduce here a new structural aspect of an ontology called DCS (Disjoint Common Subsumers) that plays a significant role in deciding the similarity between two concepts. We evaluated our proposed similarity calculator with the existing intrinsic IC-based similarity calculators, as well as corpora-dependent similarity calculators using several benchmark data sets. The experimental results show that the proposed similarity calculator produces a high correlation with human evaluation over the existing state-of-the-art IC-based similarity calculators.
  2. Cool, C.; Spink, A.: Issues of context in information retrieval (IR) : an introduction to the special issue (2002) 0.02
    0.024061583 = product of:
      0.09624633 = sum of:
        0.07648118 = weight(_text_:great in 2587) [ClassicSimilarity], result of:
          0.07648118 = score(doc=2587,freq=2.0), product of:
            0.20489426 = queryWeight, product of:
              5.6307793 = idf(docFreq=430, maxDocs=44218)
              0.03638826 = queryNorm
            0.37327147 = fieldWeight in 2587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6307793 = idf(docFreq=430, maxDocs=44218)
              0.046875 = fieldNorm(doc=2587)
        0.01976515 = product of:
          0.0395303 = sum of:
            0.0395303 = weight(_text_:processing in 2587) [ClassicSimilarity], result of:
              0.0395303 = score(doc=2587,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.26835677 = fieldWeight in 2587, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2587)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The subject of context has received a great deal of attention in the information retrieval (IR) literature over the past decade, primarily in studies of information seeking and IR interactions. Recently, attention to context in IR has expanded to address new problems in new environments. In this paper we outline five overlapping dimensions of context which we believe to be important constituent elements and we discuss how they are related to different issues in IR research. The papers in this special issue are summarized with respect to how they represent work that is being conducted within these dimensions of context. We conclude with future areas of research which are needed in order to fully understand the multidimensional nature of context in IR.
    Source
    Information processing and management. 38(2002) no.5, S.605-611
  3. Salaba, A.; Zeng, M.L.: Extending the "Explore" user task beyond subject authority data into the linked data sphere (2014) 0.02
    0.021545123 = product of:
      0.08618049 = sum of:
        0.06892512 = weight(_text_:data in 1465) [ClassicSimilarity], result of:
          0.06892512 = score(doc=1465,freq=12.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.59902847 = fieldWeight in 1465, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1465)
        0.017255375 = product of:
          0.03451075 = sum of:
            0.03451075 = weight(_text_:22 in 1465) [ClassicSimilarity], result of:
              0.03451075 = score(doc=1465,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.2708308 = fieldWeight in 1465, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1465)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    "Explore" is a user task introduced in the Functional Requirements for Subject Authority Data (FRSAD) final report. Through various case scenarios, the authors discuss how structured data, presented based on Linked Data principles and using knowledge organisation systems (KOS) as the backbone, extend the explore task within and beyond subject authority data.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  4. Kelly, D.: Measuring online information seeking context : Part 1: background and method (2006) 0.02
    0.020958323 = product of:
      0.08383329 = sum of:
        0.063734315 = weight(_text_:great in 206) [ClassicSimilarity], result of:
          0.063734315 = score(doc=206,freq=2.0), product of:
            0.20489426 = queryWeight, product of:
              5.6307793 = idf(docFreq=430, maxDocs=44218)
              0.03638826 = queryNorm
            0.31105953 = fieldWeight in 206, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6307793 = idf(docFreq=430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=206)
        0.020098975 = weight(_text_:data in 206) [ClassicSimilarity], result of:
          0.020098975 = score(doc=206,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.17468026 = fieldWeight in 206, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=206)
      0.25 = coord(2/8)
    
    Abstract
    Context is one of the most important concepts in information seeking and retrieval research. However, the challenges of studying context are great; thus, it is more common for researchers to use context as a post hoc explanatory factor, rather than as a concept that drives inquiry. The purposes of this study were to develop a method for collecting data about information seeking context in natural online environments, and identify which aspects of context should be considered when studying online information seeking. The study is reported in two parts. In this, the first part, the background and method are presented. Results and implications of this research are presented in Part 2 (Kelly, in press). Part 1 discusses previous literature on information seeking context and behavior and situates the current work within this literature. This part further describes the naturalistic, longitudinal research design that was used to examine and measure the online information seeking contexts of users during a 14-week period. In this design, information seeking context was characterized by a user's self-identified tasks and topics, and several attributes of these, such as the length of time the user expected to work on a task and the user's familiarity with a topic. At weekly intervals, users evaluated the usefulness of the documents that they viewed, and classified these documents according to their tasks and topics. At the end of the study, users provided feedback about the study method.
  5. Kelly, D.: Measuring online information seeking context : Part 2: Findings and discussion (2006) 0.02
    0.020958323 = product of:
      0.08383329 = sum of:
        0.063734315 = weight(_text_:great in 215) [ClassicSimilarity], result of:
          0.063734315 = score(doc=215,freq=2.0), product of:
            0.20489426 = queryWeight, product of:
              5.6307793 = idf(docFreq=430, maxDocs=44218)
              0.03638826 = queryNorm
            0.31105953 = fieldWeight in 215, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6307793 = idf(docFreq=430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=215)
        0.020098975 = weight(_text_:data in 215) [ClassicSimilarity], result of:
          0.020098975 = score(doc=215,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.17468026 = fieldWeight in 215, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=215)
      0.25 = coord(2/8)
    
    Abstract
    Context is one of the most important concepts in information seeking and retrieval research. However, the challenges of studying context are great; thus, it is more common for researchers to use context as a post hoc explanatory factor, rather than as a concept that drives inquiry. The purpose of this study was to develop a method for collecting data about information seeking context in natural online environments, and identify which aspects of context should be considered when studying online information seeking. The study is reported in two parts. In this, the second part, results and implications of this research are presented. Part 1 (Kelly, 2006) discussed previous literature on information seeking context and behavior, situated the current study within this literature, and described the naturalistic, longitudinal research design that was used to examine and measure the online information seeking context of seven users during a 14-week period. Results provide support for the value of the method in studying online information seeking context, the relative importance of various measures of context, how these measures change over time, and, finally, the relationship between these measures. In particular, results demonstrate significant differences in distributions of usefulness ratings according to task and topic.
  6. Brunetti, J.M.; Roberto García, R.: User-centered design and evaluation of overview components for semantic data exploration (2014) 0.02
    0.018544234 = product of:
      0.07417694 = sum of:
        0.06431672 = weight(_text_:data in 1626) [ClassicSimilarity], result of:
          0.06431672 = score(doc=1626,freq=32.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.5589768 = fieldWeight in 1626, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=1626)
        0.009860214 = product of:
          0.019720428 = sum of:
            0.019720428 = weight(_text_:22 in 1626) [ClassicSimilarity], result of:
              0.019720428 = score(doc=1626,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.15476047 = fieldWeight in 1626, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1626)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Purpose - The growing volumes of semantic data available in the web result in the need for handling the information overload phenomenon. The potential of this amount of data is enormous but in most cases it is very difficult for users to visualize, explore and use this data, especially for lay-users without experience with Semantic Web technologies. The paper aims to discuss these issues. Design/methodology/approach - The Visual Information-Seeking Mantra "Overview first, zoom and filter, then details-on-demand" proposed by Shneiderman describes how data should be presented in different stages to achieve an effective exploration. The overview is the first user task when dealing with a data set. The objective is that the user is capable of getting an idea about the overall structure of the data set. Different information architecture (IA) components supporting the overview tasks have been developed, so they are automatically generated from semantic data, and evaluated with end-users. Findings - The chosen IA components are well known to web users, as they are present in most web pages: navigation bars, site maps and site indexes. The authors complement them with Treemaps, a visualization technique for displaying hierarchical data. These components have been developed following an iterative User-Centered Design methodology. Evaluations with end-users have shown that they get easily used to them despite the fact that they are generated automatically from structured data, without requiring knowledge about the underlying semantic technologies, and that the different overview components complement each other as they focus on different information search needs. Originality/value - Obtaining semantic data sets overviews cannot be easily done with the current semantic web browsers. Overviews become difficult to achieve with large heterogeneous data sets, which is typical in the Semantic Web, because traditional IA techniques do not easily scale to large data sets. There is little or no support to obtain overview information quickly and easily at the beginning of the exploration of a new data set. This can be a serious limitation when exploring a data set for the first time, especially for lay-users. The proposal is to reuse and adapt existing IA components to provide this overview to users and show that they can be generated automatically from the thesaurus and ontologies that structure semantic data while providing a comparable user experience to traditional web sites.
    Date
    20. 1.2015 18:30:22
  7. Olmos, R.; Jorge-Botana, G.; Luzón, J.M.; Martín-Cordero, J.I.; León, J.A.: Transforming LSA space dimensions into a rubric for an automatic assessment and feedback system (2016) 0.02
    0.017983727 = product of:
      0.07193491 = sum of:
        0.055463947 = weight(_text_:higher in 2878) [ClassicSimilarity], result of:
          0.055463947 = score(doc=2878,freq=2.0), product of:
            0.19113865 = queryWeight, product of:
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.03638826 = queryNorm
            0.2901765 = fieldWeight in 2878, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2878)
        0.01647096 = product of:
          0.03294192 = sum of:
            0.03294192 = weight(_text_:processing in 2878) [ClassicSimilarity], result of:
              0.03294192 = score(doc=2878,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.22363065 = fieldWeight in 2878, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2878)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The purpose of this article is to validate, through two empirical studies, a new method for automatic evaluation of written texts, called Inbuilt Rubric, based on the Latent Semantic Analysis (LSA) technique, which constitutes an innovative and distinct turn with respect to LSA application so far. In the first empirical study, evidence of the validity of the method to identify and evaluate the conceptual axes of a text in a sample of 78 summaries by secondary school students is sought. Results show that the proposed method has a significantly higher degree of reliability than classic LSA methods of text evaluation, and displays very high sensitivity to identify which conceptual axes are included or not in each summary. A second study evaluates the method's capacity to interact and provide feedback about quality in a real online system on a sample of 924 discursive texts written by university students. Results show that students improved the quality of their written texts using this system, and also rated the experience very highly. The final conclusion is that this new method opens a very interesting way regarding the role of automatic assessors in the identification of presence/absence and quality of elaboration of relevant conceptual information in texts written by students with lower time costs than the usual LSA-based methods.
    Source
    Information processing and management. 52(2016) no.3, S.359-373
  8. Goslin, K.; Hofmann, M.: ¬A Wikipedia powered state-based approach to automatic search query enhancement (2018) 0.02
    0.017000671 = product of:
      0.068002686 = sum of:
        0.04823754 = weight(_text_:data in 5083) [ClassicSimilarity], result of:
          0.04823754 = score(doc=5083,freq=8.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.4192326 = fieldWeight in 5083, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=5083)
        0.01976515 = product of:
          0.0395303 = sum of:
            0.0395303 = weight(_text_:processing in 5083) [ClassicSimilarity], result of:
              0.0395303 = score(doc=5083,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.26835677 = fieldWeight in 5083, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5083)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    This paper describes the development and testing of a novel Automatic Search Query Enhancement (ASQE) algorithm, the Wikipedia N Sub-state Algorithm (WNSSA), which utilises Wikipedia as the sole data source for prior knowledge. This algorithm is built upon the concept of iterative states and sub-states, harnessing the power of Wikipedia's data set and link information to identify and utilise reoccurring terms to aid term selection and weighting during enhancement. This algorithm is designed to prevent query drift by making callbacks to the user's original search intent by persisting the original query between internal states with additional selected enhancement terms. The developed algorithm has shown to improve both short and long queries by providing a better understanding of the query and available data. The proposed algorithm was compared against five existing ASQE algorithms that utilise Wikipedia as the sole data source, showing an average Mean Average Precision (MAP) improvement of 0.273 over the tested existing ASQE algorithms.
    Source
    Information processing and management. 54(2018) no.4, S.726-739
  9. Rekabsaz, N. et al.: Toward optimized multimodal concept indexing (2016) 0.02
    0.01621212 = product of:
      0.06484848 = sum of:
        0.04019795 = weight(_text_:data in 2751) [ClassicSimilarity], result of:
          0.04019795 = score(doc=2751,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.34936053 = fieldWeight in 2751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.078125 = fieldNorm(doc=2751)
        0.024650536 = product of:
          0.049301073 = sum of:
            0.049301073 = weight(_text_:22 in 2751) [ClassicSimilarity], result of:
              0.049301073 = score(doc=2751,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.38690117 = fieldWeight in 2751, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2751)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Date
    1. 2.2016 18:25:22
    Source
    Semantic keyword-based search on structured data sources: First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers. Eds.: J. Cardoso et al
  10. Kozikowski, P. et al.: Support of part-whole relations in query answering (2016) 0.02
    0.01621212 = product of:
      0.06484848 = sum of:
        0.04019795 = weight(_text_:data in 2754) [ClassicSimilarity], result of:
          0.04019795 = score(doc=2754,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.34936053 = fieldWeight in 2754, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.078125 = fieldNorm(doc=2754)
        0.024650536 = product of:
          0.049301073 = sum of:
            0.049301073 = weight(_text_:22 in 2754) [ClassicSimilarity], result of:
              0.049301073 = score(doc=2754,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.38690117 = fieldWeight in 2754, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2754)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Date
    1. 2.2016 18:25:22
    Source
    Semantic keyword-based search on structured data sources: First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers. Eds.: J. Cardoso et al
  11. Bilal, D.; Kirby, J.: Differences and similarities in information seeking : children and adults as Web users (2002) 0.01
    0.014386982 = product of:
      0.057547927 = sum of:
        0.044371158 = weight(_text_:higher in 2591) [ClassicSimilarity], result of:
          0.044371158 = score(doc=2591,freq=2.0), product of:
            0.19113865 = queryWeight, product of:
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.03638826 = queryNorm
            0.23214121 = fieldWeight in 2591, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.03125 = fieldNorm(doc=2591)
        0.013176767 = product of:
          0.026353534 = sum of:
            0.026353534 = weight(_text_:processing in 2591) [ClassicSimilarity], result of:
              0.026353534 = score(doc=2591,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.17890452 = fieldWeight in 2591, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2591)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    This study examined the success and information seeking behaviors of seventh-grade science students and graduate students in information science in using Yahooligans! Web search engine/directory. It investigated these users' cognitive, affective, and physical behaviors as they sought the answer for a fact-finding task. It analyzed and compared the overall patterns of children's and graduate students' Web activities, including searching moves, browsing moves, backtracking moves, looping moves, screen scrolling, target location and deviation moves, and the time they took to complete the task. The authors applied Bilal's Web Traversal Measure to quantify these users' effectiveness, efficiency, and quality of moves they made. Results were based on 14 children's Web sessions and nine graduate students' sessions. Both groups' Web activities were captured online using Lotus ScreenCam, a software package that records and replays online activities in Web browsers. Children's affective states were captured via exit interviews. Graduate students' affective states were extracted from the journal writings they kept during the traversal process. The study findings reveal that 89% of the graduate students found the correct answer to the search task as opposed to 50% of the children. Based on the Measure, graduate students' weighted effectiveness, efficiency, and quality of the Web moves they made were much higher than those of the children. Regardless of success and weighted scores, however, similarities and differences in information seeking were found between the two groups. Yahooligans! poor structure of keyword searching was a major factor that contributed to the "breakdowns" children and graduate students experienced. Unlike children, graduate students were able to recover from "breakdowns" quickly and effectively. Three main factors influenced these users' performance: ability to recover from "breakdowns", navigational style, and focus on task. Children and graduate students made recommendations for improving Yahooligans! interface design. Implications for Web user training and system design improvements are made.
    Source
    Information processing and management. 38(2002) no.5, S.649-670
  12. Ru, C.; Tang, J.; Li, S.; Xie, S.; Wang, T.: Using semantic similarity to reduce wrong labels in distant supervision for relation extraction (2018) 0.01
    0.014167227 = product of:
      0.056668907 = sum of:
        0.04019795 = weight(_text_:data in 5055) [ClassicSimilarity], result of:
          0.04019795 = score(doc=5055,freq=8.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.34936053 = fieldWeight in 5055, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5055)
        0.01647096 = product of:
          0.03294192 = sum of:
            0.03294192 = weight(_text_:processing in 5055) [ClassicSimilarity], result of:
              0.03294192 = score(doc=5055,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.22363065 = fieldWeight in 5055, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5055)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Distant supervision (DS) has the advantage of automatically generating large amounts of labelled training data and has been widely used for relation extraction. However, there are usually many wrong labels in the automatically labelled data in distant supervision (Riedel, Yao, & McCallum, 2010). This paper presents a novel method to reduce the wrong labels. The proposed method uses the semantic Jaccard with word embedding to measure the semantic similarity between the relation phrase in the knowledge base and the dependency phrases between two entities in a sentence to filter the wrong labels. In the process of reducing wrong labels, the semantic Jaccard algorithm selects a core dependency phrase to represent the candidate relation in a sentence, which can capture features for relation classification and avoid the negative impact from irrelevant term sequences that previous neural network models of relation extraction often suffer. In the process of relation classification, the core dependency phrases are also used as the input of a convolutional neural network (CNN) for relation classification. The experimental results show that compared with the methods using original DS data, the methods using filtered DS data performed much better in relation extraction. It indicates that the semantic similarity based method is effective in reducing wrong labels. The relation extraction performance of the CNN model using the core dependency phrases as input is the best of all, which indicates that using the core dependency phrases as input of CNN is enough to capture the features for relation classification and could avoid negative impact from irrelevant terms.
    Source
    Information processing and management. 54(2018) no.4, S.593-608
  13. Järvelin, K.; Kristensen, J.; Niemi, T.; Sormunen, E.; Keskustalo, H.: ¬A deductive data model for query expansion (1996) 0.01
    0.014141314 = product of:
      0.056565255 = sum of:
        0.041774936 = weight(_text_:data in 2230) [ClassicSimilarity], result of:
          0.041774936 = score(doc=2230,freq=6.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.3630661 = fieldWeight in 2230, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=2230)
        0.014790321 = product of:
          0.029580642 = sum of:
            0.029580642 = weight(_text_:22 in 2230) [ClassicSimilarity], result of:
              0.029580642 = score(doc=2230,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.23214069 = fieldWeight in 2230, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2230)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    We present a deductive data model for concept-based query expansion. It is based on three abstraction levels: the conceptual, linguistic and occurrence levels. Concepts and relationships among them are represented at the conceptual level. The expression level represents natural language expressions for concepts. Each expression has one or more matching models at the occurrence level. Each model specifies the matching of the expression in database indices built in varying ways. The data model supports a concept-based query expansion and formulation tool, the ExpansionTool, for environments providing heterogeneous IR systems. Expansion is controlled by adjustable matching reliability.
    Source
    Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR '96), Zürich, Switzerland, August 18-22, 1996. Eds.: H.P. Frei et al
  14. Hazrina, S.; Sharef, N.M.; Ibrahim, H.; Murad, M.A.A.; Noah, S.A.M.: Review on the advancements of disambiguation in semantic question answering system (2017) 0.01
    0.013550873 = product of:
      0.05420349 = sum of:
        0.027849957 = weight(_text_:data in 3292) [ClassicSimilarity], result of:
          0.027849957 = score(doc=3292,freq=6.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.24204408 = fieldWeight in 3292, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=3292)
        0.026353534 = product of:
          0.05270707 = sum of:
            0.05270707 = weight(_text_:processing in 3292) [ClassicSimilarity], result of:
              0.05270707 = score(doc=3292,freq=8.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.35780904 = fieldWeight in 3292, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3292)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Ambiguity is a potential problem in any semantic question answering (SQA) system due to the nature of idiosyncrasy in composing natural language (NL) question and semantic resources. Thus, disambiguation of SQA systems is a field of ongoing research. Ambiguity occurs in SQA because a word or a sentence can have more than one meaning or multiple words in the same language can share the same meaning. Therefore, an SQA system needs disambiguation solutions to select the correct meaning when the linguistic triples matched with multiple KB concepts, and enumerate similar words especially when linguistic triples do not match with any KB concept. The latest development in this field is a solution for SQA systems that is able to process a complex NL question while accessing open-domain data from linked open data (LOD). The contributions in this paper include (1) formulating an SQA conceptual framework based on an in-depth study of existing SQA processes; (2) identifying the ambiguity types, specifically in English based on an interdisciplinary literature review; (3) highlighting the ambiguity types that had been resolved by the previous SQA studies; and (4) analysing the results of the existing SQA disambiguation solutions, the complexity of NL question processing, and the complexity of data retrieval from KB(s) or LOD. The results of this review demonstrated that out of thirteen types of ambiguity identified in the literature, only six types had been successfully resolved by the previous studies. Efforts to improve the disambiguation are in progress for the remaining unresolved ambiguity types to improve the accuracy of the formulated answers by the SQA system. The remaining ambiguity types are potentially resolved in the identified SQA process based on ambiguity scenarios elaborated in this paper. The results of this review also demonstrated that most existing research on SQA systems have treated the processing of the NL question complexity separate from the processing of the KB structure complexity.
    Source
    Information processing and management. 53(2017) no.1, S.52-69
  15. Fidel, R.; Efthimiadis, E.N.: Terminological knowledge structure for intermediary expert systems (1995) 0.01
    0.013017729 = product of:
      0.052070916 = sum of:
        0.02411877 = weight(_text_:data in 5695) [ClassicSimilarity], result of:
          0.02411877 = score(doc=5695,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.2096163 = fieldWeight in 5695, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=5695)
        0.027952146 = product of:
          0.05590429 = sum of:
            0.05590429 = weight(_text_:processing in 5695) [ClassicSimilarity], result of:
              0.05590429 = score(doc=5695,freq=4.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.3795138 = fieldWeight in 5695, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5695)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    To provide advice for online searching about term selection and query expansion, an intermediary expert system should indicate a terminological knowledge structure. Terminological attributes could provide the foundation of a knowledge base, and knowledge acquisition could rely on knowledge base techniques coupled with statistical techniques. The strategies of expert searchers would provide 1 source of knowledge. The knowledge structure would include 3 constructs for each term: frequency data, a hedge, and a position in a classification scheme. Switching vocabularies could provide a meta-scheme and facilitate the interoperability of databases in similar subjects. To develop such knowledge structure, research should focus on terminological attributes, word and phrase disambiguation, automated text processing, and the role of thesauri and classification schemes in indexing and retrieval. It should develop techniques that combine knowledge base and statistical methods and that consider user preferences
    Source
    Information processing and management. 31(1995) no.1, S.15-27
  16. Wolfram, D.; Xie, H.I.: Traditional IR for web users : a context for general audience digital libraries (2002) 0.01
    0.012820851 = product of:
      0.051283404 = sum of:
        0.034812447 = weight(_text_:data in 2589) [ClassicSimilarity], result of:
          0.034812447 = score(doc=2589,freq=6.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.30255508 = fieldWeight in 2589, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2589)
        0.01647096 = product of:
          0.03294192 = sum of:
            0.03294192 = weight(_text_:processing in 2589) [ClassicSimilarity], result of:
              0.03294192 = score(doc=2589,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.22363065 = fieldWeight in 2589, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2589)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The emergence of general audience digital libraries (GADLs) defines a context that represents a hybrid of both "traditional" IR, using primarily bibliographic resources provided by database vendors, and "popular" IR, exemplified by public search systems available on the World Wide Web. Findings of a study investigating end-user searching and response to a GADL are reported. Data collected from a Web-based end-user survey and data logs of resource usage for a Web-based GADL were analyzed for user characteristics, patterns of access and use, and user feedback. Cross-tabulations using respondent demographics revealed several key differences in how the system was used and valued by users of different age groups. Older users valued the service more than younger users and engaged in different searching and viewing behaviors. The GADL more closely resembles traditional retrieval systems in terms of content and purpose of use, but is more similar to popular IR systems in terms of user behavior and accessibility. A model that defines the dual context of the GADL environment is derived from the data analysis and existing IR models in general and other specific contexts. The authors demonstrate the distinguishing characteristics of this IR context, and discuss implications for the development and evaluation of future GADLs to accommodate a variety of user needs and expectations.
    Source
    Information processing and management. 38(2002) no.5, S.627-648
  17. Robertson, S.E.; Walker, S.; Hancock-Beaulieu, M.M.: Large test collection experiments of an operational, interactive system : OKAPI at TREC (1995) 0.01
    0.012799477 = product of:
      0.05119791 = sum of:
        0.028138565 = weight(_text_:data in 6964) [ClassicSimilarity], result of:
          0.028138565 = score(doc=6964,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.24455236 = fieldWeight in 6964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6964)
        0.023059342 = product of:
          0.046118684 = sum of:
            0.046118684 = weight(_text_:processing in 6964) [ClassicSimilarity], result of:
              0.046118684 = score(doc=6964,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.3130829 = fieldWeight in 6964, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=6964)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The Okapi system has been used in a series of experiments on the TREC collections, investiganting probabilistic methods, relevance feedback, and query expansion, and interaction issues. Some new probabilistic models have been developed, resulting in simple weigthing functions that take account of document length and within document and within query term frequency. All have been shown to be beneficial when based on large quantities of relevance data as in the routing task. Interaction issues are much more difficult to evaluate in the TREC framework, and no benefits have yet been demonstrated from feedback based on small numbers of 'relevant' items identified by intermediary searchers
    Source
    Information processing and management. 31(1995) no.3, S.345-360
  18. Gao, J.; Zhang, J.: Clustered SVD strategies in latent semantic indexing (2005) 0.01
    0.012799477 = product of:
      0.05119791 = sum of:
        0.028138565 = weight(_text_:data in 1166) [ClassicSimilarity], result of:
          0.028138565 = score(doc=1166,freq=2.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.24455236 = fieldWeight in 1166, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1166)
        0.023059342 = product of:
          0.046118684 = sum of:
            0.046118684 = weight(_text_:processing in 1166) [ClassicSimilarity], result of:
              0.046118684 = score(doc=1166,freq=2.0), product of:
                0.14730503 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.03638826 = queryNorm
                0.3130829 = fieldWeight in 1166, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1166)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The text retrieval method using latent semantic indexing (LSI) technique with truncated singular value decomposition (SVD) has been intensively studied in recent years. The SVD reduces the noise contained in the original representation of the term-document matrix and improves the information retrieval accuracy. Recent studies indicate that SVD is mostly useful for small homogeneous data collections. For large inhomogeneous datasets, the performance of the SVD based text retrieval technique may deteriorate. We propose to partition a large inhomogeneous dataset into several smaller ones with clustered structure, on which we apply the truncated SVD. Our experimental results show that the clustered SVD strategies may enhance the retrieval accuracy and reduce the computing and storage costs.
    Source
    Information processing and management. 41(2005) no.5, S.1051-1064
  19. Thenmalar, S.; Geetha, T.V.: Enhanced ontology-based indexing and searching (2014) 0.01
    0.011863112 = product of:
      0.04745245 = sum of:
        0.038824763 = weight(_text_:higher in 1633) [ClassicSimilarity], result of:
          0.038824763 = score(doc=1633,freq=2.0), product of:
            0.19113865 = queryWeight, product of:
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.03638826 = queryNorm
            0.20312355 = fieldWeight in 1633, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.252756 = idf(docFreq=628, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1633)
        0.008627688 = product of:
          0.017255375 = sum of:
            0.017255375 = weight(_text_:22 in 1633) [ClassicSimilarity], result of:
              0.017255375 = score(doc=1633,freq=2.0), product of:
                0.12742549 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03638826 = queryNorm
                0.1354154 = fieldWeight in 1633, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1633)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Purpose - The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based information retrieval aims to identify relevant information based on the meanings of the query terms or on the context of the terms and the performance of semantic information retrieval is carried out through standard measures-precision and recall. Higher precision leads to the (meaningful) relevant documents obtained and lower recall leads to the less coverage of the concepts. Design/methodology/approach - In this paper, the authors enhance the existing ontology-based indexing proposed by Kohler et al., by incorporating sibling information to the index. The index designed by Kohler et al., contains only super and sub-concepts from the ontology. In addition, in our approach, we focus on two tasks; query expansion and ranking of the expanded queries, to improve the efficiency of the ontology-based search. The aforementioned tasks make use of ontological concepts, and relations existing between those concepts so as to obtain semantically more relevant search results for a given query. Findings - The proposed ontology-based indexing technique is investigated by analysing the coverage of concepts that are being populated in the index. Here, we introduce a new measure called index enhancement measure, to estimate the coverage of ontological concepts being indexed. We have evaluated the ontology-based search for the tourism domain with the tourism documents and tourism-specific ontology. The comparison of search results based on the use of ontology "with and without query expansion" is examined to estimate the efficiency of the proposed query expansion task. The ranking is compared with the ORank system to evaluate the performance of our ontology-based search. From these analyses, the ontology-based search results shows better recall when compared to the other concept-based search systems. The mean average precision of the ontology-based search is found to be 0.79 and the recall is found to be 0.65, the ORank system has the mean average precision of 0.62 and the recall is found to be 0.51, while the concept-based search has the mean average precision of 0.56 and the recall is found to be 0.42. Practical implications - When the concept is not present in the domain-specific ontology, the concept cannot be indexed. When the given query term is not available in the ontology then the term-based results are retrieved. Originality/value - In addition to super and sub-concepts, we incorporate the concepts present in same level (siblings) to the ontological index. The structural information from the ontology is determined for the query expansion. The ranking of the documents depends on the type of the query (single concept query, multiple concept queries and concept with relation queries) and the ontological relations that exists in the query and the documents. With this ontological structural information, the search results showed us better coverage of concepts with respect to the query.
    Date
    20. 1.2015 18:30:22
  20. Hannech, A.: Système de recherche d'information étendue basé sur une projection multi-espaces (2018) 0.01
    0.011691121 = product of:
      0.046764486 = sum of:
        0.025493726 = weight(_text_:great in 4472) [ClassicSimilarity], result of:
          0.025493726 = score(doc=4472,freq=2.0), product of:
            0.20489426 = queryWeight, product of:
              5.6307793 = idf(docFreq=430, maxDocs=44218)
              0.03638826 = queryNorm
            0.12442382 = fieldWeight in 4472, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6307793 = idf(docFreq=430, maxDocs=44218)
              0.015625 = fieldNorm(doc=4472)
        0.021270758 = weight(_text_:data in 4472) [ClassicSimilarity], result of:
          0.021270758 = score(doc=4472,freq=14.0), product of:
            0.115061514 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03638826 = queryNorm
            0.18486422 = fieldWeight in 4472, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.015625 = fieldNorm(doc=4472)
      0.25 = coord(2/8)
    
    Abstract
    Since its appearance in the early 90's, the World Wide Web (WWW or Web) has provided universal access to knowledge and the world of information has been primarily witness to a great revolution (the digital revolution). It quickly became very popular, making it the largest and most comprehensive database and knowledge base thanks to the amount and diversity of data it contains. However, the considerable increase and evolution of these data raises important problems for users, in particular for accessing the documents most relevant to their search queries. In order to cope with this exponential explosion of data volume and facilitate their access by users, various models are offered by information retrieval systems (IRS) for the representation and retrieval of web documents. Traditional SRIs use simple keywords that are not semantically linked to index and retrieve these documents. This creates limitations in terms of the relevance and ease of exploration of results. To overcome these limitations, existing techniques enrich documents by integrating external keywords from different sources. However, these systems still suffer from limitations that are related to the exploitation techniques of these sources of enrichment. When the different sources are used so that they cannot be distinguished by the system, this limits the flexibility of the exploration models that can be applied to the results returned by this system. Users then feel lost to these results, and find themselves forced to filter them manually to select the relevant information. If they want to go further, they must reformulate and target their search queries even more until they reach the documents that best meet their expectations. In this way, even if the systems manage to find more relevant results, their presentation remains problematic. In order to target research to more user-specific information needs and improve the relevance and exploration of its research findings, advanced SRIs adopt different data personalization techniques that assume that current research of user is directly related to his profile and / or previous browsing / search experiences.
    However, this assumption does not hold in all cases, the needs of the user evolve over time and can move away from his previous interests stored in his profile. In other cases, the user's profile may be misused to extract or infer new information needs. This problem is much more accentuated with ambiguous queries. When multiple POIs linked to a search query are identified in the user's profile, the system is unable to select the relevant data from that profile to respond to that request. This has a direct impact on the quality of the results provided to this user. In order to overcome some of these limitations, in this research thesis, we have been interested in the development of techniques aimed mainly at improving the relevance of the results of current SRIs and facilitating the exploration of major collections of documents. To do this, we propose a solution based on a new concept and model of indexing and information retrieval called multi-spaces projection. This proposal is based on the exploitation of different categories of semantic and social information that enrich the universe of document representation and search queries in several dimensions of interpretations. The originality of this representation is to be able to distinguish between the different interpretations used for the description and the search for documents. This gives a better visibility on the results returned and helps to provide a greater flexibility of search and exploration, giving the user the ability to navigate one or more views of data that interest him the most. In addition, the proposed multidimensional representation universes for document description and search query interpretation help to improve the relevance of the user's results by providing a diversity of research / exploration that helps meet his diverse needs and those of other different users. This study exploits different aspects that are related to the personalized search and aims to solve the problems caused by the evolution of the information needs of the user. Thus, when the profile of this user is used by our system, a technique is proposed and used to identify the interests most representative of his current needs in his profile. This technique is based on the combination of three influential factors, including the contextual, frequency and temporal factor of the data. The ability of users to interact, exchange ideas and opinions, and form social networks on the Web, has led systems to focus on the types of interactions these users have at the level of interaction between them as well as their social roles in the system. This social information is discussed and integrated into this research work. The impact and how they are integrated into the IR process are studied to improve the relevance of the results.

Years

Languages

  • e 118
  • d 5
  • f 1
  • More… Less…

Types

  • a 112
  • el 11
  • m 7
  • x 2
  • More… Less…