Search (81 results, page 1 of 5)

  • × language_ss:"e"
  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  • × year_i:[2010 TO 2020}
  1. Sebastian, Y.: Literature-based discovery by learning heterogeneous bibliographic information networks (2017) 0.03
    0.029486848 = product of:
      0.07371712 = sum of:
        0.05623238 = weight(_text_:philosophy in 535) [ClassicSimilarity], result of:
          0.05623238 = score(doc=535,freq=2.0), product of:
            0.23055021 = queryWeight, product of:
              5.5189433 = idf(docFreq=481, maxDocs=44218)
              0.04177434 = queryNorm
            0.24390514 = fieldWeight in 535, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5189433 = idf(docFreq=481, maxDocs=44218)
              0.03125 = fieldNorm(doc=535)
        0.017484732 = weight(_text_:of in 535) [ClassicSimilarity], result of:
          0.017484732 = score(doc=535,freq=30.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.26765788 = fieldWeight in 535, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=535)
      0.4 = coord(2/5)
    
    Abstract
    Literature-based discovery (LBD) research aims at finding effective computational methods for predicting previously unknown connections between clusters of research papers from disparate research areas. Existing methods encompass two general approaches. The first approach searches for these unknown connections by examining the textual contents of research papers. In addition to the existing textual features, the second approach incorporates structural features of scientific literatures, such as citation structures. These approaches, however, have not considered research papers' latent bibliographic metadata structures as important features that can be used for predicting previously unknown relationships between them. This thesis investigates a new graph-based LBD method that exploits the latent bibliographic metadata connections between pairs of research papers. The heterogeneous bibliographic information network is proposed as an efficient graph-based data structure for modeling the complex relationships between these metadata. In contrast to previous approaches, this method seamlessly combines textual and citation information in the form of pathbased metadata features for predicting future co-citation links between research papers from disparate research fields. The results reported in this thesis provide evidence that the method is effective for reconstructing the historical literature-based discovery hypotheses. This thesis also investigates the effects of semantic modeling and topic modeling on the performance of the proposed method. For semantic modeling, a general-purpose word sense disambiguation technique is proposed to reduce the lexical ambiguity in the title and abstract of research papers. The experimental results suggest that the reduced lexical ambiguity did not necessarily lead to a better performance of the method. This thesis discusses some of the possible contributing factors to these results. Finally, topic modeling is used for learning the latent topical relations between research papers. The learned topic model is incorporated into the heterogeneous bibliographic information network graph and allows new predictive features to be learned. The results in this thesis suggest that topic modeling improves the performance of the proposed method by increasing the overall accuracy for predicting the future co-citation links between disparate research papers.
    Footnote
    A thesis submitted in ful llment of the requirements for the degree of Doctor of Philosophy Monash University, Faculty of Information Technology.
  2. Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.02
    0.024185965 = product of:
      0.04030994 = sum of:
        0.008315044 = product of:
          0.041575223 = sum of:
            0.041575223 = weight(_text_:problem in 1343) [ClassicSimilarity], result of:
              0.041575223 = score(doc=1343,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.23447686 = fieldWeight in 1343, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1343)
          0.2 = coord(1/5)
        0.017845279 = weight(_text_:of in 1343) [ClassicSimilarity], result of:
          0.017845279 = score(doc=1343,freq=20.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.27317715 = fieldWeight in 1343, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1343)
        0.0141496165 = product of:
          0.028299233 = sum of:
            0.028299233 = weight(_text_:22 in 1343) [ClassicSimilarity], result of:
              0.028299233 = score(doc=1343,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.19345059 = fieldWeight in 1343, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1343)
          0.5 = coord(1/2)
      0.6 = coord(3/5)
    
    Abstract
    A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. Recently, methods that exploit named entities have been shown to be more effective for query expansion than traditional pseudorelevance feedback methods. In this article, we introduce a supervised learning approach that exploits named entities for query expansion using Wikipedia as a repository of high-quality feedback documents. In contrast with existing entity-oriented pseudorelevance feedback approaches, we tackle query expansion as a learning-to-rank problem. As a result, not only do we select effective expansion terms but we also weigh these terms according to their predicted effectiveness. To this end, we exploit the rich structure of Wikipedia articles to devise discriminative term features, including each candidate term's proximity to the original query terms, as well as its frequency across multiple article fields and in category and infobox descriptors. Experiments on three Text REtrieval Conference web test collections attest the effectiveness of our approach, with gains of up to 23.32% in terms of mean average precision, 19.49% in terms of precision at 10, and 7.86% in terms of normalized discounted cumulative gain compared with a state-of-the-art approach for entity-oriented query expansion.
    Date
    22. 8.2014 17:07:50
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.9, S.1870-1883
  3. Kozikowski, P. et al.: Support of part-whole relations in query answering (2016) 0.02
    0.015834233 = product of:
      0.03958558 = sum of:
        0.011286346 = weight(_text_:of in 2754) [ClassicSimilarity], result of:
          0.011286346 = score(doc=2754,freq=2.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.17277241 = fieldWeight in 2754, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=2754)
        0.028299233 = product of:
          0.056598466 = sum of:
            0.056598466 = weight(_text_:22 in 2754) [ClassicSimilarity], result of:
              0.056598466 = score(doc=2754,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.38690117 = fieldWeight in 2754, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2754)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Date
    1. 2.2016 18:25:22
  4. Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.02
    0.015834233 = product of:
      0.03958558 = sum of:
        0.011286346 = weight(_text_:of in 3280) [ClassicSimilarity], result of:
          0.011286346 = score(doc=3280,freq=2.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.17277241 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
        0.028299233 = product of:
          0.056598466 = sum of:
            0.056598466 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
              0.056598466 = score(doc=3280,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.38690117 = fieldWeight in 3280, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3280)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  5. Mlodzka-Stybel, A.: Towards continuous improvement of users' access to a library catalogue (2014) 0.02
    0.015664605 = product of:
      0.03916151 = sum of:
        0.01935205 = weight(_text_:of in 1466) [ClassicSimilarity], result of:
          0.01935205 = score(doc=1466,freq=12.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.29624295 = fieldWeight in 1466, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1466)
        0.019809462 = product of:
          0.039618924 = sum of:
            0.039618924 = weight(_text_:22 in 1466) [ClassicSimilarity], result of:
              0.039618924 = score(doc=1466,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.2708308 = fieldWeight in 1466, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1466)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    The paper discusses the issue of increasing users' access to library records by their publication in Google. Data from the records, converted into html format, have been indexed by Google. The process covered basic formal description fields of the records, description of the content, supported with a thesaurus, as well as an abstract, if present in the record. In addition to monitoring the end users' statistics, the pilot testing covered visibility of library records in Google search results.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  6. Blanco, R.; Matthews, M.; Mika, P.: Ranking of daily deals with concept expansion (2015) 0.01
    0.012556955 = product of:
      0.031392388 = sum of:
        0.009978054 = product of:
          0.04989027 = sum of:
            0.04989027 = weight(_text_:problem in 2663) [ClassicSimilarity], result of:
              0.04989027 = score(doc=2663,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.28137225 = fieldWeight in 2663, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2663)
          0.2 = coord(1/5)
        0.021414334 = weight(_text_:of in 2663) [ClassicSimilarity], result of:
          0.021414334 = score(doc=2663,freq=20.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.32781258 = fieldWeight in 2663, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2663)
      0.4 = coord(2/5)
    
    Abstract
    Daily deals have emerged in the last three years as a successful form of online advertising. The downside of this success is that users are increasingly overloaded by the many thousands of deals offered each day by dozens of deal providers and aggregators. The challenge is thus offering the right deals to the right users i.e., the relevance ranking of deals. This is the problem we address in our paper. Exploiting the characteristics of deals data, we propose a combination of a term- and a concept-based retrieval model that closes the semantic gap between queries and documents expanding both of them with category information. The method consistently outperforms state-of-the-art methods based on term-matching alone and existing approaches for ad classification and ranking.
  7. Vechtomova, O.; Robertson, S.E.: ¬A domain-independent approach to finding related entities (2012) 0.01
    0.01211739 = product of:
      0.030293474 = sum of:
        0.009978054 = product of:
          0.04989027 = sum of:
            0.04989027 = weight(_text_:problem in 2733) [ClassicSimilarity], result of:
              0.04989027 = score(doc=2733,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.28137225 = fieldWeight in 2733, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2733)
          0.2 = coord(1/5)
        0.02031542 = weight(_text_:of in 2733) [ClassicSimilarity], result of:
          0.02031542 = score(doc=2733,freq=18.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.3109903 = fieldWeight in 2733, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2733)
      0.4 = coord(2/5)
    
    Abstract
    We propose an approach to the retrieval of entities that have a specific relationship with the entity given in a query. Our research goal is to investigate whether related entity finding problem can be addressed by combining a measure of relatedness of candidate answer entities to the query, and likelihood that the candidate answer entity belongs to the target entity category specified in the query. An initial list of candidate entities, extracted from top ranked documents retrieved for the query, is refined using a number of statistical and linguistic methods. The proposed method extracts the category of the target entity from the query, identifies instances of this category as seed entities, and computes similarity between candidate and seed entities. The evaluation was conducted on the Related Entity Finding task of the Entity Track of TREC 2010, as well as the QA list questions from TREC 2005 and 2006. Evaluation results demonstrate that the proposed methods are effective in finding related entities.
  8. Ferreira, R.S.; Graça Pimentel, M. de; Cristo, M.: ¬A wikification prediction model based on the combination of latent, dyadic, and monadic features (2018) 0.01
    0.011771946 = product of:
      0.029429864 = sum of:
        0.008315044 = product of:
          0.041575223 = sum of:
            0.041575223 = weight(_text_:problem in 4119) [ClassicSimilarity], result of:
              0.041575223 = score(doc=4119,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.23447686 = fieldWeight in 4119, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4119)
          0.2 = coord(1/5)
        0.02111482 = weight(_text_:of in 4119) [ClassicSimilarity], result of:
          0.02111482 = score(doc=4119,freq=28.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.32322758 = fieldWeight in 4119, product of:
              5.2915025 = tf(freq=28.0), with freq of:
                28.0 = termFreq=28.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4119)
      0.4 = coord(2/5)
    
    Abstract
    Considering repositories of web documents that are semantically linked and created in a collaborative fashion, as in the case of Wikipedia, a key problem faced by content providers is the placement of links in the articles. These links must support user navigation and provide a deeper semantic interpretation of the content. Current wikification methods exploit machine learning techniques to capture characteristics of the concepts and its associations. In previous work, we proposed a preliminary prediction model combining traditional predictors with a latent component which captures the concept graph topology by means of matrix factorization. In this work, we provide a detailed description of our method and a deeper comparison with a state-of-the-art wikification method using a sample of Wikipedia and report a gain up to 13% in F1 score. We also provide a comprehensive analysis of the model performance showing the importance of the latent predictor component and the attributes derived from the associations between the concepts. Moreover, we include an analysis that allows us to conclude that the model is resilient to ambiguity without including a disambiguation phase. We finally report the positive impact of selecting training samples from specific content quality classes.
    Source
    Journal of the Association for Information Science and Technology. 69(2018) no.3, S.380-394
  9. Vo, D.-T.; Bagheri, E.: Feature-enriched matrix factorization for relation extraction (2019) 0.01
    0.011577157 = product of:
      0.028942892 = sum of:
        0.0133040715 = product of:
          0.066520356 = sum of:
            0.066520356 = weight(_text_:problem in 5105) [ClassicSimilarity], result of:
              0.066520356 = score(doc=5105,freq=8.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.375163 = fieldWeight in 5105, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5105)
          0.2 = coord(1/5)
        0.01563882 = weight(_text_:of in 5105) [ClassicSimilarity], result of:
          0.01563882 = score(doc=5105,freq=24.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.23940048 = fieldWeight in 5105, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=5105)
      0.4 = coord(2/5)
    
    Abstract
    Relation extraction aims at finding meaningful relationships between two named entities from within unstructured textual content. In this paper, we define the problem of information extraction as a matrix completion problem where we employ the notion of universal schemas formed as a collection of patterns derived from open information extraction systems as well as additional features derived from grammatical clause patterns and statistical topic models. One of the challenges with earlier work that employ matrix completion methods is that such approaches require a sufficient number of observed relation instances to be able to make predictions. However, in practice there is often insufficient number of explicit evidence supporting each relation type that could be used within the matrix model. Hence, existing work suffer from a low recall. In our work, we extend the work in the state of the art by proposing novel ways of integrating two sets of features, i.e., topic models and grammatical clause structures, for alleviating the low recall problem. More specifically, we propose that it is possible to (1) employ grammatical clause information from textual sentences to serve as an implicit indication of relation type and argument similarity. The basis for this is that it is likely that similar relation types and arguments are observed within similar grammatical structures, and (2) benefit from statistical topic models to determine similarity between relation types and arguments. We employ statistical topic models to determine relation type and argument similarity based on their co-occurrence within the same topics. We have performed extensive experiments based on both gold standard and silver standard datasets. The experiments show that our approach has been able to address the low recall problem in existing methods, by showing an improvement of 21% on recall and 8% on f-measure over the state of the art baseline.
  10. Thenmalar, S.; Geetha, T.V.: Enhanced ontology-based indexing and searching (2014) 0.01
    0.011202767 = product of:
      0.028006917 = sum of:
        0.018102186 = weight(_text_:of in 1633) [ClassicSimilarity], result of:
          0.018102186 = score(doc=1633,freq=42.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.2771099 = fieldWeight in 1633, product of:
              6.4807405 = tf(freq=42.0), with freq of:
                42.0 = termFreq=42.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1633)
        0.009904731 = product of:
          0.019809462 = sum of:
            0.019809462 = weight(_text_:22 in 1633) [ClassicSimilarity], result of:
              0.019809462 = score(doc=1633,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.1354154 = fieldWeight in 1633, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1633)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Purpose - The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based information retrieval aims to identify relevant information based on the meanings of the query terms or on the context of the terms and the performance of semantic information retrieval is carried out through standard measures-precision and recall. Higher precision leads to the (meaningful) relevant documents obtained and lower recall leads to the less coverage of the concepts. Design/methodology/approach - In this paper, the authors enhance the existing ontology-based indexing proposed by Kohler et al., by incorporating sibling information to the index. The index designed by Kohler et al., contains only super and sub-concepts from the ontology. In addition, in our approach, we focus on two tasks; query expansion and ranking of the expanded queries, to improve the efficiency of the ontology-based search. The aforementioned tasks make use of ontological concepts, and relations existing between those concepts so as to obtain semantically more relevant search results for a given query. Findings - The proposed ontology-based indexing technique is investigated by analysing the coverage of concepts that are being populated in the index. Here, we introduce a new measure called index enhancement measure, to estimate the coverage of ontological concepts being indexed. We have evaluated the ontology-based search for the tourism domain with the tourism documents and tourism-specific ontology. The comparison of search results based on the use of ontology "with and without query expansion" is examined to estimate the efficiency of the proposed query expansion task. The ranking is compared with the ORank system to evaluate the performance of our ontology-based search. From these analyses, the ontology-based search results shows better recall when compared to the other concept-based search systems. The mean average precision of the ontology-based search is found to be 0.79 and the recall is found to be 0.65, the ORank system has the mean average precision of 0.62 and the recall is found to be 0.51, while the concept-based search has the mean average precision of 0.56 and the recall is found to be 0.42. Practical implications - When the concept is not present in the domain-specific ontology, the concept cannot be indexed. When the given query term is not available in the ontology then the term-based results are retrieved. Originality/value - In addition to super and sub-concepts, we incorporate the concepts present in same level (siblings) to the ontological index. The structural information from the ontology is determined for the query expansion. The ranking of the documents depends on the type of the query (single concept query, multiple concept queries and concept with relation queries) and the ontological relations that exists in the query and the documents. With this ontological structural information, the search results showed us better coverage of concepts with respect to the query.
    Date
    20. 1.2015 18:30:22
    Source
    Aslib journal of information management. 66(2014) no.6, S.678-696
  11. Gnoli, C.; Santis, R. de; Pusterla, L.: Commerce, see also Rhetoric : cross-discipline relationships as authority data for enhanced retrieval (2015) 0.01
    0.011145428 = product of:
      0.02786357 = sum of:
        0.008315044 = product of:
          0.041575223 = sum of:
            0.041575223 = weight(_text_:problem in 2299) [ClassicSimilarity], result of:
              0.041575223 = score(doc=2299,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.23447686 = fieldWeight in 2299, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2299)
          0.2 = coord(1/5)
        0.019548526 = weight(_text_:of in 2299) [ClassicSimilarity], result of:
          0.019548526 = score(doc=2299,freq=24.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.2992506 = fieldWeight in 2299, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2299)
      0.4 = coord(2/5)
    
    Abstract
    Subjects in a classification scheme are often related to other subjects belonging to different hierarchies. This problem was identified already by Hugh of Saint Victor (1096?-1141). Still with present-time bibliographic classifications, a user browsing the class of architecture under the hierarchy of arts may miss relevant items classified in building or in civil engineering under the hierarchy of applied sciences. To face these limitations we have developed SciGator, a browsable interface to explore the collections of all scientific libraries at the University of Pavia. Besides showing subclasses of a given class, the interface points users to related classes in the Dewey Decimal Classification, or in other local schemes, and allows for expanded queries that include them. This is made possible by using a special field for related classes in the database structure which models classification authority data. Ontologically, many relationships between classes in different hierarchies are cases of existential dependence. Dependence can occur between disciplines in such disciplinary classifications as Dewey (e.g. architecture existentially depends on building), or between phenomena in such phenomenon-based classifications as the Integrative Levels Classification (e.g. fishing as a human activity existentially depends on fish as a class of organisms). We provide an example of its representation in OWL and discuss some details of it.
    Source
    Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro
  12. Ahn, J.-w.; Brusilovsky, P.: Adaptive visualization for exploratory information retrieval (2013) 0.01
    0.011145428 = product of:
      0.02786357 = sum of:
        0.008315044 = product of:
          0.041575223 = sum of:
            0.041575223 = weight(_text_:problem in 2717) [ClassicSimilarity], result of:
              0.041575223 = score(doc=2717,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.23447686 = fieldWeight in 2717, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2717)
          0.2 = coord(1/5)
        0.019548526 = weight(_text_:of in 2717) [ClassicSimilarity], result of:
          0.019548526 = score(doc=2717,freq=24.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.2992506 = fieldWeight in 2717, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2717)
      0.4 = coord(2/5)
    
    Abstract
    As the volume and breadth of online information is rapidly increasing, ad hoc search systems become less and less efficient to answer information needs of modern users. To support the growing complexity of search tasks, researchers in the field of information developed and explored a range of approaches that extend the traditional ad hoc retrieval paradigm. Among these approaches, personalized search systems and exploratory search systems attracted many followers. Personalized search explored the power of artificial intelligence techniques to provide tailored search results according to different user interests, contexts, and tasks. In contrast, exploratory search capitalized on the power of human intelligence by providing users with more powerful interfaces to support the search process. As these approaches are not contradictory, we believe that they can re-enforce each other. We argue that the effectiveness of personalized search systems may be increased by allowing users to interact with the system and learn/investigate the problem in order to reach the final goal. We also suggest that an interactive visualization approach could offer a good ground to combine the strong sides of personalized and exploratory search approaches. This paper proposes a specific way to integrate interactive visualization and personalized search and introduces an adaptive visualization based search system Adaptive VIBE that implements it. We tested the effectiveness of Adaptive VIBE and investigated its strengths and weaknesses by conducting a full-scale user study. The results show that Adaptive VIBE can improve the precision and the productivity of the personalized search system while helping users to discover more diverse sets of information.
  13. Salaba, A.; Zeng, M.L.: Extending the "Explore" user task beyond subject authority data into the linked data sphere (2014) 0.01
    0.011083962 = product of:
      0.027709905 = sum of:
        0.007900442 = weight(_text_:of in 1465) [ClassicSimilarity], result of:
          0.007900442 = score(doc=1465,freq=2.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.120940685 = fieldWeight in 1465, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1465)
        0.019809462 = product of:
          0.039618924 = sum of:
            0.039618924 = weight(_text_:22 in 1465) [ClassicSimilarity], result of:
              0.039618924 = score(doc=1465,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.2708308 = fieldWeight in 1465, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1465)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  14. Zeng, M.L.; Gracy, K.F.; Zumer, M.: Using a semantic analysis tool to generate subject access points : a study using Panofsky's theory and two research samples (2014) 0.01
    0.010622528 = product of:
      0.02655632 = sum of:
        0.009576782 = weight(_text_:of in 1464) [ClassicSimilarity], result of:
          0.009576782 = score(doc=1464,freq=4.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.14660224 = fieldWeight in 1464, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1464)
        0.016979538 = product of:
          0.033959076 = sum of:
            0.033959076 = weight(_text_:22 in 1464) [ClassicSimilarity], result of:
              0.033959076 = score(doc=1464,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.23214069 = fieldWeight in 1464, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1464)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    This paper attempts to explore an approach of using an automatic semantic analysis tool to enhance the "subject" access to materials that are not included in the usual library subject cataloging process. Using two research samples the authors analyzed the access points supplied by OpenCalais, a semantic analysis tool. As an aid in understanding how computerized subject analysis might be approached, this paper suggests using the three-layer framework that has been accepted and applied in image analysis, developed by Erwin Panofsky.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  15. Celik, I.; Abel, F.; Siehndel, P.: Adaptive faceted search on Twitter (2011) 0.01
    0.010429245 = product of:
      0.026073113 = sum of:
        0.0133040715 = product of:
          0.066520356 = sum of:
            0.066520356 = weight(_text_:problem in 2221) [ClassicSimilarity], result of:
              0.066520356 = score(doc=2221,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.375163 = fieldWeight in 2221, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2221)
          0.2 = coord(1/5)
        0.0127690425 = weight(_text_:of in 2221) [ClassicSimilarity], result of:
          0.0127690425 = score(doc=2221,freq=4.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.19546966 = fieldWeight in 2221, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=2221)
      0.4 = coord(2/5)
    
    Abstract
    In the last few years, Twitter has become a powerful tool for publishing and discussing information. Yet, content exploration in Twitter requires substantial efforts and users often have to scan information streams by hand. In this paper, we approach this problem by means of faceted search. We propose strategies for inferring facets and facet values on Twitter by enriching the semantics of individual Twitter messages and present di erent methods, including personalized and context-adaptive methods, for making faceted search on Twitter more effective.
  16. Bando, L.L.; Scholer, F.; Turpin, A.: Query-biased summary generation assisted by query expansion : temporality (2015) 0.01
    0.010097824 = product of:
      0.02524456 = sum of:
        0.008315044 = product of:
          0.041575223 = sum of:
            0.041575223 = weight(_text_:problem in 1820) [ClassicSimilarity], result of:
              0.041575223 = score(doc=1820,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.23447686 = fieldWeight in 1820, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1820)
          0.2 = coord(1/5)
        0.016929517 = weight(_text_:of in 1820) [ClassicSimilarity], result of:
          0.016929517 = score(doc=1820,freq=18.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.25915858 = fieldWeight in 1820, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1820)
      0.4 = coord(2/5)
    
    Abstract
    Query-biased summaries help users to identify which items returned by a search system should be read in full. In this article, we study the generation of query-biased summaries as a sentence ranking approach, and methods to evaluate their effectiveness. Using sentence-level relevance assessments from the TREC Novelty track, we gauge the benefits of query expansion to minimize the vocabulary mismatch problem between informational requests and sentence ranking methods. Our results from an intrinsic evaluation show that query expansion significantly improves the selection of short relevant sentences (5-13 words) between 7% and 11%. However, query expansion does not lead to improvements for sentences of medium (14-20 words) and long (21-29 words) lengths. In a separate crowdsourcing study, we analyze whether a summary composed of sentences ranked using query expansion was preferred over summaries not assisted by query expansion, rather than assessing sentences individually. We found that participants chose summaries aided by query expansion around 60% of the time over summaries using an unexpanded query. We conclude that query expansion techniques can benefit the selection of sentences for the construction of query-biased summaries at the summary level rather than at the sentence ranking level.
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.5, S.961-979
  17. Vidinli, I.B.; Ozcan, R.: New query suggestion framework and algorithms : a case study for an educational search engine (2016) 0.01
    0.010048111 = product of:
      0.025120277 = sum of:
        0.009978054 = product of:
          0.04989027 = sum of:
            0.04989027 = weight(_text_:problem in 3185) [ClassicSimilarity], result of:
              0.04989027 = score(doc=3185,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.28137225 = fieldWeight in 3185, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3185)
          0.2 = coord(1/5)
        0.015142222 = weight(_text_:of in 3185) [ClassicSimilarity], result of:
          0.015142222 = score(doc=3185,freq=10.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.23179851 = fieldWeight in 3185, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3185)
      0.4 = coord(2/5)
    
    Abstract
    Query suggestion is generally an integrated part of web search engines. In this study, we first redefine and reduce the query suggestion problem as "comparison of queries". We then propose a general modular framework for query suggestion algorithm development. We also develop new query suggestion algorithms which are used in our proposed framework, exploiting query, session and user features. As a case study, we use query logs of a real educational search engine that targets K-12 students in Turkey. We also exploit educational features (course, grade) in our query suggestion algorithms. We test our framework and algorithms over a set of queries by an experiment and demonstrate a 66-90% statistically significant increase in relevance of query suggestions compared to a baseline method.
  18. Brunetti, J.M.; Roberto García, R.: User-centered design and evaluation of overview components for semantic data exploration (2014) 0.01
    0.009945323 = product of:
      0.024863306 = sum of:
        0.013543614 = weight(_text_:of in 1626) [ClassicSimilarity], result of:
          0.013543614 = score(doc=1626,freq=18.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.20732687 = fieldWeight in 1626, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=1626)
        0.011319693 = product of:
          0.022639386 = sum of:
            0.022639386 = weight(_text_:22 in 1626) [ClassicSimilarity], result of:
              0.022639386 = score(doc=1626,freq=2.0), product of:
                0.14628662 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04177434 = queryNorm
                0.15476047 = fieldWeight in 1626, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1626)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Purpose - The growing volumes of semantic data available in the web result in the need for handling the information overload phenomenon. The potential of this amount of data is enormous but in most cases it is very difficult for users to visualize, explore and use this data, especially for lay-users without experience with Semantic Web technologies. The paper aims to discuss these issues. Design/methodology/approach - The Visual Information-Seeking Mantra "Overview first, zoom and filter, then details-on-demand" proposed by Shneiderman describes how data should be presented in different stages to achieve an effective exploration. The overview is the first user task when dealing with a data set. The objective is that the user is capable of getting an idea about the overall structure of the data set. Different information architecture (IA) components supporting the overview tasks have been developed, so they are automatically generated from semantic data, and evaluated with end-users. Findings - The chosen IA components are well known to web users, as they are present in most web pages: navigation bars, site maps and site indexes. The authors complement them with Treemaps, a visualization technique for displaying hierarchical data. These components have been developed following an iterative User-Centered Design methodology. Evaluations with end-users have shown that they get easily used to them despite the fact that they are generated automatically from structured data, without requiring knowledge about the underlying semantic technologies, and that the different overview components complement each other as they focus on different information search needs. Originality/value - Obtaining semantic data sets overviews cannot be easily done with the current semantic web browsers. Overviews become difficult to achieve with large heterogeneous data sets, which is typical in the Semantic Web, because traditional IA techniques do not easily scale to large data sets. There is little or no support to obtain overview information quickly and easily at the beginning of the exploration of a new data set. This can be a serious limitation when exploring a data set for the first time, especially for lay-users. The proposal is to reuse and adapt existing IA components to provide this overview to users and show that they can be generated automatically from the thesaurus and ontologies that structure semantic data while providing a comparable user experience to traditional web sites.
    Date
    20. 1.2015 18:30:22
    Source
    Aslib journal of information management. 66(2014) no.5, S.519-536
  19. Liu, X.; Zheng, W.; Fang, H.: ¬An exploration of ranking models and feedback method for related entity finding (2013) 0.01
    0.009751107 = product of:
      0.024377767 = sum of:
        0.0117592495 = product of:
          0.058796246 = sum of:
            0.058796246 = weight(_text_:problem in 2714) [ClassicSimilarity], result of:
              0.058796246 = score(doc=2714,freq=4.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.33160037 = fieldWeight in 2714, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2714)
          0.2 = coord(1/5)
        0.012618518 = weight(_text_:of in 2714) [ClassicSimilarity], result of:
          0.012618518 = score(doc=2714,freq=10.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.19316542 = fieldWeight in 2714, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2714)
      0.4 = coord(2/5)
    
    Abstract
    Most existing search engines focus on document retrieval. However, information needs are certainly not limited to finding relevant documents. Instead, a user may want to find relevant entities such as persons and organizations. In this paper, we study the problem of related entity finding. Our goal is to rank entities based on their relevance to a structured query, which specifies an input entity, the type of related entities and the relation between the input and related entities. We first discuss a general probabilistic framework, derive six possible retrieval models to rank the related entities, and then compare these models both analytically and empirically. To further improve performance, we study the problem of feedback in the context of related entity finding. Specifically, we propose a mixture model based feedback method that can utilize the pseudo feedback entities to estimate an enriched model for the relation between the input and related entities. Experimental results over two standard TREC collections show that the derived relation generation model combined with a relation feedback method performs better than other models.
  20. Hazrina, S.; Sharef, N.M.; Ibrahim, H.; Murad, M.A.A.; Noah, S.A.M.: Review on the advancements of disambiguation in semantic question answering system (2017) 0.01
    0.009654707 = product of:
      0.024136767 = sum of:
        0.0066520358 = product of:
          0.033260178 = sum of:
            0.033260178 = weight(_text_:problem in 3292) [ClassicSimilarity], result of:
              0.033260178 = score(doc=3292,freq=2.0), product of:
                0.17731056 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.04177434 = queryNorm
                0.1875815 = fieldWeight in 3292, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3292)
          0.2 = coord(1/5)
        0.017484732 = weight(_text_:of in 3292) [ClassicSimilarity], result of:
          0.017484732 = score(doc=3292,freq=30.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.26765788 = fieldWeight in 3292, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=3292)
      0.4 = coord(2/5)
    
    Abstract
    Ambiguity is a potential problem in any semantic question answering (SQA) system due to the nature of idiosyncrasy in composing natural language (NL) question and semantic resources. Thus, disambiguation of SQA systems is a field of ongoing research. Ambiguity occurs in SQA because a word or a sentence can have more than one meaning or multiple words in the same language can share the same meaning. Therefore, an SQA system needs disambiguation solutions to select the correct meaning when the linguistic triples matched with multiple KB concepts, and enumerate similar words especially when linguistic triples do not match with any KB concept. The latest development in this field is a solution for SQA systems that is able to process a complex NL question while accessing open-domain data from linked open data (LOD). The contributions in this paper include (1) formulating an SQA conceptual framework based on an in-depth study of existing SQA processes; (2) identifying the ambiguity types, specifically in English based on an interdisciplinary literature review; (3) highlighting the ambiguity types that had been resolved by the previous SQA studies; and (4) analysing the results of the existing SQA disambiguation solutions, the complexity of NL question processing, and the complexity of data retrieval from KB(s) or LOD. The results of this review demonstrated that out of thirteen types of ambiguity identified in the literature, only six types had been successfully resolved by the previous studies. Efforts to improve the disambiguation are in progress for the remaining unresolved ambiguity types to improve the accuracy of the formulated answers by the SQA system. The remaining ambiguity types are potentially resolved in the identified SQA process based on ambiguity scenarios elaborated in this paper. The results of this review also demonstrated that most existing research on SQA systems have treated the processing of the NL question complexity separate from the processing of the KB structure complexity.

Types

  • a 71
  • el 9
  • m 8
  • s 1
  • x 1
  • More… Less…