Search (34 results, page 1 of 2)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  • × year_i:[2010 TO 2020}
  1. Vechtomova, O.; Robertson, S.E.: ¬A domain-independent approach to finding related entities (2012) 0.01
    0.0051744715 = product of:
      0.020697886 = sum of:
        0.012270111 = product of:
          0.03681033 = sum of:
            0.03681033 = weight(_text_:problem in 2733) [ClassicSimilarity], result of:
              0.03681033 = score(doc=2733,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.28137225 = fieldWeight in 2733, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2733)
          0.33333334 = coord(1/3)
        0.008427775 = product of:
          0.025283325 = sum of:
            0.025283325 = weight(_text_:29 in 2733) [ClassicSimilarity], result of:
              0.025283325 = score(doc=2733,freq=2.0), product of:
                0.108422816 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23319192 = fieldWeight in 2733, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2733)
          0.33333334 = coord(1/3)
      0.25 = coord(2/8)
    
    Abstract
    We propose an approach to the retrieval of entities that have a specific relationship with the entity given in a query. Our research goal is to investigate whether related entity finding problem can be addressed by combining a measure of relatedness of candidate answer entities to the query, and likelihood that the candidate answer entity belongs to the target entity category specified in the query. An initial list of candidate entities, extracted from top ranked documents retrieved for the query, is refined using a number of statistical and linguistic methods. The proposed method extracts the category of the target entity from the query, identifies instances of this category as seed entities, and computes similarity between candidate and seed entities. The evaluation was conducted on the Related Entity Finding task of the Entity Track of TREC 2010, as well as the QA list questions from TREC 2005 and 2006. Evaluation results demonstrate that the proposed methods are effective in finding related entities.
    Date
    27. 1.2016 18:44:29
  2. Bando, L.L.; Scholer, F.; Turpin, A.: Query-biased summary generation assisted by query expansion : temporality (2015) 0.00
    0.0043120594 = product of:
      0.017248238 = sum of:
        0.010225092 = product of:
          0.030675275 = sum of:
            0.030675275 = weight(_text_:problem in 1820) [ClassicSimilarity], result of:
              0.030675275 = score(doc=1820,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23447686 = fieldWeight in 1820, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1820)
          0.33333334 = coord(1/3)
        0.007023146 = product of:
          0.021069437 = sum of:
            0.021069437 = weight(_text_:29 in 1820) [ClassicSimilarity], result of:
              0.021069437 = score(doc=1820,freq=2.0), product of:
                0.108422816 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030822188 = queryNorm
                0.19432661 = fieldWeight in 1820, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1820)
          0.33333334 = coord(1/3)
      0.25 = coord(2/8)
    
    Abstract
    Query-biased summaries help users to identify which items returned by a search system should be read in full. In this article, we study the generation of query-biased summaries as a sentence ranking approach, and methods to evaluate their effectiveness. Using sentence-level relevance assessments from the TREC Novelty track, we gauge the benefits of query expansion to minimize the vocabulary mismatch problem between informational requests and sentence ranking methods. Our results from an intrinsic evaluation show that query expansion significantly improves the selection of short relevant sentences (5-13 words) between 7% and 11%. However, query expansion does not lead to improvements for sentences of medium (14-20 words) and long (21-29 words) lengths. In a separate crowdsourcing study, we analyze whether a summary composed of sentences ranked using query expansion was preferred over summaries not assisted by query expansion, rather than assessing sentences individually. We found that participants chose summaries aided by query expansion around 60% of the time over summaries using an unexpanded query. We conclude that query expansion techniques can benefit the selection of sentences for the construction of query-biased summaries at the summary level rather than at the sentence ranking level.
  3. Gnoli, C.; Santis, R. de; Pusterla, L.: Commerce, see also Rhetoric : cross-discipline relationships as authority data for enhanced retrieval (2015) 0.00
    0.0043120594 = product of:
      0.017248238 = sum of:
        0.010225092 = product of:
          0.030675275 = sum of:
            0.030675275 = weight(_text_:problem in 2299) [ClassicSimilarity], result of:
              0.030675275 = score(doc=2299,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23447686 = fieldWeight in 2299, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2299)
          0.33333334 = coord(1/3)
        0.007023146 = product of:
          0.021069437 = sum of:
            0.021069437 = weight(_text_:29 in 2299) [ClassicSimilarity], result of:
              0.021069437 = score(doc=2299,freq=2.0), product of:
                0.108422816 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030822188 = queryNorm
                0.19432661 = fieldWeight in 2299, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2299)
          0.33333334 = coord(1/3)
      0.25 = coord(2/8)
    
    Abstract
    Subjects in a classification scheme are often related to other subjects belonging to different hierarchies. This problem was identified already by Hugh of Saint Victor (1096?-1141). Still with present-time bibliographic classifications, a user browsing the class of architecture under the hierarchy of arts may miss relevant items classified in building or in civil engineering under the hierarchy of applied sciences. To face these limitations we have developed SciGator, a browsable interface to explore the collections of all scientific libraries at the University of Pavia. Besides showing subclasses of a given class, the interface points users to related classes in the Dewey Decimal Classification, or in other local schemes, and allows for expanded queries that include them. This is made possible by using a special field for related classes in the database structure which models classification authority data. Ontologically, many relationships between classes in different hierarchies are cases of existential dependence. Dependence can occur between disciplines in such disciplinary classifications as Dewey (e.g. architecture existentially depends on building), or between phenomena in such phenomenon-based classifications as the Integrative Levels Classification (e.g. fishing as a human activity existentially depends on fish as a class of organisms). We provide an example of its representation in OWL and discuss some details of it.
    Source
    Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro
  4. Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.00
    0.004296265 = product of:
      0.01718506 = sum of:
        0.010225092 = product of:
          0.030675275 = sum of:
            0.030675275 = weight(_text_:problem in 1343) [ClassicSimilarity], result of:
              0.030675275 = score(doc=1343,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23447686 = fieldWeight in 1343, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1343)
          0.33333334 = coord(1/3)
        0.0069599687 = product of:
          0.020879906 = sum of:
            0.020879906 = weight(_text_:22 in 1343) [ClassicSimilarity], result of:
              0.020879906 = score(doc=1343,freq=2.0), product of:
                0.10793405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030822188 = queryNorm
                0.19345059 = fieldWeight in 1343, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1343)
          0.33333334 = coord(1/3)
      0.25 = coord(2/8)
    
    Abstract
    A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. Recently, methods that exploit named entities have been shown to be more effective for query expansion than traditional pseudorelevance feedback methods. In this article, we introduce a supervised learning approach that exploits named entities for query expansion using Wikipedia as a repository of high-quality feedback documents. In contrast with existing entity-oriented pseudorelevance feedback approaches, we tackle query expansion as a learning-to-rank problem. As a result, not only do we select effective expansion terms but we also weigh these terms according to their predicted effectiveness. To this end, we exploit the rich structure of Wikipedia articles to devise discriminative term features, including each candidate term's proximity to the original query terms, as well as its frequency across multiple article fields and in category and infobox descriptors. Experiments on three Text REtrieval Conference web test collections attest the effectiveness of our approach, with gains of up to 23.32% in terms of mean average precision, 19.49% in terms of precision at 10, and 7.86% in terms of normalized discounted cumulative gain compared with a state-of-the-art approach for entity-oriented query expansion.
    Date
    22. 8.2014 17:07:50
  5. Layfield, C.; Azzopardi, J,; Staff, C.: Experiments with document retrieval from small text collections using Latent Semantic Analysis or term similarity with query coordination and automatic relevance feedback (2017) 0.00
    0.0034496475 = product of:
      0.01379859 = sum of:
        0.008180073 = product of:
          0.02454022 = sum of:
            0.02454022 = weight(_text_:problem in 3478) [ClassicSimilarity], result of:
              0.02454022 = score(doc=3478,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.1875815 = fieldWeight in 3478, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3478)
          0.33333334 = coord(1/3)
        0.0056185164 = product of:
          0.016855549 = sum of:
            0.016855549 = weight(_text_:29 in 3478) [ClassicSimilarity], result of:
              0.016855549 = score(doc=3478,freq=2.0), product of:
                0.108422816 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030822188 = queryNorm
                0.15546128 = fieldWeight in 3478, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3478)
          0.33333334 = coord(1/3)
      0.25 = coord(2/8)
    
    Abstract
    One of the problems faced by users of databases containing textual documents is the difficulty in retrieving relevant results due to the diverse vocabulary used in queries and contained in relevant documents, especially when there are only a small number of relevant documents. This problem is known as the Vocabulary Gap. The PIKES team have constructed a small test collection of 331 articles extracted from a blog and a Gold Standard for 35 queries selected from the blog's search log so the results of different approaches to semantic search can be compared. So far, prior approaches include recognising Named Entities in documents and queries, and relations including temporal relations, and represent them as `semantic layers' in a retrieval system index. In this work, we take two different approaches that do not involve Named Entity Recognition. In the first approach, we process an unannotated version of the PIKES document collection using Latent Semantic Analysis and use a combination of query coordination and automatic relevance feedback with which we outperform prior work. However, this approach is highly dependent on the underlying collection, and is not necessarily scalable to massive collections. In our second approach, we use an LSA Model generated by SEMILAR from a Wikipedia dump to generate a Term Similarity Matrix (TSM). We automatically expand the queries in the PIKES test collection with related terms from the TSM and submit them to a term-by-document matrix derived by indexing the PIKES collection using the Vector Space Model. Coupled with a combination of query coordination and automatic relevance feedback we also outperform prior work with this approach. The advantage of the second approach is that it is independent of the underlying document collection.
    Date
    10. 3.2017 13:29:57
  6. Celik, I.; Abel, F.; Siehndel, P.: Adaptive faceted search on Twitter (2011) 0.00
    0.0020450184 = product of:
      0.016360147 = sum of:
        0.016360147 = product of:
          0.04908044 = sum of:
            0.04908044 = weight(_text_:problem in 2221) [ClassicSimilarity], result of:
              0.04908044 = score(doc=2221,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.375163 = fieldWeight in 2221, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2221)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    In the last few years, Twitter has become a powerful tool for publishing and discussing information. Yet, content exploration in Twitter requires substantial efforts and users often have to scan information streams by hand. In this paper, we approach this problem by means of faceted search. We propose strategies for inferring facets and facet values on Twitter by enriching the semantics of individual Twitter messages and present di erent methods, including personalized and context-adaptive methods, for making faceted search on Twitter more effective.
  7. Vo, D.-T.; Bagheri, E.: Feature-enriched matrix factorization for relation extraction (2019) 0.00
    0.0020450184 = product of:
      0.016360147 = sum of:
        0.016360147 = product of:
          0.04908044 = sum of:
            0.04908044 = weight(_text_:problem in 5105) [ClassicSimilarity], result of:
              0.04908044 = score(doc=5105,freq=8.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.375163 = fieldWeight in 5105, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5105)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    Relation extraction aims at finding meaningful relationships between two named entities from within unstructured textual content. In this paper, we define the problem of information extraction as a matrix completion problem where we employ the notion of universal schemas formed as a collection of patterns derived from open information extraction systems as well as additional features derived from grammatical clause patterns and statistical topic models. One of the challenges with earlier work that employ matrix completion methods is that such approaches require a sufficient number of observed relation instances to be able to make predictions. However, in practice there is often insufficient number of explicit evidence supporting each relation type that could be used within the matrix model. Hence, existing work suffer from a low recall. In our work, we extend the work in the state of the art by proposing novel ways of integrating two sets of features, i.e., topic models and grammatical clause structures, for alleviating the low recall problem. More specifically, we propose that it is possible to (1) employ grammatical clause information from textual sentences to serve as an implicit indication of relation type and argument similarity. The basis for this is that it is likely that similar relation types and arguments are observed within similar grammatical structures, and (2) benefit from statistical topic models to determine similarity between relation types and arguments. We employ statistical topic models to determine relation type and argument similarity based on their co-occurrence within the same topics. We have performed extensive experiments based on both gold standard and silver standard datasets. The experiments show that our approach has been able to address the low recall problem in existing methods, by showing an improvement of 21% on recall and 8% on f-measure over the state of the art baseline.
  8. Liu, X.; Zheng, W.; Fang, H.: ¬An exploration of ranking models and feedback method for related entity finding (2013) 0.00
    0.0018075579 = product of:
      0.014460463 = sum of:
        0.014460463 = product of:
          0.04338139 = sum of:
            0.04338139 = weight(_text_:problem in 2714) [ClassicSimilarity], result of:
              0.04338139 = score(doc=2714,freq=4.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.33160037 = fieldWeight in 2714, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2714)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    Most existing search engines focus on document retrieval. However, information needs are certainly not limited to finding relevant documents. Instead, a user may want to find relevant entities such as persons and organizations. In this paper, we study the problem of related entity finding. Our goal is to rank entities based on their relevance to a structured query, which specifies an input entity, the type of related entities and the relation between the input and related entities. We first discuss a general probabilistic framework, derive six possible retrieval models to rank the related entities, and then compare these models both analytically and empirically. To further improve performance, we study the problem of feedback in the context of related entity finding. Specifically, we propose a mixture model based feedback method that can utilize the pseudo feedback entities to estimate an enriched model for the relation between the input and related entities. Experimental results over two standard TREC collections show that the derived relation generation model combined with a relation feedback method performs better than other models.
  9. Rekabsaz, N. et al.: Toward optimized multimodal concept indexing (2016) 0.00
    0.0017399922 = product of:
      0.013919937 = sum of:
        0.013919937 = product of:
          0.04175981 = sum of:
            0.04175981 = weight(_text_:22 in 2751) [ClassicSimilarity], result of:
              0.04175981 = score(doc=2751,freq=2.0), product of:
                0.10793405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030822188 = queryNorm
                0.38690117 = fieldWeight in 2751, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2751)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    1. 2.2016 18:25:22
  10. Kozikowski, P. et al.: Support of part-whole relations in query answering (2016) 0.00
    0.0017399922 = product of:
      0.013919937 = sum of:
        0.013919937 = product of:
          0.04175981 = sum of:
            0.04175981 = weight(_text_:22 in 2754) [ClassicSimilarity], result of:
              0.04175981 = score(doc=2754,freq=2.0), product of:
                0.10793405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030822188 = queryNorm
                0.38690117 = fieldWeight in 2754, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2754)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    1. 2.2016 18:25:22
  11. Marx, E. et al.: Exploring term networks for semantic search over RDF knowledge graphs (2016) 0.00
    0.0017399922 = product of:
      0.013919937 = sum of:
        0.013919937 = product of:
          0.04175981 = sum of:
            0.04175981 = weight(_text_:22 in 3279) [ClassicSimilarity], result of:
              0.04175981 = score(doc=3279,freq=2.0), product of:
                0.10793405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030822188 = queryNorm
                0.38690117 = fieldWeight in 3279, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3279)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  12. Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.00
    0.0017399922 = product of:
      0.013919937 = sum of:
        0.013919937 = product of:
          0.04175981 = sum of:
            0.04175981 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
              0.04175981 = score(doc=3280,freq=2.0), product of:
                0.10793405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030822188 = queryNorm
                0.38690117 = fieldWeight in 3280, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3280)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  13. Hannech, A.: Système de recherche d'information étendue basé sur une projection multi-espaces (2018) 0.00
    0.0017248237 = product of:
      0.006899295 = sum of:
        0.0040900367 = product of:
          0.01227011 = sum of:
            0.01227011 = weight(_text_:problem in 4472) [ClassicSimilarity], result of:
              0.01227011 = score(doc=4472,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.09379075 = fieldWeight in 4472, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.015625 = fieldNorm(doc=4472)
          0.33333334 = coord(1/3)
        0.0028092582 = product of:
          0.0084277745 = sum of:
            0.0084277745 = weight(_text_:29 in 4472) [ClassicSimilarity], result of:
              0.0084277745 = score(doc=4472,freq=2.0), product of:
                0.108422816 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030822188 = queryNorm
                0.07773064 = fieldWeight in 4472, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.015625 = fieldNorm(doc=4472)
          0.33333334 = coord(1/3)
      0.25 = coord(2/8)
    
    Abstract
    However, this assumption does not hold in all cases, the needs of the user evolve over time and can move away from his previous interests stored in his profile. In other cases, the user's profile may be misused to extract or infer new information needs. This problem is much more accentuated with ambiguous queries. When multiple POIs linked to a search query are identified in the user's profile, the system is unable to select the relevant data from that profile to respond to that request. This has a direct impact on the quality of the results provided to this user. In order to overcome some of these limitations, in this research thesis, we have been interested in the development of techniques aimed mainly at improving the relevance of the results of current SRIs and facilitating the exploration of major collections of documents. To do this, we propose a solution based on a new concept and model of indexing and information retrieval called multi-spaces projection. This proposal is based on the exploitation of different categories of semantic and social information that enrich the universe of document representation and search queries in several dimensions of interpretations. The originality of this representation is to be able to distinguish between the different interpretations used for the description and the search for documents. This gives a better visibility on the results returned and helps to provide a greater flexibility of search and exploration, giving the user the ability to navigate one or more views of data that interest him the most. In addition, the proposed multidimensional representation universes for document description and search query interpretation help to improve the relevance of the user's results by providing a diversity of research / exploration that helps meet his diverse needs and those of other different users. This study exploits different aspects that are related to the personalized search and aims to solve the problems caused by the evolution of the information needs of the user. Thus, when the profile of this user is used by our system, a technique is proposed and used to identify the interests most representative of his current needs in his profile. This technique is based on the combination of three influential factors, including the contextual, frequency and temporal factor of the data. The ability of users to interact, exchange ideas and opinions, and form social networks on the Web, has led systems to focus on the types of interactions these users have at the level of interaction between them as well as their social roles in the system. This social information is discussed and integrated into this research work. The impact and how they are integrated into the IR process are studied to improve the relevance of the results.
    Date
    29. 9.2018 18:57:38
  14. Blanco, R.; Matthews, M.; Mika, P.: Ranking of daily deals with concept expansion (2015) 0.00
    0.0015337638 = product of:
      0.012270111 = sum of:
        0.012270111 = product of:
          0.03681033 = sum of:
            0.03681033 = weight(_text_:problem in 2663) [ClassicSimilarity], result of:
              0.03681033 = score(doc=2663,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.28137225 = fieldWeight in 2663, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2663)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    Daily deals have emerged in the last three years as a successful form of online advertising. The downside of this success is that users are increasingly overloaded by the many thousands of deals offered each day by dozens of deal providers and aggregators. The challenge is thus offering the right deals to the right users i.e., the relevance ranking of deals. This is the problem we address in our paper. Exploiting the characteristics of deals data, we propose a combination of a term- and a concept-based retrieval model that closes the semantic gap between queries and documents expanding both of them with category information. The method consistently outperforms state-of-the-art methods based on term-matching alone and existing approaches for ad classification and ranking.
  15. Vidinli, I.B.; Ozcan, R.: New query suggestion framework and algorithms : a case study for an educational search engine (2016) 0.00
    0.0015337638 = product of:
      0.012270111 = sum of:
        0.012270111 = product of:
          0.03681033 = sum of:
            0.03681033 = weight(_text_:problem in 3185) [ClassicSimilarity], result of:
              0.03681033 = score(doc=3185,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.28137225 = fieldWeight in 3185, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3185)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    Query suggestion is generally an integrated part of web search engines. In this study, we first redefine and reduce the query suggestion problem as "comparison of queries". We then propose a general modular framework for query suggestion algorithm development. We also develop new query suggestion algorithms which are used in our proposed framework, exploiting query, session and user features. As a case study, we use query logs of a real educational search engine that targets K-12 students in Turkey. We also exploit educational features (course, grade) in our query suggestion algorithms. We test our framework and algorithms over a set of queries by an experiment and demonstrate a 66-90% statistically significant increase in relevance of query suggestions compared to a baseline method.
  16. Koopman, B.; Zuccon, G.; Bruza, P.; Sitbon, L.; Lawley, M.: Information retrieval as semantic inference : a graph Inference model applied to medical search (2016) 0.00
    0.0014460464 = product of:
      0.011568371 = sum of:
        0.011568371 = product of:
          0.034705114 = sum of:
            0.034705114 = weight(_text_:problem in 3260) [ClassicSimilarity], result of:
              0.034705114 = score(doc=3260,freq=4.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.2652803 = fieldWeight in 3260, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3260)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    This paper presents a Graph Inference retrieval model that integrates structured knowledge resources, statistical information retrieval methods and inference in a unified framework. Key components of the model are a graph-based representation of the corpus and retrieval driven by an inference mechanism achieved as a traversal over the graph. The model is proposed to tackle the semantic gap problem-the mismatch between the raw data and the way a human being interprets it. We break down the semantic gap problem into five core issues, each requiring a specific type of inference in order to be overcome. Our model and evaluation is applied to the medical domain because search within this domain is particularly challenging and, as we show, often requires inference. In addition, this domain features both structured knowledge resources as well as unstructured text. Our evaluation shows that inference can be effective, retrieving many new relevant documents that are not retrieved by state-of-the-art information retrieval models. We show that many retrieved documents were not pooled by keyword-based search methods, prompting us to perform additional relevance assessment on these new documents. A third of the newly retrieved documents judged were found to be relevant. Our analysis provides a thorough understanding of when and how to apply inference for retrieval, including a categorisation of queries according to the effect of inference. The inference mechanism promoted recall by retrieving new relevant documents not found by previous keyword-based approaches. In addition, it promoted precision by an effective reranking of documents. When inference is used, performance gains can generally be expected on hard queries. However, inference should not be applied universally: for easy, unambiguous queries and queries with few relevant documents, inference did adversely affect effectiveness. These conclusions reflect the fact that for retrieval as inference to be effective, a careful balancing act is involved. Finally, although the Graph Inference model is developed and applied to medical search, it is a general retrieval model applicable to other areas such as web search, where an emerging research trend is to utilise structured knowledge resources for more effective semantic search.
  17. Hoppe, T.: Semantische Filterung : ein Werkzeug zur Steigerung der Effizienz im Wissensmanagement (2013) 0.00
    0.0014046291 = product of:
      0.011237033 = sum of:
        0.011237033 = product of:
          0.033711098 = sum of:
            0.033711098 = weight(_text_:29 in 2245) [ClassicSimilarity], result of:
              0.033711098 = score(doc=2245,freq=2.0), product of:
                0.108422816 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030822188 = queryNorm
                0.31092256 = fieldWeight in 2245, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2245)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    29. 9.2015 18:56:44
  18. Renker, L.: Exploration von Textkorpora : Topic Models als Grundlage der Interaktion (2015) 0.00
    0.0012781365 = product of:
      0.010225092 = sum of:
        0.010225092 = product of:
          0.030675275 = sum of:
            0.030675275 = weight(_text_:problem in 2380) [ClassicSimilarity], result of:
              0.030675275 = score(doc=2380,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23447686 = fieldWeight in 2380, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2380)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    Das Internet birgt schier endlose Informationen. Ein zentrales Problem besteht heutzutage darin diese auch zugänglich zu machen. Es ist ein fundamentales Domänenwissen erforderlich, um in einer Volltextsuche die korrekten Suchanfragen zu formulieren. Das ist jedoch oftmals nicht vorhanden, so dass viel Zeit aufgewandt werden muss, um einen Überblick des behandelten Themas zu erhalten. In solchen Situationen findet sich ein Nutzer in einem explorativen Suchvorgang, in dem er sich schrittweise an ein Thema heranarbeiten muss. Für die Organisation von Daten werden mittlerweile ganz selbstverständlich Verfahren des Machine Learnings verwendet. In den meisten Fällen bleiben sie allerdings für den Anwender unsichtbar. Die interaktive Verwendung in explorativen Suchprozessen könnte die menschliche Urteilskraft enger mit der maschinellen Verarbeitung großer Datenmengen verbinden. Topic Models sind ebensolche Verfahren. Sie finden in einem Textkorpus verborgene Themen, die sich relativ gut von Menschen interpretieren lassen und sind daher vielversprechend für die Anwendung in explorativen Suchprozessen. Nutzer können damit beim Verstehen unbekannter Quellen unterstützt werden. Bei der Betrachtung entsprechender Forschungsarbeiten fiel auf, dass Topic Models vorwiegend zur Erzeugung statischer Visualisierungen verwendet werden. Das Sensemaking ist ein wesentlicher Bestandteil der explorativen Suche und wird dennoch nur in sehr geringem Umfang genutzt, um algorithmische Neuerungen zu begründen und in einen umfassenden Kontext zu setzen. Daraus leitet sich die Vermutung ab, dass die Verwendung von Modellen des Sensemakings und die nutzerzentrierte Konzeption von explorativen Suchen, neue Funktionen für die Interaktion mit Topic Models hervorbringen und einen Kontext für entsprechende Forschungsarbeiten bieten können.
  19. Ahn, J.-w.; Brusilovsky, P.: Adaptive visualization for exploratory information retrieval (2013) 0.00
    0.0012781365 = product of:
      0.010225092 = sum of:
        0.010225092 = product of:
          0.030675275 = sum of:
            0.030675275 = weight(_text_:problem in 2717) [ClassicSimilarity], result of:
              0.030675275 = score(doc=2717,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23447686 = fieldWeight in 2717, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2717)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    As the volume and breadth of online information is rapidly increasing, ad hoc search systems become less and less efficient to answer information needs of modern users. To support the growing complexity of search tasks, researchers in the field of information developed and explored a range of approaches that extend the traditional ad hoc retrieval paradigm. Among these approaches, personalized search systems and exploratory search systems attracted many followers. Personalized search explored the power of artificial intelligence techniques to provide tailored search results according to different user interests, contexts, and tasks. In contrast, exploratory search capitalized on the power of human intelligence by providing users with more powerful interfaces to support the search process. As these approaches are not contradictory, we believe that they can re-enforce each other. We argue that the effectiveness of personalized search systems may be increased by allowing users to interact with the system and learn/investigate the problem in order to reach the final goal. We also suggest that an interactive visualization approach could offer a good ground to combine the strong sides of personalized and exploratory search approaches. This paper proposes a specific way to integrate interactive visualization and personalized search and introduces an adaptive visualization based search system Adaptive VIBE that implements it. We tested the effectiveness of Adaptive VIBE and investigated its strengths and weaknesses by conducting a full-scale user study. The results show that Adaptive VIBE can improve the precision and the productivity of the personalized search system while helping users to discover more diverse sets of information.
  20. Ferreira, R.S.; Graça Pimentel, M. de; Cristo, M.: ¬A wikification prediction model based on the combination of latent, dyadic, and monadic features (2018) 0.00
    0.0012781365 = product of:
      0.010225092 = sum of:
        0.010225092 = product of:
          0.030675275 = sum of:
            0.030675275 = weight(_text_:problem in 4119) [ClassicSimilarity], result of:
              0.030675275 = score(doc=4119,freq=2.0), product of:
                0.13082431 = queryWeight, product of:
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.030822188 = queryNorm
                0.23447686 = fieldWeight in 4119, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.244485 = idf(docFreq=1723, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4119)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    Considering repositories of web documents that are semantically linked and created in a collaborative fashion, as in the case of Wikipedia, a key problem faced by content providers is the placement of links in the articles. These links must support user navigation and provide a deeper semantic interpretation of the content. Current wikification methods exploit machine learning techniques to capture characteristics of the concepts and its associations. In previous work, we proposed a preliminary prediction model combining traditional predictors with a latent component which captures the concept graph topology by means of matrix factorization. In this work, we provide a detailed description of our method and a deeper comparison with a state-of-the-art wikification method using a sample of Wikipedia and report a gain up to 13% in F1 score. We also provide a comprehensive analysis of the model performance showing the importance of the latent predictor component and the attributes derived from the associations between the concepts. Moreover, we include an analysis that allows us to conclude that the model is resilient to ambiguity without including a disambiguation phase. We finally report the positive impact of selecting training samples from specific content quality classes.

Languages

  • e 29
  • d 4
  • f 1
  • More… Less…

Types

  • a 31
  • el 4
  • x 2
  • More… Less…