Search (42 results, page 1 of 3)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.07
    0.066501744 = product of:
      0.13300349 = sum of:
        0.13300349 = sum of:
          0.09814678 = weight(_text_:learning in 1343) [ClassicSimilarity], result of:
            0.09814678 = score(doc=1343,freq=6.0), product of:
              0.22973695 = queryWeight, product of:
                4.464877 = idf(docFreq=1382, maxDocs=44218)
                0.05145426 = queryNorm
              0.42721373 = fieldWeight in 1343, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.464877 = idf(docFreq=1382, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1343)
          0.03485671 = weight(_text_:22 in 1343) [ClassicSimilarity], result of:
            0.03485671 = score(doc=1343,freq=2.0), product of:
              0.18018405 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05145426 = queryNorm
              0.19345059 = fieldWeight in 1343, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1343)
      0.5 = coord(1/2)
    
    Abstract
    A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. Recently, methods that exploit named entities have been shown to be more effective for query expansion than traditional pseudorelevance feedback methods. In this article, we introduce a supervised learning approach that exploits named entities for query expansion using Wikipedia as a repository of high-quality feedback documents. In contrast with existing entity-oriented pseudorelevance feedback approaches, we tackle query expansion as a learning-to-rank problem. As a result, not only do we select effective expansion terms but we also weigh these terms according to their predicted effectiveness. To this end, we exploit the rich structure of Wikipedia articles to devise discriminative term features, including each candidate term's proximity to the original query terms, as well as its frequency across multiple article fields and in category and infobox descriptors. Experiments on three Text REtrieval Conference web test collections attest the effectiveness of our approach, with gains of up to 23.32% in terms of mean average precision, 19.49% in terms of precision at 10, and 7.86% in terms of normalized discounted cumulative gain compared with a state-of-the-art approach for entity-oriented query expansion.
    Date
    22. 8.2014 17:07:50
  2. Xu, B.; Lin, H.; Lin, Y.: Assessment of learning to rank methods for query expansion (2016) 0.03
    0.034700125 = product of:
      0.06940025 = sum of:
        0.06940025 = product of:
          0.1388005 = sum of:
            0.1388005 = weight(_text_:learning in 2929) [ClassicSimilarity], result of:
              0.1388005 = score(doc=2929,freq=12.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.6041714 = fieldWeight in 2929, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2929)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Pseudo relevance feedback, as an effective query expansion method, can significantly improve information retrieval performance. However, the method may negatively impact the retrieval performance when some irrelevant terms are used in the expanded query. Therefore, it is necessary to refine the expansion terms. Learning to rank methods have proven effective in information retrieval to solve ranking problems by ranking the most relevant documents at the top of the returned list, but few attempts have been made to employ learning to rank methods for term refinement in pseudo relevance feedback. This article proposes a novel framework to explore the feasibility of using learning to rank to optimize pseudo relevance feedback by means of reranking the candidate expansion terms. We investigate some learning approaches to choose the candidate terms and introduce some state-of-the-art learning to rank methods to refine the expansion terms. In addition, we propose two term labeling strategies and examine the usefulness of various term features to optimize the framework. Experimental results with three TREC collections show that our framework can effectively improve retrieval performance.
  3. Kwok, K.L.: ¬A network approach to probabilistic information retrieval (1995) 0.03
    0.029444033 = product of:
      0.058888067 = sum of:
        0.058888067 = product of:
          0.11777613 = sum of:
            0.11777613 = weight(_text_:learning in 5696) [ClassicSimilarity], result of:
              0.11777613 = score(doc=5696,freq=6.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.51265645 = fieldWeight in 5696, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5696)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Shows how probabilistic information retrieval based on document components may be implemented as a feedforward (feedbackward) artificial neural network. The network supports adaptation of connection weights as well as the growing of new edges between queries and terms based on user relevance feedback data for training, and it reflects query modification and expansion in information retrieval. A learning rule is applied that can also be viewed as supporting sequential learning using a harmonic sequence learning rate. Experimental results with 4 standard small collections and a large Wall Street Journal collection show that small query expansion levels of about 30 terms can achieve most of the gains at the low-recall high-precision region, while larger expansion levels continue to provide gains at the high-recall low-precision region of a precision recall curve
  4. Boyack, K.W.; Wylie,B.N.; Davidson, G.S.: Information Visualization, Human-Computer Interaction, and Cognitive Psychology : Domain Visualizations (2002) 0.02
    0.024647415 = product of:
      0.04929483 = sum of:
        0.04929483 = product of:
          0.09858966 = sum of:
            0.09858966 = weight(_text_:22 in 1352) [ClassicSimilarity], result of:
              0.09858966 = score(doc=1352,freq=4.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.54716086 = fieldWeight in 1352, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1352)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 2.2003 17:25:39
    22. 2.2003 18:17:40
  5. Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.02
    0.024399696 = product of:
      0.04879939 = sum of:
        0.04879939 = product of:
          0.09759878 = sum of:
            0.09759878 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
              0.09759878 = score(doc=2134,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.5416616 = fieldWeight in 2134, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2134)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    30. 3.2001 13:32:22
  6. Huang, L.; Milne, D.; Frank, E.; Witten, I.H.: Learning a concept-based document similarity measure (2012) 0.02
    0.024040952 = product of:
      0.048081905 = sum of:
        0.048081905 = product of:
          0.09616381 = sum of:
            0.09616381 = weight(_text_:learning in 372) [ClassicSimilarity], result of:
              0.09616381 = score(doc=372,freq=4.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.41858223 = fieldWeight in 372, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.046875 = fieldNorm(doc=372)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Document similarity measures are crucial components of many text-analysis tasks, including information retrieval, document classification, and document clustering. Conventional measures are brittle: They estimate the surface overlap between documents based on the words they mention and ignore deeper semantic connections. We propose a new measure that assesses similarity at both the lexical and semantic levels, and learns from human judgments how to combine them by using machine-learning techniques. Experiments show that the new measure produces values for documents that are more consistent with people's judgments than people are with each other. We also use it to classify and cluster large document sets covering different genres and topics, and find that it improves both classification and clustering performance.
  7. Cai, F.; Rijke, M. de: Learning from homologous queries and semantically related terms for query auto completion (2016) 0.02
    0.020034127 = product of:
      0.040068254 = sum of:
        0.040068254 = product of:
          0.08013651 = sum of:
            0.08013651 = weight(_text_:learning in 2971) [ClassicSimilarity], result of:
              0.08013651 = score(doc=2971,freq=4.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.34881854 = fieldWeight in 2971, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2971)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Query auto completion (QAC) models recommend possible queries to web search users when they start typing a query prefix. Most of today's QAC models rank candidate queries by popularity (i.e., frequency), and in doing so they tend to follow a strict query matching policy when counting the queries. That is, they ignore the contributions from so-called homologous queries, queries with the same terms but ordered differently or queries that expand the original query. Importantly, homologous queries often express a remarkably similar search intent. Moreover, today's QAC approaches often ignore semantically related terms. We argue that users are prone to combine semantically related terms when generating queries. We propose a learning to rank-based QAC approach, where, for the first time, features derived from homologous queries and semantically related terms are introduced. In particular, we consider: (i) the observed and predicted popularity of homologous queries for a query candidate; and (ii) the semantic relatedness of pairs of terms inside a query and pairs of queries inside a session. We quantify the improvement of the proposed new features using two large-scale real-world query logs and show that the mean reciprocal rank and the success rate can be improved by up to 9% over state-of-the-art QAC models.
  8. Surfing versus Drilling for knowledge in science : When should you use your computer? When should you use your brain? (2018) 0.02
    0.019629356 = product of:
      0.03925871 = sum of:
        0.03925871 = product of:
          0.07851742 = sum of:
            0.07851742 = weight(_text_:learning in 4564) [ClassicSimilarity], result of:
              0.07851742 = score(doc=4564,freq=6.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.34177098 = fieldWeight in 4564, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4564)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    For this second Special Issue of Infozine, we have invited students, teachers, researchers, and software developers to share their opinions about one or the other aspect of this broad topic: how to balance drilling (for depth) vs. surfing (for breadth) in scientific learning, teaching, research, and software design - and how the modern digital-liberal system affects our ability to strike this balance. This special issue is meant to provide a wide and unbiased spectrum of possible viewpoints on the topic, helping readers to define lucidly their own position and information use behavior.
    Content
    Editorial: Surfing versus Drilling for Knowledge in Science: When should you use your computer? When should you use your brain? Blaise Pascal: Les deux infinis - The two infinities / Philippe Hünenberger and Oliver Renn - "Surfing" vs. "drilling" in the modern scientific world / Antonio Loprieno - Of millimeter paper and machine learning / Philippe Hünenberger - From one to many, from breadth to depth - industrializing research / Janne Soetbeer - "Deep drilling" requires "surfing" / Gerd Folkers and Laura Folkers - Surfing vs. drilling in science: A delicate balance / Alzbeta Kubincová - Digital trends in academia - for the sake of critical thinking or comfort? / Leif-Thore Deck - I diagnose, therefore I am a Doctor? Will drilling computer software replace human doctors in the future? / Yi Zheng - Surfing versus drilling in fundamental research / Wilfred van Gunsteren - Using brain vs. brute force in computational studies of biological systems / Arieh Warshel - Laboratory literature boards in the digital age / Jeffrey Bode - Research strategies in computational chemistry / Sereina Riniker - Surfing on the hype waves or drilling deep for knowledge? A perspective from industry / Nadine Schneider and Nikolaus Stiefl - The use and purpose of articles and scientists / Philip Mark Lund - Can you look at papers like artwork? / Oliver Renn - Dynamite fishing in the data swamp / Frank Perabo 34 Streetlights, augmented intelligence, and information discovery / Jeffrey Saffer and Vicki Burnett - "Yes Dave. Happy to do that for you." Why AI, machine learning, and blockchain will lead to deeper "drilling" / Michiel Kolman and Sjors de Heuvel - Trends in scientific document search ( Stefan Geißler - Power tools for text mining / Jane Reed 42 Publishing and patenting: Navigating the differences to ensure search success / Paul Peters
  9. Rekabsaz, N. et al.: Toward optimized multimodal concept indexing (2016) 0.02
    0.017428355 = product of:
      0.03485671 = sum of:
        0.03485671 = product of:
          0.06971342 = sum of:
            0.06971342 = weight(_text_:22 in 2751) [ClassicSimilarity], result of:
              0.06971342 = score(doc=2751,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.38690117 = fieldWeight in 2751, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2751)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  10. Kozikowski, P. et al.: Support of part-whole relations in query answering (2016) 0.02
    0.017428355 = product of:
      0.03485671 = sum of:
        0.03485671 = product of:
          0.06971342 = sum of:
            0.06971342 = weight(_text_:22 in 2754) [ClassicSimilarity], result of:
              0.06971342 = score(doc=2754,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.38690117 = fieldWeight in 2754, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2754)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  11. Marx, E. et al.: Exploring term networks for semantic search over RDF knowledge graphs (2016) 0.02
    0.017428355 = product of:
      0.03485671 = sum of:
        0.03485671 = product of:
          0.06971342 = sum of:
            0.06971342 = weight(_text_:22 in 3279) [ClassicSimilarity], result of:
              0.06971342 = score(doc=3279,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.38690117 = fieldWeight in 3279, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3279)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  12. Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.02
    0.017428355 = product of:
      0.03485671 = sum of:
        0.03485671 = product of:
          0.06971342 = sum of:
            0.06971342 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
              0.06971342 = score(doc=3280,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.38690117 = fieldWeight in 3280, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3280)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  13. Sacco, G.M.: Dynamic taxonomies and guided searches (2006) 0.02
    0.017253192 = product of:
      0.034506384 = sum of:
        0.034506384 = product of:
          0.06901277 = sum of:
            0.06901277 = weight(_text_:22 in 5295) [ClassicSimilarity], result of:
              0.06901277 = score(doc=5295,freq=4.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.38301262 = fieldWeight in 5295, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5295)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 17:56:22
  14. Pahlevi, S.M.; Kitagawa, H.: Conveying taxonomy context for topic-focused Web search (2005) 0.02
    0.01699952 = product of:
      0.03399904 = sum of:
        0.03399904 = product of:
          0.06799808 = sum of:
            0.06799808 = weight(_text_:learning in 3310) [ClassicSimilarity], result of:
              0.06799808 = score(doc=3310,freq=2.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.29598233 = fieldWeight in 3310, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3310)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Introducing context to a user query is effective to improve the search effectiveness. In this article we propose a method employing the taxonomy-based search services such as Web directories to facilitate searches in any Web search interfaces that support Boolean queries. The proposed method enables one to convey current search context an taxonomy of a taxonomy-based search service to the searches conducted with the Web search interfaces. The basic idea is to learn the search context in the form of a Boolean condition that is commonly accepted by many Web search interfaces, and to use the condition to modify the user query before forwarding it to the Web search interfaces. To guarantee that the modified query can always be processed by the Web search interfaces and to make the method adaptive to different user requirements an search result effectiveness, we have developed new fast classification learning algorithms.
  15. Drexel, G.: Knowledge engineering for intelligent information retrieval (2001) 0.02
    0.01699952 = product of:
      0.03399904 = sum of:
        0.03399904 = product of:
          0.06799808 = sum of:
            0.06799808 = weight(_text_:learning in 4043) [ClassicSimilarity], result of:
              0.06799808 = score(doc=4043,freq=2.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.29598233 = fieldWeight in 4043, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4043)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper presents a clustered approach to designing an overall ontological model together with a general rule-based component that serves as a mapping device. By observational criteria, a multi-lingual team of experts excerpts concepts from general communication in the media. The team, then, finds equivalent expressions in English, German, French, and Spanish. On the basis of a set of ontological and lexical relations, a conceptual network is built up. Concepts are thought to be universal. Objects unique in time and space are identified by names and will be explained by the universals as their instances. Our approach relies on multi-relational descriptions of concepts. It provides a powerful tool for documentation and conceptual language learning. First and foremost, our multi-lingual, polyhierarchical ontology fills the gap of semantically-based information retrieval by generating enhanced and improved queries for internet search
  16. Lee, Y.-Y.; Ke, H.; Yen, T.-Y.; Huang, H.-H.; Chen, H.-H.: Combining and learning word embedding with WordNet for semantic relatedness and similarity measurement (2020) 0.02
    0.01699952 = product of:
      0.03399904 = sum of:
        0.03399904 = product of:
          0.06799808 = sum of:
            0.06799808 = weight(_text_:learning in 5871) [ClassicSimilarity], result of:
              0.06799808 = score(doc=5871,freq=2.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.29598233 = fieldWeight in 5871, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5871)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  17. Sebastian, Y.: Literature-based discovery by learning heterogeneous bibliographic information networks (2017) 0.02
    0.016027302 = product of:
      0.032054603 = sum of:
        0.032054603 = product of:
          0.064109206 = sum of:
            0.064109206 = weight(_text_:learning in 535) [ClassicSimilarity], result of:
              0.064109206 = score(doc=535,freq=4.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.27905482 = fieldWeight in 535, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.03125 = fieldNorm(doc=535)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Literature-based discovery (LBD) research aims at finding effective computational methods for predicting previously unknown connections between clusters of research papers from disparate research areas. Existing methods encompass two general approaches. The first approach searches for these unknown connections by examining the textual contents of research papers. In addition to the existing textual features, the second approach incorporates structural features of scientific literatures, such as citation structures. These approaches, however, have not considered research papers' latent bibliographic metadata structures as important features that can be used for predicting previously unknown relationships between them. This thesis investigates a new graph-based LBD method that exploits the latent bibliographic metadata connections between pairs of research papers. The heterogeneous bibliographic information network is proposed as an efficient graph-based data structure for modeling the complex relationships between these metadata. In contrast to previous approaches, this method seamlessly combines textual and citation information in the form of pathbased metadata features for predicting future co-citation links between research papers from disparate research fields. The results reported in this thesis provide evidence that the method is effective for reconstructing the historical literature-based discovery hypotheses. This thesis also investigates the effects of semantic modeling and topic modeling on the performance of the proposed method. For semantic modeling, a general-purpose word sense disambiguation technique is proposed to reduce the lexical ambiguity in the title and abstract of research papers. The experimental results suggest that the reduced lexical ambiguity did not necessarily lead to a better performance of the method. This thesis discusses some of the possible contributing factors to these results. Finally, topic modeling is used for learning the latent topical relations between research papers. The learned topic model is incorporated into the heterogeneous bibliographic information network graph and allows new predictive features to be learned. The results in this thesis suggest that topic modeling improves the performance of the proposed method by increasing the overall accuracy for predicting the future co-citation links between disparate research papers.
  18. Chen, H.; Lally, A.M.; Zhu, B.; Chau, M.: HelpfulMed : Intelligent searching for medical information over the Internet (2003) 0.01
    0.014166266 = product of:
      0.028332531 = sum of:
        0.028332531 = product of:
          0.056665063 = sum of:
            0.056665063 = weight(_text_:learning in 1615) [ClassicSimilarity], result of:
              0.056665063 = score(doc=1615,freq=2.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.24665193 = fieldWeight in 1615, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1615)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Footnote
    Teil eines Themenheftes: "Web retrieval and mining: A machine learning perspective"
  19. Shiri, A.; Revie, C.: Usability and user perceptions of a thesaurus-enhanced search interface (2005) 0.01
    0.014166266 = product of:
      0.028332531 = sum of:
        0.028332531 = product of:
          0.056665063 = sum of:
            0.056665063 = weight(_text_:learning in 4331) [ClassicSimilarity], result of:
              0.056665063 = score(doc=4331,freq=2.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.24665193 = fieldWeight in 4331, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4331)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Purpose - This paper seeks to report an investigation into the ways in which end-users perceive a thesaurus-enhanced search interface, in particular thesaurus and search interface usability. Design/methodology/approach - Thirty academic users, split between staff and postgraduate students, carrying out real search requests were observed during this study. Users were asked to comment on a range of thesaurus and interface characteristics including: ease of use, ease of learning, ease of browsing and navigation, problems and difficulties encountered while interacting with the system, and the effect of browsing on search term selection. Findings - The results suggest that interface usability is a factor affecting thesaurus browsing/navigation and other information-searching behaviours. Academic staff viewed the function of a thesaurus as being useful for narrowing down a search and providing alternative search terms, while postgraduates stressed the role of the thesaurus for broadening searches and providing new terms. Originality/value - The paper provides an insight into the ways in which end-users make use of and interact with a thesaurus-enhanced search interface. This area is new since previous research has particularly focused on how professional searchers and librarians make use of thesauri and thesaurus-enhanced search interfaces. The research reported here suggests that end-users with varying levels of domain knowledge are able to use thesauri that are integrated into search interfaces. It also provides design implications for search interface developers as well as information professionals who are involved in teaching online searching.
  20. Ferreira, R.S.; Graça Pimentel, M. de; Cristo, M.: ¬A wikification prediction model based on the combination of latent, dyadic, and monadic features (2018) 0.01
    0.014166266 = product of:
      0.028332531 = sum of:
        0.028332531 = product of:
          0.056665063 = sum of:
            0.056665063 = weight(_text_:learning in 4119) [ClassicSimilarity], result of:
              0.056665063 = score(doc=4119,freq=2.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.24665193 = fieldWeight in 4119, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4119)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Considering repositories of web documents that are semantically linked and created in a collaborative fashion, as in the case of Wikipedia, a key problem faced by content providers is the placement of links in the articles. These links must support user navigation and provide a deeper semantic interpretation of the content. Current wikification methods exploit machine learning techniques to capture characteristics of the concepts and its associations. In previous work, we proposed a preliminary prediction model combining traditional predictors with a latent component which captures the concept graph topology by means of matrix factorization. In this work, we provide a detailed description of our method and a deeper comparison with a state-of-the-art wikification method using a sample of Wikipedia and report a gain up to 13% in F1 score. We also provide a comprehensive analysis of the model performance showing the importance of the latent predictor component and the attributes derived from the associations between the concepts. Moreover, we include an analysis that allows us to conclude that the model is resilient to ambiguity without including a disambiguation phase. We finally report the positive impact of selecting training samples from specific content quality classes.

Years

Languages

  • e 38
  • d 4

Types

  • a 37
  • el 4
  • m 1
  • x 1
  • More… Less…