Search (5 results, page 1 of 1)

  • × author_ss:"Wang, S."
  1. Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.01
    0.011321658 = product of:
      0.067929946 = sum of:
        0.067929946 = weight(_text_:problem in 5400) [ClassicSimilarity], result of:
          0.067929946 = score(doc=5400,freq=4.0), product of:
            0.20485485 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04826377 = queryNorm
            0.33160037 = fieldWeight in 5400, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5400)
      0.16666667 = coord(1/6)
    
    Abstract
    Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.
  2. Wang, S.; Isaac, A.; Schopman, B.; Schlobach, S.; Meij, L. van der: Matching multilingual subject vocabularies (2009) 0.01
    0.009606745 = product of:
      0.05764047 = sum of:
        0.05764047 = weight(_text_:problem in 3035) [ClassicSimilarity], result of:
          0.05764047 = score(doc=3035,freq=2.0), product of:
            0.20485485 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04826377 = queryNorm
            0.28137225 = fieldWeight in 3035, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.046875 = fieldNorm(doc=3035)
      0.16666667 = coord(1/6)
    
    Abstract
    Most libraries and other cultural heritage institutions use controlled knowledge organisation systems, such as thesauri, to describe their collections. Unfortunately, as most of these institutions use different such systems, united access to heterogeneous collections is difficult. Things are even worse in an international context when concepts have labels in different languages. In order to overcome the multilingual interoperability problem between European Libraries, extensive work has been done to manually map concepts from different knowledge organisation systems, which is a tedious and expensive process. Within the TELplus project, we developed and evaluated methods to automatically discover these mappings, using different ontology matching techniques. In experiments on major French, English and German subject heading lists Rameau, LCSH and SWD, we show that we can automatically produce mappings of surprisingly good quality, even when using relatively naive translation and matching methods.
  3. Wang, S.; Isaac, A.; Schlobach, S.; Meij, L. van der; Schopman, B.: Instance-based semantic interoperability in the cultural heritage (2012) 0.01
    0.008005621 = product of:
      0.04803372 = sum of:
        0.04803372 = weight(_text_:problem in 125) [ClassicSimilarity], result of:
          0.04803372 = score(doc=125,freq=2.0), product of:
            0.20485485 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04826377 = queryNorm
            0.23447686 = fieldWeight in 125, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=125)
      0.16666667 = coord(1/6)
    
    Abstract
    This paper gives a comprehensive overview over the problem of Semantic Interoperability in the Cultural Heritage domain, with a particular focus on solutions centered around extensional, i.e., instance-based, ontology matching methods. It presents three typical scenarios requiring interoperability, one with homogenous collections, one with heterogeneous collections, and one with multi-lingual collection. It discusses two different ways to evaluate potential alignments, one based on the application of re-indexing, one using a reference alignment. To these scenarios we apply extensional matching with different similarity measures which gives interesting insights. Finally, we firmly position our work in the Cultural Heritage context through an extensive discussion of the relevance for, and issues related to this specific field. The findings are as unspectacular as expected but nevertheless important: the provided methods can really improve interoperability in a number of important cases, but they are not universal solutions to all related problems. This paper will provide a solid foundation for any future work on Semantic Interoperability in the Cultural Heritage domain, in particular for anybody intending to apply extensional methods.
  4. Cai, F.; Wang, S.; Rijke, M.de: Behavior-based personalization in web search (2017) 0.01
    0.0064044963 = product of:
      0.038426977 = sum of:
        0.038426977 = weight(_text_:problem in 3527) [ClassicSimilarity], result of:
          0.038426977 = score(doc=3527,freq=2.0), product of:
            0.20485485 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04826377 = queryNorm
            0.1875815 = fieldWeight in 3527, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03125 = fieldNorm(doc=3527)
      0.16666667 = coord(1/6)
    
    Abstract
    Personalized search approaches tailor search results to users' current interests, so as to help improve the likelihood of a user finding relevant documents for their query. Previous work on personalized search focuses on using the content of the user's query and of the documents clicked to model the user's preference. In this paper we focus on a different type of signal: We investigate the use of behavioral information for the purpose of search personalization. That is, we consider clicks and dwell time for reranking an initially retrieved list of documents. In particular, we (i) investigate the impact of distributions of users and queries on document reranking; (ii) estimate the relevance of a document for a query at 2 levels, at the query-level and at the word-level, to alleviate the problem of sparseness; and (iii) perform an experimental evaluation both for users seen during the training period and for users not seen during training. For the latter, we explore the use of information from similar users who have been seen during the training period. We use the dwell time on clicked documents to estimate a document's relevance to a query, and perform Bayesian probabilistic matrix factorization to generate a relevance distribution of a document over queries. Our experiments show that: (i) for personalized ranking, behavioral information helps to improve retrieval effectiveness; and (ii) given a query, merging information inferred from behavior of a particular user and from behaviors of other users with a user-dependent adaptive weight outperforms any combination with a fixed weight.
  5. Wang, S.; Ma, Y.; Mao, J.; Bai, Y.; Liang, Z.; Li, G.: Quantifying scientific breakthroughs by a novel disruption indicator based on knowledge entities : On the rise of scrape-and-report scholarship in online reviews research (2023) 0.01
    0.0054492294 = product of:
      0.032695375 = sum of:
        0.032695375 = weight(_text_:22 in 882) [ClassicSimilarity], result of:
          0.032695375 = score(doc=882,freq=2.0), product of:
            0.1690115 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04826377 = queryNorm
            0.19345059 = fieldWeight in 882, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=882)
      0.16666667 = coord(1/6)
    
    Date
    22. 1.2023 18:37:33