Search (5 results, page 1 of 1)

  • × author_ss:"Wu, M."
  1. Zhang, Y.; Wu, M.; Zhang, G.; Lu, J.: Stepping beyond your comfort zone : diffusion-based network analytics for knowledge trajectory recommendation (2023) 0.02
    0.021209672 = product of:
      0.042419344 = sum of:
        0.042419344 = sum of:
          0.011219106 = weight(_text_:a in 994) [ClassicSimilarity], result of:
            0.011219106 = score(doc=994,freq=22.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.21126054 = fieldWeight in 994, product of:
                4.690416 = tf(freq=22.0), with freq of:
                  22.0 = termFreq=22.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0390625 = fieldNorm(doc=994)
          0.03120024 = weight(_text_:22 in 994) [ClassicSimilarity], result of:
            0.03120024 = score(doc=994,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.19345059 = fieldWeight in 994, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=994)
      0.5 = coord(1/2)
    
    Abstract
    Predicting a researcher's knowledge trajectories beyond their current foci can leverage potential inter-/cross-/multi-disciplinary interactions to achieve exploratory innovation. In this study, we present a method of diffusion-based network analytics for knowledge trajectory recommendation. The method begins by constructing a heterogeneous bibliometric network consisting of a co-topic layer and a co-authorship layer. A novel link prediction approach with a diffusion strategy is then used to capture the interactions between social elements (e.g., collaboration) and knowledge elements (e.g., technological similarity) in the process of exploratory innovation. This diffusion strategy differentiates the interactions occurring among homogeneous and heterogeneous nodes in the heterogeneous bibliometric network and weights the strengths of these interactions. Two sets of experiments-one with a local dataset and the other with a global dataset-demonstrate that the proposed method is prior to 10 selected baselines in link prediction, recommender systems, and upstream graph representation learning. A case study recommending knowledge trajectories of information scientists with topical hierarchy and explainable mediators reveals the proposed method's reliability and potential practical uses in broad scenarios.
    Date
    22. 6.2023 18:07:12
    Type
    a
  2. Wu, M.; Fuller, M.; Wilkinson, R.: Using clustering and classification approaches in interactive retrieval (2001) 0.00
    0.0023678814 = product of:
      0.0047357627 = sum of:
        0.0047357627 = product of:
          0.009471525 = sum of:
            0.009471525 = weight(_text_:a in 2666) [ClassicSimilarity], result of:
              0.009471525 = score(doc=2666,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.17835285 = fieldWeight in 2666, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2666)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  3. Wu, M.; Hawking, D.; Turpin, A.; Scholer, F.: Using anchor text for homepage and topic distillation search tasks (2012) 0.00
    0.0020714647 = product of:
      0.0041429293 = sum of:
        0.0041429293 = product of:
          0.008285859 = sum of:
            0.008285859 = weight(_text_:a in 257) [ClassicSimilarity], result of:
              0.008285859 = score(doc=257,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15602624 = fieldWeight in 257, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=257)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Past work suggests that anchor text is a good source of evidence that can be used to improve web searching. Two approaches for making use of this evidence include fusing search results from an anchor text representation and the original text representation based on a document's relevance score or rank position, and combining term frequency from both representations during the retrieval process. Although these approaches have each been tested and compared against baselines, different evaluations have used different baselines; no consistent work enables rigorous cross-comparison between these methods. The purpose of this work is threefold. First, we survey existing fusion methods of using anchor text in search. Second, we compare these methods with common testbeds and web search tasks, with the aim of identifying the most effective fusion method. Third, we try to correlate search performance with the characteristics of a test collection. Our experimental results show that the best performing method in each category can significantly improve search results over a common baseline. However, there is no single technique that consistently outperforms competing approaches across different collections and search tasks.
    Type
    a
  4. Wu, M.; Turpin, A.; Thom, J.A.; Scholer, F.; Wilkinson, R.: Cost and benefit estimation of experts' mediation in an enterprise search (2014) 0.00
    0.0020714647 = product of:
      0.0041429293 = sum of:
        0.0041429293 = product of:
          0.008285859 = sum of:
            0.008285859 = weight(_text_:a in 1186) [ClassicSimilarity], result of:
              0.008285859 = score(doc=1186,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15602624 = fieldWeight in 1186, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1186)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The success of an enterprise information retrieval system is determined by interactions among three key entities: the search engine employed; the service provider who delivers, modifies, and maintains the engine; and the users of the service within the organization. Evaluations of an enterprise search have predominately focused on the effectiveness and efficiency of the engine, with very little analysis of user involvement in the process, and none on the role of service providers. We propose and evaluate a model of costs and benefits to a service provider when investing in enhancements to the ranking of documents returned by their search engine. We demonstrate the model through a case study to analyze the potential impact of using domain experts to provide enhanced mediated search results. By demonstrating how to quantify the cost and benefit of an improved information retrieval system to the service provider, our case study shows that using the relevance assessments of domain experts to rerank original search results can significantly improve the accuracy of ranked lists. Moreover, the service provider gains substantial return on investment and a higher search success rate by investing in the relevance assessments of domain experts. Our cost and benefit analysis results are contrasted with standard modes of effectiveness analysis, including quantitative (using measures such as precision) and qualitative (through user preference surveys) approaches. Modeling costs and benefits explicitly can provide useful insights that the other approaches do not convey.
    Type
    a
  5. Wu, M.; Liu, Y.-H.; Brownlee, R.; Zhang, X.: Evaluating utility and automatic classification of subject metadata from Research Data Australia (2021) 0.00
    0.001757696 = product of:
      0.003515392 = sum of:
        0.003515392 = product of:
          0.007030784 = sum of:
            0.007030784 = weight(_text_:a in 453) [ClassicSimilarity], result of:
              0.007030784 = score(doc=453,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.13239266 = fieldWeight in 453, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=453)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this paper, we present a case study of how well subject metadata (comprising headings from an international classification scheme) has been deployed in a national data catalogue, and how often data seekers use subject metadata when searching for data. Through an analysis of user search behaviour as recorded in search logs, we find evidence that users utilise the subject metadata for data discovery. Since approximately half of the records ingested by the catalogue did not include subject metadata at the time of harvest, we experimented with automatic subject classification approaches in order to enrich these records and to provide additional support for user search and data discovery. Our results show that automatic methods work well for well represented categories of subject metadata, and these categories tend to have features that can distinguish themselves from the other categories. Our findings raise implications for data catalogue providers; they should invest more effort to enhance the quality of data records by providing an adequate description of these records for under-represented subject categories.
    Type
    a