Search (3 results, page 1 of 1)

  • × author_ss:"Wang, P."
  • × language_ss:"e"
  1. Wang, P.; Li, X.: Assessing the quality of information on Wikipedia : a deep-learning approach (2020) 0.01
    0.010607645 = product of:
      0.031822935 = sum of:
        0.031822935 = product of:
          0.095468804 = sum of:
            0.095468804 = weight(_text_:network in 5505) [ClassicSimilarity], result of:
              0.095468804 = score(doc=5505,freq=8.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.492033 = fieldWeight in 5505, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5505)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Currently, web document repositories have been collaboratively created and edited. One of these repositories, Wikipedia, is facing an important problem: assessing the quality of Wikipedia. Existing approaches exploit techniques such as statistical models or machine leaning algorithms to assess Wikipedia article quality. However, existing models do not provide satisfactory results. Furthermore, these models fail to adopt a comprehensive feature framework. In this article, we conduct an extensive survey of previous studies and summarize a comprehensive feature framework, including text statistics, writing style, readability, article structure, network, and editing history. Selected state-of-the-art deep-learning models, including the convolutional neural network (CNN), deep neural network (DNN), long short-term memory (LSTMs) network, CNN-LSTMs, bidirectional LSTMs, and stacked LSTMs, are applied to assess the quality of Wikipedia. A detailed comparison of deep-learning models is conducted with regard to different aspects: classification performance and training performance. We include an importance analysis of different features and feature sets to determine which features or feature sets are most effective in distinguishing Wikipedia article quality. This extensive experiment validates the effectiveness of the proposed model.
  2. Wang, P.; Hao, T.; Yan, J.; Jin, L.: Large-scale extraction of drug-disease pairs from the medical literature (2017) 0.01
    0.007500738 = product of:
      0.022502214 = sum of:
        0.022502214 = product of:
          0.06750664 = sum of:
            0.06750664 = weight(_text_:network in 3927) [ClassicSimilarity], result of:
              0.06750664 = score(doc=3927,freq=4.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.34791988 = fieldWeight in 3927, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3927)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Automatic extraction of large-scale and accurate drug-disease pairs from the medical literature plays an important role for drug repurposing. However, many existing extraction methods are mainly in a supervised manner. It is costly and time-consuming to manually label drug-disease pairs datasets. There are many drug-disease pairs buried in free text. In this work, we first leverage a pattern-based method to automatically extract drug-disease pairs with treatment and inducement relationships from free text. Then, to reflect a drug-disease relation, a network embedding algorithm is proposed to calculate the degree of correlation of a drug-disease pair. In the experiments, we use the method to extract treatment and inducement drug-disease pairs from 27 million medical abstracts and titles available on PubMed. We extract 138,318 unique treatment pairs and 75,396 unique inducement pairs. Our algorithm achieves a precision of 0.912 and a recall of 0.898 in extracting the frequent treatment drug-disease pairs, and a precision of 0.923 and a recall of 0.833 in extracting the frequent inducement drug-disease pairs. Besides, our proposed information network embedding algorithm can efficiently reflect the degree of correlation of drug-disease pairs. Our algorithm can achieve a precision of 0.802, a recall of 0.783 in the fine-grained evaluation of extracting frequent pairs.
  3. Wang, P.; Ma, Y.; Xie, H.; Wang, H.; Lu, J.; Xu, J.: "There is a gorilla holding a key on the book cover" : young children's known picture book search strategies (2022) 0.00
    0.003309216 = product of:
      0.009927647 = sum of:
        0.009927647 = product of:
          0.029782942 = sum of:
            0.029782942 = weight(_text_:29 in 443) [ClassicSimilarity], result of:
              0.029782942 = score(doc=443,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.19432661 = fieldWeight in 443, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=443)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    There is no information search system can assist young children's known picture book search needs since the information is not organized according to their cognitive abilities and needs. Therefore, this study explored young children's known picture book search strategies and extracted picture book search elements by simulating a search scenario and playing a picture book search game. The study found 29 elements children used to search for known picture books. Then, these elements are classified into three dimensions: The first dimension is the concept category of an element. The second dimension is an element's status in the story. The third dimension indicates where an element appears in a picture book. Additionally, it revealed a young children's general search strategy: Children first use auditory elements that they hear from the adults during reading. After receiving error returns, they add visual elements that they see by themselves in picture books. The findings can not only help to understand young children's known-item search and reformulation strategies during searching but also provide theoretical support for the development of a picture book information organization schema in the search system.

Authors