Search (7 results, page 1 of 1)

  • × author_ss:"Wang, J."
  1. Shen, R.; Wang, J.; Fox, E.A.: ¬A Lightweight Protocol between Digital Libraries and Visualization Systems (2002) 0.03
    0.026094392 = product of:
      0.052188784 = sum of:
        0.052188784 = product of:
          0.10437757 = sum of:
            0.10437757 = weight(_text_:22 in 666) [ClassicSimilarity], result of:
              0.10437757 = score(doc=666,freq=4.0), product of:
                0.15896842 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045395818 = queryNorm
                0.6565931 = fieldWeight in 666, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=666)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 2.2003 17:25:39
    22. 2.2003 18:15:14
  2. Wang, J.: ¬An extensive study on automated Dewey Decimal Classification (2009) 0.02
    0.01900376 = product of:
      0.03800752 = sum of:
        0.03800752 = product of:
          0.07601504 = sum of:
            0.07601504 = weight(_text_:bibliographic in 3172) [ClassicSimilarity], result of:
              0.07601504 = score(doc=3172,freq=8.0), product of:
                0.17672792 = queryWeight, product of:
                  3.893044 = idf(docFreq=2449, maxDocs=44218)
                  0.045395818 = queryNorm
                0.43012467 = fieldWeight in 3172, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.893044 = idf(docFreq=2449, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3172)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this paper, we present a theoretical analysis and extensive experiments on the automated assignment of Dewey Decimal Classification (DDC) classes to bibliographic data with a supervised machine-learning approach. Library classification systems, such as the DDC, impose great obstacles on state-of-art text categorization (TC) technologies, including deep hierarchy, data sparseness, and skewed distribution. We first analyze statistically the document and category distributions over the DDC, and discuss the obstacles imposed by bibliographic corpora and library classification schemes on TC technology. To overcome these obstacles, we propose an innovative algorithm to reshape the DDC structure into a balanced virtual tree by balancing the category distribution and flattening the hierarchy. To improve the classification effectiveness to a level acceptable to real-world applications, we propose an interactive classification model that is able to predict a class of any depth within a limited number of user interactions. The experiments are conducted on a large bibliographic collection created by the Library of Congress within the science and technology domains over 10 years. With no more than three interactions, a classification accuracy of nearly 90% is achieved, thus providing a practical solution to the automatic bibliographic classification problem.
  3. Wang, J.: Automatic thesaurus development : term extraction from title metadata (2006) 0.01
    0.0134376865 = product of:
      0.026875373 = sum of:
        0.026875373 = product of:
          0.053750746 = sum of:
            0.053750746 = weight(_text_:bibliographic in 5063) [ClassicSimilarity], result of:
              0.053750746 = score(doc=5063,freq=4.0), product of:
                0.17672792 = queryWeight, product of:
                  3.893044 = idf(docFreq=2449, maxDocs=44218)
                  0.045395818 = queryNorm
                0.30414405 = fieldWeight in 5063, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.893044 = idf(docFreq=2449, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5063)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The application of thesauri in networked environments is seriously hampered by the challenges of introducing new concepts and terminology into the formal controlled vocabulary, which is critical for enhancing its retrieval capability. The author describes an automated process of adding new terms to thesauri as entry vocabulary by analyzing the association between words/phrases extracted from bibliographic titles and subject descriptors in the metadata record (subject descriptors are terms assigned from controlled vocabularies of thesauri to describe the subjects of the objects [e.g., books, articles] represented by the metadata records). The investigated approach uses a corpus of metadata for scientific and technical (S&T) publications in which the titles contain substantive words for key topics. The three steps of the method are (a) extracting words and phrases from the title field of the metadata; (b) applying a method to identify and select the specific and meaningful keywords based on the associated controlled vocabulary terms from the thesaurus used to catalog the objects; and (c) inserting selected keywords into the thesaurus as new terms (most of them are in hierarchical relationships with the existing concepts), thereby updating the thesaurus with new terminology that is being used in the literature. The effectiveness of the method was demonstrated by an experiment with the Chinese Classification Thesaurus (CCT) and bibliographic data in China Machine-Readable Cataloging Record (MARC) format (CNMARC) provided by Peking University Library. This approach is equally effective in large-scale collections and in other languages.
  4. Hicks, D.; Wang, J.: Coverage and overlap of the new social sciences and humanities journal lists (2011) 0.01
    0.00922576 = product of:
      0.01845152 = sum of:
        0.01845152 = product of:
          0.03690304 = sum of:
            0.03690304 = weight(_text_:22 in 4192) [ClassicSimilarity], result of:
              0.03690304 = score(doc=4192,freq=2.0), product of:
                0.15896842 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045395818 = queryNorm
                0.23214069 = fieldWeight in 4192, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4192)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 1.2011 13:21:28
  5. He, R.; Wang, J.; Tian, J.; Chu, C.-T.; Mauney, B.; Perisic, I.: Session analysis of people search within a professional social network (2013) 0.01
    0.0076881335 = product of:
      0.015376267 = sum of:
        0.015376267 = product of:
          0.030752534 = sum of:
            0.030752534 = weight(_text_:22 in 743) [ClassicSimilarity], result of:
              0.030752534 = score(doc=743,freq=2.0), product of:
                0.15896842 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045395818 = queryNorm
                0.19345059 = fieldWeight in 743, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=743)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    19. 4.2013 20:31:22
  6. Jiang, Z.; Gu, Q.; Yin, Y.; Wang, J.; Chen, D.: GRAW+ : a two-view graph propagation method with word coupling for readability assessment (2019) 0.01
    0.0076881335 = product of:
      0.015376267 = sum of:
        0.015376267 = product of:
          0.030752534 = sum of:
            0.030752534 = weight(_text_:22 in 5218) [ClassicSimilarity], result of:
              0.030752534 = score(doc=5218,freq=2.0), product of:
                0.15896842 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045395818 = queryNorm
                0.19345059 = fieldWeight in 5218, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5218)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    15. 4.2019 13:46:22
  7. Wang, J.; Halffman, W.; Zhang, Y.H.: Sorting out journals : the proliferation of journal lists in China (2023) 0.01
    0.0076881335 = product of:
      0.015376267 = sum of:
        0.015376267 = product of:
          0.030752534 = sum of:
            0.030752534 = weight(_text_:22 in 1055) [ClassicSimilarity], result of:
              0.030752534 = score(doc=1055,freq=2.0), product of:
                0.15896842 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045395818 = queryNorm
                0.19345059 = fieldWeight in 1055, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1055)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 9.2023 16:39:23