Search (7 results, page 1 of 1)

  • × author_ss:"Li, M."
  1. Chen, Z.; Wenyin, L.; Zhang, F.; Li, M.; Zhang, H.: Web mining for Web image retrieval (2001) 0.02
    0.017316712 = product of:
      0.08081132 = sum of:
        0.046129078 = weight(_text_:web in 6521) [ClassicSimilarity], result of:
          0.046129078 = score(doc=6521,freq=14.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.47698978 = fieldWeight in 6521, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6521)
        0.008737902 = weight(_text_:information in 6521) [ClassicSimilarity], result of:
          0.008737902 = score(doc=6521,freq=6.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.16796975 = fieldWeight in 6521, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6521)
        0.025944345 = weight(_text_:retrieval in 6521) [ClassicSimilarity], result of:
          0.025944345 = score(doc=6521,freq=6.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.28943354 = fieldWeight in 6521, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6521)
      0.21428572 = coord(3/14)
    
    Abstract
    The popularity of digital images is rapidly increasing due to improving digital imaging technologies and convenient availability facilitated by the Internet. However, how to find user-intended images from the Internet is nontrivial. The main reason is that the Web images are usually not annotated using semantic descriptors. In this article, we present an effective approach to and a prototype system for image retrieval from the Internet using Web mining. The system can also serve as a Web image search engine. One of the key ideas in the approach is to extract the text information on the Web pages to semantically describe the images. The text description is then combined with other low-level image features in the image similarity assessment. Another main contribution of this work is that we apply data mining on the log of users' feedback to improve image retrieval performance in three aspects. First, the accuracy of the document space model of image representation obtained from the Web pages is improved by removing clutter and irrelevant text information. Second, to construct the user space model of users' representation of images, which is then combined with the document space model to eliminate mismatch between the page author's expression and the user's understanding and expectation. Third, to discover the relationship between low-level and high-level features, which is extremely useful for assigning the low-level features' weights in similarity assessment
    Source
    Journal of the American Society for Information Science and technology. 52(2001) no.10, S.831-839
  2. Li, M.; Li, H.; Zhou, Z.-H.: Semi-supervised document retrieval (2009) 0.01
    0.010824421 = product of:
      0.050513964 = sum of:
        0.017435152 = weight(_text_:web in 4218) [ClassicSimilarity], result of:
          0.017435152 = score(doc=4218,freq=2.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.18028519 = fieldWeight in 4218, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4218)
        0.0071344664 = weight(_text_:information in 4218) [ClassicSimilarity], result of:
          0.0071344664 = score(doc=4218,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.13714671 = fieldWeight in 4218, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4218)
        0.025944345 = weight(_text_:retrieval in 4218) [ClassicSimilarity], result of:
          0.025944345 = score(doc=4218,freq=6.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.28943354 = fieldWeight in 4218, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4218)
      0.21428572 = coord(3/14)
    
    Abstract
    This paper proposes a new machine learning method for constructing ranking models in document retrieval. The method, which is referred to as SSRank, aims to use the advantages of both the traditional Information Retrieval (IR) methods and the supervised learning methods for IR proposed recently. The advantages include the use of limited amount of labeled data and rich model representation. To do so, the method adopts a semi-supervised learning framework in ranking model construction. Specifically, given a small number of labeled documents with respect to some queries, the method effectively labels the unlabeled documents for the queries. It then uses all the labeled data to train a machine learning model (in our case, Neural Network). In the data labeling, the method also makes use of a traditional IR model (in our case, BM25). A stopping criterion based on machine learning theory is given for the data labeling process. Experimental results on three benchmark datasets and one web search dataset indicate that SSRank consistently and almost always significantly outperforms the baseline methods (unsupervised and supervised learning methods), given the same amount of labeled data. This is because SSRank can effectively leverage the use of unlabeled data in learning.
    Source
    Information processing and management. 45(2009) no.3, S.341-355
  3. Wenyin, L.; Chen, Z.; Li, M.; Zhang, H.: ¬A media agent for automatically builiding a personalized semantic index of Web media objects (2001) 0.01
    0.008186067 = product of:
      0.05730247 = sum of:
        0.051248677 = weight(_text_:web in 6522) [ClassicSimilarity], result of:
          0.051248677 = score(doc=6522,freq=12.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.5299281 = fieldWeight in 6522, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=6522)
        0.0060537956 = weight(_text_:information in 6522) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=6522,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 6522, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=6522)
      0.14285715 = coord(2/14)
    
    Abstract
    A novel idea of media agent is briefly presented, which can automatically build a personalized semantic index of Web media objects for each particular user. Because the Web is a rich source of multimedia data and the text content on the Web pages is usually semantically related to those media objects on the same pages, the media agent can automatically collect the URLs and related text, and then build the index of the multimedia data, on behalf of the user whenever and wherever she accesses these multimedia data or their container Web pages. Moreover, the media agent can also use an off-line crawler to build the index for those multimedia objects that are relevant to the user's favorites but have not accessed by the user yet. When the user wants to find these multimedia data once again, the semantic index facilitates text-based search for her.
    Source
    Journal of the American Society for Information Science and technology. 52(2001) no.10, S.853-855
    Theme
    Web-Agenten
  4. Gu, D.; Liu, H.; Zhao, H.; Yang, X.; Li, M.; Lian, C.: ¬A deep learning and clustering-based topic consistency modeling framework for matching health information supply and demand (2024) 0.00
    0.0012482718 = product of:
      0.017475804 = sum of:
        0.017475804 = weight(_text_:information in 1209) [ClassicSimilarity], result of:
          0.017475804 = score(doc=1209,freq=24.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3359395 = fieldWeight in 1209, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1209)
      0.071428575 = coord(1/14)
    
    Abstract
    Improving health literacy through health information dissemination is one of the most economical and effective mechanisms for improving population health. This process needs to fully accommodate the thematic suitability of health information supply and demand and reduce the impact of information overload and supply-demand mismatch on the enthusiasm of health information acquisition. We propose a health information topic modeling analysis framework that integrates deep learning methods and clustering techniques to model the supply-side and demand-side topics of health information and to quantify the thematic alignment of supply and demand. To validate the effectiveness of the framework, we have conducted an empirical analysis on a dataset with 90,418 pieces of textual data from two prominent social networking platforms. The results show that the supply of health information in general has not yet met the demand, the demand for health information has not yet been met to a considerable extent, especially for disease-related topics, and there is clear inconsistency between the supply and demand sides for the same health topics. Public health policy-making departments and content producers can adjust their information selection and dissemination strategies according to the distribution of identified health topics, thereby improving the effectiveness of public health information dissemination.
    Source
    Journal of the Association for Information Science and Technology. 75(2023) no.2, S.152-166
  5. Bennett, C.H.; Li, M.; Ma, B.: ¬Die Evolution der Kettenbriefe (2004) 0.00
    8.64828E-4 = product of:
      0.012107591 = sum of:
        0.012107591 = weight(_text_:information in 2418) [ClassicSimilarity], result of:
          0.012107591 = score(doc=2418,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23274569 = fieldWeight in 2418, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=2418)
      0.071428575 = coord(1/14)
    
    Theme
    Information
  6. Bu, Y.; Li, M.; Gu, W.; Huang, W.-b.: Topic diversity : a discipline scheme-free diversity measurement for journals (2021) 0.00
    5.04483E-4 = product of:
      0.0070627616 = sum of:
        0.0070627616 = weight(_text_:information in 209) [ClassicSimilarity], result of:
          0.0070627616 = score(doc=209,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.13576832 = fieldWeight in 209, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=209)
      0.071428575 = coord(1/14)
    
    Source
    Journal of the Association for Information Science and Technology. 72(2021) no.5, S.523-539
  7. Liu, X.; Bu, Y.; Li, M.; Li, J.: Monodisciplinary collaboration disrupts science more than multidisciplinary collaboration (2024) 0.00
    4.32414E-4 = product of:
      0.0060537956 = sum of:
        0.0060537956 = weight(_text_:information in 1202) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=1202,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 1202, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1202)
      0.071428575 = coord(1/14)
    
    Source
    Journal of the Association for Information Science and Technology. 75(2023) no.1, S.59-78