Search (14 results, page 1 of 1)

  • × author_ss:"Sun, A."
  1. Sun, A.; Lim, E.-P.: Web unit-based mining of homepage relationships (2006) 0.01
    0.01416524 = product of:
      0.02833048 = sum of:
        0.02833048 = sum of:
          0.005451325 = weight(_text_:e in 5274) [ClassicSimilarity], result of:
            0.005451325 = score(doc=5274,freq=4.0), product of:
              0.048544887 = queryWeight, product of:
                1.43737 = idf(docFreq=28552, maxDocs=44218)
                0.03377341 = queryNorm
              0.112294525 = fieldWeight in 5274, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                1.43737 = idf(docFreq=28552, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5274)
          0.022879155 = weight(_text_:22 in 5274) [ClassicSimilarity], result of:
            0.022879155 = score(doc=5274,freq=2.0), product of:
              0.11826873 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03377341 = queryNorm
              0.19345059 = fieldWeight in 5274, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5274)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 16:18:25
    Language
    e
  2. Sun, A.; Lim, E.-P.; Ng, W.-K.: Performance measurement framework for hierarchical text classification (2003) 0.00
    0.0016353976 = product of:
      0.003270795 = sum of:
        0.003270795 = product of:
          0.00654159 = sum of:
            0.00654159 = weight(_text_:e in 1808) [ClassicSimilarity], result of:
              0.00654159 = score(doc=1808,freq=4.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.13475344 = fieldWeight in 1808, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1808)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  3. Qu, B.; Cong, G.; Li, C.; Sun, A.; Chen, H.: ¬An evaluation of classification models for question topic categorization (2012) 0.00
    0.0013628312 = product of:
      0.0027256624 = sum of:
        0.0027256624 = product of:
          0.005451325 = sum of:
            0.005451325 = weight(_text_:e in 237) [ClassicSimilarity], result of:
              0.005451325 = score(doc=237,freq=4.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.112294525 = fieldWeight in 237, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=237)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We study the problem of question topic classification using a very large real-world Community Question Answering (CQA) dataset from Yahoo! Answers. The dataset comprises 3.9 million questions and these questions are organized into more than 1,000 categories in a hierarchy. To the best knowledge, this is the first systematic evaluation of the performance of different classification methods on question topic classification as well as short texts. Specifically, we empirically evaluate the following in classifying questions into CQA categories: (a) the usefulness of n-gram features and bag-of-word features; (b) the performance of three standard classification algorithms (naive Bayes, maximum entropy, and support vector machines); (c) the performance of the state-of-the-art hierarchical classification algorithms; (d) the effect of training data size on performance; and (e) the effectiveness of the different components of CQA data, including subject, content, asker, and the best answer. The experimental results show what aspects are important for question topic classification in terms of both effectiveness and efficiency. We believe that the experimental findings from this study will be useful in real-world classification problems.
    Language
    e
  4. Li, J.; Sun, A.; Xing, Z.: To do or not to do : distill crowdsourced negative caveats to augment api documentation (2018) 0.00
    0.0011564007 = product of:
      0.0023128013 = sum of:
        0.0023128013 = product of:
          0.0046256026 = sum of:
            0.0046256026 = weight(_text_:e in 4575) [ClassicSimilarity], result of:
              0.0046256026 = score(doc=4575,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.09528506 = fieldWeight in 4575, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4575)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  5. Zheng, X.; Sun, A.: Collecting event-related tweets from twitter stream (2019) 0.00
    0.0011564007 = product of:
      0.0023128013 = sum of:
        0.0023128013 = product of:
          0.0046256026 = sum of:
            0.0046256026 = weight(_text_:e in 4672) [ClassicSimilarity], result of:
              0.0046256026 = score(doc=4672,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.09528506 = fieldWeight in 4672, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4672)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  6. Li, H.; Bhowmick, S.S.; Sun, A.: AffRank: affinity-driven ranking of products in online social rating networks (2011) 0.00
    9.6366723E-4 = product of:
      0.0019273345 = sum of:
        0.0019273345 = product of:
          0.003854669 = sum of:
            0.003854669 = weight(_text_:e in 4483) [ClassicSimilarity], result of:
              0.003854669 = score(doc=4483,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.07940422 = fieldWeight in 4483, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4483)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  7. Sun, A.; Bhowmick, S.S.; Nguyen, K.T.N.; Bai, G.: Tag-based social image retrieval : an empirical evaluation (2011) 0.00
    9.6366723E-4 = product of:
      0.0019273345 = sum of:
        0.0019273345 = product of:
          0.003854669 = sum of:
            0.003854669 = weight(_text_:e in 4938) [ClassicSimilarity], result of:
              0.003854669 = score(doc=4938,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.07940422 = fieldWeight in 4938, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4938)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  8. Li, C.; Sun, A.; Datta, A.: TSDW: Two-stage word sense disambiguation using Wikipedia (2013) 0.00
    9.6366723E-4 = product of:
      0.0019273345 = sum of:
        0.0019273345 = product of:
          0.003854669 = sum of:
            0.003854669 = weight(_text_:e in 956) [ClassicSimilarity], result of:
              0.003854669 = score(doc=956,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.07940422 = fieldWeight in 956, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=956)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  9. Ma, Z.; Sun, A.; Cong, G.: On predicting the popularity of newly emerging hashtags in Twitter (2013) 0.00
    9.6366723E-4 = product of:
      0.0019273345 = sum of:
        0.0019273345 = product of:
          0.003854669 = sum of:
            0.003854669 = weight(_text_:e in 967) [ClassicSimilarity], result of:
              0.003854669 = score(doc=967,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.07940422 = fieldWeight in 967, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=967)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  10. Sedhai, S.; Sun, A.: ¬An analysis of 14 Million tweets on hashtag-oriented spamming* (2017) 0.00
    9.6366723E-4 = product of:
      0.0019273345 = sum of:
        0.0019273345 = product of:
          0.003854669 = sum of:
            0.003854669 = weight(_text_:e in 3683) [ClassicSimilarity], result of:
              0.003854669 = score(doc=3683,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.07940422 = fieldWeight in 3683, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3683)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  11. Phan, M.C.; Sun, A.: Collective named entity recognition in user comments via parameterized label propagation (2020) 0.00
    9.6366723E-4 = product of:
      0.0019273345 = sum of:
        0.0019273345 = product of:
          0.003854669 = sum of:
            0.003854669 = weight(_text_:e in 5815) [ClassicSimilarity], result of:
              0.003854669 = score(doc=5815,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.07940422 = fieldWeight in 5815, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5815)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  12. Lee, G.E.; Sun, A.: Understanding the stability of medical concept embeddings (2021) 0.00
    9.6366723E-4 = product of:
      0.0019273345 = sum of:
        0.0019273345 = product of:
          0.003854669 = sum of:
            0.003854669 = weight(_text_:e in 159) [ClassicSimilarity], result of:
              0.003854669 = score(doc=159,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.07940422 = fieldWeight in 159, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=159)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  13. Yu, M.; Sun, A.: Dataset versus reality : understanding model performance from the perspective of information need (2023) 0.00
    9.6366723E-4 = product of:
      0.0019273345 = sum of:
        0.0019273345 = product of:
          0.003854669 = sum of:
            0.003854669 = weight(_text_:e in 1073) [ClassicSimilarity], result of:
              0.003854669 = score(doc=1073,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.07940422 = fieldWeight in 1073, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1073)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e
  14. Li, C.; Sun, A.: Extracting fine-grained location with temporal awareness in tweets : a two-stage approach (2017) 0.00
    7.7093375E-4 = product of:
      0.0015418675 = sum of:
        0.0015418675 = product of:
          0.003083735 = sum of:
            0.003083735 = weight(_text_:e in 3686) [ClassicSimilarity], result of:
              0.003083735 = score(doc=3686,freq=2.0), product of:
                0.048544887 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03377341 = queryNorm
                0.063523374 = fieldWeight in 3686, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3686)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e