Search (63 results, page 1 of 4)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Yu, W.; Gong, Y.: Document clustering by concept factorization (2004) 0.07
    0.07053663 = product of:
      0.14107326 = sum of:
        0.14107326 = product of:
          0.21160989 = sum of:
            0.13650061 = weight(_text_:y in 4084) [ClassicSimilarity], result of:
              0.13650061 = score(doc=4084,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.6380402 = fieldWeight in 4084, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4084)
            0.07510927 = weight(_text_:k in 4084) [ClassicSimilarity], result of:
              0.07510927 = score(doc=4084,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.47329018 = fieldWeight in 4084, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4084)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a
  2. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.07
    0.06500145 = sum of:
      0.052955255 = product of:
        0.21182102 = sum of:
          0.21182102 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
            0.21182102 = score(doc=562,freq=2.0), product of:
              0.37689364 = queryWeight, product of:
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.04445543 = queryNorm
              0.56201804 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.25 = coord(1/4)
      0.012046195 = product of:
        0.036138583 = sum of:
          0.036138583 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
            0.036138583 = score(doc=562,freq=2.0), product of:
              0.15567535 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04445543 = queryNorm
              0.23214069 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.33333334 = coord(1/3)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  3. Shen, D.; Chen, Z.; Yang, Q.; Zeng, H.J.; Zhang, B.; Lu, Y.; Ma, W.Y.: Web page classification through summarization (2004) 0.06
    0.05878052 = product of:
      0.11756104 = sum of:
        0.11756104 = product of:
          0.17634156 = sum of:
            0.1137505 = weight(_text_:y in 4132) [ClassicSimilarity], result of:
              0.1137505 = score(doc=4132,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.53170013 = fieldWeight in 4132, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4132)
            0.06259105 = weight(_text_:k in 4132) [ClassicSimilarity], result of:
              0.06259105 = score(doc=4132,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.39440846 = fieldWeight in 4132, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4132)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a
  4. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.06
    0.057993826 = product of:
      0.11598765 = sum of:
        0.11598765 = product of:
          0.17398147 = sum of:
            0.1137505 = weight(_text_:y in 2748) [ClassicSimilarity], result of:
              0.1137505 = score(doc=2748,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.53170013 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
            0.060230974 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
              0.060230974 = score(doc=2748,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.38690117 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  5. Chung, Y.-M.; Noh, Y.-H.: Developing a specialized directory system by automatically classifying Web documents (2003) 0.04
    0.04469171 = product of:
      0.08938342 = sum of:
        0.08938342 = product of:
          0.13407513 = sum of:
            0.096520506 = weight(_text_:y in 1566) [ClassicSimilarity], result of:
              0.096520506 = score(doc=1566,freq=4.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.45116252 = fieldWeight in 1566, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1566)
            0.037554637 = weight(_text_:k in 1566) [ClassicSimilarity], result of:
              0.037554637 = score(doc=1566,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.23664509 = fieldWeight in 1566, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1566)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    This study developed a specialized directory system using an automatic classification technique. Economics was selected as the subject field for the classification experiments with Web documents. The classification scheme of the directory follows the DDC, and subject terms representing each class number or subject category were selected from the DDC table to construct a representative term dictionary. In collecting and classifying the Web documents, various strategies were tested in order to find the optimal thresholds. In the classification experiments, Web documents in economics were classified into a total of 757 hierarchical subject categories built from the DDC scheme. The first and second experiments using the representative term dictionary resulted in relatively high precision ratios of 77 and 60%, respectively. The third experiment employing a machine learning-based k-nearest neighbours (kNN) classifier in a closed experimental setting achieved a precision ratio of 96%. This implies that it is possible to enhance the classification performance by applying a hybrid method combining a dictionary-based technique and a kNN classifier
  6. Yang, Y.; Liu, X.: ¬A re-examination of text categorization methods (1999) 0.04
    0.041146368 = product of:
      0.082292736 = sum of:
        0.082292736 = product of:
          0.123439096 = sum of:
            0.07962535 = weight(_text_:y in 3386) [ClassicSimilarity], result of:
              0.07962535 = score(doc=3386,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.3721901 = fieldWeight in 3386, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3386)
            0.043813743 = weight(_text_:k in 3386) [ClassicSimilarity], result of:
              0.043813743 = score(doc=3386,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.27608594 = fieldWeight in 3386, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3386)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    This paper reports a controlled study with statistical significance tests an five text categorization methods: the Support Vector Machines (SVM), a k-Nearest Neighbor (kNN) classifier, a neural network (NNet) approach, the Linear Leastsquares Fit (LLSF) mapping and a Naive Bayes (NB) classifier. We focus an the robustness of these methods in dealing with a skewed category distribution, and their performance as function of the training-set category frequency. Our results show that SVM, kNN and LLSF significantly outperform NNet and NB when the number of positive training instances per category are small (less than ten, and that all the methods perform comparably when the categories are sufficiently common (over 300 instances).
  7. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.04
    0.04059568 = product of:
      0.08119136 = sum of:
        0.08119136 = product of:
          0.121787034 = sum of:
            0.07962535 = weight(_text_:y in 5273) [ClassicSimilarity], result of:
              0.07962535 = score(doc=5273,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.3721901 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
            0.04216168 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
              0.04216168 = score(doc=5273,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.2708308 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 16:24:52
  8. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.03
    0.033728372 = product of:
      0.067456745 = sum of:
        0.067456745 = product of:
          0.10118511 = sum of:
            0.065046534 = weight(_text_:k in 690) [ClassicSimilarity], result of:
              0.065046534 = score(doc=690,freq=6.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.40988132 = fieldWeight in 690, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
            0.036138583 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
              0.036138583 = score(doc=690,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.23214069 = fieldWeight in 690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    We describe the latent semantic indexing subspace signature model (LSISSM) for semantic content representation of unstructured text. Grounded on singular value decomposition, the model represents terms and documents by the distribution signatures of their statistical contribution across the top-ranking latent concept dimensions. LSISSM matches term signatures with document signatures according to their mapping coherence between latent semantic indexing (LSI) term subspace and LSI document subspace. LSISSM does feature reduction and finds a low-rank approximation of scalable and sparse term-document matrices. Experiments demonstrate that this approach significantly improves the performance of major clustering algorithms such as standard K-means and self-organizing maps compared with the vector space model and the traditional LSI model. The unique contribution ranking mechanism in LSISSM also improves the initialization of standard K-means compared with random seeding procedure, which sometimes causes low efficiency and effectiveness of clustering. A two-stage initialization strategy based on LSISSM significantly reduces the running time of standard K-means procedures.
    Date
    23. 3.2013 13:22:36
  9. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.03
    0.028658476 = product of:
      0.05731695 = sum of:
        0.05731695 = product of:
          0.08597542 = sum of:
            0.043813743 = weight(_text_:k in 2560) [ClassicSimilarity], result of:
              0.043813743 = score(doc=2560,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.27608594 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
            0.04216168 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.04216168 = score(doc=2560,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Date
    22. 9.2008 18:31:54
  10. Sparck Jones, K.: Automatic classification (1976) 0.02
    0.016690949 = product of:
      0.033381898 = sum of:
        0.033381898 = product of:
          0.10014569 = sum of:
            0.10014569 = weight(_text_:k in 2908) [ClassicSimilarity], result of:
              0.10014569 = score(doc=2908,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.63105357 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.125 = fieldNorm(doc=2908)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  11. Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.02
    0.016376272 = product of:
      0.032752544 = sum of:
        0.032752544 = product of:
          0.04912881 = sum of:
            0.025036423 = weight(_text_:k in 2741) [ClassicSimilarity], result of:
              0.025036423 = score(doc=2741,freq=2.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.15776339 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
            0.02409239 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
              0.02409239 = score(doc=2741,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.15476047 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Date
    12. 9.2004 9:56:22
  12. Chung, Y.M.; Lee, J.Y.: ¬A corpus-based approach to comparative evaluation of statistical term association measures (2001) 0.01
    0.013405627 = product of:
      0.026811253 = sum of:
        0.026811253 = product of:
          0.080433756 = sum of:
            0.080433756 = weight(_text_:y in 5769) [ClassicSimilarity], result of:
              0.080433756 = score(doc=5769,freq=4.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.37596878 = fieldWeight in 5769, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5769)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Statistical association measures have been widely applied in information retrieval research, usually employing a clustering of documents or terms on the basis of their relationships. Applications of the association measures for term clustering include automatic thesaurus construction and query expansion. This research evaluates the similarity of six association measures by comparing the relationship and behavior they demonstrate in various analyses of a test corpus. Analysis techniques include comparisons of highly ranked term pairs and term clusters, analyses of the correlation among the association measures using Pearson's correlation coefficient and MDS mapping, and an analysis of the impact of a term frequency on the association values by means of z-score. The major findings of the study are as follows: First, the most similar association measures are mutual information and Yule's coefficient of colligation Y, whereas cosine and Jaccard coefficients, as well as X**2 statistic and likelihood ratio, demonstrate quite similar behavior for terms with high frequency. Second, among all the measures, the X**2 statistic is the least affected by the frequency of terms. Third, although cosine and Jaccard coefficients tend to emphasize high frequency terms, mutual information and Yule's Y seem to overestimate rare terms
  13. Huang, Y.-L.: ¬A theoretic and empirical research of cluster indexing for Mandarine Chinese full text document (1998) 0.01
    0.013270892 = product of:
      0.026541784 = sum of:
        0.026541784 = product of:
          0.07962535 = sum of:
            0.07962535 = weight(_text_:y in 513) [ClassicSimilarity], result of:
              0.07962535 = score(doc=513,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.3721901 = fieldWeight in 513, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=513)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  14. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.01
    0.012046195 = product of:
      0.02409239 = sum of:
        0.02409239 = product of:
          0.072277166 = sum of:
            0.072277166 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
              0.072277166 = score(doc=1046,freq=2.0), product of:
                0.15567535 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04445543 = queryNorm
                0.46428138 = fieldWeight in 1046, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1046)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    5. 5.2003 14:17:22
  15. Kwon, O.W.; Lee, J.H.: Text categorization based on k-nearest neighbor approach for web site classification (2003) 0.01
    0.011663156 = product of:
      0.023326311 = sum of:
        0.023326311 = product of:
          0.06997893 = sum of:
            0.06997893 = weight(_text_:k in 1070) [ClassicSimilarity], result of:
              0.06997893 = score(doc=1070,freq=10.0), product of:
                0.15869603 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.04445543 = queryNorm
                0.44096208 = fieldWeight in 1070, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1070)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Automatic categorization is a viable method to deal with the scaling problem on the World Wide Web. For Web site classification, this paper proposes the use of Web pages linked with the home page in a different manner from the sole use of home pages in previous research. To implement our proposed method, we derive a scheme for Web site classification based on the k-nearest neighbor (k-NN) approach. It consists of three phases: Web page selection (connectivity analysis), Web page classification, and Web site classification. Given a Web site, the Web page selection chooses several representative Web pages using connectivity analysis. The k-NN classifier next classifies each of the selected Web pages. Finally, the classified Web pages are extended to a classification of the entire Web site. To improve performance, we supplement the k-NN approach with a feature selection method and a term weighting scheme using markup tags, and also reform its document-document similarity measure. In our experiments on a Korean commercial Web directory, the proposed system, using both a home page and its linked pages, improved the performance of micro-averaging breakeven point by 30.02%, compared with an ordinary classification which uses a home page only.
  16. Wu, K.J.; Chen, M.-C.; Sun, Y.: Automatic topics discovery from hyperlinked documents (2004) 0.01
    0.011375051 = product of:
      0.022750102 = sum of:
        0.022750102 = product of:
          0.068250306 = sum of:
            0.068250306 = weight(_text_:y in 2563) [ClassicSimilarity], result of:
              0.068250306 = score(doc=2563,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.3190201 = fieldWeight in 2563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2563)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  17. Yoon, Y.; Lee, G.G.: Efficient implementation of associative classifiers for document classification (2007) 0.01
    0.011375051 = product of:
      0.022750102 = sum of:
        0.022750102 = product of:
          0.068250306 = sum of:
            0.068250306 = weight(_text_:y in 909) [ClassicSimilarity], result of:
              0.068250306 = score(doc=909,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.3190201 = fieldWeight in 909, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.046875 = fieldNorm(doc=909)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  18. Ko, Y.; Seo, J.: Text classification from unlabeled documents with bootstrapping and feature projection techniques (2009) 0.01
    0.011375051 = product of:
      0.022750102 = sum of:
        0.022750102 = product of:
          0.068250306 = sum of:
            0.068250306 = weight(_text_:y in 2452) [ClassicSimilarity], result of:
              0.068250306 = score(doc=2452,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.3190201 = fieldWeight in 2452, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2452)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  19. Xu, Y.; Bernard, A.: Knowledge organization through statistical computation : a new approach (2009) 0.01
    0.011375051 = product of:
      0.022750102 = sum of:
        0.022750102 = product of:
          0.068250306 = sum of:
            0.068250306 = weight(_text_:y in 3252) [ClassicSimilarity], result of:
              0.068250306 = score(doc=3252,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.3190201 = fieldWeight in 3252, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3252)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  20. Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.01
    0.011375051 = product of:
      0.022750102 = sum of:
        0.022750102 = product of:
          0.068250306 = sum of:
            0.068250306 = weight(_text_:y in 3464) [ClassicSimilarity], result of:
              0.068250306 = score(doc=3464,freq=2.0), product of:
                0.21393733 = queryWeight, product of:
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.04445543 = queryNorm
                0.3190201 = fieldWeight in 3464, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.8124003 = idf(docFreq=976, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3464)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    

Languages

  • e 55
  • d 6
  • a 1
  • chi 1
  • More… Less…

Types

  • a 57
  • el 9
  • r 1
  • More… Less…