Search (6 results, page 1 of 1)

  • × theme_ss:"Automatisches Klassifizieren"
  • × theme_ss:"Internet"
  • × year_i:[2000 TO 2010}
  1. Chung, Y.-M.; Noh, Y.-H.: Developing a specialized directory system by automatically classifying Web documents (2003) 0.00
    0.0044022407 = product of:
      0.013206721 = sum of:
        0.013206721 = weight(_text_:a in 1566) [ClassicSimilarity], result of:
          0.013206721 = score(doc=1566,freq=22.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.25351265 = fieldWeight in 1566, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=1566)
      0.33333334 = coord(1/3)
    
    Abstract
    This study developed a specialized directory system using an automatic classification technique. Economics was selected as the subject field for the classification experiments with Web documents. The classification scheme of the directory follows the DDC, and subject terms representing each class number or subject category were selected from the DDC table to construct a representative term dictionary. In collecting and classifying the Web documents, various strategies were tested in order to find the optimal thresholds. In the classification experiments, Web documents in economics were classified into a total of 757 hierarchical subject categories built from the DDC scheme. The first and second experiments using the representative term dictionary resulted in relatively high precision ratios of 77 and 60%, respectively. The third experiment employing a machine learning-based k-nearest neighbours (kNN) classifier in a closed experimental setting achieved a precision ratio of 96%. This implies that it is possible to enhance the classification performance by applying a hybrid method combining a dictionary-based technique and a kNN classifier
    Type
    a
  2. Koch, T.; Ardö, A.: Automatic classification of full-text HTML-documents from one specific subject area : DESIRE II D3.6a, Working Paper 2 (2000) 0.00
    0.003065327 = product of:
      0.009195981 = sum of:
        0.009195981 = weight(_text_:a in 1667) [ClassicSimilarity], result of:
          0.009195981 = score(doc=1667,freq=6.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.17652355 = fieldWeight in 1667, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=1667)
      0.33333334 = coord(1/3)
    
    Content
    1 Introduction / 2 Method overview / 3 Ei thesaurus preprocessing / 4 Automatic classification process: 4.1 Matching -- 4.2 Weighting -- 4.3 Preparation for display / 5 Results of the classification process / 6 Evaluations / 7 Software / 8 Other applications / 9 Experiments with universal classification systems / References / Appendix A: Ei classification service: Software / Appendix B: Use of the classification software as subject filter in a WWW harvester.
  3. Chan, L.M.; Lin, X.; Zeng, M.L.: Structural and multilingual approaches to subject access on the Web (2000) 0.00
    0.002654651 = product of:
      0.007963953 = sum of:
        0.007963953 = weight(_text_:a in 507) [ClassicSimilarity], result of:
          0.007963953 = score(doc=507,freq=2.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.15287387 = fieldWeight in 507, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=507)
      0.33333334 = coord(1/3)
    
    Type
    a
  4. Shafer, K.E.: Automatic Subject Assignment via the Scorpion System (2001) 0.00
    0.002654651 = product of:
      0.007963953 = sum of:
        0.007963953 = weight(_text_:a in 1043) [ClassicSimilarity], result of:
          0.007963953 = score(doc=1043,freq=2.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.15287387 = fieldWeight in 1043, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=1043)
      0.33333334 = coord(1/3)
    
    Type
    a
  5. Choi, B.; Peng, X.: Dynamic and hierarchical classification of Web pages (2004) 0.00
    0.0022989952 = product of:
      0.006896985 = sum of:
        0.006896985 = weight(_text_:a in 2555) [ClassicSimilarity], result of:
          0.006896985 = score(doc=2555,freq=6.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.13239266 = fieldWeight in 2555, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2555)
      0.33333334 = coord(1/3)
    
    Abstract
    Automatic classification of Web pages is an effective way to organise the vast amount of information and to assist in retrieving relevant information from the Internet. Although many automatic classification systems have been proposed, most of them ignore the conflict between the fixed number of categories and the growing number of Web pages being added into the systems. They also require searching through all existing categories to make any classification. This article proposes a dynamic and hierarchical classification system that is capable of adding new categories as required, organising the Web pages into a tree structure, and classifying Web pages by searching through only one path of the tree. The proposed single-path search technique reduces the search complexity from (n) to (log(n)). Test results show that the system improves the accuracy of classification by 6 percent in comparison to related systems. The dynamic-category expansion technique also achieves satisfying results for adding new categories into the system as required.
    Type
    a
  6. Shafer, K.E.: Evaluating Scorpion Results (2001) 0.00
    0.002212209 = product of:
      0.0066366266 = sum of:
        0.0066366266 = weight(_text_:a in 4085) [ClassicSimilarity], result of:
          0.0066366266 = score(doc=4085,freq=2.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.12739488 = fieldWeight in 4085, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=4085)
      0.33333334 = coord(1/3)
    
    Type
    a

Types