Search (89 results, page 5 of 5)

  • × theme_ss:"Automatisches Klassifizieren"
  • × type_ss:"a"
  1. Golub, K.; Hansson, J.; Soergel, D.; Tudhope, D.: Managing classification in libraries : a methodological outline for evaluating automatic subject indexing and classification in Swedish library catalogues (2015) 0.00
    0.0020488903 = product of:
      0.016391123 = sum of:
        0.016391123 = weight(_text_:data in 2300) [ClassicSimilarity], result of:
          0.016391123 = score(doc=2300,freq=2.0), product of:
            0.093835 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.029675366 = queryNorm
            0.17468026 = fieldWeight in 2300, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
      0.125 = coord(1/8)
    
    Abstract
    Subject terms play a crucial role in resource discovery but require substantial effort to produce. Automatic subject classification and indexing address problems of scale and sustainability and can be used to enrich existing bibliographic records, establish more connections across and between resources and enhance consistency of bibliographic data. The paper aims to put forward a complex methodological framework to evaluate automatic classification tools of Swedish textual documents based on the Dewey Decimal Classification (DDC) recently introduced to Swedish libraries. Three major complementary approaches are suggested: a quality-built gold standard, retrieval effects, domain analysis. The gold standard is built based on input from at least two catalogue librarians, end-users expert in the subject, end users inexperienced in the subject and automated tools. Retrieval effects are studied through a combination of assigned and free tasks, including factual and comprehensive types. The study also takes into consideration the different role and character of subject terms in various knowledge domains, such as scientific disciplines. As a theoretical framework, domain analysis is used and applied in relation to the implementation of DDC in Swedish libraries and chosen domains of knowledge within the DDC itself.
  2. Wang, H.; Hong, M.: Supervised Hebb rule based feature selection for text classification (2019) 0.00
    0.0020488903 = product of:
      0.016391123 = sum of:
        0.016391123 = weight(_text_:data in 5036) [ClassicSimilarity], result of:
          0.016391123 = score(doc=5036,freq=2.0), product of:
            0.093835 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.029675366 = queryNorm
            0.17468026 = fieldWeight in 5036, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5036)
      0.125 = coord(1/8)
    
    Abstract
    Text documents usually contain high dimensional non-discriminative (irrelevant and noisy) terms which lead to steep computational costs and poor learning performance of text classification. One of the effective solutions for this problem is feature selection which aims to identify discriminative terms from text data. This paper proposes a method termed "Hebb rule based feature selection (HRFS)". HRFS is based on supervised Hebb rule and assumes that terms and classes are neurons and select terms under the assumption that a term is discriminative if it keeps "exciting" the corresponding classes. This assumption can be explained as "a term is highly correlated with a class if it is able to keep "exciting" the class according to the original Hebb postulate. Six benchmarking datasets are used to compare HRFS with other seven feature selection methods. Experimental results indicate that HRFS is effective to achieve better performance than the compared methods. HRFS can identify discriminative terms in the view of synapse between neurons. Moreover, HRFS is also efficient because it can be described in the view of matrix operation to decrease complexity of feature selection.
  3. Pech, G.; Delgado, C.; Sorella, S.P.: Classifying papers into subfields using Abstracts, Titles, Keywords and KeyWords Plus through pattern detection and optimization procedures : an application in Physics (2022) 0.00
    0.0020488903 = product of:
      0.016391123 = sum of:
        0.016391123 = weight(_text_:data in 744) [ClassicSimilarity], result of:
          0.016391123 = score(doc=744,freq=2.0), product of:
            0.093835 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.029675366 = queryNorm
            0.17468026 = fieldWeight in 744, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=744)
      0.125 = coord(1/8)
    
    Abstract
    Classifying papers according to the fields of knowledge is critical to clearly understand the dynamics of scientific (sub)fields, their leading questions, and trends. Most studies rely on journal categories defined by popular databases such as WoS or Scopus, but some experts find that those categories may not correctly map the existing subfields nor identify the subfield of a specific article. This study addresses the classification problem using data from each paper (Abstract, Title, Keywords, and the KeyWords Plus) and the help of experts to identify the existing subfields and journals exclusive of each subfield. These "exclusive journals" are critical to obtain, through a pattern detection procedure that uses machine learning techniques (from software NVivo), a list of the frequent terms that are specific to each subfield. With that list of terms and with the help of optimization procedures, we can identify to which subfield each paper most likely belongs. This study can contribute to support scientific policy-makers, funding, and research institutions-via more accurate academic performance evaluations-, to support editors in their tasks to redefine the scopes of journals, and to support popular databases in their processes of refining categories.
  4. Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.00
    0.0017590135 = product of:
      0.014072108 = sum of:
        0.014072108 = product of:
          0.028144216 = sum of:
            0.028144216 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
              0.028144216 = score(doc=141,freq=2.0), product of:
                0.103918076 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029675366 = queryNorm
                0.2708308 = fieldWeight in 141, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=141)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Pages
    S.1-22
  5. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.00
    0.0017590135 = product of:
      0.014072108 = sum of:
        0.014072108 = product of:
          0.028144216 = sum of:
            0.028144216 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.028144216 = score(doc=2560,freq=2.0), product of:
                0.103918076 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029675366 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    22. 9.2008 18:31:54
  6. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.00
    0.0015077259 = product of:
      0.012061807 = sum of:
        0.012061807 = product of:
          0.024123615 = sum of:
            0.024123615 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
              0.024123615 = score(doc=2760,freq=2.0), product of:
                0.103918076 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029675366 = queryNorm
                0.23214069 = fieldWeight in 2760, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2760)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    22. 3.2009 19:11:54
  7. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.00
    0.0015077259 = product of:
      0.012061807 = sum of:
        0.012061807 = product of:
          0.024123615 = sum of:
            0.024123615 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
              0.024123615 = score(doc=3051,freq=2.0), product of:
                0.103918076 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029675366 = queryNorm
                0.23214069 = fieldWeight in 3051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3051)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    22. 8.2009 19:51:28
  8. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.00
    0.0015077259 = product of:
      0.012061807 = sum of:
        0.012061807 = product of:
          0.024123615 = sum of:
            0.024123615 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
              0.024123615 = score(doc=690,freq=2.0), product of:
                0.103918076 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029675366 = queryNorm
                0.23214069 = fieldWeight in 690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    23. 3.2013 13:22:36
  9. Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.00
    0.0012564383 = product of:
      0.010051507 = sum of:
        0.010051507 = product of:
          0.020103013 = sum of:
            0.020103013 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
              0.020103013 = score(doc=1107,freq=2.0), product of:
                0.103918076 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029675366 = queryNorm
                0.19345059 = fieldWeight in 1107, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1107)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    28.10.2013 19:22:57

Years

Languages

  • e 83
  • d 6
  • More… Less…