Search (11 results, page 1 of 1)

  • × theme_ss:"Automatisches Klassifizieren"
  • × type_ss:"el"
  • × year_i:[2000 TO 2010}
  1. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.02
    0.01560012 = product of:
      0.03120024 = sum of:
        0.03120024 = product of:
          0.06240048 = sum of:
            0.06240048 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
              0.06240048 = score(doc=611,freq=2.0), product of:
                0.16128273 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046056706 = queryNorm
                0.38690117 = fieldWeight in 611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=611)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 8.2009 12:54:24
  2. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.01
    0.01482369 = product of:
      0.02964738 = sum of:
        0.02964738 = sum of:
          0.0046871896 = weight(_text_:a in 3284) [ClassicSimilarity], result of:
            0.0046871896 = score(doc=3284,freq=6.0), product of:
              0.053105544 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.046056706 = queryNorm
              0.088261776 = fieldWeight in 3284, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.03125 = fieldNorm(doc=3284)
          0.02496019 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
            0.02496019 = score(doc=3284,freq=2.0), product of:
              0.16128273 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046056706 = queryNorm
              0.15476047 = fieldWeight in 3284, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03125 = fieldNorm(doc=3284)
      0.5 = coord(1/2)
    
    Abstract
    Das Klassifizieren von Objekten (z. B. Fauna, Flora, Texte) ist ein Verfahren, das auf menschlicher Intelligenz basiert. In der Informatik - insbesondere im Gebiet der Künstlichen Intelligenz (KI) - wird u. a. untersucht, inweit Verfahren, die menschliche Intelligenz benötigen, automatisiert werden können. Hierbei hat sich herausgestellt, dass die Lösung von Alltagsproblemen eine größere Herausforderung darstellt, als die Lösung von Spezialproblemen, wie z. B. das Erstellen eines Schachcomputers. So ist "Rybka" der seit Juni 2007 amtierende Computerschach-Weltmeistern. Inwieweit Alltagsprobleme mit Methoden der Künstlichen Intelligenz gelöst werden können, ist eine - für den allgemeinen Fall - noch offene Frage. Beim Lösen von Alltagsproblemen spielt die Verarbeitung der natürlichen Sprache, wie z. B. das Verstehen, eine wesentliche Rolle. Den "gesunden Menschenverstand" als Maschine (in der Cyc-Wissensbasis in Form von Fakten und Regeln) zu realisieren, ist Lenat's Ziel seit 1984. Bezüglich des KI-Paradeprojektes "Cyc" gibt es CycOptimisten und Cyc-Pessimisten. Das Verstehen der natürlichen Sprache (z. B. Werktitel, Zusammenfassung, Vorwort, Inhalt) ist auch beim intellektuellen Klassifizieren von bibliografischen Titeldatensätzen oder Netzpublikationen notwendig, um diese Textobjekte korrekt klassifizieren zu können. Seit dem Jahr 2007 werden von der Deutschen Nationalbibliothek nahezu alle Veröffentlichungen mit der Dewey Dezimalklassifikation (DDC) intellektuell klassifiziert.
    Die Menge der zu klassifizierenden Veröffentlichungen steigt spätestens seit der Existenz des World Wide Web schneller an, als sie intellektuell sachlich erschlossen werden kann. Daher werden Verfahren gesucht, um die Klassifizierung von Textobjekten zu automatisieren oder die intellektuelle Klassifizierung zumindest zu unterstützen. Seit 1968 gibt es Verfahren zur automatischen Dokumentenklassifizierung (Information Retrieval, kurz: IR) und seit 1992 zur automatischen Textklassifizierung (ATC: Automated Text Categorization). Seit immer mehr digitale Objekte im World Wide Web zur Verfügung stehen, haben Arbeiten zur automatischen Textklassifizierung seit ca. 1998 verstärkt zugenommen. Dazu gehören seit 1996 auch Arbeiten zur automatischen DDC-Klassifizierung bzw. RVK-Klassifizierung von bibliografischen Titeldatensätzen und Volltextdokumenten. Bei den Entwicklungen handelt es sich unseres Wissens bislang um experimentelle und keine im ständigen Betrieb befindlichen Systeme. Auch das VZG-Projekt Colibri/DDC ist seit 2006 u. a. mit der automatischen DDC-Klassifizierung befasst. Die diesbezüglichen Untersuchungen und Entwicklungen dienen zur Beantwortung der Forschungsfrage: "Ist es möglich, eine inhaltlich stimmige DDC-Titelklassifikation aller GVK-PLUS-Titeldatensätze automatisch zu erzielen?"
    Date
    22. 1.2010 14:41:24
    Type
    a
  3. Automatic classification research at OCLC (2002) 0.01
    0.010920083 = product of:
      0.021840166 = sum of:
        0.021840166 = product of:
          0.043680333 = sum of:
            0.043680333 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
              0.043680333 = score(doc=1563,freq=2.0), product of:
                0.16128273 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046056706 = queryNorm
                0.2708308 = fieldWeight in 1563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1563)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    5. 5.2003 9:22:09
  4. Godby, C. J.; Stuler, J.: ¬The Library of Congress Classification as a knowledge base for automatic subject categorization (2001) 0.00
    0.0033143433 = product of:
      0.0066286866 = sum of:
        0.0066286866 = product of:
          0.013257373 = sum of:
            0.013257373 = weight(_text_:a in 1567) [ClassicSimilarity], result of:
              0.013257373 = score(doc=1567,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.24964198 = fieldWeight in 1567, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1567)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper describes a set of experiments in adapting a subset of the Library of Congress Classification for use as a database for automatic classification. A high degree of concept integrity was obtained when subject headings were mapped from OCLC's WorldCat database and filtered using the log-likelihood statistic
    Footnote
    Paper, IFLA Preconference "Subject Retrieval in a Networked Environment", Dublin, OH, August 2001.
  5. Lindholm, J.; Schönthal, T.; Jansson , K.: Experiences of harvesting Web resources in engineering using automatic classification (2003) 0.00
    0.0033143433 = product of:
      0.0066286866 = sum of:
        0.0066286866 = product of:
          0.013257373 = sum of:
            0.013257373 = weight(_text_:a in 4088) [ClassicSimilarity], result of:
              0.013257373 = score(doc=4088,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.24964198 = fieldWeight in 4088, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4088)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Authors describe the background and the work involved in setting up Engine-e, a Web index that uses automatic classification as a mean for the selection of resources in Engineering. Considerations in offering a robot-generated Web index as a successor to a manually indexed quality-controlled subject gateway are also discussed
    Type
    a
  6. Yi, K.: Challenges in automated classification using library classification schemes (2006) 0.00
    0.00270615 = product of:
      0.0054123 = sum of:
        0.0054123 = product of:
          0.0108246 = sum of:
            0.0108246 = weight(_text_:a in 5810) [ClassicSimilarity], result of:
              0.0108246 = score(doc=5810,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.20383182 = fieldWeight in 5810, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5810)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A major library classification scheme has long been standard classification framework for information sources in traditional library environment, and text classification (TC) becomes a popular and attractive tool of organizing digital information. This paper gives an overview of previous projects and studies on TC using major library classification schemes, and summarizes a discussion of TC research challenges.
    Language
    a
  7. Koch, T.; Ardö, A.: Automatic classification of full-text HTML-documents from one specific subject area : DESIRE II D3.6a, Working Paper 2 (2000) 0.00
    0.0023435948 = product of:
      0.0046871896 = sum of:
        0.0046871896 = product of:
          0.009374379 = sum of:
            0.009374379 = weight(_text_:a in 1667) [ClassicSimilarity], result of:
              0.009374379 = score(doc=1667,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.17652355 = fieldWeight in 1667, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1667)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Content
    1 Introduction / 2 Method overview / 3 Ei thesaurus preprocessing / 4 Automatic classification process: 4.1 Matching -- 4.2 Weighting -- 4.3 Preparation for display / 5 Results of the classification process / 6 Evaluations / 7 Software / 8 Other applications / 9 Experiments with universal classification systems / References / Appendix A: Ei classification service: Software / Appendix B: Use of the classification software as subject filter in a WWW harvester.
  8. Prabowo, R.; Jackson, M.; Burden, P.; Knoell, H.-D.: Ontology-based automatic classification for the Web pages : design, implementation and evaluation (2002) 0.00
    0.002269176 = product of:
      0.004538352 = sum of:
        0.004538352 = product of:
          0.009076704 = sum of:
            0.009076704 = weight(_text_:a in 3383) [ClassicSimilarity], result of:
              0.009076704 = score(doc=3383,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.1709182 = fieldWeight in 3383, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3383)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In recent years, we have witnessed the continual growth in the use of ontologies in order to provide a mechanism to enable machine reasoning. This paper describes an automatic classifier, which focuses on the use of ontologies for classifying Web pages with respect to the Dewey Decimal Classification (DDC) and Library of Congress Classification (LCC) schemes. Firstly, we explain how these ontologies can be built in a modular fashion, and mapped into DDC and LCC. Secondly, we propose the formal definition of a DDC-LCC and an ontology-classification-scheme mapping. Thirdly, we explain the way the classifier uses these ontologies to assist classification. Finally, an experiment in which the accuracy of the classifier was evaluated is presented. The experiment shows that our approach results an improved classification in terms of accuracy. This improvement, however, comes at a cost in a low overage ratio due to the incompleteness of the ontologies used
  9. Adams, K.C.: Word wranglers : Automatic classification tools transform enterprise documents from "bags of words" into knowledge resources (2003) 0.00
    0.0018909799 = product of:
      0.0037819599 = sum of:
        0.0037819599 = product of:
          0.0075639198 = sum of:
            0.0075639198 = weight(_text_:a in 1665) [ClassicSimilarity], result of:
              0.0075639198 = score(doc=1665,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14243183 = fieldWeight in 1665, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1665)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Taxonomies are an important part of any knowledge management (KM) system, and automatic classification software is emerging as a "killer app" for consumer and enterprise portals. A number of companies such as Inxight Software , Mohomine, Metacode, and others claim to interpret the semantic content of any textual document and automatically classify text on the fly. The promise that software could automatically produce a Yahoo-style directory is a siren call not many IT managers are able to resist. KM needs have grown more complex due to the increasing amount of digital information, the declining effectiveness of keyword searching, and heterogeneous document formats in corporate databases. This environment requires innovative KM tools, and automatic classification technology is an example of this new kind of software. These products can be divided into three categories according to their underlying technology - rules-based, catalog-by-example, and statistical clustering. Evolving trends in this market include framing classification as a cyborg (computer- and human-based) activity and the increasing use of extensible markup language (XML) and support vector machine (SVM) technology. In this article, we'll survey the rapidly changing automatic classification software market and examine the features and capabilities of leading classification products.
  10. Hagedorn, K.; Chapman, S.; Newman, D.: Enhancing search and browse using automated clustering of subject metadata (2007) 0.00
    0.001757696 = product of:
      0.003515392 = sum of:
        0.003515392 = product of:
          0.007030784 = sum of:
            0.007030784 = weight(_text_:a in 1168) [ClassicSimilarity], result of:
              0.007030784 = score(doc=1168,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.13239266 = fieldWeight in 1168, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1168)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The Web puzzle of online information resources often hinders end-users from effective and efficient access to these resources. Clustering resources into appropriate subject-based groupings may help alleviate these difficulties, but will it work with heterogeneous material? The University of Michigan and the University of California Irvine joined forces to test automatically enhancing metadata records using the Topic Modeling algorithm on the varied OAIster corpus. We created labels for the resulting clusters of metadata records, matched the clusters to an in-house classification system, and developed a prototype that would showcase methods for search and retrieval using the enhanced records. Results indicated that while the algorithm was somewhat time-intensive to run and using a local classification scheme had its drawbacks, precise clustering of records was achieved and the prototype interface proved that faceted classification could be powerful in helping end-users find resources.
    Type
    a
  11. Reiner, U.: VZG-Projekt Colibri : Bewertung von automatisch DDC-klassifizierten Titeldatensätzen der Deutschen Nationalbibliothek (DNB) (2009) 0.00
    8.4567186E-4 = product of:
      0.0016913437 = sum of:
        0.0016913437 = product of:
          0.0033826875 = sum of:
            0.0033826875 = weight(_text_:a in 2675) [ClassicSimilarity], result of:
              0.0033826875 = score(doc=2675,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.06369744 = fieldWeight in 2675, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2675)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Das VZG-Projekt Colibri/DDC beschäftigt sich seit 2003 mit automatischen Verfahren zur Dewey-Dezimalklassifikation (Dewey Decimal Classification, kurz DDC). Ziel des Projektes ist eine einheitliche DDC-Erschließung von bibliografischen Titeldatensätzen und eine Unterstützung der DDC-Expert(inn)en und DDC-Laien, z. B. bei der Analyse und Synthese von DDC-Notationen und deren Qualitätskontrolle und der DDC-basierten Suche. Der vorliegende Bericht konzentriert sich auf die erste größere automatische DDC-Klassifizierung und erste automatische und intellektuelle Bewertung mit der Klassifizierungskomponente vc_dcl1. Grundlage hierfür waren die von der Deutschen Nationabibliothek (DNB) im November 2007 zur Verfügung gestellten 25.653 Titeldatensätze (12 Wochen-/Monatslieferungen) der Deutschen Nationalbibliografie der Reihen A, B und H. Nach Erläuterung der automatischen DDC-Klassifizierung und automatischen Bewertung in Kapitel 2 wird in Kapitel 3 auf den DNB-Bericht "Colibri_Auswertung_DDC_Endbericht_Sommer_2008" eingegangen. Es werden Sachverhalte geklärt und Fragen gestellt, deren Antworten die Weichen für den Verlauf der weiteren Klassifizierungstests stellen werden. Über das Kapitel 3 hinaus führende weitergehende Betrachtungen und Gedanken zur Fortführung der automatischen DDC-Klassifizierung werden in Kapitel 4 angestellt. Der Bericht dient dem vertieften Verständnis für die automatischen Verfahren.