Search (74 results, page 4 of 4)

Fang, H.: Classifying research articles in multidisciplinary sciences journals into subject categories (2015) 0.01

0.00516971 = product of:
  0.01550913 = sum of:
    0.01550913 = weight(_text_:h in 2194) [ClassicSimilarity], result of:
      0.01550913 = score(doc=2194,freq=2.0), product of:
        0.113001004 = queryWeight, product of:
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.045483325 = queryNorm
        0.13724773 = fieldWeight in 2194, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2194)
  0.33333334 = coord(1/3)

AlQenaei, Z.M.; Monarchi, D.E.: ¬The use of learning techniques to analyze the results of a manual classification system (2016) 0.01
```
0.00516971 = product of:
  0.01550913 = sum of:
    0.01550913 = weight(_text_:h in 2836) [ClassicSimilarity], result of:
      0.01550913 = score(doc=2836,freq=2.0), product of:
        0.113001004 = queryWeight, product of:
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.045483325 = queryNorm
        0.13724773 = fieldWeight in 2836, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2836)
  0.33333334 = coord(1/3)
```
Abstract

Classification is the process of assigning objects to pre-defined classes based on observations or characteristics of those objects, and there are many approaches to performing this task. The overall objective of this study is to demonstrate the use of two learning techniques to analyze the results of a manual classification system. Our sample consisted of 1,026 documents, from the ACM Computing Classification System, classified by their authors as belonging to one of the groups of the classification system: "H.3 Information Storage and Retrieval." A singular value decomposition of the documents' weighted term-frequency matrix was used to represent each document in a 50-dimensional vector space. The analysis of the representation using both supervised (decision tree) and unsupervised (clustering) techniques suggests that two pairs of the ACM classes are closely related to each other in the vector space. Class 1 (Content Analysis and Indexing) is closely related to Class 3 (Information Search and Retrieval), and Class 4 (Systems and Software) is closely related to Class 5 (Online Information Services). Further analysis was performed to test the diffusion of the words in the two classes using both cosine and Euclidean distance.

Suominen, A.; Toivanen, H.: Map of science with topic modeling : comparison of unsupervised learning and human-assigned subject classification (2016) 0.01

0.00516971 = product of:
  0.01550913 = sum of:
    0.01550913 = weight(_text_:h in 3121) [ClassicSimilarity], result of:
      0.01550913 = score(doc=3121,freq=2.0), product of:
        0.113001004 = queryWeight, product of:
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.045483325 = queryNorm
        0.13724773 = fieldWeight in 3121, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3121)
  0.33333334 = coord(1/3)

Wang, H.; Hong, M.: Supervised Hebb rule based feature selection for text classification (2019) 0.01

0.00516971 = product of:
  0.01550913 = sum of:
    0.01550913 = weight(_text_:h in 5036) [ClassicSimilarity], result of:
      0.01550913 = score(doc=5036,freq=2.0), product of:
        0.113001004 = queryWeight, product of:
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.045483325 = queryNorm
        0.13724773 = fieldWeight in 5036, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5036)
  0.33333334 = coord(1/3)

Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.01

0.0051353024 = product of:
  0.015405906 = sum of:
    0.015405906 = product of:
      0.030811813 = sum of:
        0.030811813 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
          0.030811813 = score(doc=2765,freq=2.0), product of:
            0.15927485 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045483325 = queryNorm
            0.19345059 = fieldWeight in 2765, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2765)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 3.2009 19:14:43

Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.01

0.0051353024 = product of:
  0.015405906 = sum of:
    0.015405906 = product of:
      0.030811813 = sum of:
        0.030811813 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
          0.030811813 = score(doc=1107,freq=2.0), product of:
            0.15927485 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045483325 = queryNorm
            0.19345059 = fieldWeight in 1107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1107)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 28.10.2013 19:22:57

Koch, T.; Vizine-Goetz, D.: DDC and knowledge organization in the digital library : Research and development. Demonstration pages (1999) 0.01
```
0.0050585633 = product of:
  0.01517569 = sum of:
    0.01517569 = product of:
      0.03035138 = sum of:
        0.03035138 = weight(_text_:von in 942) [ClassicSimilarity], result of:
          0.03035138 = score(doc=942,freq=4.0), product of:
            0.12134718 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.045483325 = queryNorm
            0.2501202 = fieldWeight in 942, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.046875 = fieldNorm(doc=942)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Der Workshop gibt einen Einblick in die aktuelle Forschung und Entwicklung zur Wissensorganisation in digitalen Bibliotheken. Diane Vizine-Goetz vom OCLC Office of Research in Dublin, Ohio, stellt die Forschungsprojekte von OCLC zur Anpassung und Weiterentwicklung der Dewey Decimal Classification als Wissensorganisationsinstrument fuer grosse digitale Dokumentensammlungen vor. Traugott Koch, NetLab, Universität Lund in Schweden, demonstriert die Ansätze und Lösungen des EU-Projekts DESIRE zum Einsatz von intellektueller und vor allem automatischer Klassifikation in Fachinformationsdiensten im Internet.

Panyr, J.: Automatische Indexierung und Klassifikation (1983) 0.00

0.004769259 = product of:
  0.014307777 = sum of:
    0.014307777 = product of:
      0.028615555 = sum of:
        0.028615555 = weight(_text_:von in 7692) [ClassicSimilarity], result of:
          0.028615555 = score(doc=7692,freq=2.0), product of:
            0.12134718 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.045483325 = queryNorm
            0.23581557 = fieldWeight in 7692, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0625 = fieldNorm(doc=7692)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Im Beitrag wird zunächst eine terminologische Klärung und Gliederung für drei Indexierungsmethoden und weitere Begriffe, die Konsistenzprobleme bei intellektueller Indexierung betreffen, unternommen. Zur automatichen Indexierung werden Extraktionsmethoden erläutert und zur Automatischen Klassifikation (Clustering) und Indexierung zwei Anwendungen vorgestellt. Eine enge Kooperation zwischen den Befürwortern der intellektuellen und den Entwicklern von automatischen Indexierungsverfahren wird empfohlen

Schek, M.: Automatische Klassifizierung in Erschließung und Recherche eines Pressearchivs (2006) 0.00
```
0.004769259 = product of:
  0.014307777 = sum of:
    0.014307777 = product of:
      0.028615555 = sum of:
        0.028615555 = weight(_text_:von in 6043) [ClassicSimilarity], result of:
          0.028615555 = score(doc=6043,freq=8.0), product of:
            0.12134718 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.045483325 = queryNorm
            0.23581557 = fieldWeight in 6043, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.03125 = fieldNorm(doc=6043)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Die Süddeutsche Zeitung (SZ) verfügt seit ihrer Gründung 1945 über ein Pressearchiv, das die Texte der eigenen Redakteure und zahlreicher nationaler und internationaler Publikationen dokumentiert und für Recherchezwecke bereitstellt. Die DIZ-Pressedatenbank (www.medienport.de) ermöglicht die browserbasierte Recherche für Redakteure und externe Kunden im Intra- und Internet und die kundenspezifischen Content Feeds für Verlage, Rundfunkanstalten und Portale. Die DIZ-Pressedatenbank enthält z. Zt. 7,8 Millionen Artikel, die jeweils als HTML oder PDF abrufbar sind. Täglich kommen ca. 3.500 Artikel hinzu, von denen ca. 1.000 durch Dokumentare inhaltlich erschlossen werden. Die Informationserschließung erfolgt im DIZ nicht durch die Vergabe von Schlagwörtern am Dokument, sondern durch die Verlinkung der Artikel mit "virtuellen Mappen", den Dossiers. Insgesamt enthält die DIZ-Pressedatenbank ca. 90.000 Dossiers, die untereinander zum "DIZ-Wissensnetz" verlinkt sind. DIZ definiert das Wissensnetz als Alleinstellungsmerkmal und wendet beträchtliche personelle Ressourcen für die Aktualisierung und Qualitätssicherung der Dossiers auf. Im Zuge der Medienkrise mussten sich DIZ der Herausforderung stellen, bei sinkenden Lektoratskapazitäten die Qualität der Informationserschließung im Input zu erhalten. Auf der Outputseite gilt es, eine anspruchsvolle Zielgruppe - u.a. die Redakteure der Süddeutschen Zeitung - passgenau und zeitnah mit den Informationen zu versorgen, die sie für ihre tägliche Arbeit benötigt. Bezogen auf die Ausgangssituation in der Dokumentation der Süddeutschen Zeitung identifizierte DIZ drei Ansatzpunkte, wie die Aufwände auf der Inputseite (Lektorat) zu optimieren sind und gleichzeitig auf der Outputseite (Recherche) das Wissensnetz besser zu vermarkten ist: - (Teil-)Automatische Klassifizierung von Pressetexten (Vorschlagwesen) - Visualisierung des Wissensnetzes - Neue Retrievalmöglichkeiten (Ähnlichkeitssuche, Clustering) Im Bereich "Visualisierung" setzt DIZ auf den Net-Navigator von intelligent views, eine interaktive Visualisierung allgemeiner Graphen, basierend auf einem physikalischen Modell. In den Bereichen automatische Klassifizierung, Ähnlichkeitssuche und Clustering hat DIZ sich für das Produkt nextBot der Firma Brainbot entschieden.

Wille, J.: Automatisches Klassifizieren bibliographischer Beschreibungsdaten : Vorgehensweise und Ergebnisse (2006) 0.00

0.004173102 = product of:
  0.012519306 = sum of:
    0.012519306 = product of:
      0.025038611 = sum of:
        0.025038611 = weight(_text_:von in 6090) [ClassicSimilarity], result of:
          0.025038611 = score(doc=6090,freq=2.0), product of:
            0.12134718 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.045483325 = queryNorm
            0.20633863 = fieldWeight in 6090, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6090)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Diese Arbeit befasst sich mit den praktischen Aspekten des Automatischen Klassifizierens bibliographischer Referenzdaten. Im Vordergrund steht die konkrete Vorgehensweise anhand des eigens zu diesem Zweck entwickelten Open Source-Programms COBRA "Classification Of Bibliographic Records, Automatic". Es werden die Rahmenbedingungen und Parameter f¨ur einen Einsatz im bibliothekarischen Umfeld geklärt. Schließlich erfolgt eine Auswertung von Klassifizierungsergebnissen am Beispiel sozialwissenschaftlicher Daten aus der Datenbank SOLIS.

Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.00

0.0041357684 = product of:
  0.012407305 = sum of:
    0.012407305 = weight(_text_:h in 4095) [ClassicSimilarity], result of:
      0.012407305 = score(doc=4095,freq=2.0), product of:
        0.113001004 = queryWeight, product of:
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.045483325 = queryNorm
        0.10979818 = fieldWeight in 4095, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.03125 = fieldNorm(doc=4095)
  0.33333334 = coord(1/3)

Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.00

0.0041082418 = product of:
  0.012324724 = sum of:
    0.012324724 = product of:
      0.024649449 = sum of:
        0.024649449 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
          0.024649449 = score(doc=2741,freq=2.0), product of:
            0.15927485 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045483325 = queryNorm
            0.15476047 = fieldWeight in 2741, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 12. 9.2004 9:56:22

Borko, H.: Research in computer based classification systems (1985) 0.00

0.003618797 = product of:
  0.010856391 = sum of:
    0.010856391 = weight(_text_:h in 3647) [ClassicSimilarity], result of:
      0.010856391 = score(doc=3647,freq=2.0), product of:
        0.113001004 = queryWeight, product of:
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.045483325 = queryNorm
        0.096073404 = fieldWeight in 3647, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.4844491 = idf(docFreq=10020, maxDocs=44218)
          0.02734375 = fieldNorm(doc=3647)
  0.33333334 = coord(1/3)

Helmbrecht-Schaar, A.: Entwicklung eines Verfahrens der automatischen Klassifizierung für Textdokumente aus dem Fachbereich Informatik mithilfe eines fachspezifischen Klassifikationssystems (2007) 0.00
```
0.0035769443 = product of:
  0.010730833 = sum of:
    0.010730833 = product of:
      0.021461666 = sum of:
        0.021461666 = weight(_text_:von in 1410) [ClassicSimilarity], result of:
          0.021461666 = score(doc=1410,freq=2.0), product of:
            0.12134718 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.045483325 = queryNorm
            0.17686167 = fieldWeight in 1410, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.046875 = fieldNorm(doc=1410)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In der vorliegenden Arbeit werden die Möglichkeiten für eine Automatisierung des Klassifizierens von Online Dokumenten evaluiert und ein mögliches Verfahren prototypisch implementiert. Dabei werden Verfahren der Terminologieextraktion angewandt, um die Sinnträger der Texte zu ermitteln. Klassifikationen, die im Allg. nur wenige weiterführende Informationen enthalten, sollen über einen Mapping Mechanismus auf die das Dokument beschreibenden Terme angewandt werden. Im Ansatz wird bereits sichtbar, dass es keine rein automatische Klassifikation geben kann, da es immer einen Bruch zwischen den intellektuell erstellten Klassifikationen und den aus den Texten generierten Informationen geben wird. Es wird ein semiautomatisches Verfahren vorgestellt, das durch Anwenderaktionen lernt und zu einer sukzessiven Automatisierung führen kann. Die Ergebnisse der semiautomatischen Klassifizierung werden mit denen einer manuellen verglichen. Im Anschluss wird ein Ausblick auf Möglichkeiten und Grenzen der automatischen Klassifikation gegeben.

Search (74 results, page 4 of 4)

Authors

Years

Languages

Types

Themes