Search (24 results, page 1 of 2)

Jersek, T.: Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse (2012) 0.01

0.010768173 = product of:
  0.082555994 = sum of:
    0.020465806 = weight(_text_:und in 122) [ClassicSimilarity], result of:
      0.020465806 = score(doc=122,freq=8.0), product of:
        0.052235067 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.023567878 = queryNorm
        0.39180204 = fieldWeight in 122, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0625 = fieldNorm(doc=122)
    0.02069673 = product of:
      0.04139346 = sum of:
        0.04139346 = weight(_text_:bibliothekswesen in 122) [ClassicSimilarity], result of:
          0.04139346 = score(doc=122,freq=2.0), product of:
            0.10505787 = queryWeight, product of:
              4.457672 = idf(docFreq=1392, maxDocs=44218)
              0.023567878 = queryNorm
            0.39400625 = fieldWeight in 122, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.457672 = idf(docFreq=1392, maxDocs=44218)
              0.0625 = fieldNorm(doc=122)
      0.5 = coord(1/2)
    0.04139346 = weight(_text_:bibliothekswesen in 122) [ClassicSimilarity], result of:
      0.04139346 = score(doc=122,freq=2.0), product of:
        0.10505787 = queryWeight, product of:
          4.457672 = idf(docFreq=1392, maxDocs=44218)
          0.023567878 = queryNorm
        0.39400625 = fieldWeight in 122, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.457672 = idf(docFreq=1392, maxDocs=44218)
          0.0625 = fieldNorm(doc=122)
  0.13043478 = coord(3/23)

Abstract: Die Arbeit befasst sich mit der Realisierung und der Durchführung einer automatischen DDCKlassifizierung durch das Indexierungssystem Lingo. Dies geschieht durch die Einbeziehung von Relationen des DFG-Projektes CrissCross, anhand derer Lingo bibliographische Titeldatensätze automatisch klassifiziert. Der dabei verwendete Ansatz wird mit dem üblichen methodischen Vorgehen bei automatischen Klassifizierungssystemen verglichen. Das Klassifizierungsverfahren wird daraufhin anhand einer Testkollektion von bibliographischen Titeldatensätzen der Deutschen Nationalbibliothek (DNB) getestet. Es folgt eine Diskussion der Ergebnisse und eine Bewertung des Klassifizierungssystems.
Content: Diplomarbeit, Studiengang Bibliothekswesen, Fakultät für Informations- und Kommunikationswissenschaften, Fachhochschule Köln.

Sommer, M.: Automatische Generierung von DDC-Notationen für Hochschulveröffentlichungen (2012) 0.01

0.0063950857 = product of:
  0.049028993 = sum of:
    0.021707265 = weight(_text_:und in 587) [ClassicSimilarity], result of:
      0.021707265 = score(doc=587,freq=16.0), product of:
        0.052235067 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.023567878 = queryNorm
        0.41556883 = fieldWeight in 587, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.046875 = fieldNorm(doc=587)
    0.017655406 = weight(_text_:im in 587) [ClassicSimilarity], result of:
      0.017655406 = score(doc=587,freq=4.0), product of:
        0.066621356 = queryWeight, product of:
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.023567878 = queryNorm
        0.26501122 = fieldWeight in 587, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.046875 = fieldNorm(doc=587)
    0.009666322 = product of:
      0.019332644 = sum of:
        0.019332644 = weight(_text_:29 in 587) [ClassicSimilarity], result of:
          0.019332644 = score(doc=587,freq=2.0), product of:
            0.08290443 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.023567878 = queryNorm
            0.23319192 = fieldWeight in 587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=587)
      0.5 = coord(1/2)
  0.13043478 = coord(3/23)

Abstract: Das Thema dieser Bachelorarbeit ist die automatische Generierung von Notationen der Dewey-Dezimalklassifikation für Metadaten. Die Metadaten sind im Dublin-Core-Format und stammen vom Server für wissenschaftliche Schriften der Hochschule Hannover. Zu Beginn erfolgt eine allgemeine Einführung über die Methoden und Hauptanwendungsbereiche des automatischen Klassifizierens. Danach werden die Dewey-Dezimalklassifikation und der Prozess der Metadatengewinnung beschrieben. Der theoretische Teil endet mit der Beschreibung von zwei Projekten. In dem ersten Projekt wurde ebenfalls versucht Metadaten mit Notationen der Dewey-Dezimalklassifikation anzureichern. Das Ergebnis des zweiten Projekts ist eine Konkordanz zwischen der Schlagwortnormdatei und der Dewey-Dezimalklassifikation. Diese Konkordanz wurde im praktischen Teil dieser Arbeit dazu benutzt um automatisch Notationen der Dewey-Dezimalklassifikation zu vergeben.
Content: Vgl. unter: http://opus.bsz-bw.de/fhhv/volltexte/2012/397/pdf/Bachelorarbeit_final_Korrektur01.pdf. Bachelorarbeit, Hochschule Hannover, Fakultät III - Medien, Information und Design, Abteilung Information und Kommunikation, Studiengang Informationsmanagement
Date: 29. 1.2013 15:44:43
Imprint: Hannover : Hochschule Hannover, Fakultät III - Medien, Information und Design, Abteilung Information und Kommunikation

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.01

0.006204248 = product of:
  0.0475659 = sum of:
    0.011110791 = product of:
      0.022221582 = sum of:
        0.022221582 = weight(_text_:1 in 2748) [ClassicSimilarity], result of:
          0.022221582 = score(doc=2748,freq=4.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.38382855 = fieldWeight in 2748, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
    0.020489499 = product of:
      0.040978998 = sum of:
        0.040978998 = weight(_text_:international in 2748) [ClassicSimilarity], result of:
          0.040978998 = score(doc=2748,freq=4.0), product of:
            0.078619614 = queryWeight, product of:
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.023567878 = queryNorm
            0.52123123 = fieldWeight in 2748, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
    0.01596561 = product of:
      0.03193122 = sum of:
        0.03193122 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.03193122 = score(doc=2748,freq=2.0), product of:
            0.08253069 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.023567878 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
  0.13043478 = coord(3/23)

Date: 1. 2.2016 18:25:22
1. 2.2016 19:07:41
Imprint: Basel : Springer International Publishing
Source: Semantic keyword-based search on structured data sources: First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers. Eds.: J. Cardoso et al

Groß, T.; Faden, M.: Automatische Indexierung elektronischer Dokumente an der Deutschen Zentralbibliothek für Wirtschaftswissenschaften : Bericht über die Jahrestagung der Internationalen Buchwissenschaftlichen Gesellschaft (2010) 0.00
```
0.0037753168 = product of:
  0.043416142 = sum of:
    0.018447628 = weight(_text_:und in 4051) [ClassicSimilarity], result of:
      0.018447628 = score(doc=4051,freq=26.0), product of:
        0.052235067 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.023567878 = queryNorm
        0.3531656 = fieldWeight in 4051, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.03125 = fieldNorm(doc=4051)
    0.024968514 = weight(_text_:im in 4051) [ClassicSimilarity], result of:
      0.024968514 = score(doc=4051,freq=18.0), product of:
        0.066621356 = queryWeight, product of:
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.023567878 = queryNorm
        0.37478244 = fieldWeight in 4051, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.03125 = fieldNorm(doc=4051)
  0.08695652 = coord(2/23)
```
Abstract

Die zunehmende Verfügbarmachung digitaler Informationen in den letzten Jahren sowie die Aussicht auf ein weiteres Ansteigen der sogenannten Datenflut kumulieren in einem grundlegenden, sich weiter verstärkenden Informationsstrukturierungsproblem. Die stetige Zunahme von digitalen Informationsressourcen im World Wide Web sichert zwar jederzeit und ortsungebunden den Zugriff auf verschiedene Informationen; offen bleibt der strukturierte Zugang, insbesondere zu wissenschaftlichen Ressourcen. Angesichts der steigenden Anzahl elektronischer Inhalte und vor dem Hintergrund stagnierender bzw. knapper werdender personeller Ressourcen in der Sacherschließun schafft keine Bibliothek bzw. kein Bibliotheksverbund es mehr, weder aktuell noch zukünftig, alle digitalen Daten zu erfassen, zu strukturieren und zueinander in Beziehung zu setzen. In der Informationsgesellschaft des 21. Jahrhunderts wird es aber zunehmend wichtiger, die in der Flut verschwundenen wissenschaftlichen Informationen zeitnah, angemessen und vollständig zu strukturieren und somit als Basis für eine Wissensgenerierung wieder nutzbar zu machen. Eine normierte Inhaltserschließung digitaler Informationsressourcen ist deshalb für die Deutsche Zentralbibliothek für Wirtschaftswissenschaften (ZBW) als wichtige Informationsinfrastruktureinrichtung in diesem Bereich ein entscheidender und auch erfolgskritischer Aspekt im Wettbewerb mit anderen Informationsdienstleistern. Weil die traditionelle intellektuelle Sacherschließung aber nicht beliebig skalierbar ist - mit dem Anstieg der Zahl an Online-Dokumenten steigt proportional auch der personelle Ressourcenbedarf an Fachreferenten, wenn ein gewisser Qualitätsstandard gehalten werden soll - bedarf es zukünftig anderer Sacherschließungsverfahren. Automatisierte Verschlagwortungsmethoden werden dabei als einzige Möglichkeit angesehen, die bibliothekarische Sacherschließung auch im digitalen Zeitalter zukunftsfest auszugestalten. Zudem können maschinelle Ansätze dazu beitragen, die Heterogenitäten (Indexierungsinkonsistenzen) zwischen den einzelnen Sacherschließer zu nivellieren, und somit zu einer homogeneren Erschließung des Bibliotheksbestandes beitragen.
Mit der Anfang 2010 begonnen Implementierung und Ergebnisevaluierung des automatischen Indexierungsverfahrens "Decisiv Categorization" der Firma Recommind soll das hier skizzierte Informationsstrukturierungsproblem in zwei Schritten gelöst werden. Kurz- bis mittelfristig soll die intellektuelle Indexierung durch ein semiautomatisches Verfahren6 unterstützt werden. Mittel- bis langfristig soll das maschinelle Verfahren, aufbauend auf einem entsprechenden Training, in die Lage versetzt werden, sowohl im Hause vorliegende Dokumente vollautomatisch zu indexieren als auch ZBW-fremde digitale Informationsressourcen zu verschlagworten bzw. zu klassifizieren, um sie in einem gemeinsamen Suchraum auffindbar machen zu können. Im Anschluss an diese Einleitung werden die ersten Ansätze maschineller Sacherschließung an der ZBW (2001-2004) und deren Ergebnisse und Problemlagen aufgezeigt. Danach werden die Rahmenbedingungen (Projektauftrag und -ziel) für eine Wiederaufnahme des Vorhabens im Jahre 2009 aufgezeigt, gefolgt von einer Darstellung der Funktionsweise der Recommind-Technologie und deren Einsatz im Rahmen der Sacherschließung von Online-Dokumenten mit einem Thesaurus. Schwerpunkt dieser Abhandlung bilden im Anschluss daran die Evaluierungsmöglichkeiten automatischer Indexierungsansätze sowie die aktuellen Ergebnisse und zentralen Erkenntnisse des Einsatzes im Kontext der ZBW. Das Fazit beschreibt die entsprechenden Schlussfolgerungen aus den erzielten Ergebnissen sowie den Ausblick auf das weitere Vorgehen.
Kasprzik, A.: Automatisierte und semiautomatisierte Klassifizierung : eine Analyse aktueller Projekte (2014) 0.00
```
0.0022974934 = product of:
  0.026421174 = sum of:
    0.021707265 = weight(_text_:und in 2470) [ClassicSimilarity], result of:
      0.021707265 = score(doc=2470,freq=16.0), product of:
        0.052235067 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.023567878 = queryNorm
        0.41556883 = fieldWeight in 2470, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.046875 = fieldNorm(doc=2470)
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 2470) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=2470,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 2470, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=2470)
      0.5 = coord(1/2)
  0.08695652 = coord(2/23)
```
Abstract

Das sprunghafte Anwachsen der Menge digital verfügbarer Dokumente gepaart mit dem Zeit- und Personalmangel an wissenschaftlichen Bibliotheken legt den Einsatz von halb- oder vollautomatischen Verfahren für die verbale und klassifikatorische Inhaltserschließung nahe. Nach einer kurzen allgemeinen Einführung in die gängige Methodik beleuchtet dieser Artikel eine Reihe von Projekten zur automatisierten Klassifizierung aus dem Zeitraum 2007-2012 und aus dem deutschsprachigen Raum. Ein Großteil der vorgestellten Projekte verwendet Methoden des Maschinellen Lernens aus der Künstlichen Intelligenz, arbeitet meist mit angepassten Versionen einer kommerziellen Software und bezieht sich in der Regel auf die Dewey Decimal Classification (DDC). Als Datengrundlage dienen Metadatensätze, Abstracs, Inhaltsverzeichnisse und Volltexte in diversen Datenformaten. Die abschließende Analyse enthält eine Anordnung der Projekte nach einer Reihe von verschiedenen Kriterien und eine Zusammenfassung der aktuellen Lage und der größten Herausfordungen für automatisierte Klassifizierungsverfahren.

Source

Perspektive Bibliothek. 3(2014) H.1, S.85-110

Golub, K.; Hansson, J.; Soergel, D.; Tudhope, D.: Managing classification in libraries : a methodological outline for evaluating automatic subject indexing and classification in Swedish library catalogues (2015) 0.00

0.0013303827 = product of:
  0.0152994 = sum of:
    0.008055268 = product of:
      0.016110536 = sum of:
        0.016110536 = weight(_text_:29 in 2300) [ClassicSimilarity], result of:
          0.016110536 = score(doc=2300,freq=2.0), product of:
            0.08290443 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.023567878 = queryNorm
            0.19432661 = fieldWeight in 2300, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
      0.5 = coord(1/2)
    0.0072441325 = product of:
      0.014488265 = sum of:
        0.014488265 = weight(_text_:international in 2300) [ClassicSimilarity], result of:
          0.014488265 = score(doc=2300,freq=2.0), product of:
            0.078619614 = queryWeight, product of:
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.023567878 = queryNorm
            0.18428308 = fieldWeight in 2300, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
      0.5 = coord(1/2)
  0.08695652 = coord(2/23)

Source: Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro

Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.00

0.0012504549 = product of:
  0.028760463 = sum of:
    0.028760463 = sum of:
      0.0094278185 = weight(_text_:1 in 3464) [ClassicSimilarity], result of:
        0.0094278185 = score(doc=3464,freq=2.0), product of:
          0.057894554 = queryWeight, product of:
            2.4565027 = idf(docFreq=10304, maxDocs=44218)
            0.023567878 = queryNorm
          0.16284466 = fieldWeight in 3464, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.4565027 = idf(docFreq=10304, maxDocs=44218)
            0.046875 = fieldNorm(doc=3464)
      0.019332644 = weight(_text_:29 in 3464) [ClassicSimilarity], result of:
        0.019332644 = score(doc=3464,freq=2.0), product of:
          0.08290443 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.023567878 = queryNorm
          0.23319192 = fieldWeight in 3464, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.046875 = fieldNorm(doc=3464)
  0.04347826 = coord(1/23)

Date: 1. 6.2010 9:29:57

Piros, A.: Automatic interpretation of complex UDC numbers : towards support for library systems (2015) 0.00

0.001064306 = product of:
  0.0122395195 = sum of:
    0.0064442144 = product of:
      0.012888429 = sum of:
        0.012888429 = weight(_text_:29 in 2301) [ClassicSimilarity], result of:
          0.012888429 = score(doc=2301,freq=2.0), product of:
            0.08290443 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.023567878 = queryNorm
            0.15546128 = fieldWeight in 2301, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03125 = fieldNorm(doc=2301)
      0.5 = coord(1/2)
    0.0057953056 = product of:
      0.011590611 = sum of:
        0.011590611 = weight(_text_:international in 2301) [ClassicSimilarity], result of:
          0.011590611 = score(doc=2301,freq=2.0), product of:
            0.078619614 = queryWeight, product of:
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.023567878 = queryNorm
            0.14742646 = fieldWeight in 2301, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.03125 = fieldNorm(doc=2301)
      0.5 = coord(1/2)
  0.08695652 = coord(2/23)

Source: Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro

Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.00

4.1649418E-4 = product of:
  0.009579366 = sum of:
    0.009579366 = product of:
      0.019158732 = sum of:
        0.019158732 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
          0.019158732 = score(doc=690,freq=2.0), product of:
            0.08253069 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.023567878 = queryNorm
            0.23214069 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Date: 23. 3.2013 13:22:36

Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.00

4.1649418E-4 = product of:
  0.009579366 = sum of:
    0.009579366 = product of:
      0.019158732 = sum of:
        0.019158732 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
          0.019158732 = score(doc=2158,freq=2.0), product of:
            0.08253069 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.023567878 = queryNorm
            0.23214069 = fieldWeight in 2158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2158)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Date: 4. 8.2015 19:22:04

Ma, Z.; Sun, A.; Cong, G.: On predicting the popularity of newly emerging hashtags in Twitter (2013) 0.00

3.5022903E-4 = product of:
  0.008055268 = sum of:
    0.008055268 = product of:
      0.016110536 = sum of:
        0.016110536 = weight(_text_:29 in 967) [ClassicSimilarity], result of:
          0.016110536 = score(doc=967,freq=2.0), product of:
            0.08290443 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.023567878 = queryNorm
            0.19432661 = fieldWeight in 967, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=967)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Date: 25. 6.2013 19:05:29

Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.00

3.470785E-4 = product of:
  0.007982805 = sum of:
    0.007982805 = product of:
      0.01596561 = sum of:
        0.01596561 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
          0.01596561 = score(doc=1107,freq=2.0), product of:
            0.08253069 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.023567878 = queryNorm
            0.19345059 = fieldWeight in 1107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1107)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Date: 28.10.2013 19:22:57

Wartena, C.; Sommer, M.: Automatic classification of scientific records using the German Subject Heading Authority File (SWD) (2012) 0.00

3.149623E-4 = product of:
  0.0072441325 = sum of:
    0.0072441325 = product of:
      0.014488265 = sum of:
        0.014488265 = weight(_text_:international in 472) [ClassicSimilarity], result of:
          0.014488265 = score(doc=472,freq=2.0), product of:
            0.078619614 = queryWeight, product of:
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.023567878 = queryNorm
            0.18428308 = fieldWeight in 472, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.0390625 = fieldNorm(doc=472)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: Proceedings of the 2nd International Workshop on Semantic Digital Archives held in conjunction with the 16th Int. Conference on Theory and Practice of Digital Libraries (TPDL) on September 27, 2012 in Paphos, Cyprus [http://ceur-ws.org/Vol-912/proceedings.pdf]. Eds.: A. Mitschik et al

Barthel, S.; Tönnies, S.; Balke, W.-T.: Large-scale experiments for mathematical document classification (2013) 0.00

3.149623E-4 = product of:
  0.0072441325 = sum of:
    0.0072441325 = product of:
      0.014488265 = sum of:
        0.014488265 = weight(_text_:international in 1056) [ClassicSimilarity], result of:
          0.014488265 = score(doc=1056,freq=2.0), product of:
            0.078619614 = queryWeight, product of:
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.023567878 = queryNorm
            0.18428308 = fieldWeight in 1056, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1056)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: 15th International Conference on Asia-Pacific Digital Libraries ICADL 2013. Bangalore, India. [to appear, 2013]

Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.00

2.5196982E-4 = product of:
  0.0057953056 = sum of:
    0.0057953056 = product of:
      0.011590611 = sum of:
        0.011590611 = weight(_text_:international in 4095) [ClassicSimilarity], result of:
          0.011590611 = score(doc=4095,freq=2.0), product of:
            0.078619614 = queryWeight, product of:
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.023567878 = queryNorm
            0.14742646 = fieldWeight in 4095, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.03125 = fieldNorm(doc=4095)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: IEEE International Conference on Big Data (Big Data) (2017)

AlQenaei, Z.M.; Monarchi, D.E.: ¬The use of learning techniques to analyze the results of a manual classification system (2016) 0.00
```
2.4153895E-4 = product of:
  0.0055553955 = sum of:
    0.0055553955 = product of:
      0.011110791 = sum of:
        0.011110791 = weight(_text_:1 in 2836) [ClassicSimilarity], result of:
          0.011110791 = score(doc=2836,freq=4.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.19191428 = fieldWeight in 2836, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2836)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)
```
Abstract

Classification is the process of assigning objects to pre-defined classes based on observations or characteristics of those objects, and there are many approaches to performing this task. The overall objective of this study is to demonstrate the use of two learning techniques to analyze the results of a manual classification system. Our sample consisted of 1,026 documents, from the ACM Computing Classification System, classified by their authors as belonging to one of the groups of the classification system: "H.3 Information Storage and Retrieval." A singular value decomposition of the documents' weighted term-frequency matrix was used to represent each document in a 50-dimensional vector space. The analysis of the representation using both supervised (decision tree) and unsupervised (clustering) techniques suggests that two pairs of the ACM classes are closely related to each other in the vector space. Class 1 (Content Analysis and Indexing) is closely related to Class 3 (Information Search and Retrieval), and Class 4 (Systems and Software) is closely related to Class 5 (Online Information Services). Further analysis was performed to test the diffusion of the words in the two classes using both cosine and Euclidean distance.

Source

Knowledge organization. 43(2016) no.1, S.56-63
Golub, K.: Automated subject classification of textual documents in the context of Web-based hierarchical browsing (2011) 0.00
```
2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 4558) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=4558,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 4558, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=4558)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)
```
Abstract

While automated methods for information organization have been around for several decades now, exponential growth of the World Wide Web has put them into the forefront of research in different communities, within which several approaches can be identified: 1) machine learning (algorithms that allow computers to improve their performance based on learning from pre-existing data); 2) document clustering (algorithms for unsupervised document organization and automated topic extraction); and 3) string matching (algorithms that match given strings within larger text). Here the aim was to automatically organize textual documents into hierarchical structures for subject browsing. The string-matching approach was tested using a controlled vocabulary (containing pre-selected and pre-defined authorized terms, each corresponding to only one concept). The results imply that an appropriate controlled vocabulary, with a sufficient number of entry terms designating classes, could in itself be a solution for automated classification. Then, if the same controlled vocabulary had an appropriat hierarchical structure, it would at the same time provide a good browsing structure for the collection of automatically classified documents.
Desale, S.K.; Kumbhar, R.: Research on automatic classification of documents in library environment : a literature review (2013) 0.00
```
2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 1071) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=1071,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 1071, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=1071)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)
```
Abstract

This paper aims to provide an overview of automatic classification research, which focuses on issues related to the automatic classification of documents in a library environment. The review covers literature published in mainstream library and information science studies. The review was done on literature published in both academic and professional LIS journals and other documents. This review reveals that basically three types of research are being done on automatic classification: 1) hierarchical classification using different library classification schemes, 2) text categorization and document categorization using different type of classifiers with or without using training documents, and 3) automatic bibliographic classification. Predominantly this research is directed towards solving problems of organization of digital documents in an online environment. However, very little research is devoted towards solving the problems of arrangement of physical documents.

Kishida, K.: High-speed rough clustering for very large document collections (2010) 0.00

1.707938E-4 = product of:
  0.0039282576 = sum of:
    0.0039282576 = product of:
      0.007856515 = sum of:
        0.007856515 = weight(_text_:1 in 3463) [ClassicSimilarity], result of:
          0.007856515 = score(doc=3463,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.13570388 = fieldWeight in 3463, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3463)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Date: 1. 6.2010 9:27:35

Mu, T.; Goulermas, J.Y.; Korkontzelos, I.; Ananiadou, S.: Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities (2016) 0.00

1.707938E-4 = product of:
  0.0039282576 = sum of:
    0.0039282576 = product of:
      0.007856515 = sum of:
        0.007856515 = weight(_text_:1 in 2496) [ClassicSimilarity], result of:
          0.007856515 = score(doc=2496,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.13570388 = fieldWeight in 2496, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2496)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: Journal of the Association for Information Science and Technology. 67(2016) no.1, S.106-133

Search (24 results, page 1 of 2)

Authors

Languages

Types

Themes