Search (117 results, page 1 of 6)

Wätjen, H.-J.; Diekmann, B.; Möller, G.; Carstensen, K.-U.: Bericht zum DFG-Projekt: GERHARD : German Harvest Automated Retrieval and Directory (1998) 0.05

0.053350244 = product of:
  0.10670049 = sum of:
    0.050220765 = weight(_text_:k in 3065) [ClassicSimilarity], result of:
      0.050220765 = score(doc=3065,freq=2.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.39440846 = fieldWeight in 3065, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.078125 = fieldNorm(doc=3065)
    0.04225481 = weight(_text_:u in 3065) [ClassicSimilarity], result of:
      0.04225481 = score(doc=3065,freq=2.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.3617784 = fieldWeight in 3065, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.078125 = fieldNorm(doc=3065)
    0.014224912 = weight(_text_:d in 3065) [ClassicSimilarity], result of:
      0.014224912 = score(doc=3065,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.20990817 = fieldWeight in 3065, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.078125 = fieldNorm(doc=3065)
  0.5 = coord(3/6)

Language: d

Golub, K.; Hansson, J.; Soergel, D.; Tudhope, D.: Managing classification in libraries : a methodological outline for evaluating automatic subject indexing and classification in Swedish library catalogues (2015) 0.04

0.042949304 = product of:
  0.064423956 = sum of:
    0.025110383 = weight(_text_:k in 2300) [ClassicSimilarity], result of:
      0.025110383 = score(doc=2300,freq=2.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.19720423 = fieldWeight in 2300, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2300)
    0.021127405 = weight(_text_:u in 2300) [ClassicSimilarity], result of:
      0.021127405 = score(doc=2300,freq=2.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.1808892 = fieldWeight in 2300, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2300)
    0.0100585325 = weight(_text_:d in 2300) [ClassicSimilarity], result of:
      0.0100585325 = score(doc=2300,freq=4.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.1484275 = fieldWeight in 2300, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2300)
    0.008127635 = product of:
      0.024382904 = sum of:
        0.024382904 = weight(_text_:29 in 2300) [ClassicSimilarity], result of:
          0.024382904 = score(doc=2300,freq=2.0), product of:
            0.12547383 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03566941 = queryNorm
            0.19432661 = fieldWeight in 2300, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
      0.33333334 = coord(1/3)
  0.6666667 = coord(4/6)

Source: Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.04

0.036294382 = product of:
  0.072588764 = sum of:
    0.04225481 = weight(_text_:u in 611) [ClassicSimilarity], result of:
      0.04225481 = score(doc=611,freq=2.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.3617784 = fieldWeight in 611, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.078125 = fieldNorm(doc=611)
    0.014224912 = weight(_text_:d in 611) [ClassicSimilarity], result of:
      0.014224912 = score(doc=611,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.20990817 = fieldWeight in 611, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.078125 = fieldNorm(doc=611)
    0.016109042 = product of:
      0.048327126 = sum of:
        0.048327126 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.048327126 = score(doc=611,freq=2.0), product of:
            0.124908194 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03566941 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)

Date: 22. 8.2009 12:54:24
Language: d

Reiner, U.: DDC-based search in the data of the German National Bibliography (2008) 0.03

0.032010145 = product of:
  0.06402029 = sum of:
    0.030132461 = weight(_text_:k in 2166) [ClassicSimilarity], result of:
      0.030132461 = score(doc=2166,freq=2.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.23664509 = fieldWeight in 2166, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.046875 = fieldNorm(doc=2166)
    0.025352886 = weight(_text_:u in 2166) [ClassicSimilarity], result of:
      0.025352886 = score(doc=2166,freq=2.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.21706703 = fieldWeight in 2166, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.046875 = fieldNorm(doc=2166)
    0.0085349465 = weight(_text_:d in 2166) [ClassicSimilarity], result of:
      0.0085349465 = score(doc=2166,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.1259449 = fieldWeight in 2166, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.046875 = fieldNorm(doc=2166)
  0.5 = coord(3/6)

Location: D
Source: New pespectives on subject indexing and classification: essays in honour of Magda Heiner-Freiling. Red.: K. Knull-Schlomann, u.a

Panyr, J.: Automatische Klassifikation und Information Retrieval : Anwendung und Entwicklung komplexer Verfahren in Information-Retrieval-Systemen und ihre Evaluierung (1986) 0.02

0.024948752 = product of:
  0.07484625 = sum of:
    0.050705772 = weight(_text_:u in 32) [ClassicSimilarity], result of:
      0.050705772 = score(doc=32,freq=2.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.43413407 = fieldWeight in 32, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.09375 = fieldNorm(doc=32)
    0.024140477 = weight(_text_:d in 32) [ClassicSimilarity], result of:
      0.024140477 = score(doc=32,freq=4.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.356226 = fieldWeight in 32, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.09375 = fieldNorm(doc=32)
  0.33333334 = coord(2/6)

Footnote: Zugleich Dissertation U Saarbrücken 1085
Language: d
Type: d

Shen, D.; Chen, Z.; Yang, Q.; Zeng, H.J.; Zhang, B.; Lu, Y.; Ma, W.Y.: Web page classification through summarization (2004) 0.02

0.021481892 = product of:
  0.064445674 = sum of:
    0.050220765 = weight(_text_:k in 4132) [ClassicSimilarity], result of:
      0.050220765 = score(doc=4132,freq=2.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.39440846 = fieldWeight in 4132, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.078125 = fieldNorm(doc=4132)
    0.014224912 = weight(_text_:d in 4132) [ClassicSimilarity], result of:
      0.014224912 = score(doc=4132,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.20990817 = fieldWeight in 4132, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.078125 = fieldNorm(doc=4132)
  0.33333334 = coord(2/6)

Source: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a

Kwon, O.W.; Lee, J.H.: Text categorization based on k-nearest neighbor approach for web site classification (2003) 0.02
```
0.021425387 = product of:
  0.06427616 = sum of:
    0.056148525 = weight(_text_:k in 1070) [ClassicSimilarity], result of:
      0.056148525 = score(doc=1070,freq=10.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.44096208 = fieldWeight in 1070, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1070)
    0.008127635 = product of:
      0.024382904 = sum of:
        0.024382904 = weight(_text_:29 in 1070) [ClassicSimilarity], result of:
          0.024382904 = score(doc=1070,freq=2.0), product of:
            0.12547383 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03566941 = queryNorm
            0.19432661 = fieldWeight in 1070, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1070)
      0.33333334 = coord(1/3)
  0.33333334 = coord(2/6)
```
Abstract

Automatic categorization is a viable method to deal with the scaling problem on the World Wide Web. For Web site classification, this paper proposes the use of Web pages linked with the home page in a different manner from the sole use of home pages in previous research. To implement our proposed method, we derive a scheme for Web site classification based on the k-nearest neighbor (k-NN) approach. It consists of three phases: Web page selection (connectivity analysis), Web page classification, and Web site classification. Given a Web site, the Web page selection chooses several representative Web pages using connectivity analysis. The k-NN classifier next classifies each of the selected Web pages. Finally, the classified Web pages are extended to a classification of the entire Web site. To improve performance, we supplement the k-NN approach with a feature selection method and a term weighting scheme using markup tags, and also reform its document-document similarity measure. In our experiments on a Korean commercial Web directory, the proposed system, using both a home page and its linked pages, improved the performance of micro-averaging breakeven point by 30.02%, compared with an ordinary classification which uses a home page only.

Date

27.12.2007 17:32:29
Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.02
```
0.020704288 = product of:
  0.041408576 = sum of:
    0.029274995 = weight(_text_:u in 3284) [ClassicSimilarity], result of:
      0.029274995 = score(doc=3284,freq=6.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.25064746 = fieldWeight in 3284, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03125 = fieldNorm(doc=3284)
    0.0056899646 = weight(_text_:d in 3284) [ClassicSimilarity], result of:
      0.0056899646 = score(doc=3284,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.08396327 = fieldWeight in 3284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03125 = fieldNorm(doc=3284)
    0.006443617 = product of:
      0.01933085 = sum of:
        0.01933085 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
          0.01933085 = score(doc=3284,freq=2.0), product of:
            0.124908194 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03566941 = queryNorm
            0.15476047 = fieldWeight in 3284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=3284)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)
```
Abstract

Das Klassifizieren von Objekten (z. B. Fauna, Flora, Texte) ist ein Verfahren, das auf menschlicher Intelligenz basiert. In der Informatik - insbesondere im Gebiet der Künstlichen Intelligenz (KI) - wird u. a. untersucht, inweit Verfahren, die menschliche Intelligenz benötigen, automatisiert werden können. Hierbei hat sich herausgestellt, dass die Lösung von Alltagsproblemen eine größere Herausforderung darstellt, als die Lösung von Spezialproblemen, wie z. B. das Erstellen eines Schachcomputers. So ist "Rybka" der seit Juni 2007 amtierende Computerschach-Weltmeistern. Inwieweit Alltagsprobleme mit Methoden der Künstlichen Intelligenz gelöst werden können, ist eine - für den allgemeinen Fall - noch offene Frage. Beim Lösen von Alltagsproblemen spielt die Verarbeitung der natürlichen Sprache, wie z. B. das Verstehen, eine wesentliche Rolle. Den "gesunden Menschenverstand" als Maschine (in der Cyc-Wissensbasis in Form von Fakten und Regeln) zu realisieren, ist Lenat's Ziel seit 1984. Bezüglich des KI-Paradeprojektes "Cyc" gibt es CycOptimisten und Cyc-Pessimisten. Das Verstehen der natürlichen Sprache (z. B. Werktitel, Zusammenfassung, Vorwort, Inhalt) ist auch beim intellektuellen Klassifizieren von bibliografischen Titeldatensätzen oder Netzpublikationen notwendig, um diese Textobjekte korrekt klassifizieren zu können. Seit dem Jahr 2007 werden von der Deutschen Nationalbibliothek nahezu alle Veröffentlichungen mit der Dewey Dezimalklassifikation (DDC) intellektuell klassifiziert.
Die Menge der zu klassifizierenden Veröffentlichungen steigt spätestens seit der Existenz des World Wide Web schneller an, als sie intellektuell sachlich erschlossen werden kann. Daher werden Verfahren gesucht, um die Klassifizierung von Textobjekten zu automatisieren oder die intellektuelle Klassifizierung zumindest zu unterstützen. Seit 1968 gibt es Verfahren zur automatischen Dokumentenklassifizierung (Information Retrieval, kurz: IR) und seit 1992 zur automatischen Textklassifizierung (ATC: Automated Text Categorization). Seit immer mehr digitale Objekte im World Wide Web zur Verfügung stehen, haben Arbeiten zur automatischen Textklassifizierung seit ca. 1998 verstärkt zugenommen. Dazu gehören seit 1996 auch Arbeiten zur automatischen DDC-Klassifizierung bzw. RVK-Klassifizierung von bibliografischen Titeldatensätzen und Volltextdokumenten. Bei den Entwicklungen handelt es sich unseres Wissens bislang um experimentelle und keine im ständigen Betrieb befindlichen Systeme. Auch das VZG-Projekt Colibri/DDC ist seit 2006 u. a. mit der automatischen DDC-Klassifizierung befasst. Die diesbezüglichen Untersuchungen und Entwicklungen dienen zur Beantwortung der Forschungsfrage: "Ist es möglich, eine inhaltlich stimmige DDC-Titelklassifikation aller GVK-PLUS-Titeldatensätze automatisch zu erzielen?"

Date

22. 1.2010 14:41:24

Language

d
Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.02
```
0.020618794 = product of:
  0.06185638 = sum of:
    0.052190956 = weight(_text_:k in 690) [ClassicSimilarity], result of:
      0.052190956 = score(doc=690,freq=6.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.40988132 = fieldWeight in 690, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.046875 = fieldNorm(doc=690)
    0.009665425 = product of:
      0.028996274 = sum of:
        0.028996274 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
          0.028996274 = score(doc=690,freq=2.0), product of:
            0.124908194 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03566941 = queryNorm
            0.23214069 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
      0.33333334 = coord(1/3)
  0.33333334 = coord(2/6)
```
Abstract

We describe the latent semantic indexing subspace signature model (LSISSM) for semantic content representation of unstructured text. Grounded on singular value decomposition, the model represents terms and documents by the distribution signatures of their statistical contribution across the top-ranking latent concept dimensions. LSISSM matches term signatures with document signatures according to their mapping coherence between latent semantic indexing (LSI) term subspace and LSI document subspace. LSISSM does feature reduction and finds a low-rank approximation of scalable and sparse term-document matrices. Experiments demonstrate that this approach significantly improves the performance of major clustering algorithms such as standard K-means and self-organizing maps compared with the vector space model and the traditional LSI model. The unique contribution ranking mechanism in LSISSM also improves the initialization of standard K-means compared with random seeding procedure, which sometimes causes low efficiency and effectiveness of clustering. A two-stage initialization strategy based on LSISSM significantly reduces the running time of standard K-means procedures.

Date

23. 3.2013 13:22:36

Schek, M.: Automatische Klassifizierung und Visualisierung im Archiv der Süddeutschen Zeitung (2005) 0.02

0.018672587 = product of:
  0.037345175 = sum of:
    0.01757727 = weight(_text_:k in 4884) [ClassicSimilarity], result of:
      0.01757727 = score(doc=4884,freq=2.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.13804297 = fieldWeight in 4884, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4884)
    0.014789184 = weight(_text_:u in 4884) [ClassicSimilarity], result of:
      0.014789184 = score(doc=4884,freq=2.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.12662244 = fieldWeight in 4884, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4884)
    0.0049787187 = weight(_text_:d in 4884) [ClassicSimilarity], result of:
      0.0049787187 = score(doc=4884,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.07346786 = fieldWeight in 4884, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4884)
  0.5 = coord(3/6)

Language: d
Object: K-Infinity
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.02

0.017384928 = product of:
  0.052154783 = sum of:
    0.042489357 = product of:
      0.16995743 = sum of:
        0.16995743 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.16995743 = score(doc=562,freq=2.0), product of:
            0.30240566 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03566941 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.25 = coord(1/4)
    0.009665425 = product of:
      0.028996274 = sum of:
        0.028996274 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.028996274 = score(doc=562,freq=2.0), product of:
            0.124908194 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03566941 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
  0.33333334 = coord(2/6)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Alberts, I.; Forest, D.: Email pragmatics and automatic classification : a study in the organizational context (2012) 0.02
```
0.016868304 = product of:
  0.050604913 = sum of:
    0.04349246 = weight(_text_:k in 238) [ClassicSimilarity], result of:
      0.04349246 = score(doc=238,freq=6.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.34156775 = fieldWeight in 238, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.0390625 = fieldNorm(doc=238)
    0.007112456 = weight(_text_:d in 238) [ClassicSimilarity], result of:
      0.007112456 = score(doc=238,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.104954086 = fieldWeight in 238, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.0390625 = fieldNorm(doc=238)
  0.33333334 = coord(2/6)
```
Abstract

This paper presents a two-phased research project aiming to improve email triage for public administration managers. The first phase developed a typology of email classification patterns through a qualitative study involving 34 participants. Inspired by the fields of pragmatics and speech act theory, this typology comprising four top level categories and 13 subcategories represents the typical email triage behaviors of managers in an organizational context. The second study phase was conducted on a corpus of 1,703 messages using email samples of two managers. Using the k-NN (k-nearest neighbor) algorithm, statistical treatments automatically classified the email according to lexical and nonlexical features representative of managers' triage patterns. The automatic classification of email according to the lexicon of the messages was found to be substantially more efficient when k = 2 and n = 2,000. For four categories, the average recall rate was 94.32%, the average precision rate was 94.50%, and the accuracy rate was 94.54%. For 13 categories, the average recall rate was 91.09%, the average precision rate was 84.18%, and the accuracy rate was 88.70%. It appears that a message's nonlexical features are also deeply influenced by email pragmatics. Features related to the recipient and the sender were the most relevant for characterizing email.

Panyr, J.: STEINADLER: ein Verfahren zur automatischen Deskribierung und zur automatischen thematischen Klassifikation (1978) 0.02

0.016256098 = product of:
  0.04876829 = sum of:
    0.022759859 = weight(_text_:d in 5169) [ClassicSimilarity], result of:
      0.022759859 = score(doc=5169,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.33585307 = fieldWeight in 5169, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.125 = fieldNorm(doc=5169)
    0.02600843 = product of:
      0.07802529 = sum of:
        0.07802529 = weight(_text_:29 in 5169) [ClassicSimilarity], result of:
          0.07802529 = score(doc=5169,freq=2.0), product of:
            0.12547383 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03566941 = queryNorm
            0.6218451 = fieldWeight in 5169, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.125 = fieldNorm(doc=5169)
      0.33333334 = coord(1/3)
  0.33333334 = coord(2/6)

Language: d
Source: Nachrichten für Dokumentation. 29(1978), S.92-96

Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.02

0.015476957 = product of:
  0.04643087 = sum of:
    0.03515454 = weight(_text_:k in 2560) [ClassicSimilarity], result of:
      0.03515454 = score(doc=2560,freq=2.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.27608594 = fieldWeight in 2560, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2560)
    0.01127633 = product of:
      0.03382899 = sum of:
        0.03382899 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
          0.03382899 = score(doc=2560,freq=2.0), product of:
            0.124908194 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03566941 = queryNorm
            0.2708308 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
      0.33333334 = coord(1/3)
  0.33333334 = coord(2/6)

Date: 22. 9.2008 18:31:54

Schulze, U.: Erfahrungen bei der Anwendung automatischer Klassifizierungsverfahren zur Inhaltsanalyse einer Dokumentenmenge (1978) 0.02

0.015061259 = product of:
  0.045183778 = sum of:
    0.033803847 = weight(_text_:u in 83) [ClassicSimilarity], result of:
      0.033803847 = score(doc=83,freq=2.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.28942272 = fieldWeight in 83, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.0625 = fieldNorm(doc=83)
    0.011379929 = weight(_text_:d in 83) [ClassicSimilarity], result of:
      0.011379929 = score(doc=83,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.16792654 = fieldWeight in 83, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.0625 = fieldNorm(doc=83)
  0.33333334 = coord(2/6)

Language: d

Pfister, J.: Clustering von Patent-Dokumenten am Beispiel der Datenbanken des Fachinformationszentrums Karlsruhe (2006) 0.02

0.015061259 = product of:
  0.045183778 = sum of:
    0.033803847 = weight(_text_:u in 5976) [ClassicSimilarity], result of:
      0.033803847 = score(doc=5976,freq=2.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.28942272 = fieldWeight in 5976, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.0625 = fieldNorm(doc=5976)
    0.011379929 = weight(_text_:d in 5976) [ClassicSimilarity], result of:
      0.011379929 = score(doc=5976,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.16792654 = fieldWeight in 5976, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.0625 = fieldNorm(doc=5976)
  0.33333334 = coord(2/6)

Language: d
Source: Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker

Han, K.; Rezapour, R.; Nakamura, K.; Devkota, D.; Miller, D.C.; Diesner, J.: ¬An expert-in-the-loop method for domain-specific document categorization based on small training data (2023) 0.01

0.014207967 = product of:
  0.0426239 = sum of:
    0.035511445 = weight(_text_:k in 967) [ClassicSimilarity], result of:
      0.035511445 = score(doc=967,freq=4.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.2788889 = fieldWeight in 967, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.0390625 = fieldNorm(doc=967)
    0.007112456 = weight(_text_:d in 967) [ClassicSimilarity], result of:
      0.007112456 = score(doc=967,freq=2.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.104954086 = fieldWeight in 967, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.0390625 = fieldNorm(doc=967)
  0.33333334 = coord(2/6)

Hagedorn, K.; Chapman, S.; Newman, D.: Enhancing search and browse using automated clustering of subject metadata (2007) 0.01

0.014067567 = product of:
  0.0422027 = sum of:
    0.030132461 = weight(_text_:k in 1168) [ClassicSimilarity], result of:
      0.030132461 = score(doc=1168,freq=2.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.23664509 = fieldWeight in 1168, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.046875 = fieldNorm(doc=1168)
    0.012070239 = weight(_text_:d in 1168) [ClassicSimilarity], result of:
      0.012070239 = score(doc=1168,freq=4.0), product of:
        0.06776731 = queryWeight, product of:
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.03566941 = queryNorm
        0.178113 = fieldWeight in 1168, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.899872 = idf(docFreq=17979, maxDocs=44218)
          0.046875 = fieldNorm(doc=1168)
  0.33333334 = coord(2/6)

Source: D-Lib magazine. 13(2007) nos.7/8, x S

Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.01

0.013652353 = product of:
  0.040957056 = sum of:
    0.029578367 = weight(_text_:u in 1595) [ClassicSimilarity], result of:
      0.029578367 = score(doc=1595,freq=2.0), product of:
        0.11679749 = queryWeight, product of:
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.03566941 = queryNorm
        0.25324488 = fieldWeight in 1595, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2744443 = idf(docFreq=4547, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1595)
    0.011378689 = product of:
      0.034136064 = sum of:
        0.034136064 = weight(_text_:29 in 1595) [ClassicSimilarity], result of:
          0.034136064 = score(doc=1595,freq=2.0), product of:
            0.12547383 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03566941 = queryNorm
            0.27205724 = fieldWeight in 1595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
      0.33333334 = coord(1/3)
  0.33333334 = coord(2/6)

Date: 11. 5.2003 18:29:44
Source: Advances in classification research, vol.10: proceedings of the 10th ASIS SIG/CR Classification Research Workshop. Ed.: Albrechtsen, H. u. J.E. Mai

Sparck Jones, K.: Automatic classification (1976) 0.01

0.013392205 = product of:
  0.08035323 = sum of:
    0.08035323 = weight(_text_:k in 2908) [ClassicSimilarity], result of:
      0.08035323 = score(doc=2908,freq=2.0), product of:
        0.12733187 = queryWeight, product of:
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.03566941 = queryNorm
        0.63105357 = fieldWeight in 2908, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.569778 = idf(docFreq=3384, maxDocs=44218)
          0.125 = fieldNorm(doc=2908)
  0.16666667 = coord(1/6)

Search (117 results, page 1 of 6)

Authors

Years

Languages

Types

Themes

Subjects