Search (94 results, page 1 of 5)

Dubin, D.: Dimensions and discriminability (1998) 0.11

0.10835254 = product of:
  0.16252881 = sum of:
    0.03873757 = weight(_text_:science in 2338) [ClassicSimilarity], result of:
      0.03873757 = score(doc=2338,freq=4.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.2881068 = fieldWeight in 2338, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2338)
    0.12379125 = sum of:
      0.075381085 = weight(_text_:index in 2338) [ClassicSimilarity], result of:
        0.075381085 = score(doc=2338,freq=2.0), product of:
          0.22304957 = queryWeight, product of:
            4.369764 = idf(docFreq=1520, maxDocs=44218)
            0.05104385 = queryNorm
          0.33795667 = fieldWeight in 2338, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.369764 = idf(docFreq=1520, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2338)
      0.04841016 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
        0.04841016 = score(doc=2338,freq=2.0), product of:
          0.17874686 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05104385 = queryNorm
          0.2708308 = fieldWeight in 2338, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2338)
  0.6666667 = coord(2/3)

Abstract: Visualization interfaces can improve subject access by highlighting the inclusion of document representation components in similarity and discrimination relationships. Within a set of retrieved documents, what kinds of groupings can index terms and subject headings make explicit? The role of controlled vocabulary in classifying search output is examined
Date: 22. 9.1997 19:16:05
Imprint: Urbana-Champaign, IL : Illinois University at Urbana-Champaign, Graduate School of Library and Information Science
Source: Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.07

0.06787892 = product of:
  0.10181837 = sum of:
    0.08107116 = product of:
      0.24321347 = sum of:
        0.24321347 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.24321347 = score(doc=562,freq=2.0), product of:
            0.4327503 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.05104385 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.02074721 = product of:
      0.04149442 = sum of:
        0.04149442 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.04149442 = score(doc=562,freq=2.0), product of:
            0.17874686 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05104385 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Huang, Y.-L.: ¬A theoretic and empirical research of cluster indexing for Mandarine Chinese full text document (1998) 0.05

0.053796053 = product of:
  0.08069408 = sum of:
    0.027391598 = weight(_text_:science in 513) [ClassicSimilarity], result of:
      0.027391598 = score(doc=513,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.20372227 = fieldWeight in 513, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0546875 = fieldNorm(doc=513)
    0.053302478 = product of:
      0.106604956 = sum of:
        0.106604956 = weight(_text_:index in 513) [ClassicSimilarity], result of:
          0.106604956 = score(doc=513,freq=4.0), product of:
            0.22304957 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.05104385 = queryNorm
            0.4779429 = fieldWeight in 513, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0546875 = fieldNorm(doc=513)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Since most popular commercialized systems for full text retrieval are designed with full text scaning and Boolean logic query mode, these systems use an oversimplified relationship between the indexing form and the content of document. Reports the use of Singular Value Decomposition (SVD) to develop a Cluster Indexing Model (CIM) based on a Vector Space Model (VSM) in orer to explore the index theory of cluster indexing for chinese full text documents. From a series of experiments, it was found that the indexing performance of CIM is better than traditional VSM, and has almost equivalent effectiveness of the authority control of index terms
Source: Bulletin of library and information science. 1998, no.24, S.44-68

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.05

0.049139693 = product of:
  0.07370954 = sum of:
    0.039130855 = weight(_text_:science in 2748) [ClassicSimilarity], result of:
      0.039130855 = score(doc=2748,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.2910318 = fieldWeight in 2748, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.078125 = fieldNorm(doc=2748)
    0.034578685 = product of:
      0.06915737 = sum of:
        0.06915737 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.06915737 = score(doc=2748,freq=2.0), product of:
            0.17874686 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05104385 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 1. 2.2016 18:25:22
Series: Lecture notes in computer science ; 9398

Classification, automation, and new media : Proceedings of the 24th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Passau, March 15 - 17, 2000 (2002) 0.04

0.0438286 = product of:
  0.065742895 = sum of:
    0.027669692 = weight(_text_:science in 5997) [ClassicSimilarity], result of:
      0.027669692 = score(doc=5997,freq=4.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.20579056 = fieldWeight in 5997, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5997)
    0.0380732 = product of:
      0.0761464 = sum of:
        0.0761464 = weight(_text_:index in 5997) [ClassicSimilarity], result of:
          0.0761464 = score(doc=5997,freq=4.0), product of:
            0.22304957 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.05104385 = queryNorm
            0.3413878 = fieldWeight in 5997, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5997)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Given the huge amount of information in the internet and in practically every domain of knowledge that we are facing today, knowledge discovery calls for automation. The book deals with methods from classification and data analysis that respond effectively to this rapidly growing challenge. The interested reader will find new methodological insights as well as applications in economics, management science, finance, and marketing, and in pattern recognition, biology, health, and archaeology.
Content: Data Analysis, Statistics, and Classification.- Pattern Recognition and Automation.- Data Mining, Information Processing, and Automation.- New Media, Web Mining, and Automation.- Applications in Management Science, Finance, and Marketing.- Applications in Medicine, Biology, Archaeology, and Others.- Author Index.- Subject Index.

Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.03

0.03439779 = product of:
  0.05159668 = sum of:
    0.027391598 = weight(_text_:science in 5273) [ClassicSimilarity], result of:
      0.027391598 = score(doc=5273,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.20372227 = fieldWeight in 5273, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5273)
    0.02420508 = product of:
      0.04841016 = sum of:
        0.04841016 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
          0.04841016 = score(doc=5273,freq=2.0), product of:
            0.17874686 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05104385 = queryNorm
            0.2708308 = fieldWeight in 5273, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5273)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 22. 7.2006 16:24:52
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.3, S.431-442

Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.03

0.029483816 = product of:
  0.044225723 = sum of:
    0.023478512 = weight(_text_:science in 2760) [ClassicSimilarity], result of:
      0.023478512 = score(doc=2760,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.17461908 = fieldWeight in 2760, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.046875 = fieldNorm(doc=2760)
    0.02074721 = product of:
      0.04149442 = sum of:
        0.04149442 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
          0.04149442 = score(doc=2760,freq=2.0), product of:
            0.17874686 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05104385 = queryNorm
            0.23214069 = fieldWeight in 2760, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2760)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 22. 3.2009 19:11:54
Source: Journal of the American Society for Information Science and Technology. 60(2009) no.4, S.803-813

Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.03

0.029483816 = product of:
  0.044225723 = sum of:
    0.023478512 = weight(_text_:science in 690) [ClassicSimilarity], result of:
      0.023478512 = score(doc=690,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.17461908 = fieldWeight in 690, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.046875 = fieldNorm(doc=690)
    0.02074721 = product of:
      0.04149442 = sum of:
        0.04149442 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
          0.04149442 = score(doc=690,freq=2.0), product of:
            0.17874686 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05104385 = queryNorm
            0.23214069 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 23. 3.2013 13:22:36
Source: Journal of the American Society for Information Science and Technology. 64(2013) no.4, S.844-860

Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.03

0.029483816 = product of:
  0.044225723 = sum of:
    0.023478512 = weight(_text_:science in 2158) [ClassicSimilarity], result of:
      0.023478512 = score(doc=2158,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.17461908 = fieldWeight in 2158, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.046875 = fieldNorm(doc=2158)
    0.02074721 = product of:
      0.04149442 = sum of:
        0.04149442 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
          0.04149442 = score(doc=2158,freq=2.0), product of:
            0.17874686 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05104385 = queryNorm
            0.23214069 = fieldWeight in 2158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2158)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 4. 8.2015 19:22:04
Source: Journal of the Association for Information Science and Technology. 66(2015) no.9, S.1817-1831

Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.02

0.024569847 = product of:
  0.03685477 = sum of:
    0.019565428 = weight(_text_:science in 2765) [ClassicSimilarity], result of:
      0.019565428 = score(doc=2765,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.1455159 = fieldWeight in 2765, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2765)
    0.017289342 = product of:
      0.034578685 = sum of:
        0.034578685 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
          0.034578685 = score(doc=2765,freq=2.0), product of:
            0.17874686 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05104385 = queryNorm
            0.19345059 = fieldWeight in 2765, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2765)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 22. 3.2009 19:14:43
Source: Journal of the American Society for Information Science and Technology. 60(2009) no.4, S.814-825

Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.02

0.024569847 = product of:
  0.03685477 = sum of:
    0.019565428 = weight(_text_:science in 1107) [ClassicSimilarity], result of:
      0.019565428 = score(doc=1107,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.1455159 = fieldWeight in 1107, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1107)
    0.017289342 = product of:
      0.034578685 = sum of:
        0.034578685 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
          0.034578685 = score(doc=1107,freq=2.0), product of:
            0.17874686 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05104385 = queryNorm
            0.19345059 = fieldWeight in 1107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1107)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 28.10.2013 19:22:57
Source: Journal of the American Society for Information Science and Technology. 64(2013) no.11, S.2265-2277

Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.02

0.023579285 = product of:
  0.07073785 = sum of:
    0.07073785 = sum of:
      0.043074906 = weight(_text_:index in 3284) [ClassicSimilarity], result of:
        0.043074906 = score(doc=3284,freq=2.0), product of:
          0.22304957 = queryWeight, product of:
            4.369764 = idf(docFreq=1520, maxDocs=44218)
            0.05104385 = queryNorm
          0.1931181 = fieldWeight in 3284, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.369764 = idf(docFreq=1520, maxDocs=44218)
            0.03125 = fieldNorm(doc=3284)
      0.027662948 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
        0.027662948 = score(doc=3284,freq=2.0), product of:
          0.17874686 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05104385 = queryNorm
          0.15476047 = fieldWeight in 3284, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=3284)
  0.33333334 = coord(1/3)

Date: 22. 1.2010 14:41:24
Footnote: Vortrag gehalten am 03.06.2009 auf dem 98. Bibliothekartag 2009 in Erfurt; erscheint in: Dialog mit Biliotheken. Vgl. auch: http://www.gbv.de/vgm/info/biblio/01VZG/06Publikationen/2009/index.

Borko, H.: Research in computer based classification systems (1985) 0.02
```
0.021694047 = product of:
  0.03254107 = sum of:
    0.013695799 = weight(_text_:science in 3647) [ClassicSimilarity], result of:
      0.013695799 = score(doc=3647,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.101861134 = fieldWeight in 3647, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.02734375 = fieldNorm(doc=3647)
    0.018845271 = product of:
      0.037690543 = sum of:
        0.037690543 = weight(_text_:index in 3647) [ClassicSimilarity], result of:
          0.037690543 = score(doc=3647,freq=2.0), product of:
            0.22304957 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.05104385 = queryNorm
            0.16897833 = fieldWeight in 3647, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3647)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The selection in this reader by R. M. Needham and K. Sparck Jones reports an early approach to automatic classification that was taken in England. The following selection reviews various approaches that were being pursued in the United States at about the same time. It then discusses a particular approach initiated in the early 1960s by Harold Borko, at that time Head of the Language Processing and Retrieval Research Staff at the System Development Corporation, Santa Monica, California and, since 1966, a member of the faculty at the Graduate School of Library and Information Science, University of California, Los Angeles. As was described earlier, there are two steps in automatic classification, the first being to identify pairs of terms that are similar by virtue of co-occurring as index terms in the same documents, and the second being to form equivalence classes of intersubstitutable terms. To compute similarities, Borko and his associates used a standard correlation formula; to derive classification categories, where Needham and Sparck Jones used clumping, the Borko team used the statistical technique of factor analysis. The fact that documents can be classified automatically, and in any number of ways, is worthy of passing notice. Worthy of serious attention would be a demonstra tion that a computer-based classification system was effective in the organization and retrieval of documents. One reason for the inclusion of the following selection in the reader is that it addresses the question of evaluation. To evaluate the effectiveness of their automatically derived classification, Borko and his team asked three questions. The first was Is the classification reliable? in other words, could the categories derived from one sample of texts be used to classify other texts? Reliability was assessed by a case-study comparison of the classes derived from three different samples of abstracts. The notso-surprising conclusion reached was that automatically derived classes were reliable only to the extent that the sample from which they were derived was representative of the total document collection. The second evaluation question asked whether the classification was reasonable, in the sense of adequately describing the content of the document collection. The answer was sought by comparing the automatically derived categories with categories in a related classification system that was manually constructed. Here the conclusion was that the automatic method yielded categories that fairly accurately reflected the major area of interest in the sample collection of texts; however, since there were only eleven such categories and they were quite broad, they could not be regarded as suitable for use in a university or any large general library. The third evaluation question asked whether automatic classification was accurate, in the sense of producing results similar to those obtainabie by human cIassifiers. When using human classification as a criterion, automatic classification was found to be 50 percent accurate.

Ardö, A.; Koch, T.: Automatic classification applied to full-text Internet documents in a robot-generated subject index (1999) 0.02

0.021537453 = product of:
  0.06461236 = sum of:
    0.06461236 = product of:
      0.12922472 = sum of:
        0.12922472 = weight(_text_:index in 382) [ClassicSimilarity], result of:
          0.12922472 = score(doc=382,freq=2.0), product of:
            0.22304957 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.05104385 = queryNorm
            0.5793543 = fieldWeight in 382, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.09375 = fieldNorm(doc=382)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Lindholm, J.; Schönthal, T.; Jansson , K.: Experiences of harvesting Web resources in engineering using automatic classification (2003) 0.02

0.020305708 = product of:
  0.06091712 = sum of:
    0.06091712 = product of:
      0.12183424 = sum of:
        0.12183424 = weight(_text_:index in 4088) [ClassicSimilarity], result of:
          0.12183424 = score(doc=4088,freq=4.0), product of:
            0.22304957 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.05104385 = queryNorm
            0.5462205 = fieldWeight in 4088, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0625 = fieldNorm(doc=4088)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Authors describe the background and the work involved in setting up Engine-e, a Web index that uses automatic classification as a mean for the selection of resources in Engineering. Considerations in offering a robot-generated Web index as a successor to a manually indexed quality-controlled subject gateway are also discussed

Losee, R.M.; Haas, S.W.: Sublanguage terms : dictionaries, usage, and automatic classification (1995) 0.01
```
0.014757169 = product of:
  0.044271506 = sum of:
    0.044271506 = weight(_text_:science in 2650) [ClassicSimilarity], result of:
      0.044271506 = score(doc=2650,freq=4.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.3292649 = fieldWeight in 2650, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0625 = fieldNorm(doc=2650)
  0.33333334 = coord(1/3)
```
Abstract

The use of terms from natural and social science titles and abstracts is studied from the perspective of sublanguages and their specialized dictionaries. Explores different notions of sublanguage distinctiveness. Object methods for separating hard and soft sciences are suggested based on measures of sublanguage use, dictionary characteristics, and sublanguage distinctiveness. Abstracts were automatically classified with a high degree of accuracy by using a formula that condsiders the degree of uniqueness of terms in each sublanguage. This may prove useful for text filtering of information retrieval systems

Source

Journal of the American Society for Information Science. 46(1995) no.7, S.519-529
Zhang, X: Rough set theory based automatic text categorization (2005) 0.01
```
0.014757169 = product of:
  0.044271506 = sum of:
    0.044271506 = weight(_text_:science in 2822) [ClassicSimilarity], result of:
      0.044271506 = score(doc=2822,freq=4.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.3292649 = fieldWeight in 2822, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0625 = fieldNorm(doc=2822)
  0.33333334 = coord(1/3)
```
Abstract

Der Forschungsbericht "Rough Set Theory Based Automatic Text Categorization and the Handling of Semantic Heterogeneity" von Xueying Zhang ist in Buchform auf Englisch erschienen. Zhang hat in ihrer Arbeit ein Verfahren basierend auf der Rough Set Theory entwickelt, das Beziehungen zwischen Schlagwörtern verschiedener Vokabulare herstellt. Sie war von 2003 bis 2005 Mitarbeiterin des IZ und ist seit Oktober 2005 Associate Professor an der Nanjing University of Science and Technology.

Footnote

Nanjing University of Science and Technology, Diss.

Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.01

0.013831474 = product of:
  0.04149442 = sum of:
    0.04149442 = product of:
      0.08298884 = sum of:
        0.08298884 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
          0.08298884 = score(doc=1046,freq=2.0), product of:
            0.17874686 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05104385 = queryNorm
            0.46428138 = fieldWeight in 1046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 5. 5.2003 14:17:22

Subramanian, S.; Shafer, K.E.: Clustering (1998) 0.01

0.013043619 = product of:
  0.039130855 = sum of:
    0.039130855 = weight(_text_:science in 1103) [ClassicSimilarity], result of:
      0.039130855 = score(doc=1103,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.2910318 = fieldWeight in 1103, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.078125 = fieldNorm(doc=1103)
  0.33333334 = coord(1/3)

Abstract: This article presents our exploration of computer science clustering algorithms as they relate to the Scorpion system. Scorpion is a research project at OCLC that explores the indexing and cataloging of electronic resources. For a more complete description of the Scorpion, please visit the Scorpion Web site at <http://purl.oclc.org/scorpion>

Shafer, K.E.: Evaluating Scorpion results (1998) 0.01

0.013043619 = product of:
  0.039130855 = sum of:
    0.039130855 = weight(_text_:science in 1569) [ClassicSimilarity], result of:
      0.039130855 = score(doc=1569,freq=2.0), product of:
        0.13445559 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.05104385 = queryNorm
        0.2910318 = fieldWeight in 1569, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.078125 = fieldNorm(doc=1569)
  0.33333334 = coord(1/3)

Abstract: Scorpion is a research project at OCLC that builds tools for automatic subject assignment by combining library science and information retrieval techniques. A thesis of Scorpion is that the Dewey Decimal Classification (Dewey) can be used to perform automatic subject assignment for electronic items.

Search (94 results, page 1 of 5)

Authors

Years

Languages

Types

Themes

Subjects