Search (61 results, page 2 of 4)

Krellenstein, M.: Document classification at Northern Light (1999) 0.01

0.0128368875 = product of:
  0.03851066 = sum of:
    0.03851066 = product of:
      0.07702132 = sum of:
        0.07702132 = weight(_text_:management in 4435) [ClassicSimilarity], result of:
          0.07702132 = score(doc=4435,freq=2.0), product of:
            0.17235184 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051133685 = queryNorm
            0.44688427 = fieldWeight in 4435, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.09375 = fieldNorm(doc=4435)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Footnote: Vortrag bei: Search engines and beyond: developing efficient knowledge management systems; 1999 Search engine Meeting, Boston, MA, April 19-20 1999

Golub, K.: Automated subject classification of textual web documents (2006) 0.01
```
0.012546628 = product of:
  0.037639882 = sum of:
    0.037639882 = weight(_text_:resources in 5600) [ClassicSimilarity], result of:
      0.037639882 = score(doc=5600,freq=2.0), product of:
        0.18665522 = queryWeight, product of:
          3.650338 = idf(docFreq=3122, maxDocs=44218)
          0.051133685 = queryNorm
        0.20165458 = fieldWeight in 5600, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.650338 = idf(docFreq=3122, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5600)
  0.33333334 = coord(1/3)
```
Abstract

Purpose - To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning, information retrieval and library science), and point to problems with the approaches and automated classification as such. Design/methodology/approach - A range of works dealing with automated classification of full-text web documents are discussed. Explorations of individual approaches are given in the following sections: special features (description, differences, evaluation), application and characteristics of web pages. Findings - Provides major similarities and differences between the three approaches: document pre-processing and utilization of web-specific document characteristics is common to all the approaches; major differences are in applied algorithms, employment or not of the vector space model and of controlled vocabularies. Problems of automated classification are recognized. Research limitations/implications - The paper does not attempt to provide an exhaustive bibliography of related resources. Practical implications - As an integrated overview of approaches from different research communities with application examples, it is very useful for students in library and information science and computer science, as well as for practitioners. Researchers from one community have the information on how similar tasks are conducted in different communities. Originality/value - To the author's knowledge, no review paper on automated text classification attempted to discuss more than one community's approach from an integrated perspective.
Golub, K.; Hansson, J.; Soergel, D.; Tudhope, D.: Managing classification in libraries : a methodological outline for evaluating automatic subject indexing and classification in Swedish library catalogues (2015) 0.01
```
0.012546628 = product of:
  0.037639882 = sum of:
    0.037639882 = weight(_text_:resources in 2300) [ClassicSimilarity], result of:
      0.037639882 = score(doc=2300,freq=2.0), product of:
        0.18665522 = queryWeight, product of:
          3.650338 = idf(docFreq=3122, maxDocs=44218)
          0.051133685 = queryNorm
        0.20165458 = fieldWeight in 2300, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.650338 = idf(docFreq=3122, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2300)
  0.33333334 = coord(1/3)
```
Abstract

Subject terms play a crucial role in resource discovery but require substantial effort to produce. Automatic subject classification and indexing address problems of scale and sustainability and can be used to enrich existing bibliographic records, establish more connections across and between resources and enhance consistency of bibliographic data. The paper aims to put forward a complex methodological framework to evaluate automatic classification tools of Swedish textual documents based on the Dewey Decimal Classification (DDC) recently introduced to Swedish libraries. Three major complementary approaches are suggested: a quality-built gold standard, retrieval effects, domain analysis. The gold standard is built based on input from at least two catalogue librarians, end-users expert in the subject, end users inexperienced in the subject and automated tools. Retrieval effects are studied through a combination of assigned and free tasks, including factual and comprehensive types. The study also takes into consideration the different role and character of subject terms in various knowledge domains, such as scientific disciplines. As a theoretical framework, domain analysis is used and applied in relation to the implementation of DDC in Swedish libraries and chosen domains of knowledge within the DDC itself.
Golub, K.; Soergel, D.; Buchanan, G.; Tudhope, D.; Lykke, M.; Hiom, D.: ¬A framework for evaluating automatic indexing or classification in the context of retrieval (2016) 0.01
```
0.012546628 = product of:
  0.037639882 = sum of:
    0.037639882 = weight(_text_:resources in 3311) [ClassicSimilarity], result of:
      0.037639882 = score(doc=3311,freq=2.0), product of:
        0.18665522 = queryWeight, product of:
          3.650338 = idf(docFreq=3122, maxDocs=44218)
          0.051133685 = queryNorm
        0.20165458 = fieldWeight in 3311, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.650338 = idf(docFreq=3122, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3311)
  0.33333334 = coord(1/3)
```
Abstract

Tools for automatic subject assignment help deal with scale and sustainability in creating and enriching metadata, establishing more connections across and between resources and enhancing consistency. Although some software vendors and experimental researchers claim the tools can replace manual subject indexing, hard scientific evidence of their performance in operating information environments is scarce. A major reason for this is that research is usually conducted in laboratory conditions, excluding the complexities of real-life systems and situations. The article reviews and discusses issues with existing evaluation approaches such as problems of aboutness and relevance assessments, implying the need to use more than a single "gold standard" method when evaluating indexing and retrieval, and proposes a comprehensive evaluation framework. The framework is informed by a systematic review of the literature on evaluation approaches: evaluating indexing quality directly through assessment by an evaluator or through comparison with a gold standard, evaluating the quality of computer-assisted indexing directly in the context of an indexing workflow, and evaluating indexing quality indirectly through analyzing retrieval performance.

Savic, D.: Designing an expert system for classifying office documents (1994) 0.01

0.012102734 = product of:
  0.036308203 = sum of:
    0.036308203 = product of:
      0.072616406 = sum of:
        0.072616406 = weight(_text_:management in 2655) [ClassicSimilarity], result of:
          0.072616406 = score(doc=2655,freq=4.0), product of:
            0.17235184 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051133685 = queryNorm
            0.42132655 = fieldWeight in 2655, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0625 = fieldNorm(doc=2655)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Can records management benefit from artificial intelligence technology, in particular from expert systems? Gives an answer to this question by showing an example of a small scale prototype project in automatic classification of office documents. Project methodology and basic elements of an expert system's approach are elaborated to give guidelines to potential users of this promising technology
Source: Records management quarterly. 28(1994) no.3, S.20-29

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.01

0.011546515 = product of:
  0.034639545 = sum of:
    0.034639545 = product of:
      0.06927909 = sum of:
        0.06927909 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.06927909 = score(doc=611,freq=2.0), product of:
            0.17906146 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051133685 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 8.2009 12:54:24

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.01

0.011546515 = product of:
  0.034639545 = sum of:
    0.034639545 = product of:
      0.06927909 = sum of:
        0.06927909 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.06927909 = score(doc=2748,freq=2.0), product of:
            0.17906146 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051133685 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 2.2016 18:25:22

Guerrero-Bote, V.P.; Moya Anegón, F. de; Herrero Solana, V.: Document organization using Kohonen's algorithm (2002) 0.01

0.008557925 = product of:
  0.025673775 = sum of:
    0.025673775 = product of:
      0.05134755 = sum of:
        0.05134755 = weight(_text_:management in 2564) [ClassicSimilarity], result of:
          0.05134755 = score(doc=2564,freq=2.0), product of:
            0.17235184 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051133685 = queryNorm
            0.29792285 = fieldWeight in 2564, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0625 = fieldNorm(doc=2564)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information processing and management. 38(2002) no.1, S.79-89

Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.01

0.00808256 = product of:
  0.02424768 = sum of:
    0.02424768 = product of:
      0.04849536 = sum of:
        0.04849536 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
          0.04849536 = score(doc=141,freq=2.0), product of:
            0.17906146 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051133685 = queryNorm
            0.2708308 = fieldWeight in 141, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=141)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Pages: S.1-22

Automatic classification research at OCLC (2002) 0.01

0.00808256 = product of:
  0.02424768 = sum of:
    0.02424768 = product of:
      0.04849536 = sum of:
        0.04849536 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
          0.04849536 = score(doc=1563,freq=2.0), product of:
            0.17906146 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051133685 = queryNorm
            0.2708308 = fieldWeight in 1563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1563)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 5. 5.2003 9:22:09

Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.01

0.00808256 = product of:
  0.02424768 = sum of:
    0.02424768 = product of:
      0.04849536 = sum of:
        0.04849536 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
          0.04849536 = score(doc=5273,freq=2.0), product of:
            0.17906146 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051133685 = queryNorm
            0.2708308 = fieldWeight in 5273, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5273)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 7.2006 16:24:52

Classification, automation, and new media : Proceedings of the 24th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Passau, March 15 - 17, 2000 (2002) 0.01
```
0.007564209 = product of:
  0.022692626 = sum of:
    0.022692626 = product of:
      0.045385253 = sum of:
        0.045385253 = weight(_text_:management in 5997) [ClassicSimilarity], result of:
          0.045385253 = score(doc=5997,freq=4.0), product of:
            0.17235184 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051133685 = queryNorm
            0.2633291 = fieldWeight in 5997, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5997)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Given the huge amount of information in the internet and in practically every domain of knowledge that we are facing today, knowledge discovery calls for automation. The book deals with methods from classification and data analysis that respond effectively to this rapidly growing challenge. The interested reader will find new methodological insights as well as applications in economics, management science, finance, and marketing, and in pattern recognition, biology, health, and archaeology.

Content

Data Analysis, Statistics, and Classification.- Pattern Recognition and Automation.- Data Mining, Information Processing, and Automation.- New Media, Web Mining, and Automation.- Applications in Management Science, Finance, and Marketing.- Applications in Medicine, Biology, Archaeology, and Others.- Author Index.- Subject Index.

Savic, D.: Automatic classification of office documents : review of available methods and techniques (1995) 0.01

0.0074881846 = product of:
  0.022464553 = sum of:
    0.022464553 = product of:
      0.044929106 = sum of:
        0.044929106 = weight(_text_:management in 2219) [ClassicSimilarity], result of:
          0.044929106 = score(doc=2219,freq=2.0), product of:
            0.17235184 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051133685 = queryNorm
            0.2606825 = fieldWeight in 2219, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2219)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Records management quarterly. 29(1995) no.4, S.3-18

Losee, R.M.: Text windows and phrases differing by discipline, location in document, and syntactic structure (1996) 0.01

0.0074881846 = product of:
  0.022464553 = sum of:
    0.022464553 = product of:
      0.044929106 = sum of:
        0.044929106 = weight(_text_:management in 6962) [ClassicSimilarity], result of:
          0.044929106 = score(doc=6962,freq=2.0), product of:
            0.17235184 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051133685 = queryNorm
            0.2606825 = fieldWeight in 6962, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6962)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information processing and management. 32(1996) no.6, S.747-767

Miyamoto, S.: Information clustering based an fuzzy multisets (2003) 0.01

0.0074881846 = product of:
  0.022464553 = sum of:
    0.022464553 = product of:
      0.044929106 = sum of:
        0.044929106 = weight(_text_:management in 1071) [ClassicSimilarity], result of:
          0.044929106 = score(doc=1071,freq=2.0), product of:
            0.17235184 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051133685 = queryNorm
            0.2606825 = fieldWeight in 1071, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1071)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information processing and management. 39(2003) no.2, S.195-213

Hu, G.; Zhou, S.; Guan, J.; Hu, X.: Towards effective document clustering : a constrained K-means based approach (2008) 0.01

0.0074881846 = product of:
  0.022464553 = sum of:
    0.022464553 = product of:
      0.044929106 = sum of:
        0.044929106 = weight(_text_:management in 2113) [ClassicSimilarity], result of:
          0.044929106 = score(doc=2113,freq=2.0), product of:
            0.17235184 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051133685 = queryNorm
            0.2606825 = fieldWeight in 2113, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2113)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information processing and management. 44(2008) no.4, S.1397-1409

Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.01

0.0069279084 = product of:
  0.020783724 = sum of:
    0.020783724 = product of:
      0.04156745 = sum of:
        0.04156745 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
          0.04156745 = score(doc=3051,freq=2.0), product of:
            0.17906146 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051133685 = queryNorm
            0.23214069 = fieldWeight in 3051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3051)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 8.2009 19:51:28

Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.01

0.0069279084 = product of:
  0.020783724 = sum of:
    0.020783724 = product of:
      0.04156745 = sum of:
        0.04156745 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
          0.04156745 = score(doc=690,freq=2.0), product of:
            0.17906146 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051133685 = queryNorm
            0.23214069 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 23. 3.2013 13:22:36

Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.01

0.0069279084 = product of:
  0.020783724 = sum of:
    0.020783724 = product of:
      0.04156745 = sum of:
        0.04156745 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
          0.04156745 = score(doc=2158,freq=2.0), product of:
            0.17906146 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051133685 = queryNorm
            0.23214069 = fieldWeight in 2158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2158)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 4. 8.2015 19:22:04

Wu, K.J.; Chen, M.-C.; Sun, Y.: Automatic topics discovery from hyperlinked documents (2004) 0.01

0.0064184438 = product of:
  0.01925533 = sum of:
    0.01925533 = product of:
      0.03851066 = sum of:
        0.03851066 = weight(_text_:management in 2563) [ClassicSimilarity], result of:
          0.03851066 = score(doc=2563,freq=2.0), product of:
            0.17235184 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051133685 = queryNorm
            0.22344214 = fieldWeight in 2563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=2563)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information processing and management. 40(2004) no.2, S.239-255

Search (61 results, page 2 of 4)

Authors

Years

Languages

Types

Themes

Subjects