Finn, A.; Kushmerick, N.: Learning to classify documents according to genre (2006)
0.00
0.0018575465 = product of:
0.011145279 = sum of:
0.011145279 = weight(_text_:in in 6010) [ClassicSimilarity], result of:
0.011145279 = score(doc=6010,freq=8.0), product of:
0.061799437 = queryWeight, product of:
1.3602545 = idf(docFreq=30841, maxDocs=44218)
0.04543226 = queryNorm
0.18034597 = fieldWeight in 6010, product of:
2.828427 = tf(freq=8.0), with freq of:
8.0 = termFreq=8.0
1.3602545 = idf(docFreq=30841, maxDocs=44218)
0.046875 = fieldNorm(doc=6010)
0.16666667 = coord(1/6)
- Abstract
- Current document-retrieval tools succeed in locating large numbers of documents relevant to a given query. While search results may be relevant according to the topic of the documents, it is more difficult to identify which of the relevant documents are most suitable for a particular user. Automatic genre analysis (i.e., the ability to distinguish documents according to style) would be a useful tool for identifying documents that are most suitable for a particular user. We investigate the use of machine learning for automatic genre classification. We introduce the idea of domain transfer-genre classifiers should be reusable across multiple topics-which does not arise in standard text classification. We investigate different features for building genre classifiers and their ability to transfer across multiple-topic domains. We also show how different feature-sets can be used in conjunction with each other to improve performance and reduce the number of documents that need to be labeled.
- Footnote
- Beitrag in einem Themenschwerpunkt "Computational analysis of style"