Search (22 results, page 1 of 2)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.09
    0.0883389 = product of:
      0.26501667 = sum of:
        0.22639212 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.22639212 = score(doc=562,freq=2.0), product of:
            0.40282002 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.047513504 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.03862454 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.03862454 = score(doc=562,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(2/6)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Yang, P.; Gao, W.; Tan, Q.; Wong, K.-F.: ¬A link-bridged topic model for cross-domain document classification (2013) 0.02
    0.01763814 = product of:
      0.10582883 = sum of:
        0.10582883 = weight(_text_:relationship in 2706) [ClassicSimilarity], result of:
          0.10582883 = score(doc=2706,freq=6.0), product of:
            0.2292412 = queryWeight, product of:
              4.824759 = idf(docFreq=964, maxDocs=44218)
              0.047513504 = queryNorm
            0.46164837 = fieldWeight in 2706, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.824759 = idf(docFreq=964, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2706)
      0.16666667 = coord(1/6)
    
    Abstract
    Transfer learning utilizes labeled data available from some related domain (source domain) for achieving effective knowledge transformation to the target domain. However, most state-of-the-art cross-domain classification methods treat documents as plain text and ignore the hyperlink (or citation) relationship existing among the documents. In this paper, we propose a novel cross-domain document classification approach called Link-Bridged Topic model (LBT). LBT consists of two key steps. Firstly, LBT utilizes an auxiliary link network to discover the direct or indirect co-citation relationship among documents by embedding the background knowledge into a graph kernel. The mined co-citation relationship is leveraged to bridge the gap across different domains. Secondly, LBT simultaneously combines the content information and link structures into a unified latent topic model. The model is based on an assumption that the documents of source and target domains share some common topics from the point of view of both content information and link structure. By mapping both domains data into the latent topic spaces, LBT encodes the knowledge about domain commonality and difference as the shared topics with associated differential probabilities. The learned latent topics must be consistent with the source and target data, as well as content and link statistics. Then the shared topics act as the bridge to facilitate knowledge transfer from the source to the target domains. Experiments on different types of datasets show that our algorithm significantly improves the generalization performance of cross-domain document classification.
  3. Schaalje, G.B.; Blades, N.J.; Funai, T.: ¬An open-set size-adjusted Bayesian classifier for authorship attribution (2013) 0.02
    0.017281776 = product of:
      0.103690654 = sum of:
        0.103690654 = weight(_text_:relationship in 1041) [ClassicSimilarity], result of:
          0.103690654 = score(doc=1041,freq=4.0), product of:
            0.2292412 = queryWeight, product of:
              4.824759 = idf(docFreq=964, maxDocs=44218)
              0.047513504 = queryNorm
            0.45232117 = fieldWeight in 1041, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.824759 = idf(docFreq=964, maxDocs=44218)
              0.046875 = fieldNorm(doc=1041)
      0.16666667 = coord(1/6)
    
    Abstract
    Recent studies of authorship attribution have used machine-learning methods including regularized multinomial logistic regression, neural nets, support vector machines, and the nearest shrunken centroid classifier to identify likely authors of disputed texts. These methods are all limited by an inability to perform open-set classification and account for text and corpus size. We propose a customized Bayesian logit-normal-beta-binomial classification model for supervised authorship attribution. The model is based on the beta-binomial distribution with an explicit inverse relationship between extra-binomial variation and text size. The model internally estimates the relationship of extra-binomial variation to text size, and uses Markov Chain Monte Carlo (MCMC) to produce distributions of posterior authorship probabilities instead of point estimates. We illustrate the method by training the machine-learning methods as well as the open-set Bayesian classifier on undisputed papers of The Federalist, and testing the method on documents historically attributed to Alexander Hamilton, John Jay, and James Madison. The Bayesian classifier was the best classifier of these texts.
  4. Huang, Y.-L.: ¬A theoretic and empirical research of cluster indexing for Mandarine Chinese full text document (1998) 0.01
    0.014256737 = product of:
      0.08554042 = sum of:
        0.08554042 = weight(_text_:relationship in 513) [ClassicSimilarity], result of:
          0.08554042 = score(doc=513,freq=2.0), product of:
            0.2292412 = queryWeight, product of:
              4.824759 = idf(docFreq=964, maxDocs=44218)
              0.047513504 = queryNorm
            0.3731459 = fieldWeight in 513, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.824759 = idf(docFreq=964, maxDocs=44218)
              0.0546875 = fieldNorm(doc=513)
      0.16666667 = coord(1/6)
    
    Abstract
    Since most popular commercialized systems for full text retrieval are designed with full text scaning and Boolean logic query mode, these systems use an oversimplified relationship between the indexing form and the content of document. Reports the use of Singular Value Decomposition (SVD) to develop a Cluster Indexing Model (CIM) based on a Vector Space Model (VSM) in orer to explore the index theory of cluster indexing for chinese full text documents. From a series of experiments, it was found that the indexing performance of CIM is better than traditional VSM, and has almost equivalent effectiveness of the authority control of index terms
  5. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.01
    0.012874847 = product of:
      0.07724908 = sum of:
        0.07724908 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
          0.07724908 = score(doc=1046,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.46428138 = fieldWeight in 1046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
      0.16666667 = coord(1/6)
    
    Date
    5. 5.2003 14:17:22
  6. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.01
    0.01072904 = product of:
      0.06437424 = sum of:
        0.06437424 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.06437424 = score(doc=611,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.16666667 = coord(1/6)
    
    Date
    22. 8.2009 12:54:24
  7. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.01
    0.01072904 = product of:
      0.06437424 = sum of:
        0.06437424 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.06437424 = score(doc=2748,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.16666667 = coord(1/6)
    
    Date
    1. 2.2016 18:25:22
  8. Chung, Y.M.; Lee, J.Y.: ¬A corpus-based approach to comparative evaluation of statistical term association measures (2001) 0.01
    0.010183383 = product of:
      0.061100297 = sum of:
        0.061100297 = weight(_text_:relationship in 5769) [ClassicSimilarity], result of:
          0.061100297 = score(doc=5769,freq=2.0), product of:
            0.2292412 = queryWeight, product of:
              4.824759 = idf(docFreq=964, maxDocs=44218)
              0.047513504 = queryNorm
            0.26653278 = fieldWeight in 5769, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.824759 = idf(docFreq=964, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5769)
      0.16666667 = coord(1/6)
    
    Abstract
    Statistical association measures have been widely applied in information retrieval research, usually employing a clustering of documents or terms on the basis of their relationships. Applications of the association measures for term clustering include automatic thesaurus construction and query expansion. This research evaluates the similarity of six association measures by comparing the relationship and behavior they demonstrate in various analyses of a test corpus. Analysis techniques include comparisons of highly ranked term pairs and term clusters, analyses of the correlation among the association measures using Pearson's correlation coefficient and MDS mapping, and an analysis of the impact of a term frequency on the association values by means of z-score. The major findings of the study are as follows: First, the most similar association measures are mutual information and Yule's coefficient of colligation Y, whereas cosine and Jaccard coefficients, as well as X**2 statistic and likelihood ratio, demonstrate quite similar behavior for terms with high frequency. Second, among all the measures, the X**2 statistic is the least affected by the frequency of terms. Third, although cosine and Jaccard coefficients tend to emphasize high frequency terms, mutual information and Yule's Y seem to overestimate rare terms
  9. Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.01
    0.0075103277 = product of:
      0.045061965 = sum of:
        0.045061965 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
          0.045061965 = score(doc=141,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.2708308 = fieldWeight in 141, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=141)
      0.16666667 = coord(1/6)
    
    Pages
    S.1-22
  10. Dubin, D.: Dimensions and discriminability (1998) 0.01
    0.0075103277 = product of:
      0.045061965 = sum of:
        0.045061965 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
          0.045061965 = score(doc=2338,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.2708308 = fieldWeight in 2338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2338)
      0.16666667 = coord(1/6)
    
    Date
    22. 9.1997 19:16:05
  11. Automatic classification research at OCLC (2002) 0.01
    0.0075103277 = product of:
      0.045061965 = sum of:
        0.045061965 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
          0.045061965 = score(doc=1563,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.2708308 = fieldWeight in 1563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1563)
      0.16666667 = coord(1/6)
    
    Date
    5. 5.2003 9:22:09
  12. Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.01
    0.0075103277 = product of:
      0.045061965 = sum of:
        0.045061965 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
          0.045061965 = score(doc=1673,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.2708308 = fieldWeight in 1673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1673)
      0.16666667 = coord(1/6)
    
    Date
    1. 8.1996 22:08:06
  13. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.01
    0.0075103277 = product of:
      0.045061965 = sum of:
        0.045061965 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
          0.045061965 = score(doc=5273,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.2708308 = fieldWeight in 5273, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5273)
      0.16666667 = coord(1/6)
    
    Date
    22. 7.2006 16:24:52
  14. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.01
    0.0075103277 = product of:
      0.045061965 = sum of:
        0.045061965 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
          0.045061965 = score(doc=2560,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.2708308 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
      0.16666667 = coord(1/6)
    
    Date
    22. 9.2008 18:31:54
  15. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.01
    0.0064374236 = product of:
      0.03862454 = sum of:
        0.03862454 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
          0.03862454 = score(doc=2760,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.23214069 = fieldWeight in 2760, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2760)
      0.16666667 = coord(1/6)
    
    Date
    22. 3.2009 19:11:54
  16. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.01
    0.0064374236 = product of:
      0.03862454 = sum of:
        0.03862454 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
          0.03862454 = score(doc=3051,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.23214069 = fieldWeight in 3051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3051)
      0.16666667 = coord(1/6)
    
    Date
    22. 8.2009 19:51:28
  17. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.01
    0.0064374236 = product of:
      0.03862454 = sum of:
        0.03862454 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
          0.03862454 = score(doc=690,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.23214069 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
      0.16666667 = coord(1/6)
    
    Date
    23. 3.2013 13:22:36
  18. Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.01
    0.0064374236 = product of:
      0.03862454 = sum of:
        0.03862454 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
          0.03862454 = score(doc=2158,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.23214069 = fieldWeight in 2158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2158)
      0.16666667 = coord(1/6)
    
    Date
    4. 8.2015 19:22:04
  19. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.01
    0.00536452 = product of:
      0.03218712 = sum of:
        0.03218712 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
          0.03218712 = score(doc=2765,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.19345059 = fieldWeight in 2765, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2765)
      0.16666667 = coord(1/6)
    
    Date
    22. 3.2009 19:14:43
  20. Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.01
    0.00536452 = product of:
      0.03218712 = sum of:
        0.03218712 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
          0.03218712 = score(doc=1107,freq=2.0), product of:
            0.16638419 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047513504 = queryNorm
            0.19345059 = fieldWeight in 1107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1107)
      0.16666667 = coord(1/6)
    
    Date
    28.10.2013 19:22:57