Search (205 results, page 1 of 11)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Wätjen, H.-J.; Diekmann, B.; Möller, G.; Carstensen, K.-U.: Bericht zum DFG-Projekt: GERHARD : German Harvest Automated Retrieval and Directory (1998) 0.04
    0.044547867 = product of:
      0.07795876 = sum of:
        0.025528383 = product of:
          0.051056765 = sum of:
            0.051056765 = weight(_text_:k in 3065) [ClassicSimilarity], result of:
              0.051056765 = score(doc=3065,freq=2.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.39440846 = fieldWeight in 3065, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3065)
          0.5 = coord(1/2)
        0.0047360887 = weight(_text_:s in 3065) [ClassicSimilarity], result of:
          0.0047360887 = score(doc=3065,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.120123915 = fieldWeight in 3065, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.078125 = fieldNorm(doc=3065)
        0.042958204 = weight(_text_:u in 3065) [ClassicSimilarity], result of:
          0.042958204 = score(doc=3065,freq=2.0), product of:
            0.11874176 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03626318 = queryNorm
            0.3617784 = fieldWeight in 3065, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.078125 = fieldNorm(doc=3065)
        0.0047360887 = weight(_text_:s in 3065) [ClassicSimilarity], result of:
          0.0047360887 = score(doc=3065,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.120123915 = fieldWeight in 3065, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.078125 = fieldNorm(doc=3065)
      0.5714286 = coord(4/7)
    
    Pages
    34 S
  2. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.04
    0.038731378 = product of:
      0.067779906 = sum of:
        0.043196652 = product of:
          0.17278661 = sum of:
            0.17278661 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.17278661 = score(doc=562,freq=2.0), product of:
                0.30743963 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03626318 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.25 = coord(1/4)
        0.004921888 = weight(_text_:s in 562) [ClassicSimilarity], result of:
          0.004921888 = score(doc=562,freq=6.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.124836445 = fieldWeight in 562, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.004921888 = weight(_text_:s in 562) [ClassicSimilarity], result of:
          0.004921888 = score(doc=562,freq=6.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.124836445 = fieldWeight in 562, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.01473948 = product of:
          0.02947896 = sum of:
            0.02947896 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.02947896 = score(doc=562,freq=2.0), product of:
                0.12698747 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03626318 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
    Pages
    S.331-334
  3. Panyr, J.: Automatische Klassifikation und Information Retrieval : Anwendung und Entwicklung komplexer Verfahren in Information-Retrieval-Systemen und ihre Evaluierung (1986) 0.03
    0.026964193 = product of:
      0.06291645 = sum of:
        0.005683306 = weight(_text_:s in 32) [ClassicSimilarity], result of:
          0.005683306 = score(doc=32,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.14414869 = fieldWeight in 32, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.09375 = fieldNorm(doc=32)
        0.05154984 = weight(_text_:u in 32) [ClassicSimilarity], result of:
          0.05154984 = score(doc=32,freq=2.0), product of:
            0.11874176 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03626318 = queryNorm
            0.43413407 = fieldWeight in 32, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.09375 = fieldNorm(doc=32)
        0.005683306 = weight(_text_:s in 32) [ClassicSimilarity], result of:
          0.005683306 = score(doc=32,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.14414869 = fieldWeight in 32, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.09375 = fieldNorm(doc=32)
      0.42857143 = coord(3/7)
    
    Footnote
    Zugleich Dissertation U Saarbrücken 1085
    Pages
    XII,416 S
  4. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.03
    0.026830094 = product of:
      0.04695266 = sum of:
        0.026529877 = product of:
          0.053059753 = sum of:
            0.053059753 = weight(_text_:k in 690) [ClassicSimilarity], result of:
              0.053059753 = score(doc=690,freq=6.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.40988132 = fieldWeight in 690, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.5 = coord(1/2)
        0.002841653 = weight(_text_:s in 690) [ClassicSimilarity], result of:
          0.002841653 = score(doc=690,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.072074346 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
        0.002841653 = weight(_text_:s in 690) [ClassicSimilarity], result of:
          0.002841653 = score(doc=690,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.072074346 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
        0.01473948 = product of:
          0.02947896 = sum of:
            0.02947896 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
              0.02947896 = score(doc=690,freq=2.0), product of:
                0.12698747 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03626318 = queryNorm
                0.23214069 = fieldWeight in 690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Abstract
    We describe the latent semantic indexing subspace signature model (LSISSM) for semantic content representation of unstructured text. Grounded on singular value decomposition, the model represents terms and documents by the distribution signatures of their statistical contribution across the top-ranking latent concept dimensions. LSISSM matches term signatures with document signatures according to their mapping coherence between latent semantic indexing (LSI) term subspace and LSI document subspace. LSISSM does feature reduction and finds a low-rank approximation of scalable and sparse term-document matrices. Experiments demonstrate that this approach significantly improves the performance of major clustering algorithms such as standard K-means and self-organizing maps compared with the vector space model and the traditional LSI model. The unique contribution ranking mechanism in LSISSM also improves the initialization of standard K-means compared with random seeding procedure, which sometimes causes low efficiency and effectiveness of clustering. A two-stage initialization strategy based on LSISSM significantly reduces the running time of standard K-means procedures.
    Date
    23. 3.2013 13:22:36
    Source
    Journal of the American Society for Information Science and Technology. 64(2013) no.4, S.844-860
  5. Reiner, U.: DDC-based search in the data of the German National Bibliography (2008) 0.03
    0.026728718 = product of:
      0.046775255 = sum of:
        0.01531703 = product of:
          0.03063406 = sum of:
            0.03063406 = weight(_text_:k in 2166) [ClassicSimilarity], result of:
              0.03063406 = score(doc=2166,freq=2.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.23664509 = fieldWeight in 2166, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2166)
          0.5 = coord(1/2)
        0.002841653 = weight(_text_:s in 2166) [ClassicSimilarity], result of:
          0.002841653 = score(doc=2166,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.072074346 = fieldWeight in 2166, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=2166)
        0.02577492 = weight(_text_:u in 2166) [ClassicSimilarity], result of:
          0.02577492 = score(doc=2166,freq=2.0), product of:
            0.11874176 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03626318 = queryNorm
            0.21706703 = fieldWeight in 2166, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.046875 = fieldNorm(doc=2166)
        0.002841653 = weight(_text_:s in 2166) [ClassicSimilarity], result of:
          0.002841653 = score(doc=2166,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.072074346 = fieldWeight in 2166, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=2166)
      0.5714286 = coord(4/7)
    
    Pages
    S.121-129
    Source
    New pespectives on subject indexing and classification: essays in honour of Magda Heiner-Freiling. Red.: K. Knull-Schlomann, u.a
  6. Sparck Jones, K.: Automatic classification (1976) 0.02
    0.024000384 = product of:
      0.056000896 = sum of:
        0.040845416 = product of:
          0.08169083 = sum of:
            0.08169083 = weight(_text_:k in 2908) [ClassicSimilarity], result of:
              0.08169083 = score(doc=2908,freq=2.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.63105357 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.125 = fieldNorm(doc=2908)
          0.5 = coord(1/2)
        0.0075777415 = weight(_text_:s in 2908) [ClassicSimilarity], result of:
          0.0075777415 = score(doc=2908,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.19219826 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.125 = fieldNorm(doc=2908)
        0.0075777415 = weight(_text_:s in 2908) [ClassicSimilarity], result of:
          0.0075777415 = score(doc=2908,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.19219826 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.125 = fieldNorm(doc=2908)
      0.42857143 = coord(3/7)
    
    Pages
    S.209-225
  7. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.02
    0.023826545 = product of:
      0.04169645 = sum of:
        0.01786987 = product of:
          0.03573974 = sum of:
            0.03573974 = weight(_text_:k in 2560) [ClassicSimilarity], result of:
              0.03573974 = score(doc=2560,freq=2.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.27608594 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.5 = coord(1/2)
        0.003315262 = weight(_text_:s in 2560) [ClassicSimilarity], result of:
          0.003315262 = score(doc=2560,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.08408674 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
        0.003315262 = weight(_text_:s in 2560) [ClassicSimilarity], result of:
          0.003315262 = score(doc=2560,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.08408674 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
        0.01719606 = product of:
          0.03439212 = sum of:
            0.03439212 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.03439212 = score(doc=2560,freq=2.0), product of:
                0.12698747 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03626318 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Date
    22. 9.2008 18:31:54
    Source
    International cataloguing and bibliographic control. 36(2007) no.4, S.78-82
  8. Golub, K.; Hansson, J.; Soergel, D.; Tudhope, D.: Managing classification in libraries : a methodological outline for evaluating automatic subject indexing and classification in Swedish library catalogues (2015) 0.02
    0.023394935 = product of:
      0.040941134 = sum of:
        0.012764191 = product of:
          0.025528383 = sum of:
            0.025528383 = weight(_text_:k in 2300) [ClassicSimilarity], result of:
              0.025528383 = score(doc=2300,freq=2.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.19720423 = fieldWeight in 2300, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2300)
          0.5 = coord(1/2)
        0.0033489203 = weight(_text_:s in 2300) [ClassicSimilarity], result of:
          0.0033489203 = score(doc=2300,freq=4.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.08494043 = fieldWeight in 2300, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
        0.021479102 = weight(_text_:u in 2300) [ClassicSimilarity], result of:
          0.021479102 = score(doc=2300,freq=2.0), product of:
            0.11874176 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03626318 = queryNorm
            0.1808892 = fieldWeight in 2300, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
        0.0033489203 = weight(_text_:s in 2300) [ClassicSimilarity], result of:
          0.0033489203 = score(doc=2300,freq=4.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.08494043 = fieldWeight in 2300, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
      0.5714286 = coord(4/7)
    
    Location
    S
    Pages
    S.163-175
    Source
    Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro
  9. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.02
    0.019523047 = product of:
      0.045553777 = sum of:
        0.008037409 = weight(_text_:s in 1046) [ClassicSimilarity], result of:
          0.008037409 = score(doc=1046,freq=4.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.20385705 = fieldWeight in 1046, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
        0.008037409 = weight(_text_:s in 1046) [ClassicSimilarity], result of:
          0.008037409 = score(doc=1046,freq=4.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.20385705 = fieldWeight in 1046, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
        0.02947896 = product of:
          0.05895792 = sum of:
            0.05895792 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
              0.05895792 = score(doc=1046,freq=2.0), product of:
                0.12698747 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03626318 = queryNorm
                0.46428138 = fieldWeight in 1046, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1046)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Date
    5. 5.2003 14:17:22
    Source
    Journal of library administration. 34(2001) nos.3/4, S.221-228
  10. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.02
    0.019292573 = product of:
      0.067524 = sum of:
        0.042958204 = weight(_text_:u in 611) [ClassicSimilarity], result of:
          0.042958204 = score(doc=611,freq=2.0), product of:
            0.11874176 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03626318 = queryNorm
            0.3617784 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
        0.024565801 = product of:
          0.049131602 = sum of:
            0.049131602 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
              0.049131602 = score(doc=611,freq=2.0), product of:
                0.12698747 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03626318 = queryNorm
                0.38690117 = fieldWeight in 611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=611)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Date
    22. 8.2009 12:54:24
  11. Yu, W.; Gong, Y.: Document clustering by concept factorization (2004) 0.02
    0.01800029 = product of:
      0.042000674 = sum of:
        0.03063406 = product of:
          0.06126812 = sum of:
            0.06126812 = weight(_text_:k in 4084) [ClassicSimilarity], result of:
              0.06126812 = score(doc=4084,freq=2.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.47329018 = fieldWeight in 4084, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4084)
          0.5 = coord(1/2)
        0.005683306 = weight(_text_:s in 4084) [ClassicSimilarity], result of:
          0.005683306 = score(doc=4084,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.14414869 = fieldWeight in 4084, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.09375 = fieldNorm(doc=4084)
        0.005683306 = weight(_text_:s in 4084) [ClassicSimilarity], result of:
          0.005683306 = score(doc=4084,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.14414869 = fieldWeight in 4084, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.09375 = fieldNorm(doc=4084)
      0.42857143 = coord(3/7)
    
    Pages
    S.202-209
    Source
    SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a
  12. Schulze, U.: Erfahrungen bei der Anwendung automatischer Klassifizierungsverfahren zur Inhaltsanalyse einer Dokumentenmenge (1978) 0.02
    0.01797613 = product of:
      0.041944303 = sum of:
        0.0037888708 = weight(_text_:s in 83) [ClassicSimilarity], result of:
          0.0037888708 = score(doc=83,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.09609913 = fieldWeight in 83, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=83)
        0.034366563 = weight(_text_:u in 83) [ClassicSimilarity], result of:
          0.034366563 = score(doc=83,freq=2.0), product of:
            0.11874176 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03626318 = queryNorm
            0.28942272 = fieldWeight in 83, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0625 = fieldNorm(doc=83)
        0.0037888708 = weight(_text_:s in 83) [ClassicSimilarity], result of:
          0.0037888708 = score(doc=83,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.09609913 = fieldWeight in 83, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=83)
      0.42857143 = coord(3/7)
    
    Pages
    S.166-185
  13. Pfister, J.: Clustering von Patent-Dokumenten am Beispiel der Datenbanken des Fachinformationszentrums Karlsruhe (2006) 0.02
    0.01797613 = product of:
      0.041944303 = sum of:
        0.0037888708 = weight(_text_:s in 5976) [ClassicSimilarity], result of:
          0.0037888708 = score(doc=5976,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.09609913 = fieldWeight in 5976, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=5976)
        0.034366563 = weight(_text_:u in 5976) [ClassicSimilarity], result of:
          0.034366563 = score(doc=5976,freq=2.0), product of:
            0.11874176 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03626318 = queryNorm
            0.28942272 = fieldWeight in 5976, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0625 = fieldNorm(doc=5976)
        0.0037888708 = weight(_text_:s in 5976) [ClassicSimilarity], result of:
          0.0037888708 = score(doc=5976,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.09609913 = fieldWeight in 5976, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=5976)
      0.42857143 = coord(3/7)
    
    Pages
    S.129-146
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  14. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.02
    0.015729114 = product of:
      0.036701266 = sum of:
        0.003315262 = weight(_text_:s in 1595) [ClassicSimilarity], result of:
          0.003315262 = score(doc=1595,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.08408674 = fieldWeight in 1595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
        0.030070743 = weight(_text_:u in 1595) [ClassicSimilarity], result of:
          0.030070743 = score(doc=1595,freq=2.0), product of:
            0.11874176 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03626318 = queryNorm
            0.25324488 = fieldWeight in 1595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
        0.003315262 = weight(_text_:s in 1595) [ClassicSimilarity], result of:
          0.003315262 = score(doc=1595,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.08408674 = fieldWeight in 1595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
      0.42857143 = coord(3/7)
    
    Pages
    S.107-124
    Source
    Advances in classification research, vol.10: proceedings of the 10th ASIS SIG/CR Classification Research Workshop. Ed.: Albrechtsen, H. u. J.E. Mai
  15. Schek, M.: Automatische Klassifizierung und Visualisierung im Archiv der Süddeutschen Zeitung (2005) 0.02
    0.015591754 = product of:
      0.027285568 = sum of:
        0.008934935 = product of:
          0.01786987 = sum of:
            0.01786987 = weight(_text_:k in 4884) [ClassicSimilarity], result of:
              0.01786987 = score(doc=4884,freq=2.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.13804297 = fieldWeight in 4884, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=4884)
          0.5 = coord(1/2)
        0.001657631 = weight(_text_:s in 4884) [ClassicSimilarity], result of:
          0.001657631 = score(doc=4884,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.04204337 = fieldWeight in 4884, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4884)
        0.015035371 = weight(_text_:u in 4884) [ClassicSimilarity], result of:
          0.015035371 = score(doc=4884,freq=2.0), product of:
            0.11874176 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03626318 = queryNorm
            0.12662244 = fieldWeight in 4884, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4884)
        0.001657631 = weight(_text_:s in 4884) [ClassicSimilarity], result of:
          0.001657631 = score(doc=4884,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.04204337 = fieldWeight in 4884, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4884)
      0.5714286 = coord(4/7)
    
    Object
    K-Infinity
    Source
    Medienwirtschaft. 2(2005) H.1, S.20-24
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  16. Shen, D.; Chen, Z.; Yang, Q.; Zeng, H.J.; Zhang, B.; Lu, Y.; Ma, W.Y.: Web page classification through summarization (2004) 0.02
    0.01500024 = product of:
      0.03500056 = sum of:
        0.025528383 = product of:
          0.051056765 = sum of:
            0.051056765 = weight(_text_:k in 4132) [ClassicSimilarity], result of:
              0.051056765 = score(doc=4132,freq=2.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.39440846 = fieldWeight in 4132, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4132)
          0.5 = coord(1/2)
        0.0047360887 = weight(_text_:s in 4132) [ClassicSimilarity], result of:
          0.0047360887 = score(doc=4132,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.120123915 = fieldWeight in 4132, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.078125 = fieldNorm(doc=4132)
        0.0047360887 = weight(_text_:s in 4132) [ClassicSimilarity], result of:
          0.0047360887 = score(doc=4132,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.120123915 = fieldWeight in 4132, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.078125 = fieldNorm(doc=4132)
      0.42857143 = coord(3/7)
    
    Pages
    S.242-249
    Source
    SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a
  17. Hu, G.; Zhou, S.; Guan, J.; Hu, X.: Towards effective document clustering : a constrained K-means based approach (2008) 0.01
    0.01484948 = product of:
      0.034648787 = sum of:
        0.02527181 = product of:
          0.05054362 = sum of:
            0.05054362 = weight(_text_:k in 2113) [ClassicSimilarity], result of:
              0.05054362 = score(doc=2113,freq=4.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.39044446 = fieldWeight in 2113, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2113)
          0.5 = coord(1/2)
        0.004688489 = weight(_text_:s in 2113) [ClassicSimilarity], result of:
          0.004688489 = score(doc=2113,freq=4.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.118916616 = fieldWeight in 2113, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2113)
        0.004688489 = weight(_text_:s in 2113) [ClassicSimilarity], result of:
          0.004688489 = score(doc=2113,freq=4.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.118916616 = fieldWeight in 2113, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2113)
      0.42857143 = coord(3/7)
    
    Abstract
    Document clustering is an important tool for document collection organization and browsing. In real applications, some limited knowledge about cluster membership of a small number of documents is often available, such as some pairs of documents belonging to the same cluster. This kind of prior knowledge can be served as constraints for the clustering process. We integrate the constraints into the trace formulation of the sum of square Euclidean distance function of K-means. Then, the combined criterion function is transformed into trace maximization, which is further optimized by eigen-decomposition. Our experimental evaluation shows that the proposed semi-supervised clustering method can achieve better performance, compared to three existing methods.
    Source
    Information processing and management. 44(2008) no.4, S.1397-1409
  18. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.01
    0.014587705 = product of:
      0.034037977 = sum of:
        0.0047360887 = weight(_text_:s in 2748) [ClassicSimilarity], result of:
          0.0047360887 = score(doc=2748,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.120123915 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
        0.0047360887 = weight(_text_:s in 2748) [ClassicSimilarity], result of:
          0.0047360887 = score(doc=2748,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.120123915 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
        0.024565801 = product of:
          0.049131602 = sum of:
            0.049131602 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
              0.049131602 = score(doc=2748,freq=2.0), product of:
                0.12698747 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03626318 = queryNorm
                0.38690117 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Date
    1. 2.2016 18:25:22
    Pages
    S.64-75
  19. Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.01
    0.014511971 = product of:
      0.025395948 = sum of:
        0.010211354 = product of:
          0.020422708 = sum of:
            0.020422708 = weight(_text_:k in 2741) [ClassicSimilarity], result of:
              0.020422708 = score(doc=2741,freq=2.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.15776339 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
          0.5 = coord(1/2)
        0.0026791363 = weight(_text_:s in 2741) [ClassicSimilarity], result of:
          0.0026791363 = score(doc=2741,freq=4.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.06795235 = fieldWeight in 2741, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
        0.0026791363 = weight(_text_:s in 2741) [ClassicSimilarity], result of:
          0.0026791363 = score(doc=2741,freq=4.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.06795235 = fieldWeight in 2741, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
        0.00982632 = product of:
          0.01965264 = sum of:
            0.01965264 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
              0.01965264 = score(doc=2741,freq=2.0), product of:
                0.12698747 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03626318 = queryNorm
                0.15476047 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Date
    12. 9.2004 9:56:22
    Pages
    S.351-357
  20. Kwon, O.W.; Lee, J.H.: Text categorization based on k-nearest neighbor approach for web site classification (2003) 0.01
    0.014261868 = product of:
      0.03327769 = sum of:
        0.0285416 = product of:
          0.0570832 = sum of:
            0.0570832 = weight(_text_:k in 1070) [ClassicSimilarity], result of:
              0.0570832 = score(doc=1070,freq=10.0), product of:
                0.1294515 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03626318 = queryNorm
                0.44096208 = fieldWeight in 1070, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1070)
          0.5 = coord(1/2)
        0.0023680443 = weight(_text_:s in 1070) [ClassicSimilarity], result of:
          0.0023680443 = score(doc=1070,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.060061958 = fieldWeight in 1070, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1070)
        0.0023680443 = weight(_text_:s in 1070) [ClassicSimilarity], result of:
          0.0023680443 = score(doc=1070,freq=2.0), product of:
            0.03942669 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03626318 = queryNorm
            0.060061958 = fieldWeight in 1070, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1070)
      0.42857143 = coord(3/7)
    
    Abstract
    Automatic categorization is a viable method to deal with the scaling problem on the World Wide Web. For Web site classification, this paper proposes the use of Web pages linked with the home page in a different manner from the sole use of home pages in previous research. To implement our proposed method, we derive a scheme for Web site classification based on the k-nearest neighbor (k-NN) approach. It consists of three phases: Web page selection (connectivity analysis), Web page classification, and Web site classification. Given a Web site, the Web page selection chooses several representative Web pages using connectivity analysis. The k-NN classifier next classifies each of the selected Web pages. Finally, the classified Web pages are extended to a classification of the entire Web site. To improve performance, we supplement the k-NN approach with a feature selection method and a term weighting scheme using markup tags, and also reform its document-document similarity measure. In our experiments on a Korean commercial Web directory, the proposed system, using both a home page and its linked pages, improved the performance of micro-averaging breakeven point by 30.02%, compared with an ordinary classification which uses a home page only.
    Source
    Information processing and management. 39(2003) no.1, S.25-44

Years

Languages

  • e 161
  • d 41
  • a 1
  • chi 1
  • More… Less…

Types

  • a 178
  • el 19
  • x 9
  • m 4
  • r 4
  • s 2
  • d 1
  • More… Less…