Search (208 results, page 1 of 11)

  • × theme_ss:"Computerlinguistik"
  • × type_ss:"a"
  • × year_i:[2000 TO 2010}
  1. Rahmstorf, G.: Wortmodell und Begriffssprache als Basis des semantischen Retrievals (2000) 0.13
    0.1336039 = product of:
      0.23380682 = sum of:
        0.057160407 = weight(_text_:g in 5484) [ClassicSimilarity], result of:
          0.057160407 = score(doc=5484,freq=4.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.41080675 = fieldWeight in 5484, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5484)
        0.14211753 = weight(_text_:rahmstorf in 5484) [ClassicSimilarity], result of:
          0.14211753 = score(doc=5484,freq=2.0), product of:
            0.26091042 = queryWeight, product of:
              7.042927 = idf(docFreq=104, maxDocs=44218)
              0.03704574 = queryNorm
            0.54469854 = fieldWeight in 5484, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.042927 = idf(docFreq=104, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5484)
        0.030719671 = weight(_text_:u in 5484) [ClassicSimilarity], result of:
          0.030719671 = score(doc=5484,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.25324488 = fieldWeight in 5484, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5484)
        0.0038092136 = weight(_text_:a in 5484) [ClassicSimilarity], result of:
          0.0038092136 = score(doc=5484,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.089176424 = fieldWeight in 5484, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5484)
      0.5714286 = coord(4/7)
    
    Source
    Informationskompetenz - Basiskompetenz in der Informationsgesellschaft: Proceedings des 7. Internationalen Symposiums für Informationswissenschaft (ISI 2000), Hrsg.: G. Knorz u. R. Kuhlen
    Type
    a
  2. Rahmstorf, G.: Rückkehr von Ordnung in die Informationstechnik? (2000) 0.11
    0.11451764 = product of:
      0.20040585 = sum of:
        0.048994634 = weight(_text_:g in 5504) [ClassicSimilarity], result of:
          0.048994634 = score(doc=5504,freq=4.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.35212007 = fieldWeight in 5504, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.046875 = fieldNorm(doc=5504)
        0.12181503 = weight(_text_:rahmstorf in 5504) [ClassicSimilarity], result of:
          0.12181503 = score(doc=5504,freq=2.0), product of:
            0.26091042 = queryWeight, product of:
              7.042927 = idf(docFreq=104, maxDocs=44218)
              0.03704574 = queryNorm
            0.4668845 = fieldWeight in 5504, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.042927 = idf(docFreq=104, maxDocs=44218)
              0.046875 = fieldNorm(doc=5504)
        0.026331145 = weight(_text_:u in 5504) [ClassicSimilarity], result of:
          0.026331145 = score(doc=5504,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.21706703 = fieldWeight in 5504, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.046875 = fieldNorm(doc=5504)
        0.0032650405 = weight(_text_:a in 5504) [ClassicSimilarity], result of:
          0.0032650405 = score(doc=5504,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.07643694 = fieldWeight in 5504, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=5504)
      0.5714286 = coord(4/7)
    
    Source
    Information und Öffentlichkeit: 1. Gemeinsamer Kongress der Bundesvereinigung Deutscher Bibliotheksverbände e.V. (BDB) und der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. (DGI), Leipzig, 20.-23.3.2000. Zugleich 90. Deutscher Bibliothekartag, 52. Jahrestagung der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. (DGI). Hrsg.: G. Ruppelt u. H. Neißer
    Type
    a
  3. Kuhlmann, U.; Monnerjahn, P.: Sprache auf Knopfdruck : Sieben automatische Übersetzungsprogramme im Test (2000) 0.06
    0.05764547 = product of:
      0.100879565 = sum of:
        0.026456656 = product of:
          0.052913312 = sum of:
            0.052913312 = weight(_text_:p in 5428) [ClassicSimilarity], result of:
              0.052913312 = score(doc=5428,freq=2.0), product of:
                0.13319843 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.03704574 = queryNorm
                0.39725178 = fieldWeight in 5428, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.078125 = fieldNorm(doc=5428)
          0.5 = coord(1/2)
        0.043885246 = weight(_text_:u in 5428) [ClassicSimilarity], result of:
          0.043885246 = score(doc=5428,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.3617784 = fieldWeight in 5428, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.078125 = fieldNorm(doc=5428)
        0.0054417336 = weight(_text_:a in 5428) [ClassicSimilarity], result of:
          0.0054417336 = score(doc=5428,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.12739488 = fieldWeight in 5428, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=5428)
        0.025095932 = product of:
          0.050191864 = sum of:
            0.050191864 = weight(_text_:22 in 5428) [ClassicSimilarity], result of:
              0.050191864 = score(doc=5428,freq=2.0), product of:
                0.12972787 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03704574 = queryNorm
                0.38690117 = fieldWeight in 5428, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=5428)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Source
    c't. 2000, H.22, S.220-229
    Type
    a
  4. Hammwöhner, R.: TransRouter revisited : Decision support in the routing of translation projects (2000) 0.05
    0.05445891 = product of:
      0.09530309 = sum of:
        0.040418513 = weight(_text_:g in 5483) [ClassicSimilarity], result of:
          0.040418513 = score(doc=5483,freq=2.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.29048425 = fieldWeight in 5483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5483)
        0.030719671 = weight(_text_:u in 5483) [ClassicSimilarity], result of:
          0.030719671 = score(doc=5483,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.25324488 = fieldWeight in 5483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5483)
        0.0065977518 = weight(_text_:a in 5483) [ClassicSimilarity], result of:
          0.0065977518 = score(doc=5483,freq=6.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.1544581 = fieldWeight in 5483, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5483)
        0.017567152 = product of:
          0.035134304 = sum of:
            0.035134304 = weight(_text_:22 in 5483) [ClassicSimilarity], result of:
              0.035134304 = score(doc=5483,freq=2.0), product of:
                0.12972787 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03704574 = queryNorm
                0.2708308 = fieldWeight in 5483, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5483)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Abstract
    This paper gives an outline of the final results of the TransRouter project. In the scope of this project a decision support system for translation managers has been developed, which will support the selection of appropriate routes for translation projects. In this paper emphasis is put on the decision model, which is based on a stepwise refined assessment of translation routes. The workflow of using this system is considered as well
    Date
    10.12.2000 18:22:35
    Source
    Informationskompetenz - Basiskompetenz in der Informationsgesellschaft: Proceedings des 7. Internationalen Symposiums für Informationswissenschaft (ISI 2000), Hrsg.: G. Knorz u. R. Kuhlen
    Type
    a
  5. Hull, D.; Ait-Mokhtar, S.; Chuat, M.; Eisele, A.; Gaussier, E.; Grefenstette, G.; Isabelle, P.; Samulesson, C.; Segand, F.: Language technologies and patent search and classification (2001) 0.05
    0.04725934 = product of:
      0.11027179 = sum of:
        0.031747986 = product of:
          0.06349597 = sum of:
            0.06349597 = weight(_text_:p in 6318) [ClassicSimilarity], result of:
              0.06349597 = score(doc=6318,freq=2.0), product of:
                0.13319843 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.03704574 = queryNorm
                0.47670212 = fieldWeight in 6318, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6318)
          0.5 = coord(1/2)
        0.06928888 = weight(_text_:g in 6318) [ClassicSimilarity], result of:
          0.06928888 = score(doc=6318,freq=2.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.49797297 = fieldWeight in 6318, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.09375 = fieldNorm(doc=6318)
        0.0092349285 = weight(_text_:a in 6318) [ClassicSimilarity], result of:
          0.0092349285 = score(doc=6318,freq=4.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.2161963 = fieldWeight in 6318, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=6318)
      0.42857143 = coord(3/7)
    
    Type
    a
  6. Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.04
    0.041945275 = product of:
      0.07340423 = sum of:
        0.018519659 = product of:
          0.037039317 = sum of:
            0.037039317 = weight(_text_:p in 156) [ClassicSimilarity], result of:
              0.037039317 = score(doc=156,freq=2.0), product of:
                0.13319843 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.03704574 = queryNorm
                0.27807623 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.5 = coord(1/2)
        0.030719671 = weight(_text_:u in 156) [ClassicSimilarity], result of:
          0.030719671 = score(doc=156,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.25324488 = fieldWeight in 156, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
        0.0065977518 = weight(_text_:a in 156) [ClassicSimilarity], result of:
          0.0065977518 = score(doc=156,freq=6.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.1544581 = fieldWeight in 156, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
        0.017567152 = product of:
          0.035134304 = sum of:
            0.035134304 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
              0.035134304 = score(doc=156,freq=2.0), product of:
                0.12972787 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03704574 = queryNorm
                0.2708308 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Abstract
    The present study investigates the ability of a bibliometric based semi-automatic method to select candidate thesaurus terms from citation contexts. The method consists of document co-citation analysis, citation context analysis, and noun phrase parsing. The investigation is carried out within the specialty area of periodontology. The results clearly demonstrate that the method is able to select important candidate thesaurus terms within the chosen specialty area.
    Date
    8. 3.2007 19:55:22
    Source
    Context: nature, impact and role. 5th International Conference an Conceptions of Library and Information Sciences, CoLIS 2005 Glasgow, UK, June 2005. Ed. by F. Crestani u. I. Ruthven
    Type
    a
  7. Monnerjahn, P.: Vorsprung ohne Technik : Übersetzen: Computer und Qualität (2000) 0.03
    0.029311366 = product of:
      0.068393186 = sum of:
        0.031747986 = product of:
          0.06349597 = sum of:
            0.06349597 = weight(_text_:p in 5429) [ClassicSimilarity], result of:
              0.06349597 = score(doc=5429,freq=2.0), product of:
                0.13319843 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.03704574 = queryNorm
                0.47670212 = fieldWeight in 5429, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5429)
          0.5 = coord(1/2)
        0.006530081 = weight(_text_:a in 5429) [ClassicSimilarity], result of:
          0.006530081 = score(doc=5429,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.15287387 = fieldWeight in 5429, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=5429)
        0.030115116 = product of:
          0.060230233 = sum of:
            0.060230233 = weight(_text_:22 in 5429) [ClassicSimilarity], result of:
              0.060230233 = score(doc=5429,freq=2.0), product of:
                0.12972787 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03704574 = queryNorm
                0.46428138 = fieldWeight in 5429, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5429)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Source
    c't. 2000, H.22, S.230-231
    Type
    a
  8. Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.03
    0.02926133 = product of:
      0.068276435 = sum of:
        0.03464444 = weight(_text_:g in 5480) [ClassicSimilarity], result of:
          0.03464444 = score(doc=5480,freq=2.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.24898648 = fieldWeight in 5480, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.046875 = fieldNorm(doc=5480)
        0.026331145 = weight(_text_:u in 5480) [ClassicSimilarity], result of:
          0.026331145 = score(doc=5480,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.21706703 = fieldWeight in 5480, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.046875 = fieldNorm(doc=5480)
        0.007300853 = weight(_text_:a in 5480) [ClassicSimilarity], result of:
          0.007300853 = score(doc=5480,freq=10.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.1709182 = fieldWeight in 5480, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=5480)
      0.42857143 = coord(3/7)
    
    Abstract
    (Automatic) document classification is generally defined as content-based assignment of one or more predefined categories to documents. Usually, machine learning, statistical pattern recognition, or neural network approaches are used to construct classifiers automatically. In this paper we thoroughly evaluate a wide variety of these methods on a document classification task for German text. We evaluate different feature construction and selection methods and various classifiers. Our main results are: (1) feature selection is necessary not only to reduce learning and classification time, but also to avoid overfitting (even for Support Vector Machines); (2) surprisingly, our morphological analysis does not improve classification quality compared to a letter 5-gram approach; (3) Support Vector Machines are significantly better than all other classification methods
    Source
    Informationskompetenz - Basiskompetenz in der Informationsgesellschaft: Proceedings des 7. Internationalen Symposiums für Informationswissenschaft (ISI 2000), Hrsg.: G. Knorz u. R. Kuhlen
    Type
    a
  9. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.03
    0.028164204 = product of:
      0.065716475 = sum of:
        0.04412884 = product of:
          0.17651536 = sum of:
            0.17651536 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.17651536 = score(doc=562,freq=2.0), product of:
                0.3140742 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03704574 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.25 = coord(1/4)
        0.006530081 = weight(_text_:a in 562) [ClassicSimilarity], result of:
          0.006530081 = score(doc=562,freq=8.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.15287387 = fieldWeight in 562, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.015057558 = product of:
          0.030115116 = sum of:
            0.030115116 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.030115116 = score(doc=562,freq=2.0), product of:
                0.12972787 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03704574 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Abstract
    Document representations for text classification are typically based on the classical Bag-Of-Words paradigm. This approach comes with deficiencies that motivate the integration of features on a higher semantic level than single words. In this paper we propose an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting is used for actual classification. Experimental evaluations on two well known text corpora support our approach through consistent improvement of the results.
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
    Type
    a
  10. Heyer, G.; Läuter, M.; Quasthoff, U.; Wolff, C.: Texttechnologische Anwendungen am Beispiel Text Mining (2000) 0.03
    0.027531698 = product of:
      0.06424063 = sum of:
        0.03464444 = weight(_text_:g in 5565) [ClassicSimilarity], result of:
          0.03464444 = score(doc=5565,freq=2.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.24898648 = fieldWeight in 5565, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.046875 = fieldNorm(doc=5565)
        0.026331145 = weight(_text_:u in 5565) [ClassicSimilarity], result of:
          0.026331145 = score(doc=5565,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.21706703 = fieldWeight in 5565, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.046875 = fieldNorm(doc=5565)
        0.0032650405 = weight(_text_:a in 5565) [ClassicSimilarity], result of:
          0.0032650405 = score(doc=5565,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.07643694 = fieldWeight in 5565, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=5565)
      0.42857143 = coord(3/7)
    
    Type
    a
  11. Humphreys, K.; Demetriou, G.; Gaizauskas, R.: Bioinformatics applications of information extraction from scientific journal articles (2000) 0.03
    0.025272988 = product of:
      0.08845545 = sum of:
        0.080837026 = weight(_text_:g in 4545) [ClassicSimilarity], result of:
          0.080837026 = score(doc=4545,freq=2.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.5809685 = fieldWeight in 4545, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.109375 = fieldNorm(doc=4545)
        0.0076184273 = weight(_text_:a in 4545) [ClassicSimilarity], result of:
          0.0076184273 = score(doc=4545,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.17835285 = fieldWeight in 4545, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=4545)
      0.2857143 = coord(2/7)
    
    Type
    a
  12. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.02
    0.02436761 = product of:
      0.056857757 = sum of:
        0.018519659 = product of:
          0.037039317 = sum of:
            0.037039317 = weight(_text_:p in 1595) [ClassicSimilarity], result of:
              0.037039317 = score(doc=1595,freq=2.0), product of:
                0.13319843 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.03704574 = queryNorm
                0.27807623 = fieldWeight in 1595, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1595)
          0.5 = coord(1/2)
        0.030719671 = weight(_text_:u in 1595) [ClassicSimilarity], result of:
          0.030719671 = score(doc=1595,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.25324488 = fieldWeight in 1595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
        0.0076184273 = weight(_text_:a in 1595) [ClassicSimilarity], result of:
          0.0076184273 = score(doc=1595,freq=8.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.17835285 = fieldWeight in 1595, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
      0.42857143 = coord(3/7)
    
    Abstract
    This paper presents a method that exploits the hierarchical structure of an indexing vocabulary to guide the development and training of machine learning methods for automatic text categorization. We present the design of a hierarchical classifier based an the divide-and-conquer principle. The method is evaluated using backpropagation neural networks, such as the machine learning algorithm, that leam to assign MeSH categories to a subset of MEDLINE records. Comparisons with traditional Rocchio's algorithm adapted for text categorization, as well as flat neural network classifiers, are provided. The results indicate that the use of hierarchical structures improves Performance significantly.
    Source
    Advances in classification research, vol.10: proceedings of the 10th ASIS SIG/CR Classification Research Workshop. Ed.: Albrechtsen, H. u. J.E. Mai
    Type
    a
  13. Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.02
    0.023724519 = product of:
      0.05535721 = sum of:
        0.03464444 = weight(_text_:g in 4436) [ClassicSimilarity], result of:
          0.03464444 = score(doc=4436,freq=2.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.24898648 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
        0.005655216 = weight(_text_:a in 4436) [ClassicSimilarity], result of:
          0.005655216 = score(doc=4436,freq=6.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.13239266 = fieldWeight in 4436, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
        0.015057558 = product of:
          0.030115116 = sum of:
            0.030115116 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
              0.030115116 = score(doc=4436,freq=2.0), product of:
                0.12972787 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03704574 = queryNorm
                0.23214069 = fieldWeight in 4436, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4436)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Abstract
    Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
    Date
    16. 2.2000 14:22:39
    Type
    a
  14. Rapke, K.: Automatische Indexierung von Volltexten für die Gruner+Jahr Pressedatenbank (2001) 0.02
    0.02072969 = product of:
      0.07255392 = sum of:
        0.06928888 = weight(_text_:g in 6386) [ClassicSimilarity], result of:
          0.06928888 = score(doc=6386,freq=8.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.49797297 = fieldWeight in 6386, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.046875 = fieldNorm(doc=6386)
        0.0032650405 = weight(_text_:a in 6386) [ClassicSimilarity], result of:
          0.0032650405 = score(doc=6386,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.07643694 = fieldWeight in 6386, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=6386)
      0.2857143 = coord(2/7)
    
    Abstract
    Retrieval Tests sind die anerkannteste Methode, um neue Verfahren der Inhaltserschließung gegenüber traditionellen Verfahren zu rechtfertigen. Im Rahmen einer Diplomarbeit wurden zwei grundsätzlich unterschiedliche Systeme der automatischen inhaltlichen Erschließung anhand der Pressedatenbank des Verlagshauses Gruner + Jahr (G+J) getestet und evaluiert. Untersucht wurde dabei natürlichsprachliches Retrieval im Vergleich zu Booleschem Retrieval. Bei den beiden Systemen handelt es sich zum einen um Autonomy von Autonomy Inc. und DocCat, das von IBM an die Datenbankstruktur der G+J Pressedatenbank angepasst wurde. Ersteres ist ein auf natürlichsprachlichem Retrieval basierendes, probabilistisches System. DocCat demgegenüber basiert auf Booleschem Retrieval und ist ein lernendes System, das auf Grund einer intellektuell erstellten Trainingsvorlage indexiert. Methodisch geht die Evaluation vom realen Anwendungskontext der Textdokumentation von G+J aus. Die Tests werden sowohl unter statistischen wie auch qualitativen Gesichtspunkten bewertet. Ein Ergebnis der Tests ist, dass DocCat einige Mängel gegenüber der intellektuellen Inhaltserschließung aufweist, die noch behoben werden müssen, während das natürlichsprachliche Retrieval von Autonomy in diesem Rahmen und für die speziellen Anforderungen der G+J Textdokumentation so nicht einsetzbar ist
    Type
    a
  15. Rapke, K.: Automatische Indexierung von Volltexten für die Gruner+Jahr Pressedatenbank (2001) 0.02
    0.017274743 = product of:
      0.0604616 = sum of:
        0.057740733 = weight(_text_:g in 5863) [ClassicSimilarity], result of:
          0.057740733 = score(doc=5863,freq=8.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.4149775 = fieldWeight in 5863, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5863)
        0.0027208668 = weight(_text_:a in 5863) [ClassicSimilarity], result of:
          0.0027208668 = score(doc=5863,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.06369744 = fieldWeight in 5863, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5863)
      0.2857143 = coord(2/7)
    
    Abstract
    Retrievaltests sind die anerkannteste Methode, um neue Verfahren der Inhaltserschließung gegenüber traditionellen Verfahren zu rechtfertigen. Im Rahmen einer Diplomarbeit wurden zwei grundsätzlich unterschiedliche Systeme der automatischen inhaltlichen Erschließung anhand der Pressedatenbank des Verlagshauses Gruner + Jahr (G+J) getestet und evaluiert. Untersucht wurde dabei natürlichsprachliches Retrieval im Vergleich zu Booleschem Retrieval. Bei den beiden Systemen handelt es sich zum einen um Autonomy von Autonomy Inc. und DocCat, das von IBM an die Datenbankstruktur der G+J Pressedatenbank angepasst wurde. Ersteres ist ein auf natürlichsprachlichem Retrieval basierendes, probabilistisches System. DocCat demgegenüber basiert auf Booleschem Retrieval und ist ein lernendes System, das aufgrund einer intellektuell erstellten Trainingsvorlage indexiert. Methodisch geht die Evaluation vom realen Anwendungskontext der Textdokumentation von G+J aus. Die Tests werden sowohl unter statistischen wie auch qualitativen Gesichtspunkten bewertet. Ein Ergebnis der Tests ist, dass DocCat einige Mängel gegenüber der intellektuellen Inhaltserschließung aufweist, die noch behoben werden müssen, während das natürlichsprachliche Retrieval von Autonomy in diesem Rahmen und für die speziellen Anforderungen der G+J Textdokumentation so nicht einsetzbar ist
    Type
    a
  16. Benoit, G.: Data discretization for novel relationship discovery in information retrieval (2002) 0.02
    0.015685532 = product of:
      0.05489936 = sum of:
        0.046192586 = weight(_text_:g in 5197) [ClassicSimilarity], result of:
          0.046192586 = score(doc=5197,freq=2.0), product of:
            0.13914184 = queryWeight, product of:
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.03704574 = queryNorm
            0.331982 = fieldWeight in 5197, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7559474 = idf(docFreq=2809, maxDocs=44218)
              0.0625 = fieldNorm(doc=5197)
        0.008706774 = weight(_text_:a in 5197) [ClassicSimilarity], result of:
          0.008706774 = score(doc=5197,freq=8.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.20383182 = fieldWeight in 5197, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=5197)
      0.2857143 = coord(2/7)
    
    Abstract
    A sample of 600 Dialog and Swiss-Prot full text records in genetics and molecular biology were parsed and term frequencies calculated to provide data for a test of Benoit's visualization model for retrieval. A retrieved set is displayed graphically allowing for manipulation of document and concept relationships in real time, which hopefully will reveal unanticipated relationships.
    Type
    a
  17. Drouin, P.: Term extraction using non-technical corpora as a point of leverage (2003) 0.02
    0.01561254 = product of:
      0.054643888 = sum of:
        0.04233065 = product of:
          0.0846613 = sum of:
            0.0846613 = weight(_text_:p in 8797) [ClassicSimilarity], result of:
              0.0846613 = score(doc=8797,freq=2.0), product of:
                0.13319843 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.03704574 = queryNorm
                0.63560283 = fieldWeight in 8797, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.125 = fieldNorm(doc=8797)
          0.5 = coord(1/2)
        0.012313238 = weight(_text_:a in 8797) [ClassicSimilarity], result of:
          0.012313238 = score(doc=8797,freq=4.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.28826174 = fieldWeight in 8797, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.125 = fieldNorm(doc=8797)
      0.2857143 = coord(2/7)
    
    Type
    a
  18. Klein, A.; Weis, U.; Stede, M.: ¬Der Einsatz von Sprachverarbeitungstools beim Sprachenlernen im Intranet (2000) 0.01
    0.014737436 = product of:
      0.05158102 = sum of:
        0.043885246 = weight(_text_:u in 5542) [ClassicSimilarity], result of:
          0.043885246 = score(doc=5542,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.3617784 = fieldWeight in 5542, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.078125 = fieldNorm(doc=5542)
        0.007695774 = weight(_text_:a in 5542) [ClassicSimilarity], result of:
          0.007695774 = score(doc=5542,freq=4.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.18016359 = fieldWeight in 5542, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=5542)
      0.2857143 = coord(2/7)
    
    Type
    a
  19. Sidhom, S.; Hassoun, M.: Morpho-syntactic parsing to text mining environment : NP recognition model to knowledge visualization and information (2003) 0.01
    0.014093423 = product of:
      0.04932698 = sum of:
        0.043885246 = weight(_text_:u in 3546) [ClassicSimilarity], result of:
          0.043885246 = score(doc=3546,freq=2.0), product of:
            0.121304214 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.03704574 = queryNorm
            0.3617784 = fieldWeight in 3546, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.078125 = fieldNorm(doc=3546)
        0.0054417336 = weight(_text_:a in 3546) [ClassicSimilarity], result of:
          0.0054417336 = score(doc=3546,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.12739488 = fieldWeight in 3546, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=3546)
      0.2857143 = coord(2/7)
    
    Source
    Tendencias de investigación en organización del conocimient: IV Cologuio International de Ciencas de la Documentación , VI Congreso del Capitulo Espanol de ISKO = Trends in knowledge organization research. Eds.: J.A. Frias u. C. Travieso
    Type
    a
  20. Ekmekcioglu, F.C.; Willett, P.: Effectiveness of stemming for Turkish text retrieval (2000) 0.01
    0.012759356 = product of:
      0.044657744 = sum of:
        0.037039317 = product of:
          0.074078634 = sum of:
            0.074078634 = weight(_text_:p in 5423) [ClassicSimilarity], result of:
              0.074078634 = score(doc=5423,freq=2.0), product of:
                0.13319843 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.03704574 = queryNorm
                0.55615246 = fieldWeight in 5423, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.109375 = fieldNorm(doc=5423)
          0.5 = coord(1/2)
        0.0076184273 = weight(_text_:a in 5423) [ClassicSimilarity], result of:
          0.0076184273 = score(doc=5423,freq=2.0), product of:
            0.04271548 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03704574 = queryNorm
            0.17835285 = fieldWeight in 5423, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=5423)
      0.2857143 = coord(2/7)
    
    Type
    a

Authors

Languages

  • e 148
  • d 53
  • ru 5
  • slv 1
  • More… Less…