Search (60 results, page 1 of 3)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.39

0.38502043 = product of:
  0.6417007 = sum of:
    0.048722174 = product of:
      0.14616652 = sum of:
        0.14616652 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.14616652 = score(doc=562,freq=2.0), product of:
            0.26007444 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03067635 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.14616652 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.14616652 = score(doc=562,freq=2.0), product of:
        0.26007444 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03067635 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.14616652 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.14616652 = score(doc=562,freq=2.0), product of:
        0.26007444 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03067635 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.14616652 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.14616652 = score(doc=562,freq=2.0), product of:
        0.26007444 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03067635 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.14616652 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.14616652 = score(doc=562,freq=2.0), product of:
        0.26007444 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03067635 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.008312443 = product of:
      0.02493733 = sum of:
        0.02493733 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.02493733 = score(doc=562,freq=2.0), product of:
            0.10742335 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03067635 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
  0.6 = coord(6/10)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.01

0.008832636 = product of:
  0.088326365 = sum of:
    0.088326365 = product of:
      0.13248955 = sum of:
        0.08261489 = weight(_text_:1990 in 1046) [ClassicSimilarity], result of:
          0.08261489 = score(doc=1046,freq=2.0), product of:
            0.13825724 = queryWeight, product of:
              4.506965 = idf(docFreq=1325, maxDocs=44218)
              0.03067635 = queryNorm
            0.5975448 = fieldWeight in 1046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.506965 = idf(docFreq=1325, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
        0.04987466 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
          0.04987466 = score(doc=1046,freq=2.0), product of:
            0.10742335 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03067635 = queryNorm
            0.46428138 = fieldWeight in 1046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
      0.6666667 = coord(2/3)
  0.1 = coord(1/10)

Date: 5. 5.2003 14:17:22
Footnote: Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part II

Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.00
```
0.004910724 = product of:
  0.02455362 = sum of:
    0.017626584 = product of:
      0.052879747 = sum of:
        0.052879747 = weight(_text_:problem in 1107) [ClassicSimilarity], result of:
          0.052879747 = score(doc=1107,freq=6.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.4061259 = fieldWeight in 1107, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1107)
      0.33333334 = coord(1/3)
    0.0069270367 = product of:
      0.02078111 = sum of:
        0.02078111 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
          0.02078111 = score(doc=1107,freq=2.0), product of:
            0.10742335 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03067635 = queryNorm
            0.19345059 = fieldWeight in 1107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1107)
      0.33333334 = coord(1/3)
  0.2 = coord(2/10)
```
Abstract

Retrieval of disease information is often based on several key aspects such as etiology, diagnosis, treatment, prevention, and symptoms of diseases. Automatic identification of disease aspect information is thus essential. In this article, I model the aspect identification problem as a text classification (TC) problem in which a disease aspect corresponds to a category. The disease aspect classification problem poses two challenges to classifiers: (a) a medical text often contains information about multiple aspects of a disease and hence produces noise for the classifiers and (b) text classifiers often cannot extract the textual parts (i.e., passages) about the categories of interest. I thus develop a technique, PETC (Passage Extractor for Text Classification), that extracts passages (from medical texts) for the underlying text classifiers to classify. Case studies on thousands of Chinese and English medical texts show that PETC enhances a support vector machine (SVM) classifier in classifying disease aspect information. PETC also performs better than three state-of-the-art classifier enhancement techniques, including two passage extraction techniques for text classifiers and a technique that employs term proximity information to enhance text classifiers. The contribution is of significance to evidence-based medicine, health education, and healthcare decision support. PETC can be used in those application domains in which a text to be classified may have several parts about different categories.

Date

28.10.2013 19:22:57
Orwig, R.E.; Chen, H.; Nunamaker, J.F.: ¬A graphical, self-organizing approach to classifying electronic meeting output (1997) 0.00
```
0.002849479 = product of:
  0.028494788 = sum of:
    0.028494788 = product of:
      0.08548436 = sum of:
        0.08548436 = weight(_text_:problem in 6928) [ClassicSimilarity], result of:
          0.08548436 = score(doc=6928,freq=8.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.6565352 = fieldWeight in 6928, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6928)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

Describes research in the application of a Kohonen Self-Organizing Map (SOM) to the problem of classification of electronic brainstorming output and an evaluation of the results. Describes an electronic meeting system and describes the classification problem that exists in the group problem solving process. Surveys the literature concerning classification. Describes the application of the Kohonen SOM to the meeting output classification problem. Describes an experiment that evaluated the classification performed by the Kohonen SOM by comparing it with those of a human expert and a Hopfield neural network. Discusses conclusions and directions for future research

Shafer, K.E.: Automatic Subject Assignment via the Scorpion System (2001) 0.00

0.0027538298 = product of:
  0.027538298 = sum of:
    0.027538298 = product of:
      0.08261489 = sum of:
        0.08261489 = weight(_text_:1990 in 1043) [ClassicSimilarity], result of:
          0.08261489 = score(doc=1043,freq=2.0), product of:
            0.13825724 = queryWeight, product of:
              4.506965 = idf(docFreq=1325, maxDocs=44218)
              0.03067635 = queryNorm
            0.5975448 = fieldWeight in 1043, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.506965 = idf(docFreq=1325, maxDocs=44218)
              0.09375 = fieldNorm(doc=1043)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Footnote: Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part I

Liu, R.-L.: Context-based term frequency assessment for text classification (2010) 0.00

0.0024520915 = product of:
  0.024520915 = sum of:
    0.024520915 = product of:
      0.07356274 = sum of:
        0.07356274 = weight(_text_:2010 in 3331) [ClassicSimilarity], result of:
          0.07356274 = score(doc=3331,freq=5.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.5013491 = fieldWeight in 3331, product of:
              2.236068 = tf(freq=5.0), with freq of:
                5.0 = termFreq=5.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.046875 = fieldNorm(doc=3331)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.2, S.300-309
Year: 2010

Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.00

0.0024520915 = product of:
  0.024520915 = sum of:
    0.024520915 = product of:
      0.07356274 = sum of:
        0.07356274 = weight(_text_:2010 in 3464) [ClassicSimilarity], result of:
          0.07356274 = score(doc=3464,freq=5.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.5013491 = fieldWeight in 3464, product of:
              2.236068 = tf(freq=5.0), with freq of:
                5.0 = termFreq=5.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.046875 = fieldNorm(doc=3464)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.6, S.1105-1119
Year: 2010

Major, R.L.; Ragsdale, C.T.: ¬An aggregation approach to the classification problem using multiple prediction experts (2000) 0.00

0.0024424107 = product of:
  0.024424106 = sum of:
    0.024424106 = product of:
      0.07327232 = sum of:
        0.07327232 = weight(_text_:problem in 3789) [ClassicSimilarity], result of:
          0.07327232 = score(doc=3789,freq=2.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.5627445 = fieldWeight in 3789, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.09375 = fieldNorm(doc=3789)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Shafer, K.E.: Evaluating Scorpion Results (2001) 0.00

0.0022948582 = product of:
  0.022948582 = sum of:
    0.022948582 = product of:
      0.06884574 = sum of:
        0.06884574 = weight(_text_:1990 in 4085) [ClassicSimilarity], result of:
          0.06884574 = score(doc=4085,freq=2.0), product of:
            0.13825724 = queryWeight, product of:
              4.506965 = idf(docFreq=1325, maxDocs=44218)
              0.03067635 = queryNorm
            0.497954 = fieldWeight in 4085, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.506965 = idf(docFreq=1325, maxDocs=44218)
              0.078125 = fieldNorm(doc=4085)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Footnote: Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part II

Kishida, K.: High-speed rough clustering for very large document collections (2010) 0.00

0.0020434097 = product of:
  0.020434096 = sum of:
    0.020434096 = product of:
      0.06130229 = sum of:
        0.06130229 = weight(_text_:2010 in 3463) [ClassicSimilarity], result of:
          0.06130229 = score(doc=3463,freq=5.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.41779095 = fieldWeight in 3463, product of:
              2.236068 = tf(freq=5.0), with freq of:
                5.0 = termFreq=5.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3463)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.6, S.1092-1104
Year: 2010

HaCohen-Kerner, Y.; Beck, H.; Yehudai, E.; Rosenstein, M.; Mughaz, D.: Cuisine : classification using stylistic feature sets and/or name-based feature sets (2010) 0.00

0.0020434097 = product of:
  0.020434096 = sum of:
    0.020434096 = product of:
      0.06130229 = sum of:
        0.06130229 = weight(_text_:2010 in 3706) [ClassicSimilarity], result of:
          0.06130229 = score(doc=3706,freq=5.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.41779095 = fieldWeight in 3706, product of:
              2.236068 = tf(freq=5.0), with freq of:
                5.0 = termFreq=5.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3706)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.8, S.1644-1657
Year: 2010

Fagni, T.; Sebastiani, F.: Selecting negative examples for hierarchical text classification: An experimental comparison (2010) 0.00

0.0020434097 = product of:
  0.020434096 = sum of:
    0.020434096 = product of:
      0.06130229 = sum of:
        0.06130229 = weight(_text_:2010 in 4101) [ClassicSimilarity], result of:
          0.06130229 = score(doc=4101,freq=5.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.41779095 = fieldWeight in 4101, product of:
              2.236068 = tf(freq=5.0), with freq of:
                5.0 = termFreq=5.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4101)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.11, S.2256-2265
Year: 2010

Groß, T.; Faden, M.: Automatische Indexierung elektronischer Dokumente an der Deutschen Zentralbibliothek für Wirtschaftswissenschaften : Bericht über die Jahrestagung der Internationalen Buchwissenschaftlichen Gesellschaft (2010) 0.00
```
0.0019342359 = product of:
  0.01934236 = sum of:
    0.01934236 = product of:
      0.058027074 = sum of:
        0.058027074 = weight(_text_:2010 in 4051) [ClassicSimilarity], result of:
          0.058027074 = score(doc=4051,freq=7.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.39546952 = fieldWeight in 4051, product of:
              2.6457512 = tf(freq=7.0), with freq of:
                7.0 = termFreq=7.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03125 = fieldNorm(doc=4051)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

Mit der Anfang 2010 begonnen Implementierung und Ergebnisevaluierung des automatischen Indexierungsverfahrens "Decisiv Categorization" der Firma Recommind soll das hier skizzierte Informationsstrukturierungsproblem in zwei Schritten gelöst werden. Kurz- bis mittelfristig soll die intellektuelle Indexierung durch ein semiautomatisches Verfahren6 unterstützt werden. Mittel- bis langfristig soll das maschinelle Verfahren, aufbauend auf einem entsprechenden Training, in die Lage versetzt werden, sowohl im Hause vorliegende Dokumente vollautomatisch zu indexieren als auch ZBW-fremde digitale Informationsressourcen zu verschlagworten bzw. zu klassifizieren, um sie in einem gemeinsamen Suchraum auffindbar machen zu können. Im Anschluss an diese Einleitung werden die ersten Ansätze maschineller Sacherschließung an der ZBW (2001-2004) und deren Ergebnisse und Problemlagen aufgezeigt. Danach werden die Rahmenbedingungen (Projektauftrag und -ziel) für eine Wiederaufnahme des Vorhabens im Jahre 2009 aufgezeigt, gefolgt von einer Darstellung der Funktionsweise der Recommind-Technologie und deren Einsatz im Rahmen der Sacherschließung von Online-Dokumenten mit einem Thesaurus. Schwerpunkt dieser Abhandlung bilden im Anschluss daran die Evaluierungsmöglichkeiten automatischer Indexierungsansätze sowie die aktuellen Ergebnisse und zentralen Erkenntnisse des Einsatzes im Kontext der ZBW. Das Fazit beschreibt die entsprechenden Schlussfolgerungen aus den erzielten Ergebnissen sowie den Ausblick auf das weitere Vorgehen.

Source

Bibliotheksdienst. 44(2010) H.12, S.1120-1135

Year

2010
Li, T.; Zhu, S.; Ogihara, M.: Text categorization via generalized discriminant analysis (2008) 0.00
```
0.0017626584 = product of:
  0.017626584 = sum of:
    0.017626584 = product of:
      0.052879747 = sum of:
        0.052879747 = weight(_text_:problem in 2119) [ClassicSimilarity], result of:
          0.052879747 = score(doc=2119,freq=6.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.4061259 = fieldWeight in 2119, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2119)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

Text categorization is an important research area and has been receiving much attention due to the growth of the on-line information and of Internet. Automated text categorization is generally cast as a multi-class classification problem. Much of previous work focused on binary document classification problems. Support vector machines (SVMs) excel in binary classification, but the elegant theory behind large-margin hyperplane cannot be easily extended to multi-class text classification. In addition, the training time and scaling are also important concerns. On the other hand, other techniques naturally extensible to handle multi-class classification are generally not as accurate as SVM. This paper presents a simple and efficient solution to multi-class text categorization. Classification problems are first formulated as optimization via discriminant analysis. Text categorization is then cast as the problem of finding coordinate transformations that reflects the inherent similarity from the data. While most of the previous approaches decompose a multi-class classification problem into multiple independent binary classification tasks, the proposed approach enables direct multi-class classification. By using generalized singular value decomposition (GSVD), a coordinate transformation that reflects the inherent class structure indicated by the generalized singular values is identified. Extensive experiments demonstrate the efficiency and effectiveness of the proposed approach.

Krauth, J.: Evaluation von Verfahren der automatischen Klassifikation (1983) 0.00

0.0016282737 = product of:
  0.016282737 = sum of:
    0.016282737 = product of:
      0.04884821 = sum of:
        0.04884821 = weight(_text_:problem in 111) [ClassicSimilarity], result of:
          0.04884821 = score(doc=111,freq=2.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.375163 = fieldWeight in 111, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0625 = fieldNorm(doc=111)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Abstract: Ein wichtiges Problem der automatischen Klassifikation ist die Frage der Bewertung der Ergebnisse von Klassifikationsverfahren. Hierunter fallen die Aspekte der Beurteilung der Güte von Klassifikationen, des Vergleichs von Klassifikationen, der Validität von Klassifikationen und der Stabilität von Klassifikationsverfahren. Es wird ein Überblick über die verschiedenen Ansätze gegeben

Rose, J.R.; Gasteiger, J.: HORACE: an automatic system for the hierarchical classification of chemical reactions (1994) 0.00
```
0.0014247395 = product of:
  0.014247394 = sum of:
    0.014247394 = product of:
      0.04274218 = sum of:
        0.04274218 = weight(_text_:problem in 7696) [ClassicSimilarity], result of:
          0.04274218 = score(doc=7696,freq=2.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.3282676 = fieldWeight in 7696, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7696)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

Describes an automatic classification system for classifying chemical reactions. A detailed study of the classification of chemical reactions, based on topological and physicochemical features, is followed by an analysis of the hierarchical classification produced by the HORACE algorithm (Hierarchical Organization of Reactions through Attribute and Condition Eduction), which combines both approaches in a synergistic manner. The searching and updating of reaction hierarchies is demonstrated with the hierarchies produced for 2 data sets by the HORACE algorithm. Shows that reaction hierarchies provide an efficient access to reaction information and indicate the main reaction types for a given reaction scheme, define the scope of a reaction type, enable searchers to find unusual reactions, and can help in locating the reactions most relevant for a given problem
Dang, E.K.F.; Luk, R.W.P.; Ho, K.S.; Chan, S.C.F.; Lee, D.L.: ¬A new measure of clustering effectiveness : algorithms and experimental studies (2008) 0.00
```
0.0014247395 = product of:
  0.014247394 = sum of:
    0.014247394 = product of:
      0.04274218 = sum of:
        0.04274218 = weight(_text_:problem in 1367) [ClassicSimilarity], result of:
          0.04274218 = score(doc=1367,freq=2.0), product of:
            0.1302053 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03067635 = queryNorm
            0.3282676 = fieldWeight in 1367, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1367)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

We propose a new optimal clustering effectiveness measure, called CS1, based on a combination of clusters rather than selecting a single optimal cluster as in the traditional MK1 measure. For hierarchical clustering, we present an algorithm to compute CS1, defined by seeking the optimal combinations of disjoint clusters obtained by cutting the hierarchical structure at a certain similarity level. By reformulating the optimization to a 0-1 linear fractional programming problem, we demonstrate that an exact solution can be obtained by a linear time algorithm. We further discuss how our approach can be generalized to more general problems involving overlapping clusters, and we show how optimal estimates can be obtained by greedy algorithms.

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.00

0.0013854074 = product of:
  0.013854073 = sum of:
    0.013854073 = product of:
      0.04156222 = sum of:
        0.04156222 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.04156222 = score(doc=611,freq=2.0), product of:
            0.10742335 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03067635 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Date: 22. 8.2009 12:54:24

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.00

0.0013854074 = product of:
  0.013854073 = sum of:
    0.013854073 = product of:
      0.04156222 = sum of:
        0.04156222 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.04156222 = score(doc=2748,freq=2.0), product of:
            0.10742335 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03067635 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)

Date: 1. 2.2016 18:25:22

Ru, C.; Tang, J.; Li, S.; Xie, S.; Wang, T.: Using semantic similarity to reduce wrong labels in distant supervision for relation extraction (2018) 0.00
```
0.0012923657 = product of:
  0.012923657 = sum of:
    0.012923657 = product of:
      0.03877097 = sum of:
        0.03877097 = weight(_text_:2010 in 5055) [ClassicSimilarity], result of:
          0.03877097 = score(doc=5055,freq=2.0), product of:
            0.14672957 = queryWeight, product of:
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.03067635 = queryNorm
            0.2642342 = fieldWeight in 5055, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.7831497 = idf(docFreq=1005, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5055)
      0.33333334 = coord(1/3)
  0.1 = coord(1/10)
```
Abstract

Distant supervision (DS) has the advantage of automatically generating large amounts of labelled training data and has been widely used for relation extraction. However, there are usually many wrong labels in the automatically labelled data in distant supervision (Riedel, Yao, & McCallum, 2010). This paper presents a novel method to reduce the wrong labels. The proposed method uses the semantic Jaccard with word embedding to measure the semantic similarity between the relation phrase in the knowledge base and the dependency phrases between two entities in a sentence to filter the wrong labels. In the process of reducing wrong labels, the semantic Jaccard algorithm selects a core dependency phrase to represent the candidate relation in a sentence, which can capture features for relation classification and avoid the negative impact from irrelevant term sequences that previous neural network models of relation extraction often suffer. In the process of relation classification, the core dependency phrases are also used as the input of a convolutional neural network (CNN) for relation classification. The experimental results show that compared with the methods using original DS data, the methods using filtered DS data performed much better in relation extraction. It indicates that the semantic similarity based method is effective in reducing wrong labels. The relation extraction performance of the CNN model using the core dependency phrases as input is the best of all, which indicates that using the core dependency phrases as input of CNN is enough to capture the features for relation classification and could avoid negative impact from irrelevant terms.

Search (60 results, page 1 of 3)

Authors

Years

Languages

Types

Themes