Search (4439 results, page 1 of 222)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.18

0.17518473 = product of:
  0.2627771 = sum of:
    0.07749085 = product of:
      0.23247255 = sum of:
        0.23247255 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.23247255 = score(doc=562,freq=2.0), product of:
            0.41363895 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04878962 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.18528622 = sum of:
      0.1456243 = weight(_text_:mining in 562) [ClassicSimilarity], result of:
        0.1456243 = score(doc=562,freq=4.0), product of:
          0.2752929 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.04878962 = queryNorm
          0.5289795 = fieldWeight in 562, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
      0.03966192 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
        0.03966192 = score(doc=562,freq=2.0), product of:
          0.17085294 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04878962 = queryNorm
          0.23214069 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
  0.6666667 = coord(2/3)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32
Source: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK

Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.17
```
0.1674234 = product of:
  0.25113508 = sum of:
    0.11383918 = weight(_text_:usage in 354) [ClassicSimilarity], result of:
      0.11383918 = score(doc=354,freq=6.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.42261508 = fieldWeight in 354, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.03125 = fieldNorm(doc=354)
    0.1372959 = product of:
      0.2745918 = sum of:
        0.2745918 = weight(_text_:mining in 354) [ClassicSimilarity], result of:
          0.2745918 = score(doc=354,freq=32.0), product of:
            0.2752929 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.04878962 = queryNorm
            0.9974533 = fieldWeight in 354, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.03125 = fieldNorm(doc=354)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Web mining aims to discover useful information and knowledge from the Web hyperlink structure, page contents, and usage data. Although Web mining uses many conventional data mining techniques, it is not purely an application of traditional data mining due to the semistructured and unstructured nature of the Web data and its heterogeneity. It has also developed many of its own algorithms and techniques. Liu has written a comprehensive text on Web data mining. Key topics of structure mining, content mining, and usage mining are covered both in breadth and in depth. His book brings together all the essential concepts and algorithms from related areas such as data mining, machine learning, and text processing to form an authoritative and coherent text. The book offers a rich blend of theory and practice, addressing seminal research ideas, as well as examining the technology from a practical point of view. It is suitable for students, researchers and practitioners interested in Web mining both as a learning text and a reference book. Lecturers can readily use it for classes on data mining, Web mining, and Web search. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.

RSWK

World Wide Web / Data Mining

Subject

World Wide Web / Data Mining

Theme

Data Mining

Arbelaitz, O.; Martínez-Otzeta. J.M.; Muguerza, J.: User modeling in a social network for cognitively disabled people (2016) 0.16

0.16081432 = product of:
  0.24122146 = sum of:
    0.09858761 = weight(_text_:usage in 2639) [ClassicSimilarity], result of:
      0.09858761 = score(doc=2639,freq=2.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.36599535 = fieldWeight in 2639, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.046875 = fieldNorm(doc=2639)
    0.14263386 = sum of:
      0.10297193 = weight(_text_:mining in 2639) [ClassicSimilarity], result of:
        0.10297193 = score(doc=2639,freq=2.0), product of:
          0.2752929 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.04878962 = queryNorm
          0.37404498 = fieldWeight in 2639, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.046875 = fieldNorm(doc=2639)
      0.03966192 = weight(_text_:22 in 2639) [ClassicSimilarity], result of:
        0.03966192 = score(doc=2639,freq=2.0), product of:
          0.17085294 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04878962 = queryNorm
          0.23214069 = fieldWeight in 2639, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2639)
  0.6666667 = coord(2/3)

Abstract: Online communities are becoming an important tool in the communication and participation processes in our society. However, the most widespread applications are difficult to use for people with disabilities, or may involve some risks if no previous training has been undertaken. This work describes a novel social network for cognitively disabled people along with a clustering-based method for modeling activity and socialization processes of its users in a noninvasive way. This closed social network is specifically designed for people with cognitive disabilities, called Guremintza, that provides the network administrators (e.g., social workers) with two types of reports: summary statistics of the network usage and behavior patterns discovered by a data mining process. Experiments made in an initial stage of the network show that the discovered patterns are meaningful to the social workers and they find them useful in monitoring the progress of the users.
Date: 22. 1.2016 12:02:26

Maniez, J.: ¬Des classifications aux thesaurus : du bon usage des facettes (1999) 0.16

0.15789144 = product of:
  0.23683715 = sum of:
    0.19717522 = weight(_text_:usage in 6404) [ClassicSimilarity], result of:
      0.19717522 = score(doc=6404,freq=2.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.7319907 = fieldWeight in 6404, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.09375 = fieldNorm(doc=6404)
    0.03966192 = product of:
      0.07932384 = sum of:
        0.07932384 = weight(_text_:22 in 6404) [ClassicSimilarity], result of:
          0.07932384 = score(doc=6404,freq=2.0), product of:
            0.17085294 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04878962 = queryNorm
            0.46428138 = fieldWeight in 6404, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=6404)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 1. 8.1996 22:01:00

Maniez, J.: ¬Du bon usage des facettes : des classifications aux thésaurus (1999) 0.16

0.15789144 = product of:
  0.23683715 = sum of:
    0.19717522 = weight(_text_:usage in 3773) [ClassicSimilarity], result of:
      0.19717522 = score(doc=3773,freq=2.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.7319907 = fieldWeight in 3773, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.09375 = fieldNorm(doc=3773)
    0.03966192 = product of:
      0.07932384 = sum of:
        0.07932384 = weight(_text_:22 in 3773) [ClassicSimilarity], result of:
          0.07932384 = score(doc=3773,freq=2.0), product of:
            0.17085294 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04878962 = queryNorm
            0.46428138 = fieldWeight in 3773, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=3773)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 1. 8.1996 22:01:00

Perugini, S.; Ramakrishnan, N.: Mining Web functional dependencies for flexible information access (2007) 0.16

0.1565378 = product of:
  0.23480669 = sum of:
    0.09858761 = weight(_text_:usage in 602) [ClassicSimilarity], result of:
      0.09858761 = score(doc=602,freq=2.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.36599535 = fieldWeight in 602, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.046875 = fieldNorm(doc=602)
    0.13621907 = product of:
      0.27243814 = sum of:
        0.27243814 = weight(_text_:mining in 602) [ClassicSimilarity], result of:
          0.27243814 = score(doc=602,freq=14.0), product of:
            0.2752929 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.04878962 = queryNorm
            0.9896301 = fieldWeight in 602, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=602)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: We present an approach to enhancing information access through Web structure mining in contrast to traditional approaches involving usage mining. Specifically, we mine the hardwired hierarchical hyperlink structure of Web sites to identify patterns of term-term co-occurrences we call Web functional dependencies (FDs). Intuitively, a Web FD x -> y declares that all paths through a site involving a hyperlink labeled x also contain a hyperlink labeled y. The complete set of FDs satisfied by a site help characterize (flexible and expressive) interaction paradigms supported by a site, where a paradigm is the set of explorable sequences therein. We describe algorithms for mining FDs and results from mining several hierarchical Web sites and present several interface designs that can exploit such FDs to provide compelling user experiences.
Footnote: Beitrag eines Themenschwerpunktes "Mining Web resources for enhancing information retrieval"
Theme: Data Mining

Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.14

0.14411151 = product of:
  0.43233454 = sum of:
    0.43233454 = sum of:
      0.33979005 = weight(_text_:mining in 4577) [ClassicSimilarity], result of:
        0.33979005 = score(doc=4577,freq=4.0), product of:
          0.2752929 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.04878962 = queryNorm
          1.2342855 = fieldWeight in 4577, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.109375 = fieldNorm(doc=4577)
      0.09254448 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
        0.09254448 = score(doc=4577,freq=2.0), product of:
          0.17085294 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04878962 = queryNorm
          0.5416616 = fieldWeight in 4577, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.109375 = fieldNorm(doc=4577)
  0.33333334 = coord(1/3)

Date: 2. 4.2000 18:01:22
Theme: Data Mining

Nguyen, T.T.; Tho Thanh Quan, T.T.; Tuoi Thi Phan, T.T.: Sentiment search : an emerging trend on social media monitoring systems (2014) 0.13

0.13401192 = product of:
  0.20101789 = sum of:
    0.082156345 = weight(_text_:usage in 1625) [ClassicSimilarity], result of:
      0.082156345 = score(doc=1625,freq=2.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.30499613 = fieldWeight in 1625, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1625)
    0.11886155 = sum of:
      0.085809946 = weight(_text_:mining in 1625) [ClassicSimilarity], result of:
        0.085809946 = score(doc=1625,freq=2.0), product of:
          0.2752929 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.04878962 = queryNorm
          0.31170416 = fieldWeight in 1625, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1625)
      0.033051603 = weight(_text_:22 in 1625) [ClassicSimilarity], result of:
        0.033051603 = score(doc=1625,freq=2.0), product of:
          0.17085294 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04878962 = queryNorm
          0.19345059 = fieldWeight in 1625, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1625)
  0.6666667 = coord(2/3)

Abstract: Purpose - The purpose of this paper is to discuss sentiment search, which not only retrieves data related to submitted keywords but also identifies sentiment opinion implied in the retrieved data and the subject targeted by this opinion. Design/methodology/approach - The authors propose a retrieval framework known as Cross-Domain Sentiment Search (CSS), which combines the usage of domain ontologies with specific linguistic rules to handle sentiment terms in textual data. The CSS framework also supports incrementally enriching domain ontologies when applied in new domains. Findings - The authors found that domain ontologies are extremely helpful when CSS is applied in specific domains. In the meantime, the embedded linguistic rules make CSS achieve better performance as compared to data mining techniques. Research limitations/implications - The approach has been initially applied in a real social monitoring system of a professional IT company. Thus, it is proved to be able to handle real data acquired from social media channels such as electronic newspapers or social networks. Originality/value - The authors have placed aspect-based sentiment analysis in the context of semantic search and introduced the CSS framework for the whole sentiment search process. The formal definitions of Sentiment Ontology and aspect-based sentiment analysis are also presented. This distinguishes the work from other related works.
Date: 20. 1.2015 18:30:22

Eckert, K: ¬The ICE-map visualization (2011) 0.13

0.13339874 = product of:
  0.2000981 = sum of:
    0.13145015 = weight(_text_:usage in 4743) [ClassicSimilarity], result of:
      0.13145015 = score(doc=4743,freq=2.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.4879938 = fieldWeight in 4743, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.0625 = fieldNorm(doc=4743)
    0.06864795 = product of:
      0.1372959 = sum of:
        0.1372959 = weight(_text_:mining in 4743) [ClassicSimilarity], result of:
          0.1372959 = score(doc=4743,freq=2.0), product of:
            0.2752929 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.04878962 = queryNorm
            0.49872664 = fieldWeight in 4743, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0625 = fieldNorm(doc=4743)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: In this paper, we describe in detail the Information Content Evaluation Map (ICE-Map Visualization, formerly referred to as IC Difference Analysis). The ICE-Map Visualization is a visual data mining approach for all kinds of concept hierarchies that uses statistics about the concept usage to help a user in the evaluation and maintenance of the hierarchy. It consists of a statistical framework that employs the the notion of information content from information theory, as well as a visualization of the hierarchy and the result of the statistical analysis by means of a treemap.

Klein, H.: Web Content Mining (2004) 0.13
```
0.13061415 = product of:
  0.19592121 = sum of:
    0.09294929 = weight(_text_:usage in 3154) [ClassicSimilarity], result of:
      0.09294929 = score(doc=3154,freq=4.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.34506375 = fieldWeight in 3154, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.03125 = fieldNorm(doc=3154)
    0.10297193 = product of:
      0.20594385 = sum of:
        0.20594385 = weight(_text_:mining in 3154) [ClassicSimilarity], result of:
          0.20594385 = score(doc=3154,freq=18.0), product of:
            0.2752929 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.04878962 = queryNorm
            0.74808997 = fieldWeight in 3154, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.03125 = fieldNorm(doc=3154)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Web Mining - ein Schlagwort, das mit der Verbreitung des Internets immer öfter zu lesen und zu hören ist. Die gegenwärtige Forschung beschäftigt sich aber eher mit dem Nutzungsverhalten der Internetnutzer, und ein Blick in Tagungsprogramme einschlägiger Konferenzen (z.B. GOR - German Online Research) zeigt, dass die Analyse der Inhalte kaum Thema ist. Auf der GOR wurden 1999 zwei Vorträge zu diesem Thema gehalten, auf der Folgekonferenz 2001 kein einziger. Web Mining ist der Oberbegriff für zwei Typen von Web Mining: Web Usage Mining und Web Content Mining. Unter Web Usage Mining versteht man das Analysieren von Daten, wie sie bei der Nutzung des WWW anfallen und von den Servern protokolliert wenden. Man kann ermitteln, welche Seiten wie oft aufgerufen wurden, wie lange auf den Seiten verweilt wurde und vieles andere mehr. Beim Web Content Mining wird der Inhalt der Webseiten untersucht, der nicht nur Text, sondern auf Bilder, Video- und Audioinhalte enthalten kann. Die Software für die Analyse von Webseiten ist in den Grundzügen vorhanden, doch müssen die meisten Webseiten für die entsprechende Analysesoftware erst aufbereitet werden. Zuerst müssen die relevanten Websites ermittelt werden, die die gesuchten Inhalte enthalten. Das geschieht meist mit Suchmaschinen, von denen es mittlerweile Hunderte gibt. Allerdings kann man nicht davon ausgehen, dass die Suchmaschinen alle existierende Webseiten erfassen. Das ist unmöglich, denn durch das schnelle Wachstum des Internets kommen täglich Tausende von Webseiten hinzu, und bereits bestehende ändern sich der werden gelöscht. Oft weiß man auch nicht, wie die Suchmaschinen arbeiten, denn das gehört zu den Geschäftsgeheimnissen der Betreiber. Man muss also davon ausgehen, dass die Suchmaschinen nicht alle relevanten Websites finden (können). Der nächste Schritt ist das Herunterladen der Websites, dafür gibt es Software, die unter den Bezeichnungen OfflineReader oder Webspider zu finden ist. Das Ziel dieser Programme ist, die Website in einer Form herunterzuladen, die es erlaubt, die Website offline zu betrachten. Die Struktur der Website wird in der Regel beibehalten. Wer die Inhalte einer Website analysieren will, muss also alle Dateien mit seiner Analysesoftware verarbeiten können. Software für Inhaltsanalyse geht davon aus, dass nur Textinformationen in einer einzigen Datei verarbeitet werden. QDA Software (qualitative data analysis) verarbeitet dagegen auch Audiound Videoinhalte sowie internetspezifische Kommunikation wie z.B. Chats.

Theme

Data Mining

Semantic applications (2018) 0.13

0.13044816 = product of:
  0.19567224 = sum of:
    0.082156345 = weight(_text_:usage in 5204) [ClassicSimilarity], result of:
      0.082156345 = score(doc=5204,freq=2.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.30499613 = fieldWeight in 5204, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5204)
    0.11351589 = product of:
      0.22703178 = sum of:
        0.22703178 = weight(_text_:mining in 5204) [ClassicSimilarity], result of:
          0.22703178 = score(doc=5204,freq=14.0), product of:
            0.2752929 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.04878962 = queryNorm
            0.8246917 = fieldWeight in 5204, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5204)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Content: Introduction.- Ontology Development.- Compliance using Metadata.- Variety Management for Big Data.- Text Mining in Economics.- Generation of Natural Language Texts.- Sentiment Analysis.- Building Concise Text Corpora from Web Contents.- Ontology-Based Modelling of Web Content.- Personalized Clinical Decision Support for Cancer Care.- Applications of Temporal Conceptual Semantic Systems.- Context-Aware Documentation in the Smart Factory.- Knowledge-Based Production Planning for Industry 4.0.- Information Exchange in Jurisdiction.- Supporting Automated License Clearing.- Managing cultural assets: Implementing typical cultural heritage archive's usage scenarios via Semantic Web technologies.- Semantic Applications for Process Management.- Domain-Specific Semantic Search Applications.
LCSH: Data mining
Data Mining and Knowledge Discovery
RSWK: Data Mining
Subject: Data Mining
Data mining
Data Mining and Knowledge Discovery

Jonkers, K.; Moya Anegon, F. de; Aguillo, I.F.: Measuring the usage of e-research infrastructure as an indicator of research activity (2012) 0.13

0.12727329 = product of:
  0.19090992 = sum of:
    0.13942395 = weight(_text_:usage in 277) [ClassicSimilarity], result of:
      0.13942395 = score(doc=277,freq=4.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.51759565 = fieldWeight in 277, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.046875 = fieldNorm(doc=277)
    0.051485963 = product of:
      0.10297193 = sum of:
        0.10297193 = weight(_text_:mining in 277) [ClassicSimilarity], result of:
          0.10297193 = score(doc=277,freq=2.0), product of:
            0.2752929 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.04878962 = queryNorm
            0.37404498 = fieldWeight in 277, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=277)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This study combines Web usage mining, Web link analysis, and bibliometric methods for analyzing research activities in research organizations. It uses visits to the Expert Protein Analysis System (ExPASy) server-a virtual research infrastructure for bioinformatics-as a proxy for measuring bioinformatic research activity. The study finds that in the United Kingdom (UK), Germany, and Spain the number of visits to the ExPASy Web server made by research organizations is significantly positively correlated with research output in the field of biochemistry, molecular biology, and genetics. Only in the UK do we find a significant positive correlation between ExPASy visits per publication and the normalized impact of an organization's publications. The type of indicator developed in this study can be used to measure research activity in fields in which e-research has become important. In addition, it can be used for the evaluation of e-research infrastructures.

Guenther, R.S.: Using the Metadata Object Description Schema (MODS) for resource description : guidelines and applications (2004) 0.12

0.123864934 = product of:
  0.1857974 = sum of:
    0.16266127 = weight(_text_:usage in 2837) [ClassicSimilarity], result of:
      0.16266127 = score(doc=2837,freq=4.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.6038616 = fieldWeight in 2837, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2837)
    0.02313612 = product of:
      0.04627224 = sum of:
        0.04627224 = weight(_text_:22 in 2837) [ClassicSimilarity], result of:
          0.04627224 = score(doc=2837,freq=2.0), product of:
            0.17085294 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04878962 = queryNorm
            0.2708308 = fieldWeight in 2837, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2837)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This paper describes the Metadata Object Description Schema (MODS), its accompanying documentation and some of its applications. It reviews the MODS user guidelines provided by the Library of Congress and how they enable a user of the schema to consistently apply MODS as a metadata scheme. Because the schema itself could not fully document appropriate usage, the guidelines provide element definitions, history, relationships with other elements, usage conventions, and examples. Short descriptions of some MODS applications are given and a more detailed discussion of its use in the Library of Congress's Minerva project for Web archiving is given.
Source: Library hi tech. 22(2004) no.1, S.89-98

KDD : techniques and applications (1998) 0.12

0.12352415 = product of:
  0.37057245 = sum of:
    0.37057245 = sum of:
      0.2912486 = weight(_text_:mining in 6783) [ClassicSimilarity], result of:
        0.2912486 = score(doc=6783,freq=4.0), product of:
          0.2752929 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.04878962 = queryNorm
          1.057959 = fieldWeight in 6783, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.09375 = fieldNorm(doc=6783)
      0.07932384 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
        0.07932384 = score(doc=6783,freq=2.0), product of:
          0.17085294 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04878962 = queryNorm
          0.46428138 = fieldWeight in 6783, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.09375 = fieldNorm(doc=6783)
  0.33333334 = coord(1/3)

Footnote: A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997
Theme: Data Mining

Chau, M.; Lu, Y.; Fang, X.; Yang, C.C.: Characteristics of character usage in Chinese Web searching (2009) 0.12
```
0.120559 = product of:
  0.1808385 = sum of:
    0.16431269 = weight(_text_:usage in 2456) [ClassicSimilarity], result of:
      0.16431269 = score(doc=2456,freq=8.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.60999227 = fieldWeight in 2456, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2456)
    0.016525801 = product of:
      0.033051603 = sum of:
        0.033051603 = weight(_text_:22 in 2456) [ClassicSimilarity], result of:
          0.033051603 = score(doc=2456,freq=2.0), product of:
            0.17085294 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04878962 = queryNorm
            0.19345059 = fieldWeight in 2456, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2456)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The use of non-English Web search engines has been prevalent. Given the popularity of Chinese Web searching and the unique characteristics of Chinese language, it is imperative to conduct studies with focuses on the analysis of Chinese Web search queries. In this paper, we report our research on the character usage of Chinese search logs from a Web search engine in Hong Kong. By examining the distribution of search query terms, we found that users tended to use more diversified terms and that the usage of characters in search queries was quite different from the character usage of general online information in Chinese. After studying the Zipf distribution of n-grams with different values of n, we found that the curve of unigram is the most curved one of all while the bigram curve follows the Zipf distribution best, and that the curves of n-grams with larger n (n = 3-6) had similar structures with ?-values in the range of 0.66-0.86. The distribution of combined n-grams was also studied. All the analyses are performed on the data both before and after the removal of function terms and incomplete terms and similar findings are revealed. We believe the findings from this study have provided some insights into further research in non-English Web searching and will assist in the design of more effective Chinese Web search engines.

Date

22.11.2008 17:57:22

Lusti, M.: Data Warehousing and Data Mining : Eine Einführung in entscheidungsunterstützende Systeme (1999) 0.11

0.10915812 = product of:
  0.32747436 = sum of:
    0.32747436 = sum of:
      0.2745918 = weight(_text_:mining in 4261) [ClassicSimilarity], result of:
        0.2745918 = score(doc=4261,freq=8.0), product of:
          0.2752929 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.04878962 = queryNorm
          0.9974533 = fieldWeight in 4261, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0625 = fieldNorm(doc=4261)
      0.052882563 = weight(_text_:22 in 4261) [ClassicSimilarity], result of:
        0.052882563 = score(doc=4261,freq=2.0), product of:
          0.17085294 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04878962 = queryNorm
          0.30952093 = fieldWeight in 4261, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=4261)
  0.33333334 = coord(1/3)

Date: 17. 7.2002 19:22:06
RSWK: Data mining / Lehrbuch
Subject: Data mining / Lehrbuch
Theme: Data Mining

Srinivasan, R.; Pepe, A.; Rodriguez, M.A.: ¬A clustering-based semi-automated technique to build cultural ontologies (2009) 0.11
```
0.10616994 = product of:
  0.15925491 = sum of:
    0.13942395 = weight(_text_:usage in 2750) [ClassicSimilarity], result of:
      0.13942395 = score(doc=2750,freq=4.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.51759565 = fieldWeight in 2750, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.046875 = fieldNorm(doc=2750)
    0.01983096 = product of:
      0.03966192 = sum of:
        0.03966192 = weight(_text_:22 in 2750) [ClassicSimilarity], result of:
          0.03966192 = score(doc=2750,freq=2.0), product of:
            0.17085294 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04878962 = queryNorm
            0.23214069 = fieldWeight in 2750, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2750)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

This article presents and validates a clustering-based method for creating cultural ontologies for community-oriented information systems. The introduced semiautomated approach merges distributed annotation techniques, or subjective assessments of similarities between cultural categories, with established clustering methods to produce cognate ontologies. This approach is validated against a locally authentic ethnographic method, involving direct work with communities for the design of fluid ontologies. The evaluation is conducted with of a set of Native American communities located in San Diego County (CA, US). The principal aim of this research is to discover whether distributing the annotation process among isolated respondents would enable ontology hierarchies to be created that are similar to those that are crafted according to collaborative ethnographic processes, found to be effective in generating continuous usage across several studies. Our findings suggest that the proposed semiautomated solution best optimizes among issues of interoperability and scalability, deemphasized in the fluid ontology approach, and sustainable usage.

Date

22. 3.2009 18:02:06
Barat, A.H.: Hungarians in the history of the UDC (2014) 0.11
```
0.10616994 = product of:
  0.15925491 = sum of:
    0.13942395 = weight(_text_:usage in 1429) [ClassicSimilarity], result of:
      0.13942395 = score(doc=1429,freq=4.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.51759565 = fieldWeight in 1429, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.046875 = fieldNorm(doc=1429)
    0.01983096 = product of:
      0.03966192 = sum of:
        0.03966192 = weight(_text_:22 in 1429) [ClassicSimilarity], result of:
          0.03966192 = score(doc=1429,freq=2.0), product of:
            0.17085294 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04878962 = queryNorm
            0.23214069 = fieldWeight in 1429, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1429)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

I outline a major segment of the history of the Universal Decimal Classification (UDC) in Hungary and all related important events and activities. Significant and committed specialists who played prominent role on a national and international level are also mentioned. It's not an overstatement, that the usage and publications of the UDC in Hungary are significant milestones in the international history of UDC. The usage of UDC has been very widespread and it is found in different types of libraries. People who were responsible for the developing of information retrieval systems and quality of these methods were very engaged and participated in international activities. There were several huge libraries such as special, academic, municipal and national library where UDC has been employed since quite early on and the leaders of these pioneer libraries travelled widely and were active in international researches and practices.

Source

Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Compier, H.; Campbell, R.: ADONIS gathers momentum and faces some new problems (1995) 0.11

0.10526095 = product of:
  0.15789142 = sum of:
    0.13145015 = weight(_text_:usage in 3295) [ClassicSimilarity], result of:
      0.13145015 = score(doc=3295,freq=2.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.4879938 = fieldWeight in 3295, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.0625 = fieldNorm(doc=3295)
    0.026441282 = product of:
      0.052882563 = sum of:
        0.052882563 = weight(_text_:22 in 3295) [ClassicSimilarity], result of:
          0.052882563 = score(doc=3295,freq=2.0), product of:
            0.17085294 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04878962 = queryNorm
            0.30952093 = fieldWeight in 3295, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3295)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Describes the change in the perception of the ADONIS project 14 years after its introduction. Outlines the original mission to use new technology to provide copies of copyright articles more effiently, and to take net efficiency gain as a usage of copyright fee. Details the present ADONIS service - its mission is the same although the manner of achieving it has changed - providing a history of the last 10 years; lists recent developments, planned developments and highlights the main problems of ADONIS to be pricing
Source: Interlending and document supply. 23(1995) no.3, S.22-25

Brooks, C.; Schickler, M.A.; Mazer, M.S.: Pan-browser support for annotations and other meta-information on the World Wide Web (1996) 0.11

0.10526095 = product of:
  0.15789142 = sum of:
    0.13145015 = weight(_text_:usage in 5861) [ClassicSimilarity], result of:
      0.13145015 = score(doc=5861,freq=2.0), product of:
        0.26936847 = queryWeight, product of:
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.04878962 = queryNorm
        0.4879938 = fieldWeight in 5861, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.52102 = idf(docFreq=480, maxDocs=44218)
          0.0625 = fieldNorm(doc=5861)
    0.026441282 = product of:
      0.052882563 = sum of:
        0.052882563 = weight(_text_:22 in 5861) [ClassicSimilarity], result of:
          0.052882563 = score(doc=5861,freq=2.0), product of:
            0.17085294 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04878962 = queryNorm
            0.30952093 = fieldWeight in 5861, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=5861)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Describes an innovative approach for groups to create and share commentary about the content of documents accessible via the WWW. The system supports the creation, presentation, and control of user created meta information, which is displayed with the corresponding documents but stored separately from them. Describes design considerations, the system architecture, usage scenarios, initial implementations, and future word
Date: 1. 8.1996 22:08:06

Search (4439 results, page 1 of 222)

Authors

Years

Languages

Types

Themes

Subjects

Classifications