Search (654 results, page 1 of 33)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.27

0.27286285 = sum of:
  0.08046506 = product of:
    0.24139518 = sum of:
      0.24139518 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
        0.24139518 = score(doc=562,freq=2.0), product of:
          0.429515 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.05066224 = queryNorm
          0.56201804 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.33333334 = coord(1/3)
  0.19239777 = sum of:
    0.15121357 = weight(_text_:mining in 562) [ClassicSimilarity], result of:
      0.15121357 = score(doc=562,freq=4.0), product of:
        0.28585905 = queryWeight, product of:
          5.642448 = idf(docFreq=425, maxDocs=44218)
          0.05066224 = queryNorm
        0.5289795 = fieldWeight in 562, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.642448 = idf(docFreq=425, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.0411842 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
      0.0411842 = score(doc=562,freq=2.0), product of:
        0.17741053 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.05066224 = queryNorm
        0.23214069 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32
Source: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK

Sun, A.; Lim, E.-P.: Web unit-based mining of homepage relationships (2006) 0.09
```
0.09432595 = product of:
  0.1886519 = sum of:
    0.1886519 = sum of:
      0.15433173 = weight(_text_:mining in 5274) [ClassicSimilarity], result of:
        0.15433173 = score(doc=5274,freq=6.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.5398875 = fieldWeight in 5274, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5274)
      0.034320172 = weight(_text_:22 in 5274) [ClassicSimilarity], result of:
        0.034320172 = score(doc=5274,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 5274, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5274)
  0.5 = coord(1/2)
```
Abstract

Homepages usually describe important semantic information about conceptual or physical entities; hence, they are the main targets for searching and browsing. To facilitate semantic-based information retrieval (IR) at a Web site, homepages can be identified and classified under some predefined concepts and these concepts are then used in query or browsing criteria, e.g., finding professor homepages containing information retrieval. In some Web sites, relationships may also exist among homepages. These relationship instances (also known as homepage relationships) enrich our knowledge about these Web sites and allow more expressive semantic-based IR. In this article, we investigate the features to be used in mining homepage relationships. We systematically develop different classes of inter-homepage features, namely, navigation, relative-location, and common-item features. We also propose deriving for each homepage a set of support pages to obtain richer and more complete content about the entity described by the homepage. The homepage together with its support pages are known to be a Web unit. By extracting inter-homepage features from Web units, our experiments on the WebKB dataset show that better homepage relationship mining accuracies can be achieved.

Date

22. 7.2006 16:18:25

Fong, A.C.M.: Mining a Web citation database for document clustering (2002) 0.09

0.088207915 = product of:
  0.17641583 = sum of:
    0.17641583 = product of:
      0.35283166 = sum of:
        0.35283166 = weight(_text_:mining in 3940) [ClassicSimilarity], result of:
          0.35283166 = score(doc=3940,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.2342855 = fieldWeight in 3940, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.109375 = fieldNorm(doc=3940)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

Kulathuramaiyer, N.; Maurer, H.: Implications of emerging data mining (2009) 0.08
```
0.084530964 = product of:
  0.16906193 = sum of:
    0.16906193 = product of:
      0.33812386 = sum of:
        0.33812386 = weight(_text_:mining in 3144) [ClassicSimilarity], result of:
          0.33812386 = score(doc=3144,freq=20.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.1828341 = fieldWeight in 3144, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=3144)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Data Mining describes a technology that discovers non-trivial hidden patterns in a large collection of data. Although this technology has a tremendous impact on our lives, the invaluable contributions of this invisible technology often go unnoticed. This paper discusses advances in data mining while focusing on the emerging data mining capability. Such data mining applications perform multidimensional mining on a wide variety of heterogeneous data sources, providing solutions to many unresolved problems. This paper also highlights the advantages and disadvantages arising from the ever-expanding scope of data mining. Data Mining augments human intelligence by equipping us with a wealth of knowledge and by empowering us to perform our daily tasks better. As the mining scope and capacity increases, users and organizations become more willing to compromise privacy. The huge data stores of the 'master miners' allow them to gain deep insights into individual lifestyles and their social and behavioural patterns. Data integration and analysis capability of combining business and financial trends together with the ability to deterministically track market changes will drastically affect our lives.

Theme

Data Mining
Zhou, L.; Chaovalit, P.: Ontology-supported polarity mining (2008) 0.08
```
0.082510956 = product of:
  0.16502191 = sum of:
    0.16502191 = product of:
      0.33004382 = sum of:
        0.33004382 = weight(_text_:mining in 1343) [ClassicSimilarity], result of:
          0.33004382 = score(doc=1343,freq=14.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.1545684 = fieldWeight in 1343, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1343)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Polarity mining provides an in-depth analysis of semantic orientations of text information. Motivated by its success in the area of topic mining, we propose an ontology-supported polarity mining (OSPM) approach. The approach aims to enhance polarity mining with ontology by providing detailed topic-specific information. OSPM was evaluated in the movie review domain using both supervised and unsupervised techniques. Results revealed that OSPM outperformed the baseline method without ontology support. The findings of this study not only advance the state of polarity mining research but also shed light on future research directions.

Theme

Data Mining
Ku, L.-W.; Ho, H.-W.; Chen, H.-H.: Opinion mining and relationship discovery using CopeOpi opinion analysis system (2009) 0.08
```
0.080165744 = product of:
  0.16033149 = sum of:
    0.16033149 = sum of:
      0.12601131 = weight(_text_:mining in 2938) [ClassicSimilarity], result of:
        0.12601131 = score(doc=2938,freq=4.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.44081625 = fieldWeight in 2938, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2938)
      0.034320172 = weight(_text_:22 in 2938) [ClassicSimilarity], result of:
        0.034320172 = score(doc=2938,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 2938, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2938)
  0.5 = coord(1/2)
```
Abstract

We present CopeOpi, an opinion-analysis system, which extracts from the Web opinions about specific targets, summarizes the polarity and strength of these opinions, and tracks opinion variations over time. Objects that yield similar opinion tendencies over a certain time period may be correlated due to the latent causal events. CopeOpi discovers relationships among objects based on their opinion-tracking plots and collocations. Event bursts are detected from the tracking plots, and the strength of opinion relationships is determined by the coverage of these plots. To evaluate opinion mining, we use the NTCIR corpus annotated with opinion information at sentence and document levels. CopeOpi achieves sentence- and document-level f-measures of 62% and 74%. For relationship discovery, we collected 1.3M economics-related documents from 93 Web sources over 22 months, and analyzed collocation-based, opinion-based, and hybrid models. We consider as correlated company pairs that demonstrate similar stock-price variations, and selected these as the gold standard for evaluation. Results show that opinion-based and collocation-based models complement each other, and that integrated models perform the best. The top 25, 50, and 100 pairs discovered achieve precision rates of 1, 0.92, and 0.79, respectively.

Chen, S.Y.; Liu, X.: ¬The contribution of data mining to information science : making sense of it all (2005) 0.08

0.075606786 = product of:
  0.15121357 = sum of:
    0.15121357 = product of:
      0.30242714 = sum of:
        0.30242714 = weight(_text_:mining in 4655) [ClassicSimilarity], result of:
          0.30242714 = score(doc=4655,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.057959 = fieldWeight in 4655, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.09375 = fieldNorm(doc=4655)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

Toldo, L.; Rippmann, F.: Integrated bioinformatics application for automated target discovery. (2005) 0.07

0.074054174 = product of:
  0.14810835 = sum of:
    0.14810835 = sum of:
      0.10692415 = weight(_text_:mining in 5260) [ClassicSimilarity], result of:
        0.10692415 = score(doc=5260,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.37404498 = fieldWeight in 5260, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.046875 = fieldNorm(doc=5260)
      0.0411842 = weight(_text_:22 in 5260) [ClassicSimilarity], result of:
        0.0411842 = score(doc=5260,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.23214069 = fieldWeight in 5260, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=5260)
  0.5 = coord(1/2)

Abstract: In this article we present an in silico method that automatically assigns putative functions to DNA sequences. The annotations are at an increasingly conceptual level, up to identifying general biomedical fields to which the sequences could contribute. This bioinformatics data-mining system makes substantial use of several resources: a locally stored MEDLINE® database; a manually built classification system; the MeSH® taxonomy; relational technology; and bioinformatics methods. Knowledge is generated from various data sources by using well-defined semantics, and by exploiting direct links between them. A two-dimensional Concept Map(TM) displays the knowledge graph, which allows causal connections to be followed. The use of this method has been valuable and has saved considerable time in our in-house projects, and can be generally exploited for any sequence-annotation or knowledge-condensation task.
Date: 22. 7.2006 14:31:06

Malaise, V.; Zweigenbaum, P.; Bachimont, B.: Mining defining contexts to help structuring differential ontologies (2005) 0.07

0.07128276 = product of:
  0.14256552 = sum of:
    0.14256552 = product of:
      0.28513104 = sum of:
        0.28513104 = weight(_text_:mining in 6598) [ClassicSimilarity], result of:
          0.28513104 = score(doc=6598,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.9974533 = fieldWeight in 6598, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.125 = fieldNorm(doc=6598)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Benoit, G.: Data mining (2002) 0.07
```
0.07072368 = product of:
  0.14144737 = sum of:
    0.14144737 = product of:
      0.28289473 = sum of:
        0.28289473 = weight(_text_:mining in 4296) [ClassicSimilarity], result of:
          0.28289473 = score(doc=4296,freq=14.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.9896301 = fieldWeight in 4296, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=4296)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Data mining (DM) is a multistaged process of extracting previously unanticipated knowledge from large databases, and applying the results to decision making. Data mining tools detect patterns from the data and infer associations and rules from them. The extracted information may then be applied to prediction or classification models by identifying relations within the data records or between databases. Those patterns and rules can then guide decision making and forecast the effects of those decisions. However, this definition may be applied equally to "knowledge discovery in databases" (KDD). Indeed, in the recent literature of DM and KDD, a source of confusion has emerged, making it difficult to determine the exact parameters of both. KDD is sometimes viewed as the broader discipline, of which data mining is merely a component-specifically pattern extraction, evaluation, and cleansing methods (Raghavan, Deogun, & Sever, 1998, p. 397). Thurasingham (1999, p. 2) remarked that "knowledge discovery," "pattern discovery," "data dredging," "information extraction," and "knowledge mining" are all employed as synonyms for DM. Trybula, in his ARIST chapter an text mining, observed that the "existing work [in KDD] is confusing because the terminology is inconsistent and poorly defined.

Theme

Data Mining
Perugini, S.; Ramakrishnan, N.: Mining Web functional dependencies for flexible information access (2007) 0.07
```
0.07072368 = product of:
  0.14144737 = sum of:
    0.14144737 = product of:
      0.28289473 = sum of:
        0.28289473 = weight(_text_:mining in 602) [ClassicSimilarity], result of:
          0.28289473 = score(doc=602,freq=14.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.9896301 = fieldWeight in 602, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=602)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We present an approach to enhancing information access through Web structure mining in contrast to traditional approaches involving usage mining. Specifically, we mine the hardwired hierarchical hyperlink structure of Web sites to identify patterns of term-term co-occurrences we call Web functional dependencies (FDs). Intuitively, a Web FD x -> y declares that all paths through a site involving a hyperlink labeled x also contain a hyperlink labeled y. The complete set of FDs satisfied by a site help characterize (flexible and expressive) interaction paradigms supported by a site, where a paradigm is the set of explorable sequences therein. We describe algorithms for mining FDs and results from mining several hierarchical Web sites and present several interface designs that can exploit such FDs to provide compelling user experiences.

Footnote

Beitrag eines Themenschwerpunktes "Mining Web resources for enhancing information retrieval"

Theme

Data Mining
Srinivasan, P.: Text mining in biomedicine : challenges and opportunities (2006) 0.07
```
0.07072368 = product of:
  0.14144737 = sum of:
    0.14144737 = product of:
      0.28289473 = sum of:
        0.28289473 = weight(_text_:mining in 1497) [ClassicSimilarity], result of:
          0.28289473 = score(doc=1497,freq=14.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.9896301 = fieldWeight in 1497, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=1497)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Text mining is about making serendipity more likely. Serendipity, the chance discovery of interesting ideas, has been responsible for many discoveries in science. Text mining systems strive to explore large text collections, separate the potentially meaningfull connections from a vast and mostly noisy background of random associations. In this paper we provide a summary of our text mining approach and also illustrate briefly some of the experiments we have conducted with this approach. In particular we use a profile-based text mining method. We have used these profiles to explore the global distribution of disease research, replicate discoveries made by others and propose new hypotheses. Text mining holds much potential that has yet to be tapped.

Theme

Data Mining
Haravu, L.J.; Neelameghan, A.: Text mining and data mining in knowledge organization and discovery : the making of knowledge-based products (2003) 0.07
```
0.06682759 = product of:
  0.13365518 = sum of:
    0.13365518 = product of:
      0.26731035 = sum of:
        0.26731035 = weight(_text_:mining in 5653) [ClassicSimilarity], result of:
          0.26731035 = score(doc=5653,freq=18.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.9351125 = fieldWeight in 5653, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5653)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Discusses the importance of knowledge organization in the context of the information overload caused by the vast quantities of data and information accessible on internal and external networks of an organization. Defines the characteristics of a knowledge-based product. Elaborates on the techniques and applications of text mining in developing knowledge products. Presents two approaches, as case studies, to the making of knowledge products: (1) steps and processes in the planning, designing and development of a composite multilingual multimedia CD product, with the potential international, inter-cultural end users in view, and (2) application of natural language processing software in text mining. Using a text mining software, it is possible to link concept terms from a processed text to a related thesaurus, glossary, schedules of a classification scheme, and facet structured subject representations. Concludes that the products of text mining and data mining could be made more useful if the features of a faceted scheme for subject classification are incorporated into text mining techniques and products.

Theme

Data Mining

Hegna, K.; Murtomaa, E.: Data mining MARC to find : FRBR? (2003) 0.06

0.062372416 = product of:
  0.12474483 = sum of:
    0.12474483 = product of:
      0.24948967 = sum of:
        0.24948967 = weight(_text_:mining in 69) [ClassicSimilarity], result of:
          0.24948967 = score(doc=69,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.8727716 = fieldWeight in 69, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.109375 = fieldNorm(doc=69)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Budzik, J.; Hammond, K.J.; Birnbaum, L.: Information access in context (2001) 0.06

0.062372416 = product of:
  0.12474483 = sum of:
    0.12474483 = product of:
      0.24948967 = sum of:
        0.24948967 = weight(_text_:mining in 3835) [ClassicSimilarity], result of:
          0.24948967 = score(doc=3835,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.8727716 = fieldWeight in 3835, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.109375 = fieldNorm(doc=3835)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

He, Y.; Hui, S.C.: Mining a web database for author cocitation analysis (2002) 0.06

0.062372416 = product of:
  0.12474483 = sum of:
    0.12474483 = product of:
      0.24948967 = sum of:
        0.24948967 = weight(_text_:mining in 2584) [ClassicSimilarity], result of:
          0.24948967 = score(doc=2584,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.8727716 = fieldWeight in 2584, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.109375 = fieldNorm(doc=2584)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Lauw, H.W.; Lim, E.-P.: Web social mining (2009) 0.06

0.062372416 = product of:
  0.12474483 = sum of:
    0.12474483 = product of:
      0.24948967 = sum of:
        0.24948967 = weight(_text_:mining in 3905) [ClassicSimilarity], result of:
          0.24948967 = score(doc=3905,freq=8.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.8727716 = fieldWeight in 3905, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3905)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: With increasing user presence in the Web and Web 2.0, Web social mining becomes an important and challenging task that finds a wide range of new applications relevant to e-commerce and social software. In this entry, we describe three Web social mining topics, namely, social network discovery, social network analysis, and social network applications. The essential concepts, models, and techniques of these Web social mining topics will be surveyed so as to establish the basic foundation for developing novel applications and for conducting research.

Bath, P.A.: Data mining in health and medical information (2003) 0.06

0.061732687 = product of:
  0.123465374 = sum of:
    0.123465374 = product of:
      0.24693075 = sum of:
        0.24693075 = weight(_text_:mining in 4263) [ClassicSimilarity], result of:
          0.24693075 = score(doc=4263,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.86381996 = fieldWeight in 4263, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0625 = fieldNorm(doc=4263)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Data mining (DM) is part of a process by which information can be extracted from data or databases and used to inform decision making in a variety of contexts (Benoit, 2002; Michalski, Bratka & Kubat, 1997). DM includes a range of tools and methods for extractiog information; their use in the commercial sector for knowledge extraction and discovery has been one of the main driving forces in their development (Adriaans & Zantinge, 1996; Benoit, 2002). DM has been developed and applied in numerous areas. This review describes its use in analyzing health and medical information.
Theme: Data Mining

Rodríguez, A.; Carazo, J.M.; Trelles-Salazar, O.: Mining association rules from biological databases (2005) 0.06

0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 5261) [ClassicSimilarity], result of:
        0.08910345 = score(doc=5261,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 5261, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5261)
      0.034320172 = weight(_text_:22 in 5261) [ClassicSimilarity], result of:
        0.034320172 = score(doc=5261,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 5261, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5261)
  0.5 = coord(1/2)

Date: 22. 7.2006 14:34:29

Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.06
```
0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 5290) [ClassicSimilarity], result of:
        0.08910345 = score(doc=5290,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 5290, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5290)
      0.034320172 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
        0.034320172 = score(doc=5290,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 5290, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5290)
  0.5 = coord(1/2)
```
Abstract

Document keyphrases provide a concise summary of a document's content, offering semantic metadata summarizing a document. They can be used in many applications related to knowledge management and text mining, such as automatic text summarization, development of search engines, document clustering, document classification, thesaurus construction, and browsing interfaces. Because only a small portion of documents have keyphrases assigned by authors, and it is time-consuming and costly to manually assign keyphrases to documents, it is necessary to develop an algorithm to automatically generate keyphrases for documents. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified phrases to assign weights to the candidate keyphrases. The logic of our algorithm is: The more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. KIP's learning function can enrich the glossary database by automatically adding new identified keyphrases to the database. KIP's personalization feature will let the user build a glossary database specifically suitable for the area of his/her interest. The evaluation results show that KIP's performance is better than the systems we compared to and that the learning function is effective.

Date

22. 7.2006 17:25:48

Search (654 results, page 1 of 33)

Authors

Types

Themes