Search (15 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Data Mining"
  1. O'Brien, H.L.; Lebow, M.: Mixed-methods approach to measuring user experience in online news interactions (2013) 0.04
    0.03866488 = product of:
      0.07732976 = sum of:
        0.07732976 = product of:
          0.15465952 = sum of:
            0.15465952 = weight(_text_:news in 1001) [ClassicSimilarity], result of:
              0.15465952 = score(doc=1001,freq=8.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.57913023 = fieldWeight in 1001, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1001)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    When it comes to evaluating online information experiences, what metrics matter? We conducted a study in which 30 people browsed and selected content within an online news website. Data collected included psychometric scales (User Engagement, Cognitive Absorption, System Usability Scales), self-reported interest in news content, and performance metrics (i.e., reading time, browsing time, total time, number of pages visited, and use of recommended links); a subset of the participants had their physiological responses recorded during the interaction (i.e., heart rate, electrodermal activity, electrocmytogram). Findings demonstrated the concurrent validity of the psychometric scales and interest ratings and revealed that increased time on tasks, number of pages visited, and use of recommended links were not necessarily indicative of greater self-reported engagement, cognitive absorption, or perceived usability. Positive ratings of news content were associated with lower physiological activity. The implications of this research are twofold. First, we propose that user experience is a useful framework for studying online information interactions and will result in a broader conceptualization of information interaction and its evaluation. Second, we advocate a mixed-methods approach to measurement that employs a suite of metrics capable of capturing the pragmatic (e.g., usability) and hedonic (e.g., fun, engagement) aspects of information interactions. We underscore the importance of using multiple measures in information research, because our results emphasize that performance and physiological data must be interpreted in the context of users' subjective experiences.
  2. Wei, C.-P.; Lee, Y.-H.; Chiang, Y.-S.; Chen, C.-T.; Yang, C.C.C.: Exploiting temporal characteristics of features for effectively discovering event episodes from news corpora (2014) 0.04
    0.03866488 = product of:
      0.07732976 = sum of:
        0.07732976 = product of:
          0.15465952 = sum of:
            0.15465952 = weight(_text_:news in 1225) [ClassicSimilarity], result of:
              0.15465952 = score(doc=1225,freq=8.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.57913023 = fieldWeight in 1225, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1225)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    An organization performing environmental scanning generally monitors or tracks various events concerning its external environment. One of the major resources for environmental scanning is online news documents, which are readily accessible on news websites or infomediaries. However, the proliferation of the World Wide Web, which increases information sources and improves information circulation, has vastly expanded the amount of information to be scanned. Thus, it is essential to develop an effective event episode discovery mechanism to organize news documents pertaining to an event of interest. In this study, we propose two new metrics, Term Frequency × Inverse Document FrequencyTempo (TF×IDFTempo) and TF×Enhanced-IDFTempo, and develop a temporal-based event episode discovery (TEED) technique that uses the proposed metrics for feature selection and document representation. Using a traditional TF×IDF-based hierarchical agglomerative clustering technique as a performance benchmark, our empirical evaluation reveals that the proposed TEED technique outperforms its benchmark, as measured by cluster recall and cluster precision. In addition, the use of TF×Enhanced-IDFTempo significantly improves the effectiveness of event episode discovery when compared with the use of TF×IDFTempo.
  3. Lam, W.; Yang, C.C.; Menczer, F.: Introduction to the special topic section on mining Web resources for enhancing information retrieval (2007) 0.03
    0.027065417 = product of:
      0.054130834 = sum of:
        0.054130834 = product of:
          0.10826167 = sum of:
            0.10826167 = weight(_text_:news in 600) [ClassicSimilarity], result of:
              0.10826167 = score(doc=600,freq=2.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.40539116 = fieldWeight in 600, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=600)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The amount of information on the Web has been expanding at an enormous pace. There are a variety of Web documents in different genres, such as news, reports, reviews. Traditionally, the information displayed on Web sites has been static. Recently, there are many Web sites offering content that is dynamically generated and frequently updated. It is also common for Web sites to contain information in different languages since many countries adopt more than one language. Moreover, content may exist in multimedia formats including text, images, video, and audio.
  4. Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.02
    0.02415974 = product of:
      0.04831948 = sum of:
        0.04831948 = product of:
          0.09663896 = sum of:
            0.09663896 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
              0.09663896 = score(doc=4577,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.5416616 = fieldWeight in 4577, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4577)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    2. 4.2000 18:01:22
  5. Pons-Porrata, A.; Berlanga-Llavori, R.; Ruiz-Shulcloper, J.: Topic discovery based on text mining techniques (2007) 0.02
    0.023198929 = product of:
      0.046397857 = sum of:
        0.046397857 = product of:
          0.092795715 = sum of:
            0.092795715 = weight(_text_:news in 916) [ClassicSimilarity], result of:
              0.092795715 = score(doc=916,freq=2.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.34747815 = fieldWeight in 916, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.046875 = fieldNorm(doc=916)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this paper, we present a topic discovery system aimed to reveal the implicit knowledge present in news streams. This knowledge is expressed as a hierarchy of topic/subtopics, where each topic contains the set of documents that are related to it and a summary extracted from these documents. Summaries so built are useful to browse and select topics of interest from the generated hierarchies. Our proposal consists of a new incremental hierarchical clustering algorithm, which combines both partitional and agglomerative approaches, taking the main benefits from them. Finally, a new summarization method based on Testor Theory has been proposed to build the topic summaries. Experimental results in the TDT2 collection demonstrate its usefulness and effectiveness not only as a topic detection system, but also as a classification and summarization tool.
  6. KDD : techniques and applications (1998) 0.02
    0.020708349 = product of:
      0.041416697 = sum of:
        0.041416697 = product of:
          0.082833394 = sum of:
            0.082833394 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
              0.082833394 = score(doc=6783,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.46428138 = fieldWeight in 6783, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6783)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Footnote
    A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997
  7. Ku, L.-W.; Chen, H.-H.: Mining opinions from the Web : beyond relevance retrieval (2007) 0.02
    0.01933244 = product of:
      0.03866488 = sum of:
        0.03866488 = product of:
          0.07732976 = sum of:
            0.07732976 = weight(_text_:news in 605) [ClassicSimilarity], result of:
              0.07732976 = score(doc=605,freq=2.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.28956512 = fieldWeight in 605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=605)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Documents discussing public affairs, common themes, interesting products, and so on, are reported and distributed on the Web. Positive and negative opinions embedded in documents are useful references and feedbacks for governments to improve their services, for companies to market their products, and for customers to purchase their objects. Web opinion mining aims to extract, summarize, and track various aspects of subjective information on the Web. Mining subjective information enables traditional information retrieval (IR) systems to retrieve more data from human viewpoints and provide information with finer granularity. Opinion extraction identifies opinion holders, extracts the relevant opinion sentences, and decides their polarities. Opinion summarization recognizes the major events embedded in documents and summarizes the supportive and the nonsupportive evidence. Opinion tracking captures subjective information from various genres and monitors the developments of opinions from spatial and temporal dimensions. To demonstrate and evaluate the proposed opinion mining algorithms, news and bloggers' articles are adopted. Documents in the evaluation corpora are tagged in different granularities from words, sentences to documents. In the experiments, positive and negative sentiment words and their weights are mined on the basis of Chinese word structures. The f-measure is 73.18% and 63.75% for verbs and nouns, respectively. Utilizing the sentiment words mined together with topical words, we achieve f-measure 62.16% at the sentence level and 74.37% at the document level.
  8. Wang, W.M.; Cheung, C.F.; Lee, W.B.; Kwok, S.K.: Mining knowledge from natural language texts using fuzzy associated concept mapping (2008) 0.02
    0.015465952 = product of:
      0.030931905 = sum of:
        0.030931905 = product of:
          0.06186381 = sum of:
            0.06186381 = weight(_text_:news in 2121) [ClassicSimilarity], result of:
              0.06186381 = score(doc=2121,freq=2.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.2316521 = fieldWeight in 2121, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2121)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Natural Language Processing (NLP) techniques have been successfully used to automatically extract information from unstructured text through a detailed analysis of their content, often to satisfy particular information needs. In this paper, an automatic concept map construction technique, Fuzzy Association Concept Mapping (FACM), is proposed for the conversion of abstracted short texts into concept maps. The approach consists of a linguistic module and a recommendation module. The linguistic module is a text mining method that does not require the use to have any prior knowledge about using NLP techniques. It incorporates rule-based reasoning (RBR) and case based reasoning (CBR) for anaphoric resolution. It aims at extracting the propositions in text so as to construct a concept map automatically. The recommendation module is arrived at by adopting fuzzy set theories. It is an interactive process which provides suggestions of propositions for further human refinement of the automatically generated concept maps. The suggested propositions are relationships among the concepts which are not explicitly found in the paragraphs. This technique helps to stimulate individual reflection and generate new knowledge. Evaluation was carried out by using the Science Citation Index (SCI) abstract database and CNET News as test data, which are well known databases and the quality of the text is assured. Experimental results show that the automatically generated concept maps conform to the outputs generated manually by domain experts, since the degree of difference between them is proportionally small. The method provides users with the ability to convert scientific and short texts into a structured format which can be easily processed by computer. Moreover, it provides knowledge workers with extra time to re-think their written text and to view their knowledge from another angle.
  9. Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.01
    0.013805566 = product of:
      0.027611133 = sum of:
        0.027611133 = product of:
          0.055222265 = sum of:
            0.055222265 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
              0.055222265 = score(doc=1737,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.30952093 = fieldWeight in 1737, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1737)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22.11.1998 18:57:22
  10. Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.01
    0.013805566 = product of:
      0.027611133 = sum of:
        0.027611133 = product of:
          0.055222265 = sum of:
            0.055222265 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
              0.055222265 = score(doc=1270,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.30952093 = fieldWeight in 1270, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1270)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information systems. 22(1997) nos.5/6, S.333-347
  11. Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.01
    0.01207987 = product of:
      0.02415974 = sum of:
        0.02415974 = product of:
          0.04831948 = sum of:
            0.04831948 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
              0.04831948 = score(doc=2908,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.2708308 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2908)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information systems. 22(1997) nos.5/6, S.349-385
  12. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.01
    0.008628479 = product of:
      0.017256958 = sum of:
        0.017256958 = product of:
          0.034513917 = sum of:
            0.034513917 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
              0.034513917 = score(doc=668,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.19345059 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2013 19:43:01
  13. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.01
    0.008628479 = product of:
      0.017256958 = sum of:
        0.017256958 = product of:
          0.034513917 = sum of:
            0.034513917 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
              0.034513917 = score(doc=1605,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.19345059 = fieldWeight in 1605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1605)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
  14. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.01
    0.008628479 = product of:
      0.017256958 = sum of:
        0.017256958 = product of:
          0.034513917 = sum of:
            0.034513917 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
              0.034513917 = score(doc=5011,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.19345059 = fieldWeight in 5011, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5011)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    7. 3.2019 16:32:22
  15. Information visualization in data mining and knowledge discovery (2002) 0.00
    0.0034513916 = product of:
      0.006902783 = sum of:
        0.006902783 = product of:
          0.013805566 = sum of:
            0.013805566 = weight(_text_:22 in 1789) [ClassicSimilarity], result of:
              0.013805566 = score(doc=1789,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.07738023 = fieldWeight in 1789, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.015625 = fieldNorm(doc=1789)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    23. 3.2008 19:10:22