Search (28 results, page 1 of 2)

  • × theme_ss:"Data Mining"
  1. Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.17
    0.1663232 = product of:
      0.22176427 = sum of:
        0.120629124 = weight(_text_:digital in 4577) [ClassicSimilarity], result of:
          0.120629124 = score(doc=4577,freq=2.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.61014175 = fieldWeight in 4577, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.109375 = fieldNorm(doc=4577)
        0.053599782 = weight(_text_:library in 4577) [ClassicSimilarity], result of:
          0.053599782 = score(doc=4577,freq=2.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.40671125 = fieldWeight in 4577, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.109375 = fieldNorm(doc=4577)
        0.047535364 = product of:
          0.09507073 = sum of:
            0.09507073 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
              0.09507073 = score(doc=4577,freq=2.0), product of:
                0.17551683 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050121464 = queryNorm
                0.5416616 = fieldWeight in 4577, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4577)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Date
    2. 4.2000 18:01:22
    Source
    Library trends. 48(1999) no.1, S.182-208
  2. Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.15
    0.14674002 = product of:
      0.19565335 = sum of:
        0.13786185 = weight(_text_:digital in 1737) [ClassicSimilarity], result of:
          0.13786185 = score(doc=1737,freq=8.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.69730484 = fieldWeight in 1737, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0625 = fieldNorm(doc=1737)
        0.030628446 = weight(_text_:library in 1737) [ClassicSimilarity], result of:
          0.030628446 = score(doc=1737,freq=2.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.23240642 = fieldWeight in 1737, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0625 = fieldNorm(doc=1737)
        0.027163066 = product of:
          0.054326132 = sum of:
            0.054326132 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
              0.054326132 = score(doc=1737,freq=2.0), product of:
                0.17551683 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050121464 = queryNorm
                0.30952093 = fieldWeight in 1737, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1737)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Defines digital libraries and discusses the effects of new technology on librarians. Examines the different viewpoints of librarians and information technologists on digital libraries. Describes the development of a digital library at the National Drug Intelligence Center, USA, which was carried out in collaboration with information technology experts. The system is based on Web enabled search technology to find information, data visualization and data mining to visualize it and use of SGML as an information standard to store it
    Date
    22.11.1998 18:57:22
  3. Organisciak, P.; Schmidt, B.M.; Downie, J.S.: Giving shape to large digital libraries through exploratory data analysis (2022) 0.07
    0.069286 = product of:
      0.138572 = sum of:
        0.11560067 = weight(_text_:digital in 473) [ClassicSimilarity], result of:
          0.11560067 = score(doc=473,freq=10.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.58470786 = fieldWeight in 473, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.046875 = fieldNorm(doc=473)
        0.022971334 = weight(_text_:library in 473) [ClassicSimilarity], result of:
          0.022971334 = score(doc=473,freq=2.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.17430481 = fieldWeight in 473, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.046875 = fieldNorm(doc=473)
      0.5 = coord(2/4)
    
    Abstract
    The emergence of large multi-institutional digital libraries has opened the door to aggregate-level examinations of the published word. Such large-scale analysis offers a new way to pursue traditional problems in the humanities and social sciences, using digital methods to ask routine questions of large corpora. However, inquiry into multiple centuries of books is constrained by the burdens of scale, where statistical inference is technically complex and limited by hurdles to access and flexibility. This work examines the role that exploratory data analysis and visualization tools may play in understanding large bibliographic datasets. We present one such tool, HathiTrust+Bookworm, which allows multifaceted exploration of the multimillion work HathiTrust Digital Library, and center it in the broader space of scholarly tools for exploratory data analysis.
    Series
    JASIST special issue on digital humanities (DH): C. Methodological innovations, challenges, and new interest in DH
  4. Chen, C.-C.; Chen, A.-P.: Using data mining technology to provide a recommendation service in the digital library (2007) 0.07
    0.068405345 = product of:
      0.13681069 = sum of:
        0.086163655 = weight(_text_:digital in 2533) [ClassicSimilarity], result of:
          0.086163655 = score(doc=2533,freq=8.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.4358155 = fieldWeight in 2533, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2533)
        0.050647035 = weight(_text_:library in 2533) [ClassicSimilarity], result of:
          0.050647035 = score(doc=2533,freq=14.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.384306 = fieldWeight in 2533, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2533)
      0.5 = coord(2/4)
    
    Abstract
    Purpose - Since library storage has been increasing day by day, it is difficult for readers to find the books which interest them as well as representative booklists. How to utilize meaningful information effectively to improve the service quality of the digital library appears to be very important. The purpose of this paper is to provide a recommendation system architecture to promote digital library services in electronic libraries. Design/methodology/approach - In the proposed architecture, a two-phase data mining process used by association rule and clustering methods is designed to generate a recommendation system. The process considers not only the relationship of a cluster of users but also the associations among the information accessed. Findings - The process considered not only the relationship of a cluster of users but also the associations among the information accessed. With the advanced filter, the recommendation supported by the proposed system architecture would be closely served to meet users' needs. Originality/value - This paper not only constructs a recommendation service for readers to search books from the web but takes the initiative in finding the most suitable books for readers as well. Furthermore, library managers are expected to purchase core and hot books from a limited budget to maintain and satisfy the requirements of readers along with promoting digital library services.
    Source
    Electronic library. 25(2007) no.6, S.711-724
  5. Nicholson, S.: Bibliomining for automated collection development in a digital library setting : using data mining to discover Web-based scholarly research works (2003) 0.06
    0.064484105 = product of:
      0.12896821 = sum of:
        0.086163655 = weight(_text_:digital in 1867) [ClassicSimilarity], result of:
          0.086163655 = score(doc=1867,freq=8.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.4358155 = fieldWeight in 1867, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1867)
        0.042804558 = weight(_text_:library in 1867) [ClassicSimilarity], result of:
          0.042804558 = score(doc=1867,freq=10.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.32479787 = fieldWeight in 1867, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1867)
      0.5 = coord(2/4)
    
    Abstract
    This research creates an intelligent agent for automated collection development in a digital library setting. It uses a predictive model based an facets of each Web page to select scholarly works. The criteria came from the academic library selection literature, and a Delphi study was used to refine the list to 41 criteria. A Perl program was designed to analyze a Web page for each criterion and applied to a large collection of scholarly and nonscholarly Web pages. Bibliomining, or data mining for libraries, was then used to create different classification models. Four techniques were used: logistic regression, nonparametric discriminant analysis, classification trees, and neural networks. Accuracy and return were used to judge the effectiveness of each model an test datasets. In addition, a set of problematic pages that were difficult to classify because of their similarity to scholarly research was gathered and classified using the models. The resulting models could be used in the selection process to automatically create a digital library of Webbased scholarly research works. In addition, the technique can be extended to create a digital library of any type of structured electronic information.
  6. Cohen, D.J.: From Babel to knowledge : data mining large digital collections (2006) 0.05
    0.049362347 = product of:
      0.09872469 = sum of:
        0.077067114 = weight(_text_:digital in 1178) [ClassicSimilarity], result of:
          0.077067114 = score(doc=1178,freq=10.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.38980526 = fieldWeight in 1178, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.03125 = fieldNorm(doc=1178)
        0.021657582 = weight(_text_:library in 1178) [ClassicSimilarity], result of:
          0.021657582 = score(doc=1178,freq=4.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.16433616 = fieldWeight in 1178, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.03125 = fieldNorm(doc=1178)
      0.5 = coord(2/4)
    
    Abstract
    In Jorge Luis Borges's curious short story The Library of Babel, the narrator describes an endless collection of books stored from floor to ceiling in a labyrinth of countless hexagonal rooms. The pages of the library's books seem to contain random sequences of letters and spaces; occasionally a few intelligible words emerge in the sea of paper and ink. Nevertheless, readers diligently, and exasperatingly, scan the shelves for coherent passages. The narrator himself has wandered numerous rooms in search of enlightenment, but with resignation he simply awaits his death and burial - which Borges explains (with signature dark humor) consists of being tossed unceremoniously over the library's banister. Borges's nightmare, of course, is a cursed vision of the research methods of disciplines such as literature, history, and philosophy, where the careful reading of books, one after the other, is supposed to lead inexorably to knowledge and understanding. Computer scientists would approach Borges's library far differently. Employing the information theory that forms the basis for search engines and other computerized techniques for assessing in one fell swoop large masses of documents, they would quickly realize the collection's incoherence though sampling and statistical methods - and wisely start looking for the library's exit. These computational methods, which allow us to find patterns, determine relationships, categorize documents, and extract information from massive corpuses, will form the basis for new tools for research in the humanities and other disciplines in the coming decade. For the past three years I have been experimenting with how to provide such end-user tools - that is, tools that harness the power of vast electronic collections while hiding much of their complicated technical plumbing. In particular, I have made extensive use of the application programming interfaces (APIs) the leading search engines provide for programmers to query their databases directly (from server to server without using their web interfaces). In addition, I have explored how one might extract information from large digital collections, from the well-curated lexicographic database WordNet to the democratic (and poorly curated) online reference work Wikipedia. While processing these digital corpuses is currently an imperfect science, even now useful tools can be created by combining various collections and methods for searching and analyzing them. And more importantly, these nascent services suggest a future in which information can be gleaned from, and sense can be made out of, even imperfect digital libraries of enormous scale. A brief examination of two approaches to data mining large digital collections hints at this future, while also providing some lessons about how to get there.
  7. Suakkaphong, N.; Zhang, Z.; Chen, H.: Disease named entity recognition using semisupervised learning and conditional random fields (2011) 0.03
    0.033873826 = product of:
      0.06774765 = sum of:
        0.043081827 = weight(_text_:digital in 4367) [ClassicSimilarity], result of:
          0.043081827 = score(doc=4367,freq=2.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.21790776 = fieldWeight in 4367, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4367)
        0.024665821 = product of:
          0.049331643 = sum of:
            0.049331643 = weight(_text_:project in 4367) [ClassicSimilarity], result of:
              0.049331643 = score(doc=4367,freq=2.0), product of:
                0.21156175 = queryWeight, product of:
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.050121464 = queryNorm
                0.23317845 = fieldWeight in 4367, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4367)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Information extraction is an important text-mining task that aims at extracting prespecified types of information from large text collections and making them available in structured representations such as databases. In the biomedical domain, information extraction can be applied to help biologists make the most use of their digital-literature archives. Currently, there are large amounts of biomedical literature that contain rich information about biomedical substances. Extracting such knowledge requires a good named entity recognition technique. In this article, we combine conditional random fields (CRFs), a state-of-the-art sequence-labeling algorithm, with two semisupervised learning techniques, bootstrapping and feature sampling, to recognize disease names from biomedical literature. Two data-processing strategies for each technique also were analyzed: one sequentially processing unlabeled data partitions and another one processing unlabeled data partitions in a round-robin fashion. The experimental results showed the advantage of semisupervised learning techniques given limited labeled training data. Specifically, CRFs with bootstrapping implemented in sequential fashion outperformed strictly supervised CRFs for disease name recognition. The project was supported by NIH/NLM Grant R33 LM07299-01, 2002-2005.
  8. Fayyad, U.M.; Djorgovski, S.G.; Weir, N.: From digitized images to online catalogs : data ming a sky server (1996) 0.02
    0.017232731 = product of:
      0.068930924 = sum of:
        0.068930924 = weight(_text_:digital in 6625) [ClassicSimilarity], result of:
          0.068930924 = score(doc=6625,freq=2.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.34865242 = fieldWeight in 6625, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0625 = fieldNorm(doc=6625)
      0.25 = coord(1/4)
    
    Abstract
    Offers a data mining approach based on machine learning classification methods to the problem of automated cataloguing of online databases of digital images resulting from sky surveys. The SKICAT system automates the reduction and analysis of 3 terabytes of images expected to contain about 2 billion sky objects. It offers a solution to problems associated with the analysis of large data sets in science
  9. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.01
    0.012924549 = product of:
      0.051698197 = sum of:
        0.051698197 = weight(_text_:digital in 3015) [ClassicSimilarity], result of:
          0.051698197 = score(doc=3015,freq=2.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.26148933 = fieldWeight in 3015, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.046875 = fieldNorm(doc=3015)
      0.25 = coord(1/4)
    
    Abstract
    We analyze the linguistic evolution of selected scientific disciplines over a 30-year time span (1970s to 2000s). Our focus is on four highly specialized disciplines at the boundaries of computer science that emerged during that time: computational linguistics, bioinformatics, digital construction, and microelectronics. Our analysis is driven by the question whether these disciplines develop a distinctive language use-both individually and collectively-over the given time period. The data set is the English Scientific Text Corpus (scitex), which includes texts from the 1970s/1980s and early 2000s. Our theoretical basis is register theory. In terms of methods, we combine corpus-based methods of feature extraction (various aggregated features [part-of-speech based], n-grams, lexico-grammatical patterns) and automatic text classification. The results of our research are directly relevant to the study of linguistic variation and languages for specific purposes (LSP) and have implications for various natural language processing (NLP) tasks, for example, authorship attribution, text mining, or training NLP tools.
  10. Tonkin, E.L.; Tourte, G.J.L.: Working with text. tools, techniques and approaches for text mining (2016) 0.01
    0.010770457 = product of:
      0.043081827 = sum of:
        0.043081827 = weight(_text_:digital in 4019) [ClassicSimilarity], result of:
          0.043081827 = score(doc=4019,freq=2.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.21790776 = fieldWeight in 4019, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4019)
      0.25 = coord(1/4)
    
    Abstract
    What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.
  11. Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.01
    0.010464822 = product of:
      0.041859288 = sum of:
        0.041859288 = product of:
          0.083718576 = sum of:
            0.083718576 = weight(_text_:project in 720) [ClassicSimilarity], result of:
              0.083718576 = score(doc=720,freq=4.0), product of:
                0.21156175 = queryWeight, product of:
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.050121464 = queryNorm
                0.39571697 = fieldWeight in 720, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.046875 = fieldNorm(doc=720)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    This project brought together undergraduate students in Computer Science with librarians to mine abstracts of articles from the Texas A&M University Libraries' institutional repository, OAKTrust, in order to probe the creation of new metadata to improve discovery and use. The mining operation task consisted simply of classifying the articles into two categories of research type: basic research ("for understanding," "curiosity-based," or "knowledge-based") and applied research ("use-based"). These categories are fundamental especially for funders but are also important to researchers. The mining-to-classification steps took several iterations, but ultimately, we achieved good results with the toolkit BERT (Bidirectional Encoder Representations from Transformers). The project and its workflows represent a preview of what may lie ahead in the future of crafting metadata using text mining techniques to enhance discoverability.
  12. KDD : techniques and applications (1998) 0.01
    0.01018615 = product of:
      0.0407446 = sum of:
        0.0407446 = product of:
          0.0814892 = sum of:
            0.0814892 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
              0.0814892 = score(doc=6783,freq=2.0), product of:
                0.17551683 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050121464 = queryNorm
                0.46428138 = fieldWeight in 6783, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6783)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Footnote
    A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997
  13. Whittle, M.; Eaglestone, B.; Ford, N.; Gillet, V.J.; Madden, A.: Data mining of search engine logs (2007) 0.01
    0.0073997467 = product of:
      0.029598987 = sum of:
        0.029598987 = product of:
          0.059197973 = sum of:
            0.059197973 = weight(_text_:project in 1330) [ClassicSimilarity], result of:
              0.059197973 = score(doc=1330,freq=2.0), product of:
                0.21156175 = queryWeight, product of:
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.050121464 = queryNorm
                0.27981415 = fieldWeight in 1330, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1330)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    This article reports on the development of a novel method for the analysis of Web logs. The method uses techniques that look for similarities between queries and identify sequences of query transformation. It allows sequences of query transformations to be represented as graphical networks, thereby giving a richer view of search behavior than is possible with the usual sequential descriptions. We also perform a basic analysis to study the correlations between observed transformation codes, with results that appear to show evidence of behavior habits. The method was developed using transaction logs from the Excite search engine to provide a tool for an ongoing research project that is endeavoring to develop a greater understanding of Web-based searching by the general public.
  14. Lusti, M.: Data Warehousing and Data Mining : Eine Einführung in entscheidungsunterstützende Systeme (1999) 0.01
    0.0067907665 = product of:
      0.027163066 = sum of:
        0.027163066 = product of:
          0.054326132 = sum of:
            0.054326132 = weight(_text_:22 in 4261) [ClassicSimilarity], result of:
              0.054326132 = score(doc=4261,freq=2.0), product of:
                0.17551683 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050121464 = queryNorm
                0.30952093 = fieldWeight in 4261, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4261)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    17. 7.2002 19:22:06
  15. Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.01
    0.0067907665 = product of:
      0.027163066 = sum of:
        0.027163066 = product of:
          0.054326132 = sum of:
            0.054326132 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
              0.054326132 = score(doc=1270,freq=2.0), product of:
                0.17551683 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050121464 = queryNorm
                0.30952093 = fieldWeight in 1270, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1270)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information systems. 22(1997) nos.5/6, S.333-347
  16. Borgman, C.L.; Wofford, M.F.; Golshan, M.S.; Darch, P.T.: Collaborative qualitative research at scale : reflections on 20 years of acquiring global data and making data global (2021) 0.01
    0.0061664553 = product of:
      0.024665821 = sum of:
        0.024665821 = product of:
          0.049331643 = sum of:
            0.049331643 = weight(_text_:project in 239) [ClassicSimilarity], result of:
              0.049331643 = score(doc=239,freq=2.0), product of:
                0.21156175 = queryWeight, product of:
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.050121464 = queryNorm
                0.23317845 = fieldWeight in 239, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=239)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    A 5-year project to study scientific data uses in geography, starting in 1999, evolved into 20 years of research on data practices in sensor networks, environmental sciences, biology, seismology, undersea science, biomedicine, astronomy, and other fields. By emulating the "team science" approaches of the scientists studied, the UCLA Center for Knowledge Infrastructures accumulated a comprehensive collection of qualitative data about how scientists generate, manage, use, and reuse data across domains. Building upon Paul N. Edwards's model of "making global data"-collecting signals via consistent methods, technologies, and policies-to "make data global"-comparing and integrating those data, the research team has managed and exploited these data as a collaborative resource. This article reflects on the social, technical, organizational, economic, and policy challenges the team has encountered in creating new knowledge from data old and new. We reflect on continuity over generations of students and staff, transitions between grants, transfer of legacy data between software tools, research methods, and the role of professional data managers in the social sciences.
  17. Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.01
    0.0059419204 = product of:
      0.023767682 = sum of:
        0.023767682 = product of:
          0.047535364 = sum of:
            0.047535364 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
              0.047535364 = score(doc=2908,freq=2.0), product of:
                0.17551683 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050121464 = queryNorm
                0.2708308 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2908)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information systems. 22(1997) nos.5/6, S.349-385
  18. Dang, X.H.; Ong. K.-L.: Knowledge discovery in data streams (2009) 0.01
    0.0057428335 = product of:
      0.022971334 = sum of:
        0.022971334 = weight(_text_:library in 3829) [ClassicSimilarity], result of:
          0.022971334 = score(doc=3829,freq=2.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.17430481 = fieldWeight in 3829, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.046875 = fieldNorm(doc=3829)
      0.25 = coord(1/4)
    
    Source
    Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates
  19. Lackes, R.; Tillmanns, C.: Data Mining für die Unternehmenspraxis : Entscheidungshilfen und Fallstudien mit führenden Softwarelösungen (2006) 0.01
    0.005093075 = product of:
      0.0203723 = sum of:
        0.0203723 = product of:
          0.0407446 = sum of:
            0.0407446 = weight(_text_:22 in 1383) [ClassicSimilarity], result of:
              0.0407446 = score(doc=1383,freq=2.0), product of:
                0.17551683 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050121464 = queryNorm
                0.23214069 = fieldWeight in 1383, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1383)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 3.2008 14:46:06
  20. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.00
    0.0042442293 = product of:
      0.016976917 = sum of:
        0.016976917 = product of:
          0.033953834 = sum of:
            0.033953834 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
              0.033953834 = score(doc=668,freq=2.0), product of:
                0.17551683 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050121464 = queryNorm
                0.19345059 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 3.2013 19:43:01

Years

Languages

  • e 21
  • d 7

Types