Search (2493 results, page 1 of 125)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.17

0.17408767 = sum of:
  0.082819656 = product of:
    0.24845897 = sum of:
      0.24845897 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
        0.24845897 = score(doc=562,freq=2.0), product of:
          0.44208363 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.052144732 = queryNorm
          0.56201804 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.33333334 = coord(1/3)
  0.09126801 = sum of:
    0.048878662 = weight(_text_:data in 562) [ClassicSimilarity], result of:
      0.048878662 = score(doc=562,freq=4.0), product of:
        0.16488427 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.052144732 = queryNorm
        0.29644224 = fieldWeight in 562, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.04238935 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
      0.04238935 = score(doc=562,freq=2.0), product of:
        0.18260197 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.052144732 = queryNorm
        0.23214069 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32
Source: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK

Between data science and applied data analysis : Proceedings of the 26th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Mannheim, July 22-24, 2002 (2003) 0.10

0.10225324 = product of:
  0.20450649 = sum of:
    0.20450649 = sum of:
      0.11972778 = weight(_text_:data in 4606) [ClassicSimilarity], result of:
        0.11972778 = score(doc=4606,freq=6.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.7261322 = fieldWeight in 4606, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.09375 = fieldNorm(doc=4606)
      0.0847787 = weight(_text_:22 in 4606) [ClassicSimilarity], result of:
        0.0847787 = score(doc=4606,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.46428138 = fieldWeight in 4606, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.09375 = fieldNorm(doc=4606)
  0.5 = coord(1/2)

Series: Studies in classification, data analysis, and knowledge organization

Moore, R.W.: Management of very large distributed shared collections (2009) 0.09

0.088483125 = product of:
  0.17696625 = sum of:
    0.17696625 = sum of:
      0.12751201 = weight(_text_:data in 3845) [ClassicSimilarity], result of:
        0.12751201 = score(doc=3845,freq=20.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.7733425 = fieldWeight in 3845, product of:
            4.472136 = tf(freq=20.0), with freq of:
              20.0 = termFreq=20.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3845)
      0.049454242 = weight(_text_:22 in 3845) [ClassicSimilarity], result of:
        0.049454242 = score(doc=3845,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.2708308 = fieldWeight in 3845, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3845)
  0.5 = coord(1/2)

Abstract: Large scientific collections may be managed as data grids for sharing data, digital libraries for publishing data, persistent archives for preserving data, or as real-time data repositories for sensor data. Despite the multiple types of data management objectives, it is possible to build each system from generic software infrastructure. This entry examines the requirements driving the management of large data collections, the concepts on which current data management systems are based, and the current research initiatives for managing distributed data collections.
Date: 27. 8.2011 14:22:57

Houston, R.D.; Harmon, E.G.: Re-envisioning the information concept : systematic definitions (2002) 0.07

0.07198863 = product of:
  0.14397725 = sum of:
    0.14397725 = sum of:
      0.04608324 = weight(_text_:data in 136) [ClassicSimilarity], result of:
        0.04608324 = score(doc=136,freq=2.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.2794884 = fieldWeight in 136, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0625 = fieldNorm(doc=136)
      0.09789401 = weight(_text_:22 in 136) [ClassicSimilarity], result of:
        0.09789401 = score(doc=136,freq=6.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.536106 = fieldWeight in 136, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=136)
  0.5 = coord(1/2)

Abstract: This paper suggests a framework and systematic definitions for 6 words commonly used in dthe field of information science: data, information, knowledge, wisdom, inspiration, and intelligence. We intend these definitions to lead to a quantification of information science, a quantification that will enable their measurement, manipulastion, and prediction.
Date: 22. 2.2007 18:56:23
22. 2.2007 19:22:13

Malmsten, M.: Making a library catalogue part of the Semantic Web (2008) 0.07

0.06504996 = product of:
  0.13009992 = sum of:
    0.13009992 = sum of:
      0.08064567 = weight(_text_:data in 2640) [ClassicSimilarity], result of:
        0.08064567 = score(doc=2640,freq=8.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.48910472 = fieldWeight in 2640, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2640)
      0.049454242 = weight(_text_:22 in 2640) [ClassicSimilarity], result of:
        0.049454242 = score(doc=2640,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.2708308 = fieldWeight in 2640, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2640)
  0.5 = coord(1/2)

Abstract: Library catalogues contain an enormous amount of structured, high-quality data, however, this data is generally not made available to semantic web applications. In this paper we describe the tools and techniques used to make the Swedish Union Catalogue (LIBRIS) part of the Semantic Web and Linked Data. The focus is on links to and between resources and the mechanisms used to make data available, rather than perfect description of the individual resources. We also present a method of creating links between records of the same work.
Source: Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas

Lackes, R.; Tillmanns, C.: Data Mining für die Unternehmenspraxis : Entscheidungshilfen und Fallstudien mit führenden Softwarelösungen (2006) 0.06
```
0.063524835 = product of:
  0.12704967 = sum of:
    0.12704967 = sum of:
      0.08466032 = weight(_text_:data in 1383) [ClassicSimilarity], result of:
        0.08466032 = score(doc=1383,freq=12.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.513453 = fieldWeight in 1383, product of:
            3.4641016 = tf(freq=12.0), with freq of:
              12.0 = termFreq=12.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.046875 = fieldNorm(doc=1383)
      0.04238935 = weight(_text_:22 in 1383) [ClassicSimilarity], result of:
        0.04238935 = score(doc=1383,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.23214069 = fieldWeight in 1383, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=1383)
  0.5 = coord(1/2)
```
Abstract

Das Buch richtet sich an Praktiker in Unternehmen, die sich mit der Analyse von großen Datenbeständen beschäftigen. Nach einem kurzen Theorieteil werden vier Fallstudien aus dem Customer Relationship Management eines Versandhändlers bearbeitet. Dabei wurden acht führende Softwarelösungen verwendet: der Intelligent Miner von IBM, der Enterprise Miner von SAS, Clementine von SPSS, Knowledge Studio von Angoss, der Delta Miner von Bissantz, der Business Miner von Business Object und die Data Engine von MIT. Im Rahmen der Fallstudien werden die Stärken und Schwächen der einzelnen Lösungen deutlich, und die methodisch-korrekte Vorgehensweise beim Data Mining wird aufgezeigt. Beides liefert wertvolle Entscheidungshilfen für die Auswahl von Standardsoftware zum Data Mining und für die praktische Datenanalyse.

Content

Modelle, Methoden und Werkzeuge: Ziele und Aufbau der Untersuchung.- Grundlagen.- Planung und Entscheidung mit Data-Mining-Unterstützung.- Methoden.- Funktionalität und Handling der Softwarelösungen. Fallstudien: Ausgangssituation und Datenbestand im Versandhandel.- Kundensegmentierung.- Erklärung regionaler Marketingerfolge zur Neukundengewinnung.Prognose des Customer Lifetime Values.- Selektion von Kunden für eine Direktmarketingaktion.- Welche Softwarelösung für welche Entscheidung?- Fazit und Marktentwicklungen.

Date

22. 3.2008 14:46:06

Theme

Data Mining
Taniguchi, S.: Recording evidence in bibliographic records and descriptive metadata (2005) 0.06
```
0.063524835 = product of:
  0.12704967 = sum of:
    0.12704967 = sum of:
      0.08466032 = weight(_text_:data in 3565) [ClassicSimilarity], result of:
        0.08466032 = score(doc=3565,freq=12.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.513453 = fieldWeight in 3565, product of:
            3.4641016 = tf(freq=12.0), with freq of:
              12.0 = termFreq=12.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.046875 = fieldNorm(doc=3565)
      0.04238935 = weight(_text_:22 in 3565) [ClassicSimilarity], result of:
        0.04238935 = score(doc=3565,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.23214069 = fieldWeight in 3565, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=3565)
  0.5 = coord(1/2)
```
Abstract

In this article recording evidence for data values in addition to the values themselves in bibliographic records and descriptive metadata is proposed, with the aim of improving the expressiveness and reliability of those records and metadata. Recorded evidence indicates why and how data values are recorded for elements. Recording the history of changes in data values is also proposed, with the aim of reinforcing recorded evidence. First, evidence that can be recorded is categorized into classes: identifiers of rules or tasks, action descriptions of them, and input and output data of them. Dates of recording values and evidence are an additional class. Then, the relative usefulness of evidence classes and also levels (i.e., the record, data element, or data value level) to which an individual evidence class is applied, is examined. Second, examples that can be viewed as recorded evidence in existing bibliographic records and current cataloging rules are shown. Third, some examples of bibliographic records and descriptive metadata with notes of evidence are demonstrated. Fourth, ways of using recorded evidence are addressed.

Date

18. 6.2005 13:16:22
Fenstermacher, D.A.: Introduction to bioinformatics. (2005) 0.06
```
0.063524835 = product of:
  0.12704967 = sum of:
    0.12704967 = sum of:
      0.08466032 = weight(_text_:data in 5257) [ClassicSimilarity], result of:
        0.08466032 = score(doc=5257,freq=12.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.513453 = fieldWeight in 5257, product of:
            3.4641016 = tf(freq=12.0), with freq of:
              12.0 = termFreq=12.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.046875 = fieldNorm(doc=5257)
      0.04238935 = weight(_text_:22 in 5257) [ClassicSimilarity], result of:
        0.04238935 = score(doc=5257,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.23214069 = fieldWeight in 5257, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=5257)
  0.5 = coord(1/2)
```
Abstract

Bioinformatics is a multifaceted discipline combining many scientific fields including computational biology, statistics, mathematics, molecular biology, and genetics. Bioinformatics enables biomedical investigators to exploit existing and emerging computational technologies to seamlessly store, mine, retrieve, and analyze data from genomics and proteomics technologies. This is achieved by creating unified data models, standardizing data interfaces, developing structured vocabularies, generating new data visualization methods, and capturing detailed metadata that describes various aspects of the experimental design and analysis methods. Already there are a number of related undertakings that are dividing the field into more specialized groups. Clinical Bioinformatics and Biomedical Informatics are emerging as transitional fields to promote the utilization of genomics and proteomics data combined with medical history and demographic data towards personalized medicine, molecular diagnostics, pharmacogenomics and predicting outcomes of therapeutic interventions. The field of bioinformatics will continue to evolve through the incorporation of diverse technologies and methodologies that draw experts from disparate fields to create the latest computational and informational tools specifically design for the biomedical research enterprise.

Date

22. 7.2006 14:21:27
Näppilä, T.; Järvelin, K.; Niemi, T.: ¬A tool for data cube construction from structurally heterogeneous XML documents (2008) 0.06
```
0.06320223 = product of:
  0.12640446 = sum of:
    0.12640446 = sum of:
      0.09108 = weight(_text_:data in 1369) [ClassicSimilarity], result of:
        0.09108 = score(doc=1369,freq=20.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.5523875 = fieldWeight in 1369, product of:
            4.472136 = tf(freq=20.0), with freq of:
              20.0 = termFreq=20.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1369)
      0.035324458 = weight(_text_:22 in 1369) [ClassicSimilarity], result of:
        0.035324458 = score(doc=1369,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.19345059 = fieldWeight in 1369, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1369)
  0.5 = coord(1/2)
```
Abstract

Data cubes for OLAP (On-Line Analytical Processing) often need to be constructed from data located in several distributed and autonomous information sources. Such a data integration process is challenging due to semantic, syntactic, and structural heterogeneity among the data. While XML (extensible markup language) is the de facto standard for data exchange, the three types of heterogeneity remain. Moreover, popular path-oriented XML query languages, such as XQuery, require the user to know in much detail the structure of the documents to be processed and are, thus, effectively impractical in many real-world data integration tasks. Several Lowest Common Ancestor (LCA)-based XML query evaluation strategies have recently been introduced to provide a more structure-independent way to access XML documents. We shall, however, show that this approach leads in the context of certain - not uncommon - types of XML documents to undesirable results. This article introduces a novel high-level data extraction primitive that utilizes the purpose-built Smallest Possible Context (SPC) query evaluation strategy. We demonstrate, through a system prototype for OLAP data cube construction and a sample application in informetrics, that our approach has real advantages in data integration.

Date

9. 2.2008 17:22:42

Hearn, S.: Comparing catalogs : currency and consistency of controlled headings (2009) 0.06

0.059647724 = product of:
  0.11929545 = sum of:
    0.11929545 = sum of:
      0.069841206 = weight(_text_:data in 3600) [ClassicSimilarity], result of:
        0.069841206 = score(doc=3600,freq=6.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.42357713 = fieldWeight in 3600, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3600)
      0.049454242 = weight(_text_:22 in 3600) [ClassicSimilarity], result of:
        0.049454242 = score(doc=3600,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.2708308 = fieldWeight in 3600, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3600)
  0.5 = coord(1/2)

Abstract: Evaluative and comparative studies of catalog data have tended to focus on methods that are labor intensive, demand expertise, and can examine only a limited number of records. This study explores an alternative approach to gathering and analyzing catalog data, focusing on the currency and consistency of controlled headings. The resulting data provide insight into libraries' use of changed headings and their success in maintaining currency and consistency, and the systems needed to support the current pace of heading changes.
Date: 10. 9.2000 17:38:22

Wackerow, J.: ¬The Data Documentation Initiative (DDI) (2008) 0.06
```
0.056304354 = product of:
  0.11260871 = sum of:
    0.11260871 = sum of:
      0.08788159 = weight(_text_:data in 2662) [ClassicSimilarity], result of:
        0.08788159 = score(doc=2662,freq=38.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.5329895 = fieldWeight in 2662, product of:
            6.164414 = tf(freq=38.0), with freq of:
              38.0 = termFreq=38.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2662)
      0.024727121 = weight(_text_:22 in 2662) [ClassicSimilarity], result of:
        0.024727121 = score(doc=2662,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.1354154 = fieldWeight in 2662, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2662)
  0.5 = coord(1/2)
```
Abstract

The Data Documentation Initiative (DDI) is an international effort to establish an XML-based standard for the compilation, presentation, and exchange of documentation for datasets in the social and behavioral sciences. The most recent version 3.0 of the DDI supports a rich and structured set of metadata elements that not only fully informs a potential data analyst about a given dataset but also facilitates computer processing of the data. Moreover, data producers will find that by adopting the DDI standard they can produce better and more complete documentation as a natural step in designing and fielding computer-assisted interviewing. DDI 3.0 embraces the full life cycle of the data from conception, through development of the data collection instrument, collection and cleaning of data, production of data products, distribution, preservation, and reuse or analysis of the data. DDI 3.0 is designed to facilitate sharing schemes for concepts, questions, coding, and variables within organizations or throughout the social science research community. Comparison through direct inheritance as in the case of comparisonby- design or through the mapping of items like variables or categories allow capture of the harmonization processes used in creating integrated files in an uniform and machine-actionable way. DDI 3.0 is providing the structural support needed to facilitate comparative survey work in a way that was previously unavailable in an open, non-proprietary system. A specific DDI module allows for the capture and expression of native Dublin Core elements (DCMES), used either as references or as descriptions of a particular set of metadata. This module uses the simple Dublin Core namespace represented as XML Schema following the guidelines for implementing Dublin Core in XML. In DDI, the Dublin Core is not used as the primary citation mechanism - this module is included to support applications which understand the Dublin Core XML, but which do not understand DDI. This module is used wherever citations are permitted within DDI 3.0 (like citations of a study description or of other material). DDI 3.0 is aligned with other metadata standards as well: with SDMX (time-series data) for exchanging aggregate data, with ISO/IEC 11179 (metadata registry) for building data registries such as question, variable, and concept banks, and with FGDC and ISO 19115 (geographic standards) for supporting GIS users. DDI 3.0 is described in a conceptual model which is also expressed in the Universal Modeling Language (UML). Modular XML Schemas are derived from the conceptual model. Many elements support computer processing - that is, it will go beyond being "human readable", and move toward the goal of being "machine-actionable". The final release of DDI 3.0 has been published on April 28th 2008. The standard was developed by the DDI Alliance, an international group encompassing data archives and research institutions from several countries in Western Europe and North America. Earlier versions of DDI provide examples of institutions and applications: the Inter-university Consortium for Political and Social Research (ICPSR) Data Catalog, the Council of European Social Science Data Services (CESSDA) Data Portal, the Dataverse Network, the International Household Survey Network (IHSN), NESSTAR Software for publishing data on the Web and online analysis, and the Microdata Management Toolkit (by the World Bank Data Group for IHSN).

Source

Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas
Hsu, C.-N.; Chang, C.-H.; Hsieh, C.-H.; Lu, J.-J.; Chang, C.-C.: Reconfigurable Web wrapper agents for biological information integration (2005) 0.06
```
0.05576373 = product of:
  0.11152746 = sum of:
    0.11152746 = sum of:
      0.076203 = weight(_text_:data in 5263) [ClassicSimilarity], result of:
        0.076203 = score(doc=5263,freq=14.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.46216056 = fieldWeight in 5263, product of:
            3.7416575 = tf(freq=14.0), with freq of:
              14.0 = termFreq=14.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5263)
      0.035324458 = weight(_text_:22 in 5263) [ClassicSimilarity], result of:
        0.035324458 = score(doc=5263,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.19345059 = fieldWeight in 5263, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5263)
  0.5 = coord(1/2)
```
Abstract

A variety of biological data is transferred and exchanged in overwhelming volumes on the World Wide Web. How to rapidly capture, utilize, and integrate the information on the Internet to discover valuable biological knowledge is one of the most critical issues in bioinformatics. Many information integration systems have been proposed for integrating biological data. These systems usually rely on an intermediate software layer called wrappers to access connected information sources. Wrapper construction for Web data sources is often specially hand coded to accommodate the differences between each Web site. However, programming a Web wrapper requires substantial programming skill, and is time-consuming and hard to maintain. In this article we provide a solution for rapidly building software agents that can serve as Web wrappers for biological information integration. We define an XML-based language called Web Navigation Description Language (WNDL), to model a Web-browsing session. A WNDL script describes how to locate the data, extract the data, and combine the data. By executing different WNDL scripts, we can automate virtually all types of Web-browsing sessions. We also describe IEPAD (Information Extraction Based on Pattern Discovery), a data extractor based on pattern discovery techniques. IEPAD allows our software agents to automatically discover the extraction rules to extract the contents of a structurally formatted Web page. With a programming-by-example authoring tool, a user can generate a complete Web wrapper agent by browsing the target Web sites. We built a variety of biological applications to demonstrate the feasibility of our approach.

Date

22. 7.2006 14:36:42
Rapp, B.A.; Wheeler, D.L.: Bioinformatics resources from the National Center for Biotechnology Information : an integrated foundation for discovery (2005) 0.06
```
0.05576373 = product of:
  0.11152746 = sum of:
    0.11152746 = sum of:
      0.076203 = weight(_text_:data in 5265) [ClassicSimilarity], result of:
        0.076203 = score(doc=5265,freq=14.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.46216056 = fieldWeight in 5265, product of:
            3.7416575 = tf(freq=14.0), with freq of:
              14.0 = termFreq=14.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5265)
      0.035324458 = weight(_text_:22 in 5265) [ClassicSimilarity], result of:
        0.035324458 = score(doc=5265,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.19345059 = fieldWeight in 5265, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5265)
  0.5 = coord(1/2)
```
Abstract

The National Center for Biotechnology Information (NCBI) provides access to more than 30 publicly available molecular biology resources, offering an effective discovery space through high levels of data integration among large-scale data repositories. The foundation for many services is GenBank®, a public repository of DNA sequences from more than 133,000 different organisms. GenBank is accessible through the Entrez retrieval system, which integrates data from the major DNA and protein sequence databases, along with resources for taxonomy, genome maps, sequence variation, gene expression, gene function and phenotypes, protein structure and domain information, and the biomedical literature via PubMed®. Computational tools allow scientists to analyze vast quantities of diverse data. The BLAST® sequence similarity programs are instrumental in identifying genes and genetic features. Other tools support mapping disease loci to the genome, identifying new genes, comparing genomes, and relating sequence data to model protein structures. A basic research program in computational molecular biology enhances the database and software tool development initiatives. Future plans include further data integration, enhanced genome annotation and protein classification, additional data types, and links to a wider range of resources.

Date

22. 7.2006 14:58:34
Agosto, D.E.: Bounded rationality and satisficing in young people's Web-based decision making (2002) 0.06
```
0.055757105 = product of:
  0.11151421 = sum of:
    0.11151421 = sum of:
      0.06912486 = weight(_text_:data in 177) [ClassicSimilarity], result of:
        0.06912486 = score(doc=177,freq=8.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.4192326 = fieldWeight in 177, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.046875 = fieldNorm(doc=177)
      0.04238935 = weight(_text_:22 in 177) [ClassicSimilarity], result of:
        0.04238935 = score(doc=177,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.23214069 = fieldWeight in 177, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=177)
  0.5 = coord(1/2)
```
Abstract

This study investigated Simon's behavioral decisionmaking theories of bounded rationality and satisficing in relation to young people's decision making in the World Wide Web, and considered the role of personal preferences in Web-based decisions. It employed a qualitative research methodology involving group interviews with 22 adolescent females. Data analysis took the form of iterative pattern coding using QSR NUD*IST Vivo qualitative data analysis software. Data analysis revealed that the study participants did operate within the limits of bounded rationality. These limits took the form of time constraints, information overload, and physical constraints. Data analysis also uncovered two major satisficing behaviors-reduction and termination. Personal preference was found to play a major role in Web site evaluation in the areas of graphic/multimedia and subject content preferences. This study has related implications for Web site designers and for adult intermediaries who work with young people and the Web
Reitsma, R.F.; Thabane, L.; MacLeod, J.M.B.: Spatialization of Web Sites Using a Weighted Frequency Model of Navigation Data (2004) 0.06
```
0.055757105 = product of:
  0.11151421 = sum of:
    0.11151421 = sum of:
      0.06912486 = weight(_text_:data in 2064) [ClassicSimilarity], result of:
        0.06912486 = score(doc=2064,freq=8.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.4192326 = fieldWeight in 2064, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.046875 = fieldNorm(doc=2064)
      0.04238935 = weight(_text_:22 in 2064) [ClassicSimilarity], result of:
        0.04238935 = score(doc=2064,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.23214069 = fieldWeight in 2064, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2064)
  0.5 = coord(1/2)
```
Abstract

A common problem in the spatialization of information systems is the determination of geometry; i.e., dimensionality and metric. Such geometry is either chosen a priori or is inferred a posteriori from secondary data. Recent work emphasizes the use of geometric information latent in a system's navigational record. Resolving this information from its noisy background, however, requires an unambiguous criterion of selection. In this paper we use a previously published, statistical method for resolving a Web-based information system's geometry from navigational data. However, because of the method's (theoretical) sensitivity to data selection, a weighted frequency correction based an empirical probability distributions is applied. The effect of this correction an the Web-space geometry is investigated. Results indicate that the inferred geometry is robust; i.e., it does not significantly change under this probabilistic correction.

Source

Journal of the American Society for Information Science and technology. 55(2004) no.1, S.13-22

Schrodt, R.: Tiefen und Untiefen im wissenschaftlichen Sprachgebrauch (2008) 0.06

0.05521311 = product of:
  0.11042622 = sum of:
    0.11042622 = product of:
      0.33127865 = sum of:
        0.33127865 = weight(_text_:3a in 140) [ClassicSimilarity], result of:
          0.33127865 = score(doc=140,freq=2.0), product of:
            0.44208363 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.052144732 = queryNorm
            0.7493574 = fieldWeight in 140, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0625 = fieldNorm(doc=140)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Content: Vgl. auch: https://studylibde.com/doc/13053640/richard-schrodt. Vgl. auch: http%3A%2F%2Fwww.univie.ac.at%2FGermanistik%2Fschrodt%2Fvorlesung%2Fwissenschaftssprache.doc&usg=AOvVaw1lDLDR6NFf1W0-oC9mEUJf.

Rice, R.: Applying DC to institutional data repositories (2008) 0.05
```
0.054039046 = product of:
  0.10807809 = sum of:
    0.10807809 = sum of:
      0.079818524 = weight(_text_:data in 2664) [ClassicSimilarity], result of:
        0.079818524 = score(doc=2664,freq=24.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.48408815 = fieldWeight in 2664, product of:
            4.8989797 = tf(freq=24.0), with freq of:
              24.0 = termFreq=24.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.03125 = fieldNorm(doc=2664)
      0.028259566 = weight(_text_:22 in 2664) [ClassicSimilarity], result of:
        0.028259566 = score(doc=2664,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.15476047 = fieldWeight in 2664, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=2664)
  0.5 = coord(1/2)
```
Abstract

DISC-UK DataShare (2007-2009), a project led by the University of Edinburgh and funded by JISC (Joint Information Systems Committee, UK), arises from an existing consortium of academic data support professionals working in the domain of social science datasets (Data Information Specialists Committee-UK). We are working together across four universities with colleagues engaged in managing open access repositories for e-prints. Our project supports 'early adopter' academics who wish to openly share datasets and presents a model for depositing 'orphaned datasets' that are not being deposited in subject-domain data archives/centres. Outputs from the project are intended to help to demystify data as complex objects in repositories, and assist other institutional repository managers in overcoming barriers to incorporating research data. By building on lessons learned from recent JISC-funded data repository projects such as SToRe and GRADE the project will help realize the vision of the Digital Repositories Roadmap, e.g. the milestone under Data, "Institutions need to invest in research data repositories" (Heery and Powell, 2006). Application of appropriate metadata is an important area of development for the project. Datasets are not different from other digital materials in that they need to be described, not just for discovery but also for preservation and re-use. The GRADE project found that for geo-spatial datasets, Dublin Core metadata (with geo-spatial enhancements such as a bounding box for the 'coverage' property) was sufficient for discovery within a DSpace repository, though more indepth metadata or documentation was required for re-use after downloading. The project partners are examining other metadata schemas such as the Data Documentation Initiative (DDI) versions 2 and 3, used primarily by social science data archives (Martinez, 2008). Crosswalks from the DDI to qualified Dublin Core are important for describing research datasets at the study level (as opposed to the variable level which is largely out of scope for this project). DataShare is benefiting from work of of the DRIADE project (application profile development for evolutionary biology) (Carrier, et al, 2007), eBank UK (developed an application profile for crystallography data) and GAP (Geospatial Application Profile, in progress) in defining interoperable Dublin Core qualified metadata elements and their application to datasets for each partner repository. The solution devised at Edinburgh for DSpace will be covered in the poster.

Source

Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas

Mochel, K.: Search in the Web shopping environment (2006) 0.05

0.053239673 = product of:
  0.10647935 = sum of:
    0.10647935 = sum of:
      0.057025105 = weight(_text_:data in 5301) [ClassicSimilarity], result of:
        0.057025105 = score(doc=5301,freq=4.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.34584928 = fieldWeight in 5301, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0546875 = fieldNorm(doc=5301)
      0.049454242 = weight(_text_:22 in 5301) [ClassicSimilarity], result of:
        0.049454242 = score(doc=5301,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.2708308 = fieldWeight in 5301, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=5301)
  0.5 = coord(1/2)

Abstract: The author presents a design case study of a search user interface for Web catalogs in the context of online shopping for consumer products such as clothing, furniture, and sporting goods. The case study provides a review of the user data for the user interface (UI), and the resulting redesign recommendations. Based on the case study and its user data, a set of common user requirements for searching in the context of online shopping is provided.
Date: 22. 7.2006 18:23:19

Vellucci, S.L.: Metadata and authority control (2000) 0.05

0.053239673 = product of:
  0.10647935 = sum of:
    0.10647935 = sum of:
      0.057025105 = weight(_text_:data in 180) [ClassicSimilarity], result of:
        0.057025105 = score(doc=180,freq=4.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.34584928 = fieldWeight in 180, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0546875 = fieldNorm(doc=180)
      0.049454242 = weight(_text_:22 in 180) [ClassicSimilarity], result of:
        0.049454242 = score(doc=180,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.2708308 = fieldWeight in 180, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=180)
  0.5 = coord(1/2)

Abstract: A variety of information communities have developed metadata schemes to meet the needs of their own users. The ability of libraries to incorporate and use multiple metadata schemes in current library systems will depend on the compatibility of imported data with existing catalog data. Authority control will play an important role in metadata interoperability. In this article, I discuss factors for successful authority control in current library catalogs, which include operation in a well-defined and bounded universe, application of principles and standard practices to access point creation, reference to authoritative lists, and bibliographic record creation by highly trained individuals. Metadata characteristics and environmental models are examined and the likelihood of successful authority control is explored for a variety of metadata environments.
Date: 10. 9.2000 17:38:22

Naun, C.C.: FRBR principles applied to a local online journal finding aid (2007) 0.05

0.053239673 = product of:
  0.10647935 = sum of:
    0.10647935 = sum of:
      0.057025105 = weight(_text_:data in 2279) [ClassicSimilarity], result of:
        0.057025105 = score(doc=2279,freq=4.0), product of:
          0.16488427 = queryWeight, product of:
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.052144732 = queryNorm
          0.34584928 = fieldWeight in 2279, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.1620505 = idf(docFreq=5088, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2279)
      0.049454242 = weight(_text_:22 in 2279) [ClassicSimilarity], result of:
        0.049454242 = score(doc=2279,freq=2.0), product of:
          0.18260197 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052144732 = queryNorm
          0.2708308 = fieldWeight in 2279, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2279)
  0.5 = coord(1/2)

Abstract: This paper presents a case study in the development of an online journal finding aid at the University of Illinois at Urbana-Champaign (UIUC), with particular emphasis on cataloging issues. Although not consciously designed according to Functional Requirements for Bibliographic Records (FRBR) principles, the Online Research Resources (ORR) system has proved amenable to FRBR analysis. The FRBR model was helpful in examining the user tasks to be served by the system, the appropriate data structure for the system, and the feasibility of mapping the required data from existing sources. The application of the FRBR model to serial publications, however, raises important questions for the model itself, particularly concerning the treatment of work-to-work relationships.
Date: 10. 9.2000 17:38:22

Search (2493 results, page 1 of 125)

Authors

Languages

Types

Themes

Subjects

Classifications