Search (2 results, page 1 of 1)

Näppilä, T.; Järvelin, K.; Niemi, T.: ¬A tool for data cube construction from structurally heterogeneous XML documents (2008) 0.02
```
0.020074995 = product of:
  0.04014999 = sum of:
    0.04014999 = sum of:
      0.00894975 = weight(_text_:a in 1369) [ClassicSimilarity], result of:
        0.00894975 = score(doc=1369,freq=14.0), product of:
          0.053105544 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.046056706 = queryNorm
          0.1685276 = fieldWeight in 1369, product of:
            3.7416575 = tf(freq=14.0), with freq of:
              14.0 = termFreq=14.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1369)
      0.03120024 = weight(_text_:22 in 1369) [ClassicSimilarity], result of:
        0.03120024 = score(doc=1369,freq=2.0), product of:
          0.16128273 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046056706 = queryNorm
          0.19345059 = fieldWeight in 1369, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1369)
  0.5 = coord(1/2)
```
Abstract

Data cubes for OLAP (On-Line Analytical Processing) often need to be constructed from data located in several distributed and autonomous information sources. Such a data integration process is challenging due to semantic, syntactic, and structural heterogeneity among the data. While XML (extensible markup language) is the de facto standard for data exchange, the three types of heterogeneity remain. Moreover, popular path-oriented XML query languages, such as XQuery, require the user to know in much detail the structure of the documents to be processed and are, thus, effectively impractical in many real-world data integration tasks. Several Lowest Common Ancestor (LCA)-based XML query evaluation strategies have recently been introduced to provide a more structure-independent way to access XML documents. We shall, however, show that this approach leads in the context of certain - not uncommon - types of XML documents to undesirable results. This article introduces a novel high-level data extraction primitive that utilizes the purpose-built Smallest Possible Context (SPC) query evaluation strategy. We demonstrate, through a system prototype for OLAP data cube construction and a sample application in informetrics, that our approach has real advantages in data integration.

Date

9. 2.2008 17:22:42

Type

a
Moilanen, K.; Niemi, T.; Näppilä, T.; Kuru, M.: ¬A visual XML dataspace approach for satisfying ad hoc information needs (2015) 0.00
```
0.002269176 = product of:
  0.004538352 = sum of:
    0.004538352 = product of:
      0.009076704 = sum of:
        0.009076704 = weight(_text_:a in 2269) [ClassicSimilarity], result of:
          0.009076704 = score(doc=2269,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.1709182 = fieldWeight in 2269, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2269)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Dataspace systems constitute a recent data management approach that supports better cooperation among autonomous and heterogeneous data sources with which the user is initially unfamiliar. A central idea is to gradually increase the user's knowledge about the contents, structures, and semantics of the data sources in the dataspace. Without this knowledge, the user is not able to make sophisticated queries. The dataspace systems proposed so far are usually application specific. In contrast, our idea in this paper is to develop an application-independent extensible markup language (XML) dataspace system with versatile facilities. Unlike the other proposed dataspace systems, we show that it is possible to build an interface based on conventional visual tools in terms of which the user can satisfy his or her sophisticated information needs. In our system, the user does not need to master programming techniques nor the XML syntax, which provides a good starting point for its declarative use.

Type

a

Search (2 results, page 1 of 1)

Authors

Years