XML Prague 2009

Gold Sponsors:

Sponsors:

Main Media Partner:

Media Partners:

Sister Conference:

Partner for Accommodation:

Video services sponsored by:

Produced by:
XMLPrague.cz & Institute for Theoretical Computer Science

Posters

Here are the posters for XML Prague 2009.

Current Support of XML by the "Big Three", Irena Mlýnková and Martin Nečaský
Application of Xdefinitions in solving the problems of business consistency of data descriptions in large information systems, Jiří Měska
Using XSLT for C code generation and testing, Tony Graham
Phontom nodes, Dmitriy Shabanov
The SDOM 1.0 library, O'Neil Delpratt (joint work with Rajeev Raman and Naila Rahman)
EXQuery: Collaboratively Defining Open Standards for Portable XQuery Applications, Adam Retter
XSL-FO 2.0, Tony Graham

Current Support of XML by the "Big Three"
Irena Mlýnková and Martin Nečaský

XML technologies have undoubtedly become a standard for data representation and manipulation. Thus it is inevitable to propose and implement efficient techniques for managing and processing XML data. A natural alternative is to exploit tools and functions offered by relational database systems. Even though the native XML databases are undoubtedly more efficient, relational databases are still more popular among XML users due to their long history, maturity and reliability.

In this paper we provide an overview of XML-processing functions that are currently supported by the so-called "Big Three", i.e. Oracle 11g, IBM DB2 9, and Microsoft SQL Server 2008. We firstly show what are the key aspects a user may require from an XML-enabled database. Then, we provide an overview of their support in the respective systems. And, finally, we compare and contrast the findings so that advantages and disadvantages of the particular systems are apparent.

Application of Xdefinitions in solving the problems of business consistency of data descriptions in large information systems
Jiří Měska
Syntea

Using XSLT for C code generation and testing
Tony Graham
Menteith Consulting Limited

xmlroff is a fast, free, high-quality, multi-platform XSL formatter that aims to excel at DocBook formatting and that integrates easily with other programs and with scripting languages.

xmlroff is written in C, but much of the C code is generated from the XML source for the XSL recommendation. What started out as a simple stylesheet to extract FO and property names from the XSL spec to avoid transcription errors has grown (and shrunk) over time to generate:

- C code for GObjects representing FOs and properties

- C code to verify allowed property values

- A single enumeration of all the 'enumerated token' values defined by XSL

- Structured comments in the C code that are used by gtk-doc to generate documentation

- XML entity declarations and references for assembling the generated documentation into one DocBook document

The code generation has been toned down in places over time. The initial object hierarchy was very flat because it was very easy to generate a unique class for each FO and each property, but just because you can doesn't mean you should, and there is now more use of common superclasses and that single, common enumuration of all of XSL's 'enumeration token' values.

Other uses of XSLT and XPath in the xmlroff project include:

- Pruning unsupported FOs and properties from the input FO document

- Generating test scripts for any XSL formatter from a W3C-format XSL testsuite's XML description

- Generating HTML reports from test results XML

- Generating a W3C-format XSL testsuite from FOP "layoutengine"

Phontom nodes
Dmitriy Shabanov

Just imagine that you have to write some well structured defined application. For it you easily can describe all primary data: all forms to enter information inside.

Next you need to make all reports to work. For that you will write a lot of manifold ... stop, but if we describe it through relation:

Element0 = function (Element1, Element2, ....); (1)
where
Element* is the Item or xs:anyType from data model, which is defined in [Data Model];
function is logical operation on elements.

Then programming may be different:
1. Describe all primary structures.
2. Complete structure by putting relations. Structure integrity can be guaranteed by relation coding. Information completeness will be guaranteed from part of information or the reason why it can be reached can be highlight.
3. [XSLT] can transform one structure to another.
Calculated data can be used as element at other functions.
Dependences can be calculated from relations definitions. Parallelizable calculations can be optimized based on the dependence.

Image this way usage at accounting world is very simple. Same method can be used at lingual and other structures, content management system for instance.

The SDOM 1.0 library
O'Neil Delpratt (joint work with Rajeev Raman and Naila Rahman)
University of Leicester - LRA

The poster abstract and the full paper download.

The Succinct DOM (SDOM) library is a DOM implementation written in C++. Currently SDOM supports the read-only operations of the core DOM API and the DOM extended TreeWalker API. In addition, SDOM supports Saxon's DocumentInfo and NodeInfo interfaces. SDOM is suitable for in-memory representation of large XML documents, as it avoids the use of pointers to represent the structure of node objects, and is based upon succinct data structures, which use an information-theoretically minimum amount of space to represent an object.

SDOM gives a space-efficient in-memory representation, with stable and predictable memory usage. Experimental evaluations were performed in [Delpratt et al., EDBT '08] on the main components of SDOM, comparing performance against the standard C++ DOM implementation Xerces (memory-usage and running time) and Saxon's TinyTree (memory-usage only). The memory usage of SDOM is an order of magnitude less than Xerces and Saxon, but SDOM is extremely fast: navigation is in some cases faster than for a pointer-based representation such as Xerces (even for moderate-sized documents which can comfortably be loaded into main memory by Xerces).

Our discussion above is based on the library libSDOM (SDOM, with uncompressed text nodes). In addition, SDOM consists of the library libSDOMCT (SDOM-CT, with compressed text nodes). The variant, SDOM-CT, applies bzip-based compression to textual and attribute data, and its space usage is comparable with "queryable" XML compressors. Some of these compressors support navigation and/or querying (e.g. subpath queries) of the compressed file. SDOM-CT does not support querying directly, but remains extremely fast: it is several orders of magnitude faster for navigation than queryable XML compressors that support navigation (and only a few times slower than say Xerces).

EXQuery: Collaboratively Defining Open Standards for Portable XQuery Applications
Adam Retter

After developing extensions for XQuery to enable the development of entire applications, we came to realise that many vendors were also tackling these same issues in their own ways.

Conversely, as XQuery Application developers we want our applications to run on all XQuery vendor implementations; if we step outside of the XQuery specification, then today this is not possible!

EXQuery will attempt to address the issues of Extensibility and Portability in XQuery Applications. Our approach is to encourage and support collaboration within the XQuery community, culminating in the development of open standards.

Link to the new EXQuery website

XSL-FO 2.0
Tony Graham
Menteith Consulting Limited

The W3C is in the process of developing the second major version of XSL-FO, the formatting specification component of XSL. XSL-FO is widely deployed in industry and academia where multiple output formats (typically print and online) are needed from single source XML. It is used in many diverse applications and countries on a large number of implementations to create technical documentation, reports and contracts, terms and conditions, invoices and other forms processing, such as driver’s licenses, postal forms, etc. XSL-FO is also widely used for heavy multilingual work because of the internationalization aspects provided in 1.0 to accommodate multiple and mixed writing modes (writing directions such as left-to-right, top-to-bottom, right-to-left, etc.) of the world's languages.

Extensible Stylesheet Language (XSL) Requirements Version 2.0 (http://www.w3.org/TR/xslfo20-req/) introduces planned new work for XSL-FO 2.0.

The primary goals of the W3C XSL Working Group in developing XSL 2.0 are to provide more sophisticated formatting and layout, enhanced internationalization to provide special formatting objects for Japanese and other Asian and non-Western languages and scripts and to improve integration with other technologies such as SVG and MathML.

A number of XSL 1.0 implementations already support dynamic inclusion of vector graphics using W3C SVG. The XSL and SVG WGs want to define a tighter interface between XSL-FO and SVG to provide enhanced functionality. Experiments with the use of SVG paths to create non-rectangular text regions, or "run-arounds", have helped to motivate further work on deeper integration of SVG graphics inside XSL-FO documents, and to work with the SVG WG on specifying the meaning of XSL-FO markup inside SVG graphics. A similar level of integration with MathML is contemplated.

If you would like to participate in developing XSL-FO 2.0, please contact the FO subgroup chair, Liam Quin (liam@w3.org).