MarkLogic Logo

The FLWOR Foundation

<oXygen/> XML Editor logo

Mercator IT logo
Consulting and training in XML, XSL, and XSLT

W3C logo
The XML Guild

Media Partners

Media Work logo

Sister events

Das markupforum ist eine Plattform für technisch Interessierte und Entscheider, die sich über neue und etablierte XML-Technologien informieren möchten. In diesem Jahr präsentieren Ihnen unsere Referenten Technologien und Möglichkeiten rund ums Publishing mit XML und sind gerne bereit diese Ansätze mit Ihnen zu diskutieren.

Balisage - The Markup Conference

TMRA: International Conference on Topic Maps Research and Applications - Linked Topic Maps


Mary's - Travel & Tourist Services


[1] Holistic approach for XML Structural-Indexes
Amir Averbuch, Tel Aviv University, Shachar Harrusi, Tel Aviv University, Yaniv Shmeuli, Tel Aviv University, Guy Wolf, Tel Aviv University

XML query processing is an important task. Structural-indexing techniques were presented to speed up querying of XML. Structural-indexes summarize the XML tree structure offline and then online process the structural-pattern, which compose a query, on the summary. This poster focuses on optimization of tree structural-pattern processing on indexes for XML. We present a construction, which is based on tree automata, that processes a tree structural-pattern and not a path structural-pattern as existing structural-indexes do. We use the term holistic to describe this construction because it processes the tree structural-pattern as a whole. The existing indexes join the partial solutions of the path structural-patterns. This 'join' operation is inefficient when the data or the pattern is complex. The suggested holistic technique eliminates the need for such a join.

Furthermore, the holistic index processes the `vertical' (labels on paths from root to leaf) and the `horizontal' (labels of children of the same parent) information. Other structural-indexes decompose the tree structural-pattern into paths and process only 'vertical' information. Our experiments show that the proposed holistic structural-index algorithm reduces data retrieval by 50% on average, in comparison to local-index. The reduction becomes more substantial when the structural-pattern is more complex. For certain twig patterns the holistic-index selects 500 times less nodes then the local selects for the same pattern.


[2] Fast Faceted Search in XML, Using XQuery and Indexes in eXist-db
Anne Schuth, University of Amsterdam, Maarten Marx, University of Amsterdam

We present and compare three implementations of faceted search in eXist-db, an XML database. The bit-vector based implementation outperforms the other two implementations in that performance is near constant when the size of the database grows. We investigate this method in detail to pinpoint the source of this speedup. We do so using a micro-benchmark based on XMark designed for evaluating faceted search.


[3] Treetank, Designing A Versioned XML Storage
Sebastian Graf, University of Konstanz, Marc Kramis, Seabix AG, Marcel Waldvogel, University of Konstanz, Lukas Lewandowski, University of Konstanz, Johannes Lichtenberger, University of Konstanz

XML underlies the same constant modification scenarios like any other resource especially in flexible environments like the WWW. Therefor intelligent handlings of versioned XML are mandatory. Due to the structural nature of XML, the efficient storage of changes in the data and therefor in the tree needs new paradigms regarding efficient storage and effective retrieval operations. We present Treetank, a node granular XML versioning approach which relies on the independence of the storage and the versioning system. Different layers which have the ability to satisfy specific aspects of a node-granular versioning storage guarantee this independence. Results prove that Treetank offers efficient handling of consecutive changes within all modification scenarios while not restricting XML regarding its usage. Hence, Treetank handles even huge XML instances while ensuring equal access to each revision of the data.


[4] Tag Libraries for XSLT and XQuery
Erik Hennum, MarkLogic Inc., Vyacheslav Zholudev, Jacobs University Bremen

Java Tag Library technology has seen wide adoption as a templating strategy because of its separation of concerns between presentation and data manipulation. In particular, because a tag document establishes bindings on components, a tag library removes the need to write binding logic to add or change the libraries used in the document.

We demonstrate the feasibility of adapting the Java Tag Library approach for use with XML technologies. Tag libraries can support a set of data retrieval and manipulation functions as well as UI components. Tag documents can support these functions by embedding tags within a presentation vocabulary such as XHTML or XSL-FO. In particular, tag documents can pass handlers to tag libraries for parameterization of document content by the tag library. Tag library processors can be written in either XSLT or XQuery.


[5] Using (XML) databases with SynDef
Václav Trojan, Syntea, Michal Valenta, FIT ČVUT v Praze

We present SynDef/Xdefinition technology as a tool for access (XML) databases. Although there are already broadly accepted and standardized technologies to process (XML) data which are stored in various kinds of databases (XQuery and XSLT in particular), we believe our approach can serve as a useful supplement and partially extension of these technologies.

Advantages of our approach are:

  1. It unifies two data processing stages - model checking (data validation) and data transformation into one (logical) unit.
  2. It is easier to accommodate application in case of data structure changes either model or script part of Xdefinition is changed.
  3. Simirarlly, in the case of data source changes (we decide to use XML database instead of relational one for example).
  4. It is easier for programmer to understand and grasp important part of data/database structure without the need to understand the whole structure in detail (because only the important part are mentioned in model checking stage).
  5. SynDef/Xdefinition allows to use/call traditional technologies, like XPath, XQuery, and SQL statements.
  6. Technology is continually developed since 1993 and is applied in practice on a relatively big application.


[6] Animo: limits we need and flexibilities we want
Dmitriy Shabanov, Evgeny Gazdovsky, Amir Akhmedov

Pluggable structure, information as one reference context, processing flow and context mix, code and data at one topological space ... this and many more feature make animo language a little bit different.

It based on cell's life function R = F ( C ) , where F - function (flow), C - environment force to cell and R - cell force to environment and relations between context and processing flow. Processing flow is functions chain with tree structure, which cover "from output to inputs" model.

Namespaces are used to highlight the object type: the - instance, an - reference to instance, is & have - type of relations, any - query for is-structure, ptrn - selections of processing flow way base on context, use - instruction to manipulate/affect any or ptrn ... and so on. Full list of operands can be found at specification.

Relation model base on is-have relations is build-in feature of the language. It limiting programmers' dreams and force to have clear structure. This relations do use to build universal algorithm, as example form-generator have around 150 lines of code, same time it do cover quite complex situations. Small code help to minimize possible bugs.

The language have an option to include xquery, xsl or any other computer language, same time it cover query or transformation needs itself.

For now, the language have xml representation, but it quite simple to describe scripting parser. Ever more, the language should/must be the bridge for natural language processing.

Animotron is first animo language interpreter written on xquery and eXist xml database. Next steps are optimization, speed-up and evolution of instances repository. We are looking for more ideas ...


[7] A library of data cleaning functions for XQuery
Bruno Martins, INESC-ID and Technical University of Lisbon, Helena Galhardas, INESC-ID and Technical University of Lisbon, Daniela Florescu, Oracle, Markos Zacharioudakis, Oracle, Sorin Nasoi, Enea

Data cleaning aims at obtaining high quality data, by converting source into target data without errors, inconsistencies, nor duplicates. This activity is crucial in several application contexts, such as data warehousing, data integration, and data migration. In the last decade, effective and efficient relational data cleaning techniques have been exhaustively studied. We intend to address the problem of adequately cleaning XML data which has received considerably less attention than the relational counterpart, When dealing with XML data, the activity of data cleaning is even more challenging than with relational data. First, the number of heterogeneous XML data sources available in the Web is huge. Second, the structure and schemas of XML data tend to be more complex than the relational ones.

However, XML data cleaning is also facilitated by the possibility of finding reference data in the Web more easily, and by the existence of XML web services for normalization, formatting, and extraction. We believe that the right direction to follow is to equip the XQuery language with additional mechanisms so that one can write concise and efficient data cleaning programs with it. This poster shows the first step to achieve this goal, by proposing a library of data cleaning functions that can be invoked in XQuery expressions. We envisage the following groups of data cleaning functions as fundamental: (i) normalization; (ii) conversion; (iii) string similarity; (iv) set similarity; (v) consolidation; and (vi) theasurus-based matching. This poster lists some of the functions considered in each group.

[8] MicroXML
John Cowan
  • Two blog posts by James Clark in December 2010: (1) Defining MicroXML as a concept, (2) Specifying what's in, what's out, and syntax
  • About ten other blog posts by various people
  • More than 200 postings to xml-dev
  • Available in February 2010: a parser and a self-contained editor's draft specification
  • W3C involvement probable in the future

[9] XCase: Reliable Way to Design and Maintain XML schemas
Jakub Klimek, Faculty of Mathematics and Physics, Charles University in Prague, Jakub Malý, Faculty of Mathematics and Physics, Charles University in Prague, Irena Mlýnkova, Faculty of Mathematics and Physics, Charles University in Prague, Martin Nečaský, Faculty of Mathematics and Physics, Charles University in Prague

Conceptual modeling of XML data, management of sets of XML schemas and XML document revalidation were made easier with the introduction of our conceptual model, which utilizes the MDA (Model-driven architecture) ideas of multi-level modeling. Our tool called XCase is the implementation of this model and our approaches enabling users to model their problem domain as a platform-independent model (PIM) schema, from which platform-specific model (PSM) schemas (XML schemas in our case) can be easily and quickly created. The main advantage of this approach is maintainability of multiple XML schemas describing the same data from different views as our tools maintain connections between PIM and PSM levels, which are later exploited when a change to some concept is propagated to all occurrences of the concept in all the XML schemas maintained by our tools in a well-defined and semantically correct way.