Add comments at XMLToObjectsDiscussion, please.
Contact
Chris McDonough
(Credit to Shane Hathaway for this idea)
Problem
XML is an object-friendly data format. Unlike flat data formats
such as CSV, it can have an arbitrarily complex data containment
hierarchy. In all cases, XML data can be expressed in the form of
a Document Object Model (DOM) tree, where the DOM represents the
hierarchical structure imposed by the schema of a chunk of
well-formed XML, and where methods can be called on the DOM to
return various pieces of information about the structure. Having
a DOM tree that can be poked and prodded for information makes it
slightly easier to transform XML encoded information into other
things, like instances of Python objects. But it would be even
more convenient to have a generically useful facility for
transforming XML data into Python objects less circuitously. The
Zope Product XMLDocument allows you to import an XML document into
Zope and have it converted into nodes which are instances of a
single generic Python class. But a facility to transform a
well-understood XML document into instances of arbitrary Python
classes does not exist.
Proposed Solutions
It sure would be keen if we could transform a well-understood XML
document into another structured format which directly represented
a hierarchy of Python/Zope objects.
A format already exists for representing Zope objects in XML in
the "xml pickle" ("ppml") format used by the XML import/export
machinery. This format encodes Python "pickles" (serialized
Python object representations) into XML and vice versa.
It may be conceivable that a transform operation on a given XML
file could turn it in to the "xml pickle" format, after which it
could be imported into Zope as a set of Zope objects.
The transform could be performed any number of ways, including via
XSLT or via Python. The transforming operation would need to know
enough about the detail of the input XML to associate elements
with "known" Zope classes and other object types (strings,
integers, etc.) It would obviously also need to know the output
format.
A Zope Product could be made to help perform this type of
transform and to handle the details of actually importing the
generated ppml into Zope.
Risk Factors
- Overgeneralization
creating a given XML-to-ppml transform
method might end up in many cases being more complicated than just
parsing the input XML and generating objects manually. If
possible, we should shoot for a way to let people transform very
simple XML into very simple Python structures easily.
- Undergeneralization
such a beast would be useful outside Zope,
within raw Python. "ppml" may not be the best format to use for
this purpose, as it contains Zope-specific data. We may want to
create another XML format to represent generic Python objects, and
give the option to choose between this format and ppml at
transform time. Someone may have already worked out a more
generic format.
Scope
The project will provide a Python package which facilitates the
transform of arbitrary XML into an XML format which directly
represents Python objects.
Additionally, the project will provide a Zope product which wraps
the Python package and provides a mechanism for importing
arbitrary XML as Python objects. It will also provide management
functionality for the code involved in the transform. It will
also provide a (possibly) "wizard-like" mechanism for creating and
performing relatively simple transforms.
Deliverables
A set of use cases for XMLToObjects.
The DTD for a suitable format to represent Python objects (if not ppml).
A Zope product wrapping XMLToObjects, which includes common-case
wizardry for transforming "simple" XML into "simple" Python instances.
A sample transformer and input XML document.
Documentation.
k_vertigo
some interesting links. first one is using xml schema to generate java objects. second one a python xml_pickle format thats currently in progress.
|