Example Walk-through
A walk-through of the usage example
pyXSD was designed to be a tool to help work with XML files and a master Schema file in the Computational Materials Science (CMS) Group at Oak Ridge National Laboratory. This usage example will go through what might be a typical usage of pyXSD in CMS. The XML files specifies a primitive cell, which contains three vectors in terms of some Cartesian unit (in this case, angstroms), definitions for a few atom types used in the cell, and specifications for each atom in the cell with positions given in terms of the vectors. The schema has validation information for many different cells, including many cases that are much more complex than the following example.
The cell is specified in an XML file that will referred to as initial.xml and a schema file that will be referred to as schema.xsd. The XML file is available for viewing; however, the group's Schema file has not been posted. The user has specified a number of transform operations to do during the run in a file, which will referedred to TransformFile. The user then enters in the command to run the program:
./pyXSD.py -i initial.xml -s schema.xsd -o final.xml -T TransformFile
Although the user could have let pyXSD find the schema file from the schemaLocation tag, the user, following good practice, specified the location explictly.
The program generates classes for each type definition in the schema, and it reads in initial.xml with the ElementTree Library. The program then goes through the schema classes, and builds a new tree of instances of these classes. It only makes an instance when there is an element with the same name in the xml that matches the element in the schema being examined at a particular time. The data from the xml is used, but it must meet the schema's specifications for the data in a particular element. In this example, the program comes up with one error message, but it continues to run, since the errors are non-fatal whenever possible:
Parser Error: Order Error - Expected element name 'programSpecific' in different position.
In this situation, the schema specified an element 'programSpecific' that was required, but it was not in the xml file. The program raises this error to let the user know, but the error is non-fatal, so the program continues on. The user did not need to specify data in the 'programSpecific' field in order to accomplish their desired task, so the user does not correct the error. The error is an order error, because pyXSD checks the order of elements in xml, which must be maintained from the order in the schema. If an element is not in the correct place, including if it is missing entirely, the program raises an order error.
The program then goes through and implements the calls made in the TransformFile. The page "Transform Operations" explains what each of these transforms do in this example.
Once the transform operations are complete, the program writes out the pythonic tree to an xml file. If a name for the output file is specified, the program uses the name given, which in this case was final.xml. If a name is not specified, the program uses stdout, and if the option -d is used, in this case, the program would use "initialTransformed.xml" as the name. This new xml file contains much more data then it did in initial.xml.