[Xml-compile] XML::Compile 0.87
Mark Overmeer
mark at overmeer.net
Fri Jul 4 14:24:57 BST 2008
You don't like dashes in HASH keys? See this!
key_write => 'UNDERSCORE'
version 0.87: Fri Jul 4 15:12:17 CEST 2008
Changes:
- removed /el() and #el(name) from location of error message,
which makes the location simpler and probably not less clear.
Improvements:
- implemented key_rewrite, tests in t/73rewrite.t
- removed double check for existence of required value in writer.
- renamed ::Schema::new() option 'output_namespaces' into
the nicer 'prefixes'. Old name still available.
Read the attachment (excerpt from the manual-page) about the new option
key_rewrite, with only mild compile-time penalty and no run-time burden.
--
Enjoy,
MarkOv
------------------------------------------------------------------------
Mark Overmeer MSc MARKOV Solutions
Mark at Overmeer.net solutions at overmeer.net
http://Mark.Overmeer.net http://solutions.overmeer.net
-------------- next part --------------
===== Key rewrite
[Added in release 0.87] The standard practice is to use the localName
of the XML elements as key in the Perl HASH; the key rewrite mechanism
is used to change that, sometimes to seperate elements which have the
same localName within different name-spaces, in other cases just for
fun or convenience.
Rewrite rules are interpreted at "compile-time", which means that they
do not slow-down the XML construction or deconstruction. The rules
work the same for readers and writers, because they are applied to name
found in the schema.
Key rewrite rules can be set during schema object initiation, with
new(key_rewrite), or to an existing schema object with addKeyRewrite().
The last defined rewrite rules will be applied first. These rules will
be used in all calls to compile().
Besides, you can use compile(key_rewrite) to add rules which are only
used for a single compilation. These are applied before the global
rules. All rules will always be attempted, and the changes will me
made to the key after the previous change.
== rewrite via table
When a HASH is provided as rule, then the XML element name is looked-
up. If found, the value is used as translated key.
First full name of the element is tried, and then the localName of the
element. The full name can be created with
XML::Compile::Util::pack_type() or by hand:
use XML::Compile::Util qw/pack_type/;
my %table =
( pack_type($myns, 'el1') => 'nice_name1'
, "{$myns}el2" => 'alsoNice'
, el3 => 'in any namespace'
);
$schema->addKeyRewrite( \%table );
== rewrite via function
When a CODE reference is provided, it will get called for each key
which is found in the schema. Passed are the name-space of the element
and its local-name. Returned is the key, which may be the local-name
or something else.
For instance, some people use capitals in element names and personally
I do not like them:
sub dont_like_capitals($$)
{ my ($ns, $local) = @_;
lc $local;
}
$schema->addKeyRewrite( \&dont_like_capitals );
for short:
my $schema = XML::Compile::Schema->new( ...,
key_rewrite => sub { lc $_[1] } );
== rewrite when localName collides
Let's start with an appology: we cannot auto-detect when these rewrite
rules are needed, because the colliding keys are within the same HASH,
but the processing is fragmented over various (sequence) blocks: the
parser does not have the overview on which keys of the HASH are used
for which elements.
The problem occurs when one complex type or substitutionGroup contains
multiple elements with the same localName, but from different name-
spaces. In the perl representation of the data, the name-spaces get
ignored (to make the programmer's life simple) but that may cause these
nasty conflicts.
== rewrite for convenience
In XML, we often see names like "my-elem-name", which in Perl would be
accessed as
$h->{'my-elem-name'}
In this case, you cannot leave-out the quotes in your perl code, which
is quite inconvenient, because only 'barewords' can be used as keys
unquoted. When you use option "key_rewrite" for compile() or new(),
you could decide to map dashes onto underscores.
key_rewrite
=> sub { my ($ns, $local) = @_; $local =~ s/\-/_/g; $local }
key_rewrite => sub { $_[1] =~ s/\-/_/g; $_[1] }
then "my-elem-name" in XML will get mapped onto "my_elem_name" in Perl,
both in the READER as the WRITER. Be warned that the substitute
command returns the success, not the modified value!
== pre-defined rewrite rules
UNDERSCORES
Replace dashes (-) with underscores (_).
SIMPLIFIED
Rewrite rule with the constant name (STRING) "SIMPLIFIED" will
replace all dashes with underscores, translate capitals into
lowercase, and remove all other characters which are none-bareword
(if possible, I am too lazy to check)
PREFIXED
This requires a table for prefix to name-space translations, via
compile(prefixes), which defines at least one non-empty (default)
prefix. The keys which represent elements in any name-space which
has a prefix defined will have that prefix and an underscore
prepended.
Be warned that the name-spaces which you provide are used, not the
once used in the schema. Example:
my $r = $schema->compile
( READER => $type
, prefixes => [ mine => $myns ]
, key_rewrite => 'PREFIXED'
);
my $xml = $r->( <<__XML );
<data xmlns="$myns"><x>42</x></data>
__XML
print join ' => ', %$xml; # mine_x => 42
PREFIXED(...)
Like the previous, but now only use a selected sub-set of the
available prefixes. This is particular useful in writers, when
explicit prefixes are also used to beautify the output.
The prefixes are not checked against the prefix list, and may have
surrounding blanks.
key_rewrite => 'PREFIXED(opt,sar)'
More information about the Xml-compile
mailing list