[Xml-compile] XML::Compile 0.87

Fri Jul 4 14:24:57 BST 2008

You don't like dashes in HASH keys?  See this!

   key_write => 'UNDERSCORE'

  version 0.87: Fri Jul  4 15:12:17 CEST 2008

        Changes:
        - removed /el() and #el(name) from location of error message,
          which makes the location simpler and probably not less clear.

        Improvements:
        - implemented key_rewrite, tests in t/73rewrite.t
        - removed double check for existence of required value in writer.
        - renamed ::Schema::new() option 'output_namespaces' into
          the nicer 'prefixes'.  Old name still available.

Read the attachment (excerpt from the manual-page) about the new option
key_rewrite, with only mild compile-time penalty and no run-time burden.
-- 
Enjoy,
               MarkOv

------------------------------------------------------------------------
       Mark Overmeer MSc                                MARKOV Solutions
       Mark at Overmeer.net                          solutions at overmeer.net
http://Mark.Overmeer.net                   http://solutions.overmeer.net

-------------- next part --------------
===== Key rewrite

[Added in release 0.87] The standard practice is to use the localName
of the XML elements as key in the Perl HASH; the key rewrite mechanism
is used to change that, sometimes to seperate elements which have the
same localName within different name-spaces, in other cases just for
fun or convenience.

Rewrite rules are interpreted at "compile-time", which means that they
do not slow-down the XML construction or deconstruction.  The rules
work the same for readers and writers, because they are applied to name
found in the schema.

Key rewrite rules can be set during schema object initiation, with
new(key_rewrite), or to an existing schema object with addKeyRewrite().
The last defined rewrite rules will be applied first.  These rules will
be used in all calls to compile().

Besides, you can use compile(key_rewrite) to add rules which are only
used for a single compilation.  These are applied before the global
rules.  All rules will always be attempted, and the changes will me
made to the key after the previous change.

== rewrite via table

When a HASH is provided as rule, then the XML element name is looked-
up.  If found, the value is used as translated key.

First full name of the element is tried, and then the localName of the
element.  The full name can be created with
XML::Compile::Util::pack_type() or by hand:

  use XML::Compile::Util qw/pack_type/;

  my %table =
    ( pack_type($myns, 'el1') => 'nice_name1'
    , "{$myns}el2" => 'alsoNice'
    , el3          => 'in any namespace'
    );
  $schema->addKeyRewrite( \%table );

== rewrite via function

When a CODE reference is provided, it will get called for each key
which is found in the schema.  Passed are the name-space of the element
and its local-name.  Returned is the key, which may be the local-name
or something else.

For instance, some people use capitals in element names and personally
I do not like them:

  sub dont_like_capitals($$)
  {   my ($ns, $local) = @_;
      lc $local;
  }
  $schema->addKeyRewrite( \&dont_like_capitals );

for short:

  my $schema = XML::Compile::Schema->new( ...,
      key_rewrite => sub { lc $_[1] } );

== rewrite when localName collides

Let's start with an appology: we cannot auto-detect when these rewrite
rules are needed, because the colliding keys are within the same HASH,
but the processing is fragmented over various (sequence) blocks: the
parser does not have the overview on which keys of the HASH are used
for which elements.

The problem occurs when one complex type or substitutionGroup contains
multiple elements with the same localName, but from different name-
spaces.  In the perl representation of the data, the name-spaces get
ignored (to make the programmer's life simple) but that may cause these
nasty conflicts.

== rewrite for convenience

In XML, we often see names like "my-elem-name", which in Perl would be
accessed as

  $h->{'my-elem-name'}

In this case, you cannot leave-out the quotes in your perl code, which
is quite inconvenient, because only 'barewords' can be used as keys
unquoted.  When you use option "key_rewrite" for compile() or new(),
you could decide to map dashes onto underscores.

  key_rewrite
     => sub { my ($ns, $local) = @_; $local =~ s/\-/_/g; $local }

  key_rewrite => sub { $_[1] =~ s/\-/_/g; $_[1] }

then "my-elem-name" in XML will get mapped onto "my_elem_name" in Perl,
both in the READER as the WRITER.  Be warned that the substitute
command returns the success, not the modified value!

== pre-defined rewrite rules

UNDERSCORES
    Replace dashes (-) with underscores (_).

SIMPLIFIED
    Rewrite rule with the constant name (STRING) "SIMPLIFIED" will
    replace all dashes with underscores, translate capitals into
    lowercase, and remove all other characters which are none-bareword
    (if possible, I am too lazy to check)

PREFIXED
    This requires a table for prefix to name-space translations, via
    compile(prefixes), which defines at least one non-empty (default)
    prefix.  The keys which represent elements in any name-space which
    has a prefix defined will have that prefix and an underscore
    prepended.

    Be warned that the name-spaces which you provide are used, not the
    once used in the schema.  Example:

      my $r = $schema->compile
        ( READER => $type
        , prefixes    => [ mine => $myns ]
        , key_rewrite => 'PREFIXED'
        );

      my $xml = $r->( <<__XML );
    <data xmlns="$myns"><x>42</x></data>
    __XML

      print join ' => ', %$xml;    #   mine_x => 42

PREFIXED(...)
    Like the previous, but now only use a selected sub-set of the
    available prefixes.  This is particular useful in writers, when
    explicit prefixes are also used to beautify the output.

    The prefixes are not checked against the prefix list, and may have
    surrounding blanks.

      key_rewrite => 'PREFIXED(opt,sar)'