NAME

Data::Traverse - Perl extension for traversing complex data structures

SYNOPSIS

Data::Traverse supports 2 modes: a simple procedural interface and an object-oriented interface.

The procedural interface

The procedural interface can be used to retrieve parts of a complex data structure.

It is used through use Data::Traverse qw(:lists)

  use Data::Traverse qw(:lists);
  
  my $data= ...;                           # a complex data structure

  my @values= scalars( $data);             # all scalars in the structure
  my @values= refs( $data, 'LWP::Simple'); # all LWP::Simple objects in the structure

The OO interface

The OO interface is used to write iterators that go through a data structure.

  my $iter= Data::Traverse->new( $data);
  $iter->traverse( sub { my( $iter, $item)= @_;
                         print "$item\n" if( $iter->item_key eq 'id');
                         return $iter->prune if( $iter->path_matches( '/foo/bar'));
                       }
                  );
                  
  $iter->traverse( sub { $_[1]++ if( $_[0]->is_scalar); }); # changes the data

More methods are available to get information on the current context.

DESCRIPTION

Data::Traverse lets you traverse complex data structures without needing to know all about them.

It can be used for example with the data structure created by XML::Simple

Procedural Interface

refs ($data, $ref, $optional_level)

return the list of references of the $ref type (as per UNIVERSAL::isa( $field, $ref)) in the data structure, in the order of traversal (hashes are traversed through the dictionary order of their keys).

The $optional_level argument can be used to limit the depth in the data structure, 0 being <$data> itself.

scalars ($data, $optional_level)

return the list of scalar values in the data structure, in the order of traversal (hashes are traversed through the dictionary order of their keys).

The $optional_level argument can be used to limit the depth in the data structure, 0 being <$data> itself..

refs_level ($data, $ref, $level)

return the list of references to $ref at $level in the data structure.

scalars_level ($data, $level)

return the list of scalar values at $level in the data structure.

Object-Oriented Interface

The Object-Oriented interface provides iterators on arbitrary data structures. A handler is associated with the iterator and is called for every item in the data structure. Within the handler a host of methods can be called to get information about the current context.

new ($data)

create an iterator on $data

traverse ($handler)

traverse the data structure and apply the handler to all item of the data structure. An item is anything in the data structure, scalar, arrayref or hashref.

the handler is called with the iterator and the current item as arguments. Use $_[1] if you want to update the original data structure.

the handler also receives the item as $_

in the handler you can use the following functions:

path

the current path to an item in the data structure is built from the hash keys or array indexes to get to the item, joined with '/'.

For example if $data= { foo => [ qw/a b c/], bar => []} when the iterator gets to b the path ($t->path in the handler) will be /foo/1

path_matches ($exp)

$exp is a regular expression. The path is matched against that regexp.

parent

the parent of the current item (a hashref or arrayref that includes the item)

ancestors

the list (root first) of ancestors of the current item

item_index

if the item is an element of an array then this is the item index in the array, otherwise -1 is returned

item_key

if the item is a value in a hash then this is the item key in the hash, otherwise the empty string is returned

parent_key

if the parent of the item is a value in a hash then this is the associated key, otherwise the empty string is returned

parent_index

if the parent of the item is a value in an array then this is its index, otherwise -1 is returned

level

the level of depth at which the item is found in the data structure (the size of the ancestors stack)

is_scalar

return true if the item is a scalar (this is just !ref but reads better).

prune

if the handler returns prune then the children of the current item are not traversed

finish

ends the traversal and returns

BUGS/TODO

At this point cycles in the data structure are not properly processed:

- the procedural interface will most likely enter deep recursion,

- the OO interface will only get once to each item, but then testing the context will only occur once

The procedural interface does a breadth-first traversal of the data, the OO interface does a depth-first traversal, it would be nice to be able to choose the algorithm.

More tests need to be written (isn't this always the case? ;--)

It would be nice to have more generic methods to query the context (XPath-based?)

SEE ALSO

AUTHOR

Michel Rodriguez, <mirod@cpan.org>

Feedback and comments welcome!

COPYRIGHT AND LICENSE

Copyright (C) 2004 by Michel Rodriguez

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.2 or, at your option, any later version of Perl 5 you may have available.