PLISAXC Subroutine

Purpose

Invokes the XML parser for processing an XML document residing in one or more program buffers.

Syntax

CALL PLISAXC(e, p, x, n, c)

Parameters

e is an event structure.

p is a pointer value or token that the parser passes to the event functions.

x is the address of the initial buffer that contains the XML document for processing.

n is the number of bytes of data in the buffer specified by x.

c is a numeric expression that specifies the code page of the XML document for processing.

Description

The PLISAXC built-in subroutine provides internal SAX parsing based on the libmxml2 SAX parser. It provides 19 distinct events, 16 of which are shared with the PLISAXA and PLISAXB built-ins but with some slightly different parameters and actions, and three events that are unique to PLISAXC. It operates on pointer to a memory address that contains an XML string. That string can be in ASCII or EBCDIC.

PLISAXC has no special environmental requirements except that it is not supported in AMODE 24. It executes in all the principal run-time environments, including CICS, IMS, WebSphere MQ, and TSO.

If the XML is contained in a CHARACTER VARYING or a WIDECHAR VARYING string, then the ADDRDATA built-in function should be used to obtain the address of the first data byte.

If the XML is contained in a WIDECHAR string, the value for the number of bytes is twice the value returned by the LENGTH built-in function.

Examples

This example does not use namespaces, and all the input is passed when PLISAXC is first invoked (and as a result, the end_of_input event should not be invoked). The example shows only the main routine with a call to PLISAXC, it does not show the event structure or the type declarations.

dcl token        char(8);
dcl xmlDocument  char(4000) var;
xmlDocument =
    '<?xml version="1.0" standalone="yes"?>'
||  '<!--This document is just an example-->'
||  '<sandwich>'
||  '<bread type="baker's best"/>'
||  '<?spread please use real mayonnaise ?>'
||  '<meat>Ham &amp; turkey</meat>'
||  '<filling>Cheese, lettuce, tomato, etc.</filling>'
||  '<![CDATA[We should add a <relish> element in future!]]>'.
||  '</sandwich>'
||  ' ';
call plisaxc( eventHandler,
              addr(token),
              addrdata(xmlDocument),
              length(xmlDocument) );
end;

Restrictions

If using the -ebcdic option to compile your program, the callback event logic should account for the fact that the data passed back to the call backs is in ASCII, even if the input format was in EBCDIC. Because of this, the data must be translated prior to use within the callbacks. In addition, all reference values passed as a FIXED BIN(31) are for ASCII character encoding.

On UNIX, if your XML input is EBCDIC and contains open and close square brackets (e.g. []) and they are of the value X'BA' and X'BB', then you need to create your own custom codeset module for translation of these routines, and where you execute them, you need to set the environment variable MFCODESET to point to your custom codeset. This does not apply for ASCII input.

It is assumed that your program is built using the -bigendian compiler option if operating on an Intel Chip. If compiling your program without using the -bigendian compiler option, then you must convert the parameter types to/from -bigendian prior to use.

If using the FLAGS parameter of the content_characters event as a BIT(8) array, you must take into account bit ordering differences on Intel from z/OS. If evaluated as a single byte and not a bit array, then bit ordering matches z/OS bit ordering. See the online help topic Bit (n) for a discussion of bit ordering in a bit array on little endian platforms.

If XML content contains predefined reference characters, separate content_characters events are driven for the characters preceding the predefined reference characters, the reference character, and the characters after the reference characters. For example, the XML snippet <meat>Ham &amp; turkey</meat> generates three content_characters events using PLISAXC; one for Ham, one for &, and one for Turkey.

For the FLAGS parameter of the content_characters event, the flag to indicate the next event contains content_characters ('80'x). This currently only indicates if the next event contains content_characters when the current event contains a character which must be escaped back to XML. Also, the flag to indicate that no characters are escaped back to XML ('40'x) is only set if the character string contains one of the predefined reference characters for PLISAXA.