UNSTRING Statement

The UNSTRING statement separates a data item into one or more receiving fields. Delimiters may be used to specify the ends of fields. Substring values are assigned to unique destination data items.

Note: This manual entry includes code examples and highlights for first-time users following the General Rules section.

General Format

UNSTRING source

   [ DELIMITED BY [ALL] delim 

                  [ OR [ALL] delim ] ... ]

     INTO { dest [ DELIMITER in delim-dest ] 

                 [ COUNT IN counter ] } ...

   [ WITH POINTER ptr-var ]

   [ TALLYING IN tally-var ]

   [ ON OVERFLOW statement-1 ]

   [ NOT ON OVERFLOW statement-2 ]

   [ END-UNSTRING ]

Syntax Rules

  1. Source is an alphanumeric data item. Source may be reference modified.
  2. dest is a USAGE DISPLAY data item. It may not be edited.
  3. delim is a nonnumeric literal or an alphanumeric data item. The ALL literal construct may not be used.
  4. The compiler allows source and delim to be numeric literals, in which case it treats them as string literals, displaying the following Warning at compile time:
    Warning: Literal is numeric - treated as alphanumeric

    In such cases, leading zeros are stripped from the numeric literal to form the string literal.

  5. delim-dest is an alphanumeric data item.
  6. counter, ptr-var, and tally-var are integer numeric data items.
  7. statement-1 and statement-2 are imperative statements.
  8. ptr-var must be large enough to contain a value one greater than the size of source.
  9. The DELIMITER IN and COUNT IN phrases can appear only if there is a DELIMITED BY phrase.

General Rules

  1. UNSTRING breaks up source into the various dest fields. source is the sending field and dest is the receiving field. Up to 50 dest items are allowed.
  2. counter represents the count of the number of characters within source isolated by the delimiters for the move to dest. This does not include a count of the delimiter characters.
  3. ptr-var represents the relative character position within source to move from. The leftmost position is position "1. If no POINTER phrase is specified, examination begins with the leftmost character position.
  4. tally-var is a counter which is incremented by 1 for each dest item accessed during the UNSTRING operation.
  5. Neither ptr-var nor tally-var is initialized by the UNSTRING statement.
  6. Each delim represents one delimiter. When a delimiter contains two or more characters, all the characters must be present in contiguous positions in source to be recognized as a delimiter. When delim is a figurative constant, it stands for a single nonnumeric literal.
  7. When the ALL phrase is specified, one or more contiguous occurrences of delim in source are treated as if they were only one occurrence for the remaining General Rules. Only one occurrence of delim is moved to delim-dest in this case.
  8. When two or more delimiters are specified, an OR condition exists between them. Each delimiter is compared to the sending field in the order written. If a match occurs, the characters in the sending field are considered to be a single delimiter. No characters in source can be considered a part of more than one delimiter.
  9. When an examination encounters two contiguous delimiters, the current receiving area is space-filled if it is alphabetic or alphanumeric, or zero-filled if it is numeric.
  10. When the UNSTRING statement initiates, the current receiving area is the first dest item. Data is transferred from source to the receiving area according to the following rules:
    1. Examination starts at the character position indicated by ptr-var, or the leftmost position if ptr-var is not specified.
    2. If the DELIMITED BY phrase is specified, the examination proceeds left-to-right until a delimiter is encountered. If the DELIMITED BY phrase is not specified, the number of characters examined is equal to the size of the receiving area. The sign character of the receiving item (if any) is not included in the size. If the end of source is encountered before the delimiting condition is met, the examination stops with the last character of source.
    3. The characters examined (excluding the delimiting characters, if any) are treated as an elementary alphanumeric item. These characters are moved to the current receiving field according to the rules for the MOVE statement, including space filling.
    4. If the DELIMITER IN phrase is specified, the delimiting characters are moved to delim-dest as if they were the alphanumeric source of a MOVE statement. If the delimiting condition is the end of source, then delim-dest is space-filled.
    5. If the COUNT IN phrase is specified, the number of characters examined (excluding the delimiter) is moved to counter as if the count were the numeric source of a MOVE statement.
    6. If the DELIMITED BY phrase is specified, the source item is further examined beginning with the first character to the right of the delimiter found. If the DELIMITED BY phrase is not specified, the source item is further examined beginning with the character to the right of the last character examined.
    7. The current receiving area is then set to the next dest item and the cycle specified in steps (b) through (g) is repeated until either all the characters in source are examined or there are no more dest items.
  11. The ptr-var (if any) is incremented by 1 for each character in source examined.
  12. An overflow condition occurs in either of the following situations:
    1. The value of ptr-var is less than one or greater than the size of source when the UNSTRING statement starts.
    2. During execution, all dest items have been acted upon and source contains unexamined characters.
  13. When the overflow condition exists, statement-1 (if any) executes and the UNSTRING statement terminates.
  14. If statement-2 is specified, it executes after the UNSTRING statement has finished if the overflow condition has not occurred.

Code Examples

Use UNSTRING to decompose strings containing multiple data elements. For example, a string data item might contain a person's name, using commas to separate the name fields: last-name,first-name,middle-initial. Using UNSTRING, and specifying "," (comma) as the delimiter, you could separate the name string into three data items, each containing an element of the full name.

Example 1

Assume the following data items:

01  CUSTOMER-NAME    PIC X(40)  VALUE ALL SPACES.
01  LAST-NAME        PIC X(25)  VALUE ALL SPACES.
01  FIRST-NAME       PIC X(14)  VALUE ALL SPACES.
01  MIDDLE-I         PIC X      VALUE ALL SPACES.
{ . . . }
PROCEDURE DIVISION.
{ . . . }
 DISPLAY 'Enter name: LAST,FIRST,MIDDLE-INITIAL'.
 DISPLAY 'Use a comma to separate each name entry'.
    ACCEPT CUSTOMER-NAME.

{ . . . }

UNSTRING CUSTOMER-NAME
   DELIMITED BY ","
   INTO LAST-NAME,   |characters to first comma
        FIRST-NAME,  |characters to second comma
        MIDDLE-I     |gets only the first character
                     |of the remaining string.  No
                     |overflow is raised. 
                     |See general rule 12.
   ON OVERFLOW
      DISPLAY 'OVERFLOW on UNSTRING'
END-UNSTRING.

For code examples 2 and 3 assume the following data items:

01  COLOR-LIST  PIC X(22) VALUE "RED:BLUE/GREEN  YELLOW".
01  COLOR-1     PIC X(6)  VALUE ALL SPACES.
01  COLOR-2     PIC X(6)  VALUE ALL SPACES.
01  COLOR-3     PIC X(6)  VALUE ALL SPACES.
01  COLOR-4     PIC X(6)  VALUE ALL SPACES.
01  DELIMIT-1   PIC X(3)  VALUE ALL SPACES.
01  COUNT-1     PIC 9     VALUE 0.

Example 2

UNSTRING COLOR-LIST 
   DELIMITED BY ":" OR "/" OR ALL SPACE
*ALL SPACE treats contiguous spaces 
*as one delimiter.
   INTO COLOR-1,
        COLOR-2,
        COLOR-3,
        COLOR-4 
END-UNSTRING.
*COLOR-1 = "RED   "
*COLOR-2 = "BLUE  "
*COLOR-3 = "GREEN "
*COLOR-4 = "YELLOW"

Example 3

MOVE 0 TO COUNT-1.

UNSTRING COLOR-LIST
   DELIMITED BY ":" OR "/" OR ALL SPACE
*DELIMIT-1 and COUNT-1 will hold only
*the values associated with COLOR-1.
   INTO COLOR-1
         DELIMITER IN DELIMIT-1
         COUNT IN COUNT-1,
         COLOR-2,
         COLOR-3,
         COLOR-4
   ON OVERFLOW 
      DISPLAY "overflow: unstring colors"
   NOT ON OVERFLOW
*do when UNSTRING succeeds.
      PERFORM SORT-COLORS
END-UNSTRING.
*COLOR-1 = "RED   "
*COLOR-2 = "BLUE  "
*COLOR-3 = "GREEN "
*COLOR-4 = "YELLOW"
*DELIMIT-1 = ":  "
*COUNT-1 = 3 count-1 holds the number of characters in RED

Example 4

When the string does not contain delimiters between the data elements, but the size and position of each string data element is known, the string can be deconstructed without a DELIMITED BY phrase.

Assume the following data items:

01  COLOR-LIST   PIC X(7) VALUE "REDBLUE".
01  COLOR-1      PIC X(3) VALUE ALL SPACES.
01  COLOR-2      PIC X(4) VALUE ALL SPACES.
{ . . . }
PROCEDURE DIVISION.
{ . . . }
UNSTRING COLOR-LIST
   INTO COLOR-1,
*first substring must be three characters.
        COLOR-2
*second substring must be four characters.
END-UNSTRING.
*COLOR-1 = "RED"
*COLOR-2 = "BLUE"

Example 5

Use POINTER and a PERFORM loop to extract and process string elements.

Assume the following data items:

01  COLOR-LIST       PIC X(21)  VALUE "RED BLUE GREEN YELLOW".
01  COLOR-LIST-SIZE  PIC 999.
01  COLOR-1          PIC X(6)   VALUE SPACES.
01  STRING-PTR       PIC 99.
01  FLAGS.
    05  COLOR-STRING-EMPTY   PIC X VALUE "N".
        88 NO-MORE-COLORS          VALUE "Y".
{ . . . }
PROCEDURE DIVISION.
{ . . . }
*string pointer must be initialized
MOVE 1 TO STRING-PTR.
SET COLOR-LIST-SIZE TO SIZE OF COLOR-LIST.
PERFORM PROCESS-COLOR UNTIL NO-MORE-COLORS.
{ . . . }
PROCESS-COLOR.
   UNSTRING COLOR-LIST 
      DELIMITED BY ALL SPACE
      INTO COLOR-1
      POINTER STRING-PTR
      ON OVERFLOW
*An OVERFLOW condition will be raised every time
*through the loop, except when extracting the last
*substring.  When the overflow is the result of
*having unexamined characters at the end of the
*input string, take no action.  When the overflow
*is due to the pointer value exceeding the length
*of the string, set COLOR-STRING-EMPTY.
         IF STRING-PTR > COLOR-LIST-SIZE THEN
            MOVE "Y" TO COLOR-STRING-EMPTY
         END-IF
*process the value
   PERFORM STORE-COLOR-1
*initialize COLOR1 before fetching the next color
   MOVE SPACES INTO COLOR-1
   END-UNSTRING.

Highlights for first-time users

  1. UNSTRING is best suited for separating string components that share a common delimiter. The delimiter must not appear as an element of the components' values.
  2. DELIMITED BY is optional. If it's omitted, each destination data item is completely filled. Effectively, the respective size of each destination data item is the respective delimiter.
  3. Assignment to the destination data item is done with an implied MOVE. The MOVE operation will truncate the substring or space fill the destination data item, as required. Truncation of the substring, or space filling of the destination data item resulting from the implicit MOVE, does not raise an OVERFLOW condition.
  4. The OVERFLOW condition is raised if: (a) all destination data items are used and characters still remain in the source data item; or (b) POINTER is used and the value of the pointer variable is less than 1 or greater than the length of the source data item.
  5. Use the ALL option to treat contiguous occurrences of a delimiter, such as spaces, as a single occurrence.
  6. Use DELIMITER IN to place the delimiting character(s) of the current substring into the named data item.
  7. Use the COUNT IN option to save the length of the current substring into the named data item.
  8. Use TALLYING to tally the number of destination data items assigned by the UNSTRING statement.
  9. Use the POINTER option to specify a numeric holder (ptr-var) for the current position in the source data item. By pre-assigning a value to the pointer variable you can start the examination of the source data item at any position in the string. ptr-var is incremented by one for each character in the source data item that is examined. POINTER allows the programmer to use multiple UNSTRING statements to process the source data item. Note, however, that an overflow condition will be raised if the value of ptr-var is less than the length of the string when the UNSTRING statement terminates.

    You must initialize the tallying and pointer variables or results are unpredictable.

  10. Use the OVERFLOW option to do special processing when the UNSTRING process does not examine every character in the source data item, or when the pointer variable has a value of less than one or more than the length of the source data item. When the overflow condition exists, the associated imperative statement (if any) executes and program execution continues immediately after the UNSTRING statement.
  11. Use the NOT ON OVERFLOW option to do special processing when the UNSTRING statement processes the entire source data item.