PreviousFile Handler API RebuildNext

Chapter 8: Sorting Files

In addition to the run-time COBOL sort module which is used by default to execute a SORT statement in your COBOL program, this COBOL system provides two further methods of sorting and merging files:

8.1 Mfsort Utility

Mfsort enables you to sort and merge data files. It is invoked from the command line in one of the following ways:

mfsort instructions
mfsort take filename

where the parameters are:

instructions

See the section Mfsort Instructions below.

When specifying instructions on the command line, remember to:

  • Observe the maximum command line length imposed by the operating system.

  • Allow the shell to pass through special characters such as parentheses or comments in one of the following ways:

    • Use quotes "" to enclose parentheses.

    • For parentheses, place an escape character \ directly in front of each parenthesis. For a comment, place an escape character \ directly in front of *.

filename A text file containing Mfsort instructions. See the section Mfsort Instructions below. Use this method if you need to specify a lot of instructions.

8.2 Mfsort Workfile

During a sort/merge operation, Mfsort uses a temporary workfile. This workfile is paged to disk in the current directory or, if it is set, in the directory specified by the TMP environment variable.

Mfsort copies all the records from each of the input files, truncated or padded as appropriate, to the temporary workfile. The workfile is then sorted/merged according to its key description. After being sorted/merged in the workfile, the records are copied to each of the output files and truncated or padded as appropriate.

8.3 MFsort Instructions

The general format of an mfsort command is:

mfsort [*] [CHAR-EBCDIC] [SIGN-EBCDIC] {SORT|MERGE}
   fields(instructions,...) [record definition]
   {USE input-file}...
   {GIVE output-file}...
Instruction Meaning
* The rest of the line is treated as a comment. This is useful if you are supplying instructions via a text file as you can add comments to the file which explain the purpose of each instruction.
CHAR-EBCDIC Specifies EBCDIC data. CHAR-EBCDIC must precede all SORT, MERGE, USE, or GIVE instructions.
SIGN-EBCDIC Specifies that numeric DISPLAY items with included signs are interpreted according to the EBCDIC convention. SIGN-EBCDIC is not required when CHAR-EBCDIC is specified but for data that is otherwise ASCII, such as when the program which created the data was compiled with the SIGN"EBCDIC" Compiler directive. SIGN-EBCDIC must precede all SORT, MERGE, USE or GIVE instructions.
SORT Specifies a sort operation and must be followed by a FIELDS instruction specifying the field(s) to be used for the sort operation. This may optionally be followed by a RECORD instruction specifying the record size and format of the sort workfile. SORT and MERGE are mutually exclusive.
MERGE Specifies a merge operation and must be followed by a FIELDS instruction specifying the field(s) to be used for the merge operation. This may optionally be followed by a RECORD instruction specifying the record size and format of the sort workfile. MERGE and SORT are mutually exclusive.
fields (instructions) The fields on which the file is to be sorted/merged. See the section Fields Instruction below.
record definition Record size and format. A record instruction can be used to specify these details for the workfile, input file(s) and output file(s). See the section Record Instruction below.
USE input-file Each USE instruction specifies an input file. You must specify all USE instructions before any GIVE instructions. See the section Mfsort Input and Output Files below.
GIVE output-file Each GIVE instruction specifies an output file. See the section Mfsort Input and Output Files below.

8.3.1 Fields Instruction

A SORT or MERGE instruction must be followed by a fields instruction which specifies the fields on which the input file is to be sorted/merged.

A fields instruction takes the following form:

fields({start,length,type,order},...)
start Starting position of the field within the record, counting in bytes from 1.
length Length of the field (in bytes).
type Type of data in the field. See the section Field Types below.
order Ordering of output, which can be either of:
A - ascending
D - descending

Up to 16 fields can be specified by repeating the parameter set (start, length, type, and order). The parameters and the parameter sets must be separated by commas.

8.3.1.1 Field Types

The field type must be one of the following:

Type Definition
CH PIC X DISPLAY
NU PIC 9 DISPLAY
PD PIC S9 COMP-3
FI PIC s9.99 DISPLAY
BI COMP
C5 COMP-5
C6 COMP-6
S5 S9 COMP-5
CX COMP-X
LS PIC S9 LEADING SEPARATE
TS PIC S9 TRAILING SEPARATE
LI PIC S9 LEADING INCLUDED
TI PIC S9 TRAILING INCLUDED (compiled SIGN"EBCDIC")
SB S9 COMP

For example, you can define a relative file called golf.dat within your COBOL program as follows:

file-control.
select members-file
   assign to "/mydir/golf.dat"
   organization is relative
   access mode is random
   relative key is relative-key.
data division.
file section.
fd members-file
   record contains 28 characters.
01 members-record.
   03 members-number pic 9(6).
   03 members-lname pic x(10).
   03 members-fname pic x(10).
   03 members-handicap pic 9(2).

You can then use the following mfsort command to sort the file golf.dat on the field containing the membership number:

mfsort sort fields"(1,6,nu,a)"
   use golf.dat record f,28 org rl
   give members.dat

The sorted version of the file is written to the file members.dat.

8.3.2 Mfsort Input and Output Files

The following instructions are used to define each input file:

USE input-file [record definition]
   [org organization] [key structure]

and the following instructions are used to define each output file:

GIVE output-file [record definition] 
   [org organization] [key structure]

These instructions should be placed immediately after the associated USE or GIVE instruction. If you omit any of the above instructions, the last specified values are used. This means that if the input and output files are all of the same type and format, you need only specify values for the first file.

8.3.2.1 RECORD Instruction

The RECORD instruction is optional. It is used to specify the format and length of records both in the sort workfile and, when it follows the associated USE or GIVE instruction, in the input and output files.

The RECORD instruction takes the following form:

record format,rec-len,max-len
format

The record format, one of:

F - fixed-length records of length rec-len

V - variable-length records with a minimum length of rec-len and a maximum length of max-len

If you do not specify a RECORD instruction for the sort workfile, it defaults to fixed-length record format, with the record size equal to the largest record specified in the USE or GIVE instructions.

8.3.2.2 ORG Instruction

The ORG instruction specifies the file organization, one of:

IX - indexed
RL - relative
SQ - sequential
LS - line sequential

The default value of ORG is SQ.

8.3.2.3 KEY Instruction

The KEY instruction specifies the key structure for the file and is therefore mandatory for indexed files. It is not relevant to other file organizations.

The format of the KEY instruction is:

key ({start,length,ixkey},...)
start Starting position of the key within record, counting in bytes from 1.
length Number of bytes in the key.
ixkey one of:
P - Primary key (must always be defined first)
A - Alternate key
AD - Alternate key with duplicates
C - Component of the last-specified primary or alternate key.

The KEY instruction should be repeated as required to describe the entire key structure. The parameters and parameter sets (start, length, ixkey) should be separated by commas.

You must define the keys in order of importance, the primary key first, followed by all its components if it is split, then the first alternate key and all of its components, and so on.

The following example defines three keys:

key (4,5,p,10,5,c,20,2,ad,40,2,a,46,10,c)

The first key, the primary key, is split with its first component starting at character position 4 with a length of 5 bytes and its second component starting at character position 10 with a length of 5 bytes.

The second key (alternate key) enables duplicates, starts at character position 20 and is 2 bytes in length.

The third key is a split alternate key, with the first component starting at character position 40 with a length of 2 bytes and the second component starting at character position 46 and with a length of 10 bytes

8.4 Example Mfsort Commands

Imagine four indexed files (north.dat, south.dat, east.dat and west.dat) which contain for the north, south, east and west of the country the scores achieved by members of a national organisation in a national competition. The COBOL syntax used to define north.dat is shown below:

file-control.
   select idxfile assign to "north.dat"
      organization is indexed
      record key is member-id.
data division.
file section.
fd idxfile
record contains 39 characters.
01 idxfile-record.
   03 member-id  pic 9(6).
   03 surname    pic x(15).
   03 first-name pic x(15).
   03 score      pic 9(3).

Each of the other files has been created in the same way and the results of the competition have been entered in the files.

The following mfsort command takes each of the four files, sorts them on the member's surname and outputs the result to a relative file, members.dat:

mfsort sort fields"(7,15,ch,a)"
   use north.dat org ix record f,39 key"(1,6,p)"
   use south.dat
   use east.dat
   use west.dat
   give members.dat org rl

The following mfsort command takes each of the four files, sorts them on the member's score (highest score first) and outputs the result to a relative file, scores.dat:

mfsort sort fields"(37,3,nu,d)"
   use north.dat org ix record f,39 key"(1,6,p)"
   use south.dat
   use east.dat
   use west.dat
   give scores.dat org rl

The following mfsort command takes each of the four files, sorts them on the membership number (which is the primary key) and outputs the result to an indexed file, national.dat:

mfsort fields"(1,6,nu,a)"
   use north.dat org ix record f,39 key"(1,6,p)"
   use south.dat
   use east.dat
   use west.dat
   give national.dat

8.5 Mfsort Error Messages

A list of Mfsort error messages is given below.

Instructions are needed
SORT or MERGE already specified
Failed to open filename
Failed to read filename
Invalid or illegal syntax
Unsupported syntax
Unable to get enough memory, aborting
Too many input files
Too many output files specified
All input files must be specified before specifying output files
Record format not specified
SORT failure, file status code: status
Please specify EBCDIC before other arguments
SIGN-EBCDIC incompatible with CHAR-EBCDIC
Prime key not specified first
Key description missing for ISAM file


Copyright © 2000 MERANT International Limited. All rights reserved.
This document and the proprietary marks and names used herein are protected by international law.

PreviousFile Handler API RebuildNext