Btrieve | Overview of Working with Data Files |
This COBOL system provides the following methods of sorting and merging files:
Sorting Method |
Description |
---|---|
The run-time system COBOL sort module | The default module that executes a SORT statement in your COBOL program. It can also be called directly using the CALL statement. For details, see the section The Callable Sort Module in the chapter File Handler and Callable Sort APIs. |
mfsort utility | A utility, which you can invoke from the command line, that enables you to sort and merge data files. |
This chapter descibes how to use the mfsort facility.
Mfsort enables you to sort and merge data files. It almost completely emulates IBM's Dfsort product, Release 14 and includes support for:
Details of these functions can be found on IBM's Dfsort websites which can be reached from the DFSORT home page: IBM DFSORT/MVS Overview.
The mfsort utility is provided as the file mfsort.exe.
You can invoke mfsort from the Net Express Command Prompt in one of the following ways:
mfsort instructions
mfsort take filename
where the parameters are:
Parameter |
Description |
---|---|
instructions | Mfsort instructions. See the section Instructions. When specifying instructions on the command line, remember to observe the maximum command line length imposed by the operating system. |
filename | A text file containing mfsort instructions. See the section Instructions. Use this method if you need to specify a lot of instructions. |
The following is a list of valid mfsort instructions:
Instruction |
Meaning |
---|---|
* | The rest of the line is treated as a comment. This is useful if you are supplying instructions via a text file as you can add comments to the file which explain the purpose of each instruction. |
CHAR-EBCDIC | EBCDIC data. CHAR-EBCDIC must precede all SORT, MERGE, USE or GIVE instructions. |
SIGN-EBCDIC | Numeric DISPLAY items with included signs are interpreted according to the EBCDIC convention. SIGN-EBCDIC is not required when CHAR-EBCDIC is specified but for data that is otherwise ASCII, such as when the program which created the data was compiled with the SIGN"EBCDIC" Compiler directive. SIGN-EBCDIC must precede all SORT, MERGE, USE or GIVE instructions. |
SORT/MERGE | These instructions specify either a sort or a merge option and must be followed by a FIELDS instruction specifying the field(s) to be used. The FIELDS instruction may optionally be followed by a RECORD instruction specifying the record size and format of the workfile. SORT and MERGE are mutually exclusive. |
FIELDS (instructions) | The fields on which the file is to be sorted or merged. See the section Fields Instruction. |
RECORD definition | Record size and format. A RECORD instruction can be used to specify these details for the workfile, input file(s) and output file(s). See the section RECORD Instruction. |
USE input-file | Each USE instruction specifies an input file. You must specify all USE instructions before any GIVE instructions. See the section Defining Input and Output Files. |
GIVE output-file | Each GIVE instruction specifies an output file. See the section Defining Input and Output Files. |
INCLUDE/OMIT | Specifies conditions in which records will be included or omitted from the sort process. For details, see the IBM documentation to be found at Using DFSORT Program Control Statements. INCLUDE and OMIT are mutually exclusive. |
INREC | Reformats records before the SORT/MERGE process. |
OUTREC | Reformats records following the SORT/MERGE process. |
MODS | Specifies external procedures (user exits) that are executed, each time a record is released to or returned from the SORT/MERGE process. This implementation supports the E15 and E35 user exits. |
SUM | Specifies that records with the same key value are returned as a single record. Optionally, a field may be specified to accumulate totals for all records with equal keys. |
OUTFIL | This is used to specify complex editing and reporting to one or more output files. Each output file should be specified using a GIVE command. Otherwise, OUTFIL works as described in the IBM documentation to be found at Using DFSORT Program Control Statements. |
OPTION | This can be used to specify various options. One of these options is COPY which results in records being copied, rather than sorted, to the output file. |
A SORT or MERGE instruction must be followed by a FIELDS instruction which specifies the fields on which the input file is to be sorted or merged.
A fields instruction takes the following form:
fields({start,length,type,order},...)
where the parameters are:
Parameter |
Description |
---|---|
start | The starting position of the field in the record, counting in bytes from 1 |
length | The length of the field (bytes) |
type | The type of data in the field. See the section Field Types . |
order | The ordering of output, which can be either of:
A - ascending D - descending |
You can specify up to 16 fields by repeating the parameter set (start, length, type and order). Use commas to separate the parameters and the parameter sets.
The following is a list of some of the available field types:
Field Type |
Definition |
---|---|
AQ | Character with alternate collating sequence. |
BI | COMP |
C5 | COMP-5 |
C6 | COMP-6 |
CH | PIC X DISPLAY |
CX | COMP-X |
FL | Floating point, signed. |
FS/CSF | Signed numeric, with optional leading floating sign. |
LI/OL/CLO | PIC S9 LEADING INCLUDED |
LS/CSL | PIC S9 LEADING SEPARATE |
NU | PIC 9 DISPLAY |
PD | PIC S9 COMP-3 |
PD0 | Packed decimal with first semi-byte and sign semi-byte ignored. |
SB/FI | PIC S9 COMP |
S5 | S9 COMP-5 |
SS | Substring. Used in conditions only. |
TS/CST | PIC S9 TRAILING SEPARATE |
TI/ZD/OT/CTO | PIC S9 TRAILING INCLUDED |
Y2B | Two-digit, one-byte binary year data. |
Y2C/Y2Z | Two-digit, two-byte year data, with optional trailing included sign. PIC 99 or PIC S99. |
Y2D | Two-digit, one-byte packed decimal year data. PIC 99 COMP-6. |
Y2P | Two-digit, two-byte packed decimal year data. PIC 99 COMP-3. |
Y2S | Two-digit, two-byte character year data with special indicators. Binary zeros, blanks and binary ones are treated as special cases. |
Y2T | Full date format, yyx... |
Y2U | Full date format, yyx..., COMP-3. |
Y2V | Full date format, yyx..., COMP-3. Ignores first semi-byte. |
Y2W | Full date format, x...yy. |
Y2X | Full date format, x...yy, COMP-3. |
Y2Y | Full date format, x...yy, COMP-3. Ignores first semi-byte. |
You can find other field types defined in the IBM documentation at SORT Control Statement.
Suppose that golf.dat is a relative file defined in a COBOL program as follows:
file-control. select members-file assign to "d:\netexpress\base\workarea\golf.dat" organization is relative access mode is random relative key is relative-key. data division. file section. fd members-file record contains 28 characters. 01 members-record. 03 members-number pic 9(6). 03 members-lname pic x(10). 03 members-fname pic x(10). 03 members-handicap pic 9(2).
You can then use the following mfsort command to sort the file golf.dat on the field containing the membership number in ascending order:
mfsort sort fields(1,6,nu,a) use golf.dat record f,28 org rl give members.dat
The sorted version of the file is written to the file members.dat.
You need to give instructions to define both the input and output files:
File |
Instructions |
---|---|
Input |
USE input-file [record definition] [org organization] [key structure] |
Output |
GIVE output-file [record definition] [org organization] [key structure] |
Notes:
You use the RECORD instruction to specify the format and length of records in the:
The RECORD instruction takes the following form:
RECORD format,rec-len,max-len
where the parameters are:
Parameter |
Description |
---|---|
format |
The record format, one of: F - fixed length records of length rec-len V - variable length records with a minimum length of rec-len and a maximum length of max-len |
rec-len |
If format is set to F, the record length If format is set to V, the minimum record length |
max-len | If format is set to V, the maximum record length |
If you do not specify a RECORD instruction for the sort workfile, the format defaults to fixed record format, with the record size equal to the largest record specified in the USE or GIVE instructions.
You do not need to specify a RECORD instruction for input files that are either variable length or indexed files as the file characteristics can be deduced from the file itself.
The ORG instruction specifies the file organization, and can be one of:
ORG Instruction |
File Organization |
---|---|
IX | indexed |
RL | relative |
SQ | sequential (default value) |
LS | line sequential |
You do not need to specify an ORG instruction for input files that are either variable length or indexed files as the file characteristics can be deduced from the file itself.
The KEY instruction specifies the key structure for an indexed file. It is used when an output file is indexed and its key structure is not the same as that of the indexed input file.
The format of the KEY instruction is:
KEY ({start,length,ixkey},...)
where the parameters are:
Parameter |
Description |
---|---|
start | The starting position of the key in a record, counting in bytes from 1 |
length | The number of bytes in the key |
ixkey |
One of: P - Primary key (this must always be defined first) A - Alternate key AD - Alternate key with duplicates C - Component of the last-specified primary or alternate key |
You can repeat the KEY instruction as often as required to describe the entire key structure. Use commas to separate the parameters and parameter sets (start, length, ixkey).
You must define the keys in order of importance with the primary key first, followed by all its components if it is split, then the first alternate key and all of its components and so on.
The following example defines three keys:
KEY (4,5,p,10,5,c,20,2,ad,40,2,a,46,10,c)
where:
4,5,p,10,5,c
represents the first primary key which is
split. Its first component starts at character position 4 with a length
of 5 bytes and its second component starts at character position 10 with
a length of 5 bytes.
20,2,ad
represents the second (alternate) key which
can have duplicates and starts at character position 20 with a
length of 2 bytes
40,2,a,46,10,c
represents the third key. This is a
split alternate key, with the first component starting at character
position 40 with a length of 2 bytes and the second component starting
at character position 46 with a length of 10 bytes.This section gives some examples of mfsort commands and jobstreams.
You can find other examples at the IBM document page, Examples of DFSORT Job Streams.
Imagine four indexed files (north.dat, south.dat, east.dat and west.dat) which contain for the north, south, east and west of the country the scores achieved by members of a national organisation in a national competition. The COBOL syntax used to define north.dat is shown below:
file-control. select idxfile assign to "north.dat" organization is indexed record key is member-id. data division. file section. fd idxfile record contains 39 characters. 01 idxfile-record. 03 member-id pic 9(6). 03 surname pic x(15). 03 first-name pic x(15). 03 score pic 9(3).
Each of the other files has been created in the same way and the results of the competition have been entered in the files. The following examples use these files.
The following mfsort commands takes all of the records from each of the four files, sorts them on the member's surname in ascending order and outputs the result to the relative file members.dat:
mfsort sort fields(7,15,ch,a) use north.dat use south.dat use east.dat use west.dat give members.dat org rl
The following mfsort command takes each of the four files, sorts them on the member's score (highest score first) and outputs the result to the relative file scores.dat:
mfsort sort fields(37,3,nu,d) use north.dat use south.dat use east.dat use west.dat give scores.dat org rl
The following mfsort command takes each of the four files, sorts them on the membership number (which is the primary key) and outputs the result to the indexed file national.dat. All records for which the score field is less than 20 are omitted:
mfsort sort fields(1,6,nu,a) use north.dat use south.dat use east.dat use west.dat give national.dat omit cond (37,3,nu,lt,20)
The following mfsort command takes a line sequential file, sortin.dat
and sorts its records on a character field starting at position 11
with a length of 4 bytes. The results are output to the file sortout.dat
which will include only records for which the sub-string, starting at
position 15 of length 3 bytes, is equal to any three consecutive
characters in the string 'J69,L92,J82'
.
mfsort sort fields=(11,4,ch,a) use sortin.dat org ls record (f 80) give sortout.dat include cond=(15,3,ss,eq,c'J69,L92,J82')
The following mfsort command transforms records containing a field of format cyymmdd to the format yyymmdd.
Sort C'cyymmdd' SORT FIELDS=(1,7,BI,A) * sort C'cyymmdd' use mfs110a.in org ls record (f 40) * Transform C'cyymmdd' to C'yyyymmdd' OUTFIL OUTREC=(1,1,CHANGE=(2, * change C'c' as follows: C'0',C'19', * C'0' to C'19' C'1',C'20', * C'1' to C'20' C'2',C'21'), * C'2' to C'21' NOMATCH=(C'99') 2,6) * copy C'yymmdd' give sortout.dat
The following is an example of how to use the OUTFIL command to produce a complex report, in this case a Profit and Loss report for one of four divisions. The input file, mfs121a.dat is sorted on the first two fields and only records from the western region are output. The SECTIONS instruction produces a page throw when the field starting in position 3, length 10 bytes changes. The following shows:
Chips San Martin 0088902203 West Chips Oakland 0023412432 West Chips San Jose 0123213335 West Ice Cream Marin 0054234123 West Chips Gilroy 0055484342 West Ice Cream Napa 0085734283 West Pretzels San Jose 0123488534 West Ice Cream San Francisco 0092231245 West Chips San Francisco 000324343q West Chips San Jose 0123213335 South Ice Cream San Martin 0100346730 West Pretzels Marin 0534332344 West Chips Gilroy 0055484342 South Chips Morgan Hill 0098732232 West Pretzels Morgan Hill 0084384340 West Ice Cream San Jose 000002345u West Pretzels Napa 0531234856 West Chips Oakland 0023412432 South Pretzels San Martin 000023438r West Chips Los Angeles 000223401t West Ice Cream Marin 0054234123 South Pretzels San Francisco 0541230005 West Ice Cream Napa 0085734283 South Pretzels San Jose 0123488534 South Ice Cream San Francisco 0092231245 South Chips San Francisco 000324343q South Ice Cream San Martin 0100346730 South Pretzels Marin 0534332344 South Chips Morgan Hill 0098732232 South Pretzels Morgan Hill 0084384340 South Ice Cream San Jose 000002345u South Pretzels Napa 0531234856 South Pretzels San Martin 000023438r South Chips Los Angeles 0002234014 South Pretzels San Francisco 0541230005 South
SORT FIELDS=(3,10,A,16,13,A),FORMAT=CH use mfs121a.dat org ls record (f 80) OUTFIL INCLUDE=(42,6,CH,EQ,C'West'), HEADER1=(5/,18:' Western Region',3/, 18:'Profit and Loss Report',3/, 18:' for ',&DATE,3/, 18:' Page',&PAGE), OUTREC=(6:16,13,24:31,10,ZD,M5,LENGTH=20,75:X), SECTIONS=(3,10,SKIP=P, HEADER3=(2:'Division: ',3,10,5X,'Page:',&PAGE,2/, 6:'Branch Office',24:' Profit/(Loss)',/, 6:'-------------',24:'--------------------'), TRAILER3=(6:'=============',24:'====================',/, 6:'Total',24:TOTAL=(31,10,ZD,M5,LENGTH=20),/, 6:'Lowest',24:MIN=(31,10,ZD,M5,LENGTH=20),/, 6:'Highest',24:MAX=(31,10,ZD,M5,LENGTH=20),/, 6:'Average',24:AVG=(31,10,ZD,M5,LENGTH=20),/, 3/,2:'Average for all Branch Offices so far:', X,SUBAVG=(31,10,ZD,M5))), TRAILER1=(8:'Page ',&PAGE,5X,'Date: ',&DATE,5/, 8:'Total Number of Branch Offices Reporting: ', COUNT,2/, 8:'Summary of Profit/(Loss) for all', ' Western Division Branch Offices',2/, 12:'Total:', 22:TOTAL=(31,10,ZD,M5,LENGTH=20),/, 12:'Lowest:', 22:MIN=(31,10,ZD,M5,LENGTH=20),/, 12:'Highest:', 22:MAX=(31,10,ZD,M5,LENGTH=20),/, 12:'Average:', 22:AVG=(31,10,ZD,M5,LENGTH=20)) give outfil1.dat
Western Region Profit and Loss Report for 11/05/95 Page 1 ************************************************************************** Division: Chips Page: 2 Branch Office Profit/(Loss) ------------- -------------------- Gilroy 554,843.42 Los Angeles (22,340.14) Morgan Hill 987,322.32 Oakland 234,124.32 San Francisco (32,434.31) San Jose 1,232,133.35 San Martin 889,022.03 ============= ==================== Total 3,842,670.99 Lowest (32,434.31) Highest 1,232,133.35 Average 548,952.99 Average for all Branch Offices so far: 548,952.99 ************************************************************************** Division: Ice Cream Page: 3 Branch Office Profit/(Loss) ------------- -------------------- Marin 542,341.23 Napa 857,342.83 San Francisco 922,312.45 San Jose (234.55) San Martin 1,003,467.30 ============= ==================== Total 3,325,229.26 Lowest (234.55) Highest 1,003,467.30 Average 665,045.85 Average for all Branch Offices so far: 597,325.02 ************************************************************************** Division: Pretzels Page: 4 Branch Office Profit/(Loss) ------------- -------------------- Marin 5,343,323.44 Morgan Hill 843,843.40 Napa 5,312,348.56 San Francisco 5,412,300.05 San Jose 1,234,885.34 San Martin (2,343.82) ============= ==================== Total 18,144,356.97 Lowest (2,343.82) Highest 5,412,300.05 Average 3,024,059.49 Average for all Branch Offices so far: 1,406,236.51 ************************************************************************** Page 5 Date: 11/05/95 Total Number of Branch Offices Reporting: 18 Summary of Profit/(Loss) for all Western Division Branch Offices Total: 25,312,257.22 Lowest: (32,434.31) Highest: 5,412,300.05 Average: 1,406,236.51
During a sort or merge operation, mfsort uses a temporary workfile. This workfile is paged to disk in the current directory or, if it is set, in the directory specified by the TMP environment variable.
Mfsort copies all the records from each of the input files to the temporary workfile, truncated or padded as appropriate. The workfile is then sorted or merged according to its key description. After being sorted or merged in the workfile, the records are copied to each of the output files and truncated or padded as appropriate.
During this operation:
A full list of mfsort error messages is given in the Net Express online help. (Click Help Topics on the Help menu. Then, on the Index tab, double-click Mfsort, error messages.)
Copyright © 2000 MERANT International Limited. All rights reserved.
This document and the proprietary marks and names
used herein are protected by international law.
Btrieve | Overview of Working with Data Files |