Skip to content

Overview

In this document, we will examine the different functions supported by CitSORT, and include examples, intended to facilitate your adoption of, or transition to CitSORT.

CitSORT is a command-line driven data transformation utility. It provides high-performance Sort alternative and powerful data transformation capabilities.

Functionally, CitSORT provides syntax for making use of multiple processors, if available, naming input files, naming rules for interpreting the input files, which can include rules for transformation, and then naming output files, for which different formats can be described.

To better understand, it helps to understand logically how the utility parses a typical command. In the simplest case, CitSORT is used to sort a file on a key represented by the “sort fields” parameter, into an output file. In the example below, we begin with a sequential file that lists the Presidents and Vice Presidents of the United States from first to most recent. Our simple sort will reverse the order of the list, and generate the output in line sequential format.

Setting the Temporary Directory

CitSORT uses the system’s temporary directory to store the temporary files that are required for the intermediate stages of a SORT. As a general rule, Windows, Linux and UNIX operating systems have default ways of handling temporary files in a default temporary directory.

On Windows

TMP, or TEMP is used to specify the directory to be used for temporary files. If neither TMP or TEMP is defined, or if it is set to the name of a directory that does not exist, temporary files are created in the current working directory.

A typical path is:

%USERPROFILE%\AppData\Local\Temp.

There is no limitation on the size of the temporary folder in Windows. You are only limited by the overall amount of free disk space that you have.

To change the temporary directory:

SET TMP=C:\NEWPATH\TMP

On Linux/UNIX

TMPDIR is used to specify the directory to be used for temporary files. A typical path is /tmp or /var/tmp. If you need to change the maximum size of a file, or extend a limitation in place on the size of the temporary folder in Linux/UNIX operating environment, consult with your System Administrator.

To change the temporary directory:

export TMPDIR=/usr/newpath/tmp

Command-line examples, Windows & Linux/Unix Considerations

The command-line examples in the Reference Manual are executed on a Windows platform where the shell (CMD.exe) is not interpreting the parentheses.

Executing this command-line in a Linux/UNIX environment, where the shell interprets parentheses, you could see an error, such as:

-bash: syntax error near unexpected token `('

This is a shell error, not a citsort error. To correct this, you can escape the parentheses with \ .

See the difference below:

In a Windows environment

        >citsort use presidents.dat  
        record F 85  
        sort fields (1,2,nu,d)  
        give pres2.txt org ls

In a Linux/Unix environment

        >citsort use presidents.dat  
        record F 85  
        sort fields \(1,2,nu,d\)  
        give pres2.txt org ls

In a Linux/UNIX environment, you can avoid this error by transferring the commands into a parameter-file, and using the syntax:

>citsort take parameter-file.txt

Transferring commands into a parameter file

Transferring commands into a parameter-file is recommended when running citsort in Linux and UNIX environments.

Transferring command line into a parameter-file can also be useful, if your command lines are very long, and if you wish to add comments to the different parts of the command. Comments can be inserted into a parameter-file after the “*” character.

>citsort take president-params.txt
(where president-params.txt contains the following:)

Command Description
Use presidents.dat input file to be sorted
Record F 85 fixed length, 85 bytes, sequential
Sort Fields (1,2,nu,d) sort key bytes 1-2/numeric,descending
Give pres2.txt org ls output to pres2.txt, line sequential

The CitSORT Flow of Control for a SORT

As noted above, the general flow of logic is:

Description Statement/Clause
1. Allow for multiprocessing -processmax, -processrec clauses
2. Name the input file(s) USE Statement
3. Re-format the input file(s) INREC Statement
4. Name the Sort Key(s) SORT Statement
5. Apply Sum algorithm(s) SUM Statement
6. Define Sort Filter(s) INCLUDE/OMIT Statements
7. Name the output file GIVE Statement
8. Allow different output files OUTFIL Statement
9. Re-format output file(s) OUTREC Statement
Back to top