fpSetConfig()

This function provides a way to enable and configure various options prior to document filtering, such as providing a password for a file, or enabling hidden text extraction.

Syntax

KVErrorCode pascal fpSetConfig(
    KVFilterSession 	session,
    int  		nType,
    int  		nValue,
    void 		*pData );

Arguments

session

A KeyView Filter session that you initialized by calling fpInit().

nType The configuration flag. This is a symbolic constant defined in kvtypes.h. The available options are described in the Filter Configuration Flags table.
nValue

The integer value defined for the flags above.

pData

The data for the configuration flag.

Returns

The return value is an error code.

  • If the call is successful, the return value is KVError_Success.
  • If the call is unsuccessful, the return value is an error code that indicates the problem.

Discussion

  • Although fpSetConfig() does not run out of process, any configuration flags that are set through fpSetConfig() are passed to the out-of-process session.
  • The configuration flags are described in the following table.

Filter Configuration Flags

Flag Description
KVFLT_SETOOPSRCFILE

If you set this flag to TRUE, the input file name is reported in the out-of-process error log when the file generates an error in stream mode. See Report the File Name in Stream Mode. The default is FALSE.

nValue is TRUE or FALSE.

pData is the name of the input file generating errors.

KVFLT_SETTEMPDIRECTORY

This flag enables you to specify the directory where temporary files created during filtering processes are stored.

nValue is set to 0.

pData is the path name of the directory where temporary files are stored. This value must be null terminated. On Windows, pData must be in the local Windows code page.

KVFLT_SETXMLCONFIGINFO

This flag enables you to define which elements and attributes are extracted from XML documents with a specified format ID or root element. You can use this option to override the default settings for the supported XML formats (see Filter XML Files), or to define settings for custom XML document types.

The settings are defined in the KVXConfigInfo structure. To set custom settings for more than one document type, call the fpSetConfig() function once for each type.

You can also modify element extraction settings by using the kvxconfig.ini file. See Configure Element Extraction for XML Documents.

nValue is set to 0.

pData is a pointer to the KVXConfigInfo structure.

KVFLT_SETSRCPASSWORD

This flag enables you to define a password used to open a password-protected file for filtering. See Filter Password Protected Files.

nValue is the length of the password.

pData is the source file password, which can have a maximum length of 255 characters.

To remove the configured password, set KVFLT_SETSRCPASSWORD and pass in nValue=0 and pData=NULL.

KVFLT_LOGICALPDF

This flag extracts paragraphs from a PDF file in the order in which they appear on the page (logical reading order). The nValue argument specifies the paragraph direction. See Filter PDF Files.

nValue is one of the paragraph direction options defined in the LPDF_DIRECTION enumerated type in kvtypes.h.

pData is NULL.

KVFLT_INCLREVISIONMARK

If you set this flag to TRUE, text that was deleted from a document with revision tracking enabled is extracted from the document and included in the filtered output.

To reset the flag and exclude deleted text from the filtered output, set the flag to FALSE (the default). See Extract Deleted Text Marked by Tracked Changes.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_NOEMBEDDEDOBJECT

If you set this flag to TRUE, the text of embedded previews in Microsoft Word (DOC, DOCX), Excel (XLSX), PowerPoint (PPT, PPTX), and Visio (VSDX) documents is not included in the filter output.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_SHOWHIDDENTEXT

If you set this flag to TRUE, hidden text is extracted from Microsoft Word, Excel, and PowerPoint documents.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_NOCOMMENTS

If you set this flag to TRUE, comments from Microsoft Word, PowerPoint, or Excel documents are not extracted.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_SHOWDATEFIELDCODE

If you set this flag to TRUE, date/time field codes are extracted from Microsoft Word, PowerPoint, and Rich Text Format documents instead of the date/time values.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_SKIPEMBEDDEDFONT

If you set this flag to TRUE, text that contains embedded fonts is not filtered from PDF documents. See Filter PDF Files.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_SHOWFILENAMEFIELDCODE

If you set this flag to TRUE, file name field codes are extracted from Microsoft Word documents.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_KEEPSOFTHYPHEN

If you set this flag to TRUE, soft hyphens are retained when text is filtered from PDF documents. See Filter PDF Files.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_SetConfigurableArguments

This is used to configure hidden information for Microsoft Excel files. See Hidden Data in Microsoft Excel Documents.

If you set this flag to TRUE, the pData is a variable of configurable arguments.

nValue is TRUE or FALSE.

pData is a variable of configurable arguments.

KVFLT_EXPORTALLMETADATA

If you set this flag to TRUE, all custom metadata is filtered from PDF documents when the metadata APIs are used. See Extract Custom Metadata from PDF Files.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_EXPORTTAGGEDCONTENT

If you set this flag to TRUE, tagged PDF content is filtered from PDF documents. See Filter Tagged PDF Content.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_FILTERLOGICALORDER

If you set this flag to TRUE, the text filtered from Microsoft PowerPoint files is output in logical (reading) order. See Filter Presentation Files to a Logical Reading Order.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_SETOUTPUTCHARSET

This flag enables the output character set to be changed.

pData is one of the character encodings defined in the KVCharSet enumerated type in kvcharset.h.

KVFLT_EXTRACTIMAGES

If you set this flag to TRUE, the extract API also extracts images contained within the file. See Extract Images for more details.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_STANDARDIZECELLFORMATS

If you set this flag to TRUE, standardization of cell formats in Microsoft Excel files is enabled. See Standardize Cell Formats.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_TABLEDETECTION

If you set this flag to TRUE, table detection for PDF files is enabled. See Table Detection for PDF Files.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_TABDELIMITED

If you set this flag to TRUE, tables in spreadsheets and word processing formats are output in tab delimited formats. This format includes tab delimiters for all cells, including empty cells. For more information, see Tab Delimited Output for Spreadsheets and Embedded Tables.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_OUTPUTTABLEDELIMITERS

If you set this flag to TRUE, table delimiters are enabled for IDOL Eduction (See Tab Delimited Output for Spreadsheets and Embedded Tables).

Values are TRUEand FALSE, pData is NULL.

KVFLT_NOPHONETICGUIDES

If you set this flag to TRUE, KeyView does not output Japanese phonetic guide text from Microsoft Excel files. See Exclude Japanese Guide Text.

nValue is TRUE or FALSE.

pData is NULL.

KVFLT_SOURCECODEDETECTION

If you enable this option, KeyView attempts to identify the programming language of any source code files that it finds. The nValue argument specifies the level of identification to attempt. See Source Code Identification.

nValue is KVSOURCECODE_OFF, KVSOURCECODE_ENABLED, or KVSOURCECODE_EXTENDED.

pData is NULL.

KVFLT_CHARSETDETECTION

KeyView attempts to detect the character set of an input file. Some character sets (including ANSI, UTF-8, and UTF-16) can be detected by core KeyView functionality but others can only be detected if your license includes advanced character set detection.

If your license includes advanced character set detection, it is enabled by default. However, it can increase the time required to filter some documents. To disable advanced character set detection, set this flag to FALSE.

KeyView cannot perform character set conversion unless it detects the character set of the source file, or you call fpSetSrcCharSet(). For more information see Convert Character Sets.

KVFLT_OCR

KeyView performs Optical Character Recognition (OCR) on raster image files, to attempt to filter text that might be visible in the image.

OCR is available only on certain platforms (see Optical Character Recognition in the platform differences section). If your license includes OCR, it is enabled by default. To disable OCR set nValue to FALSE and pData to NULL, and to re-enable it set nValue to TRUE and pData to NULL.

KeyView can perform OCR on a wide range of documents. If you know in advance what you will be processing (for example, if you know that the text in all of your documents is in English) you can specify OCR options to improve performance. To do this, set nValue to TRUE, and pData to a pointer to the KVOcrOptions structure. For more information, see Optical Character Recognition.

KVFLT_SETOOPUSERWIN

You can use this flag to run the out-of-process server (KVOOP) as a specific user, so that you can run KVOOP with different privileges to your main application. See Run KeyView with Reduced Privileges.

This flag can be used only on Windows, see KVFLT_SETOOPUSERUNIX to specify the user for KVOOP on non-Windows platforms.

To enable this option, set nValue to TRUE and pData to a pointer to a KV_SET_OOP_USER structure, which must have been initialized with KVStructInit and cannot be NULL.

To disable this option, set nValue to FALSE and pData to NULL.

KVFLT_SETOOPUSERUNIX

You can use this flag to run the out-of-process server (KVOOP) with a specific user and group ID, so that you can run KVOOP with different privileges to your main application. See Run KeyView with Reduced Privileges.

This flag can be used only on non-Windows platforms, see KVFLT_SETOOPUSERWIN to specify the user for KVOOP on Windows.

nValue is FALSE.

pData is a pointer to a KV_SET_OOP_USER structure, which must have been initialized with KVStructInit and cannot be NULL.

KVFLT_FILTERPIPESTREAMING

If you set this flag to TRUE, it enables pipe-streaming mode for filtering. See Configure Pipe-Streaming.

This change does not take effect until the next time fpOpenStream is called. It does not affect any fpFilterStream calls on streams that are already open.

Values are TRUE and FALSE, pData is NULL.

KVFLT_EXTRACTPIPESTREAMING

If you set this flag to TRUE, it enables pipe-streaming mode for extraction (see Configure Pipe-Streaming).

This change does not take effect until the next time fpOpenFile is called. Any file that is already open when you change the setting uses the pre-existing setting.

Values are TRUE and FALSE, pData is NULL.

KVFLT_UNEXPECTEDZIPDETECTION

If you set this flag to TRUE, fpMainFileInfo() returns an error if it detects a ZIP file that has been concatenated onto the end of the file in an attempt to hide it. For more information, see Unexpected ZIP Detection.

Values are TRUE and FALSE, pData is NULL.

Examples

  • To specify a password to open a password-protected file for filtering:

    (*fpSetConfig)(session, KVFLT_SETSRCPASSWORD, 8, "password");
  • To extract hidden text from Microsoft Word, Excel, or PowerPoint files:

    (*fpSetConfig)(session, KVFLT_SHOWHIDDENTEXT, TRUE, NULL);
  • To extract all custom metadata fields from PDF documents:

    (*fpSetConfig)(session, KVFLT_EXPORTALLMETADATA, TRUE, NULL);