define({"0":{y:0,u:"../Content/Part_Overview.htm",l:-1,t:"Overview of Filter SDK",i:0.00125561630257064,a:"Overview of Filter SDK This section provides an overview of the OpenText KeyView Filter SDK and describes how to get started with the C API. Introducing Filter SDK Getting Started"},"1":{y:0,u:"../Content/filter_shared/filtersdk_intro/intro_filtersdk.htm",l:-1,t:"Introducing Filter SDK",i:0.00203655366324296,a:"Introducing Filter SDK This section describes the KeyView Filter SDK. "},"2":{y:0,u:"../Content/filter_shared/filtersdk_intro/Overview.htm",l:-1,t:"Overview",i:0.00150291365292478,a:"OpenText KeyView Filter SDK enables you to incorporate text extraction functionality into your own applications. It extracts text and metadata from a wide variety of file formats on numerous platforms, and can automatically recognize over 1900 document types. It supports both file-based and ..."},"3":{y:0,u:"../Content/Shared/_KV_Platform.htm",l:-1,t:"Platforms, Compilers, and Dependencies",i:0.00132181446217191,a:"Platforms, Compilers, and Dependencies This section lists the supported platforms, supported compilers, and software dependencies for the KeyView software."},"4":{y:0,u:"../Content/Shared/_KV_Platform_Supported_Platforms.htm",l:-1,t:"Supported Platforms",i:0.00125561630257064,a:"The following sections list supported operating systems for each platform. Windows (x86-64) Microsoft Windows Server 2022 Microsoft Windows Server 2019 Microsoft Windows Server 2016 Microsoft Windows Server 2012 Microsoft Windows 11 Microsoft Windows 10 Windows (x86) Microsoft Windows 10 Windows ..."},"5":{y:0,u:"../Content/Shared/_KV_Platform_Compilers.htm",l:-1,t:"Supported Compilers",i:0.00132181446217191,a:"Supported Compilers The following table lists the supported compilers for the C Filter SDK."},"6":{y:0,u:"../Content/Shared/_KV_Platform_Dependencies.htm",l:-1,t:"Software Dependencies",i:0.00178925631288882,a:"To run KeyView on Windows requires the Microsoft Visual C++ 2019 redistributables to be installed. The redistributables are provided in the vcredist folder of the KeyView SDK but you can  download the latest installers from Microsoft  to get the latest security, reliability, and performance ..."},"7":{y:0,u:"../Content/Shared/_KV_install_Windows.htm",l:-1,t:"Windows Installation",i:0.00125561630257064,a:"To install the SDK on Windows, use the following procedure. To install the SDK  Run the installation program, KeyViewProductNameSDK_VersionNumber_OS.exe, where ProductName is the name of the product, VersionNumber is the product version number, and OS is the operating system. For example: ..."},"8":{y:0,u:"../Content/Shared/_KV_install_UNIX.htm",l:-1,t:"UNIX Installation",i:0.00125561630257064,a:"To install the SDK, use one of the following procedures. To install the SDK from the graphical interface Run the installation program and follow the on-screen instructions. To install the SDK from the console Run the installation program from the console as follows: ..."},"9":{y:0,u:"../Content/filter_shared/filtersdk_intro/Package_Contents.htm",l:-1,t:"Package Contents",i:0.00125561630257064,a:"The Filter SDK installation contains: All the libraries and executables necessary for extracting text from a wide variety of formats. The  include files that define the C API. These files can be found in the include directory. The Java API implemented in the package com.verity.api.filter contained ..."},"10":{y:0,u:"../Content/Shared/_KV_License_Update.htm",l:-1,t:"License Information",i:0.018757325487311,a:"Your license key controls whether you have the full version of the KeyView SDK, or a trial version. It also determines whether the following advanced features are enabled: Advanced character set detection with the character set detection library (kvlangdetect). Advanced document readers: Microsoft ..."},"11":{y:0,u:"../Content/Chapter_GettingStarted.htm",l:-1,t:"Getting Started",i:0.00178925631288882,a:"Getting Started This section provides information about how to get started with the KeyView Filter C API."},"12":{y:0,u:"../Content/C/gettingstarted/Getting_Started_Tutorial.htm",l:-1,t:"Getting Started with KeyView Filter and the C API",i:0.00125561630257064,a:"To start using KeyView Filter, you can use the following introduction tutorial: KeyView Filter SDK Introduction . This tutorial helps you use the out-of-the-box command line tools filter and tstxtract to develop your understanding of the basic capabilities and key features of the KeyView Filter SDK. ..."},"13":{y:0,u:"../Content/C/gettingstarted/KeyViewTutorial_Introduction.htm",l:-1,t:"KeyView Filter SDK Introduction",i:0.00203655366324296,a:"You can use the KeyView Filter SDK library by calling it from your own applications through one of its APIs. However, to help you get started it includes some sample applications, filter and tstxtract. which allow you to explore some of the functionality. This section is an introductory tutorial ..."},"14":{y:0,u:"../Content/C/gettingstarted/Programming_Tutorial_Basic.htm",l:-1,t:"C API Programming Tutorial",i:0.00210275182284422,a:"The KeyView Filter SDK allows you to embed KeyView functionality into other services. This tutorial helps you to: familiarize yourself with the Filter SDK C API create a sample program that replicates a common use case of the Filter SDK Setup Resources You must download the following resources ..."},"15":{y:0,u:"../Content/Part_UseSDK.htm",l:-1,t:"Use Filter SDK",i:0.00125561630257064,a:"Use Filter SDK This section explains how to perform some basic tasks by using the File Extraction and Filter APIs, and describes the sample programs. Use the File Extraction API Use the Filter API Use the Metadata API Sample Programs Advanced Topics"},"16":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c.htm",l:-1,t:"Use the File Extraction API",i:0.00196269087213876,a:"Use the File Extraction API This section describes how to extract subfiles from a container file by using the File Extraction API. "},"17":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Introduction.htm",l:-1,t:"Introduction",i:0.00125561630257064,a:"To  filter  a file, you must first determine whether the file contains any subfiles (attachments, embedded OLE objects, and so on). A file that contains subfiles is called a  container file. A container file has a main file (parent) and subfiles (children) embedded in the main file.  The following ..."},"18":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_Sub_Files.htm",l:-1,t:"Extract Subfiles",i:0.00137420297153023,a:"To  filter  all files in a container file, you must open the container and extract its subfiles by using the File Extraction API. The extraction process is done repeatedly until all subfiles are extracted and exposed for  filtering . After a subfile is extracted, you can call  Filter  API functions ..."},"19":{y:0,u:"../Content/Shared/_KV_xtract_pwd_c.htm",l:-1,t:"Open Password Protected Container Files",i:0.00125561630257064,a:"This section describes how to extract password-protected container files by using the C API. The following guidelines apply to specific file types. Lotus Notes NSF files. If you are running a Notes client with an active user connected to a Domino server, you must specify the user’s password as a ..."},"20":{y:0,u:"../Content/_KV_xtract_sanitize_paths.htm",l:-1,t:"Sanitize Absolute Paths",i:0.00614930935309923,a:"When you extract a subfile from a container and write it to disk, you specify an extract directory  and a path to extract the file to. To set the path, you might use the path in the container file that you are extracting from, as returned  from the function  fpGetSubFileInfo() .  However, if the ..."},"21":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_SubFileInputStream.htm",l:-1,t:"Access a Subfile without Extracting the Entire File",i:0.00125561630257064,a:"Some operations do not require all of the data contained within a subfile. For example, format detection can often be performed using only the beginning and end of a file, without needing to extract the entire file to disk or memory. When filtering, you might process part of a file before deciding ..."},"22":{y:0,u:"../Content/Shared/_KV_xtract_Extract_Images.htm",l:-1,t:"Extract Images",i:0.00271101773684813,a:"You can use the File Extraction API  to extract images within a file. If you use this feature, images within the file  behave in the same way as any other subfile. Extracted images  have the name image[X].[Y], where [X] is an integer, and [Y] is the extension. The format of the image is the same as ..."},"23":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Recreate_a_Files_Hierarchy.htm",l:-1,t:"Recreate a File’s Hierarchy",i:0.00447770076633405,a:"When you extract a container file, any relationships between the subfiles in the container are not maintained. However, the File Extraction interface provides information that enables you to recreate the hierarchy. You can use the  hierarchy to create a directory structure in a file system, or to ..."},"24":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_SubFiles_Outlook_Files.htm",l:-1,t:"Extract Subfiles from Outlook Files",i:0.00125561630257064,a:"Extract  Subfiles from Outlook Files When you extract an Outlook file (MSG) to disk, the message text and header information (To, From, Sent, and so on) is extracted to a text file. If you do not want to extract the header information, set the flag  KVExtractionFlag_ExcludeMailHeader  when you call  ..."},"25":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_SubFiles_OutlookExpress.htm",l:-1,t:"Extract Subfiles from Outlook Express Files",i:0.00125561630257064,a:"Extract Subfiles from Outlook  Express Files When you extract an Outlook Express (EML) file to disk, the message text and header information (To, From, Sent, and so on) is extracted to a text file. If you do not want to extract the header information, set the flag  KVExtractionFlag_ExcludeMailHeader ..."},"26":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_SubFiles_MailboxFiles.htm",l:-1,t:"Extract Subfiles from Mailbox Files",i:0.00125561630257064,a:"Extract  Subfiles from Mailbox Files A Mailbox (MBX) file is a collection of individual emails compiled with RFC 822 and RFC 2045 - 2049 (MIME), and divided by message separators. There are many mail applications that export to an MBX format, such as Eudora Email and Mozilla Thunderbird.  When an ..."},"27":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_SubFiles_OutlookPersonalFolder.htm",l:-1,t:"Extract Subfiles from Outlook Personal Folders Files",i:0.00125561630257064,a:"Extract  Subfiles from Outlook Personal Folders Files KeyView can extract Outlook items such as messages, appointments, contacts, tasks, notes, and journal entries from a PST file. When a PST file is extracted to disk, the text and header information (To, From, Sent, and so on) from each Outlook ..."},"28":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_SubFiles_LotusDominoXML.htm",l:-1,t:"Extract Subfiles from Lotus Domino XML Language Files",i:0.00125561630257064,a:"Extract  Subfiles from Lotus Domino XML Language Files When you extract a Lotus Domino XML Language (.DXL) file, the message text and header information (To, From, Sent, and so on) is extracted to a text file. If you do not want to extract the header information, set the flag  ..."},"29":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_SubFiles_LotusNotes_DB.htm",l:-1,t:"Extract Subfiles from Lotus Notes Database Files",i:0.00125561630257064,a:"Extract  Subfiles from Lotus Notes Database Files A Lotus Notes database is a single file that contains multiple documents called notes. Notes include design notes (such as forms, views, folders, navigators, outlines, pages, framesets, agents, and resources), data document notes, profile document ..."},"30":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_SubFiles_PDF_Files.htm",l:-1,t:"Extract Subfiles from PDF Files",i:0.002322896323207,a:"Extract  Subfiles from PDF Files KeyView can extract document-level and page-level attachments from a PDF document. Document-level attachments are added by using the Attach A File tool, and can include links to or from the parent document or to other file attachments. Page-level attachments are ..."},"31":{y:0,u:"../Content/Shared/_KV_PDF_ImprovePerformanceWithSmallImages.htm",l:-1,t:"Improve Performance for PDFs with Many Small Images",i:0.00125561630257064,a:"To improve performance when processing  PDF files that contain many small images, you can choose to ignore images unless they exceed a minimum width and/or height. If an image is smaller than the minimum width or height, KeyView does not extract the image.  For example, to ignore images that are ..."},"32":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_Embedded_OLE_Objects.htm",l:-1,t:"Extract Embedded OLE Objects",i:0.00125561630257064,a:"The File Extraction API can extract embedded OLE objects from the following types of documents: Lotus Notes (DXL) Microsoft Excel Microsoft Word Microsoft PowerPoint Microsoft Outlook Microsoft Visio Microsoft Project OASIS Open Document Rich Text Format (RTF) When an embedded OLE object is ..."},"33":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Default_Filenames_Extracted_Subfiles.htm",l:-1,t:"Default File Names for Extracted Subfiles",i:0.00178004588518229,a:"Default File Names for  Extracted Subfiles When you do not specify a file name in the call to   fpExtractSubFile() , in some cases a default file name is applied to the extracted subfile. Default File Name for Mail Formats To avoid naming conflicts and problems with long file names, KeyView applies ..."},"34":{y:0,u:"../Content/Chapter_UseFilterAPI.htm",l:-1,t:"Use the Filter API",i:0.00196269087213876,a:"Use the Filter API This section describes how to perform some basic filtering tasks by using the Filter API."},"35":{y:0,u:"../Content/filter_shared/filter_detection/Extract_Format_Info.htm",l:-1,t:"Obtain Format Information",i:0.00431198603198645,a:"The KeyView format detection module (kwad) detects a file\u0027s format, and reports the information to your application. You can obtain format information from a document by using the  fpGetDocInfo()  function. This extracts the file format, file class, version, and document attributes, and populates an ..."},"36":{y:0,u:"../Content/Shared/_KV_Code_Identification.htm",l:-1,t:"Source Code Identification",i:0.00252991854609526,a:"When KeyView auto-detects a file that contains source code, it can attempt to identify the programming language that it is written in. When you do not enable source code identification, files containing source code may be identified as ASCII text files, causing the application to treat them in the ..."},"37":{y:0,u:"../Content/filter_shared/filter_detection/Determine_Doc_Reader.htm",l:-1,t:"File Formats and Document Readers",i:0.00413871608050016,a:"The KeyView configuration file  formats.ini contains a section named [Formats]. Each line in this section matches a file format with the reader to use to parse the format. For most file formats there is only one suitable reader, but for some formats you can choose a reader to use. Each file format ..."},"38":{y:0,u:"../Content/Shared/_KV_Refine_Detection_of_Text.htm",l:-1,t:"Refine Detection of Text Files",i:0.00125561630257064,a:"During text detection, KeyView analyses the first 1 kB and last 1 kB of data in a document. If less than 10% of that data consists of non-ASCII characters, KeyView detects the document as a text file. However, depending on the type of documents you are working with, the default settings might not ..."},"39":{y:0,u:"../Content/Shared/_KV_AdditionalFormatInfo.htm",l:-1,t:"Additional Format Information",i:0.00125561630257064,a:"KeyView returns basic information about a document\u0027s format, but sometimes it can be useful to have additional information. The file formats_description.tsv, which can be found in the bin directory, provides a mapping between file format ID, human-readable format description, and the format\u0027s MIME ..."},"40":{y:0,u:"../Content/filter_shared/Convert_Character_Sets.htm",l:-1,t:"Character Encoding",i:0.00370313856562001,a:"To ensure that all filtered text is output in the same character encoding, KeyView performs character encoding conversion. In most cases, if your license includes advanced character set detection, KeyView can detect the character encoding used in a source file, and automatically outputs filtered ..."},"41":{y:0,u:"../Content/C/filter_api/Set_CharSet_Extraction.htm",l:-1,t:"Set the Character Set During Subfile Extraction",i:0.00125561630257064,a:"Set the Character Set During  Subfile Extraction You can convert the character set of a subfile at the time the subfile is extracted from the container and before it is filtered. This is most often used to set the character set of a mail message\u0027s body text. To specify the source character set of a ..."},"42":{y:0,u:"../Content/filter_shared/Filter_Deleted_Text.htm",l:-1,t:"Filter Deleted Text",i:0.00265342775085069,a:"Some applications have revision tracking features—such as Microsoft Word\u0027s Track Changes—that identify changes to a document. When these features are used, text that was deleted from a document might still be stored in the file. KeyView does not filter deleted text by default, but the Filter API ..."},"43":{y:0,u:"../Content/filter_shared/Filter_PDF_Files.htm",l:-1,t:"Filter PDF Files",i:0.00265342775085069,a:"Filter PDF Files Filter has special configuration options that allow greater control over the conversion of Adobe Acrobat PDF files."},"44":{y:0,u:"../Content/filter_shared/pdf2sr.htm",l:-1,t:"Use the pdf2sr Reader",i:0.00233783153276767,a:"The pdf2sr reader is an alternative that can be used instead of pdfsr for filtering PDF files. It uses a different parsing technology and may yield better results for some files. The pdf2sr reader has the following features: supports standard and custom metadata (non-XMP) supports basic text ..."},"45":{y:0,u:"../Content/filter_shared/Filter_PDF_LogicalOrder.htm",l:-1,t:"Filter PDF Files to a Logical Reading Order",i:0.00230270925336805,a:"The order of the text inside a PDF file has no relation to the layout of the text on the page or screen. By default, KeyView extracts paragraphs in the order in which they are stored in the file, not the order in which they appear on the page. For example, a three-column article could be output with ..."},"46":{y:0,u:"../Content/filter_shared/Rotated_Text.htm",l:-1,t:"Rotated Text",i:0.00125561630257064,a:"When a PDF that contains rotated text is filtered, the rotated text is extracted after the text at the end of the PDF page on which the rotated text appears. If the PDF is filtered with logical order enabled, and the amount of rotated text on a page surpasses a predefined threshold,  the page is ..."},"47":{y:0,u:"../Content/filter_shared/Filter_Tagged_PDF_Content.htm",l:-1,t:"Filter Tagged PDF Content",i:0.00246372038649399,a:"A tagged PDF contains an additional layer of text for visually impaired readers. This text is used in text-to-speech features in various PDF viewing programs. You can enable filtering of tagged PDF text in the API. Filtering the extra layer of tagged content might result in duplicate text in the ..."},"48":{y:0,u:"../Content/filter_shared/Skip_Embedded_Fonts.htm",l:-1,t:"Skip Embedded Fonts",i:0.00241710421259728,a:"Text in PDF files sometimes contains embedded fonts. If you experience difficulties filtering embedded fonts, you can skip this type of text. If you skip embedded fonts, none of the content that contains embedded fonts is included in the output. To skip text that uses embedded fonts In the C API, ..."},"49":{y:0,u:"../Content/filter_shared/Control_Hyphenation.htm",l:-1,t:"Control Hyphenation",i:0.00241710421259728,a:"There are two types of hyphens in a PDF document: A soft hyphen is added to a word by a word processor to divide the word across two lines. This is a discretionary hyphen and is used to ensure proper text flow in justified text. A hard hyphen is intentionally added to a word regardless of the word\u0027s ..."},"50":{y:0,u:"../Content/filter_shared/Filter_Portfolio_PDF.htm",l:-1,t:"Filter Portfolio PDF Files",i:0.00125561630257064,a:"Filter Portfolio PDF Files Portfolio PDF files contain subfiles and an ActionScript interface for navigating between them. You can use the extraction API to extract the subfiles.  See  Extract Subfiles from PDF Files ."},"51":{y:0,u:"../Content/Shared/_KV_Table_Detection_PDF.htm",l:-1,t:"Table Detection for PDF Files",i:0.00246372038649399,a:"PDF files often contain data presented in a tabular form. However, there is no information about the table stored within the PDF itself – the text is simply placed in  an arrangement that looks like a table to the human eye. When this data is filtered, it can be very difficult to reconstruct the ..."},"52":{y:0,u:"../Content/C/filter_api/Filter_RMS_PDF_Files.htm",l:-1,t:"Filter RMS Protected PDF Files",i:0.00125561630257064,a:"RMS-protected PDF files have two parts. The first is an unencrypted \"outer\" PDF, which contains standard text stating that the document is protected. The second is an encrypted \"inner\" PDF, which is attached to the outer PDF and contains the actual content. To filter both parts separately, filter ..."},"53":{y:0,u:"../Content/filter_shared/Filter_Spreadsheet_Files.htm",l:-1,t:"Filter Spreadsheet Files",i:0.00125561630257064,a:"Filter Spreadsheet Files Filter has special configuration options that enable greater control over the conversion of spreadsheet files."},"54":{y:0,u:"../Content/filter_shared/Filter_Worksheet_Names.htm",l:-1,t:"Filter Worksheet Names",i:0.00125561630257064,a:"Filter  Worksheet Names Normally, Filter does not extract worksheet names from a spreadsheet because it is assumed that the text should not be exposed. To extract worksheet names, add the following lines to the  formats.ini file: [Options]\ngetsheetnames=1"},"55":{y:0,u:"../Content/filter_shared/Filter_Hidden_Text_Excel.htm",l:-1,t:"Filter Hidden Text in Microsoft Excel Files",i:0.00324412816555058,a:"Filter  Hidden Text in Microsoft Excel Files Normally, Filter does not filter hidden text from a Microsoft Excel spreadsheet because it is assumed the text should not be exposed. You can change this default behavior, and extract text from hidden rows, columns, and sheets from Excel spreadsheets by ..."},"56":{y:0,u:"../Content/filter_shared/Specify_Date_and_Time_Fo.htm",l:-1,t:"Specify Date and Time Format on UNIX Systems",i:0.00125561630257064,a:"In Microsoft Excel you can choose to format dates and times according to the system locale.  On Windows, KeyView uses the system locale settings to determine how these dates and times should be formatted.  In other operating systems, KeyView uses the U.S. short date format (mm/dd/yyyy).  You can ..."},"57":{y:0,u:"../Content/filter_shared/large_numbers_excel.htm",l:-1,t:"Filter Very Large Numbers in Spreadsheet Cells to Precision Numbers",i:0.00125561630257064,a:"Numbers in Microsoft Excel files can be extracted and written to the output without formatting. By default, numbers are extracted in the format specified by the Excel file (for example, General, Currency and Date). Spreadsheets might contain cells that have very large numbers in them. Excel displays ..."},"58":{y:0,u:"../Content/filter_shared/Extract_Excel_Formulas.htm",l:-1,t:"Extract Microsoft Excel Formulas",i:0.00324412816555058,a:"When you filter a Microsoft Excel spreadsheet, KeyView extracts the value of each cell. The value of a cell might be calculated from a formula, but the formula is not included in the output unless you configure KeyView to include it. You can extract the cell value, the formula, or both. For example, ..."},"59":{y:0,u:"../Content/Shared/_KV_Standardize_Cell_Formats.htm",l:-1,t:"Standardize Cell Formats",i:0.00246372038649399,a:"In Microsoft Excel you can format cell values. For example, the date \"15/09/2021\" could be formatted as \"15 September 2021\" or \"2021-09-15\". By default, KeyView extracts cell values with formatting, as they would appear in Excel. If you prefer, you can configure KeyView to standardize cell values. ..."},"60":{y:0,u:"../Content/filter_shared/Tab_Delimited_Output.htm",l:-1,t:"Tab Delimited Output for Spreadsheets and Embedded Tables",i:0.00246372038649399,a:"You can use KeyView to convert spreadsheets, embedded tables in Word Processing documents (for example, Microsoft Word documents), and tables detected by Optical Character Recognition (OCR), to tab-delimited form. In this format, KeyView inserts a tab character between each cell, and a line break ..."},"61":{y:0,u:"../Content/filter_shared/Presentation_LogicalOrder.htm",l:-1,t:"Filter Presentation Files to a Logical Reading Order",i:0.00246372038649399,a:"With some file formats, for example Microsoft PowerPoint presentations, the order of the text inside the file has no relation to the layout of the text on the page or screen. Recently modified text might appear at the end of a file, even though that text belongs at the beginning of the document. You ..."},"62":{y:0,u:"../Content/filter_shared/Filter_XML_Files.htm",l:-1,t:"Filter XML Files",i:0.00265342775085069,a:"KeyView can detect many types of XML file, including: Generic XML Microsoft Office 2003 XML (Word, Excel, and Visio) StarOffice/OpenOffice XML (text document, presentation, and spreadsheet) When you filter XML, you can tell KeyView which elements to treat as content and metadata, or to treat the ..."},"63":{y:0,u:"../Content/C/filter_api/Configure_Element_Extrac.htm",l:-1,t:"Configure Element Extraction for XML Documents",i:0.00493571190362908,a:"When filtering XML files, you can specify which elements and attributes to extract according to the file\u0027s format ID or root element. This is useful when you want to extract only relevant text elements, such as abstracts from reports, or a list of authors from an anthology.  A  root element is an ..."},"64":{y:0,u:"../Content/filter_shared/Filter_Hidden_Data.htm",l:-1,t:"Filter Hidden Data",i:0.00125561630257064,a:"Filter Hidden Data Some documents contain hidden information, which is not filtered by default. Depending on the type of hidden data that you want to filter and the type of document that you are filtering, you can either use the API or set parameters in the formats.ini file."},"65":{y:0,u:"../Content/C/filter_api/Hidden_Data_Excel.htm",l:-1,t:"Hidden Data in Microsoft Excel Documents",i:0.0093575756349474,a:"There are several types of hidden data in Microsoft Excel documents, each of which has a corresponding flag in the  KV_CONFIG_Arg  structure, which you can toggle to determine whether the hidden data is shown. The following table lists each data type, its default behavior, and its corresponding ..."},"66":{y:0,u:"../Content/filter_shared/Hidden_Data_HTML.htm",l:-1,t:"Hidden Data in HTML Documents",i:0.00125561630257064,a:"KeyView can filter comments from HTML documents. To enable comment filtering, you must set a flag in the formats.ini file. To enable filtering of comments from HTML files Open the formats.ini file in a text editor. Under [Options], set the following flag. GetHTMLHiddenInfo= 1"},"67":{y:0,u:"../Content/Shared/_KV_No_Phonetic_Guides.htm",l:-1,t:"Exclude Japanese Guide Text",i:0.00246372038649399,a:"This option prevents output of Japanese phonetic guide text when Microsoft Excel (.xlsx) files are processed. To prevent output of Japanese phonetic guide text In the C API, call the function fpSetConfig and set the flag KVFLT_NOPHONETICGUIDES. In  formats.ini, set the following parameter.  (This is ..."},"68":{y:0,u:"../Content/filter_shared/Optical_Character_Recognition.htm",l:-1,t:"Optical Character Recognition",i:0.00125561630257064,a:"When processing raster image files, KeyView can perform Optical Character Recognition (OCR) to attempt to filter text that might be visible in the image. If text is detected to form part of a table, it will be filtered in the same way as tables in Word Processing documents. KeyView performs OCR only ..."},"69":{y:0,u:"../Content/filter_shared/OCR.htm",l:-1,t:"Optimize OCR Performance",i:0.00743931225525996,a:"The default settings for OCR attempt to detect as much text as possible. For example, KeyView attempts to detect text in multiple languages and alphabets, and rotated text in increments of 90 degrees from upright. This increases the amount of text that can be detected, prioritizing recall over ..."},"70":{y:0,u:"../Content/filter_shared/OCR_Config_Examples.htm",l:-1,t:"Configure OCR",i:0.00125561630257064,a:"In the following examples, OCR is configured to process scanned pages that contain only English or only Japanese text. Providing information about the input can result in a performance improvement, but OCR may fail to recognize text that does not match your configuration. For more information about ..."},"71":{y:0,u:"../Content/kv_RMS/_KV_RMS_ConfigureProxyForRMS.htm",l:-1,t:"Configure the Proxy for RMS",i:0.00125561630257064,a:"When KeyView needs to access contents that are protected by the Microsoft Rights Management System (RMS), it must make HTTP requests. By default, KeyView uses the system proxy settings for these requests.  To use different proxy settings, you can configure them  in the [RMS] section of the ..."},"72":{y:0,u:"../Content/filter_shared/DocumentRestrictions.htm",l:-1,t:"Document Restrictions",i:0.00420908179145449,a:"Some applications, and corresponding file formats, allow users to restrict the ways in which a document can be used. For example, you might be able to read a document but additional credentials (such as a password) could be required to modify the document content, add comments, or print the ..."},"73":{y:0,u:"../Content/C/filter_api/Unexpected_ZIP_Detection.htm",l:-1,t:"Unexpected ZIP Detection",i:0.00246372038649399,a:"Concatenating a ZIP file onto another file, such as a JPEG, is a well-known method for attempting to hide files from inspection. Users can zip up their sensitive files, then concatenate them on to the other file by using something like the Windows copy command-line tool. The result is a file that ..."},"74":{y:0,u:"../Content/kv_security/_KV_SecurityBestPractises.htm",l:-1,t:"Security Best Practices",i:0.00132181446217191,a:"This section outlines some security best practices to consider when using KeyView. Run Filter Out-of-Process. By default, Filter processes documents in a separate process, which protects the stability of the calling application. OpenText strongly recommends that you use this default. See  The Filter ..."},"75":{y:0,u:"../Content/filter_shared/The_Filter_Process_Model.htm",l:-1,t:"The Filter Process Model",i:0.00220586890937194,a:"By default, Filter runs independently from the calling application process. This is called out-of-process filtering. Out-of-process filtering protects the stability of the calling application in the rare case when a malformed document causes Filter to fail. You can configure Filter  to run in the ..."},"76":{y:0,u:"../Content/filter_shared/Persist_the_Child_Proces.htm",l:-1,t:"Persist the Child Process",i:0.00125561630257064,a:"By default, in out-of-process filtering, the parent process maintains a persistent connection with the child server after each file is filtered. When the connection is preserved in this way, subsequent filtering requests are processed more quickly because the server is already prepared to receive ..."},"77":{y:0,u:"../Content/C/filter_api/Run_Filter_In_Process.htm",l:-1,t:"Run Filter In Process",i:0.00144532366692734,a:"By default, Filter runs out of process. However, you can enable in-process filtering through the API or in the formats.ini file. If the type of process is not specified in the formats.ini or in the API, Filter is run out of process. If the type of process is specified in the formats.ini and in the ..."},"78":{y:0,u:"../Content/C/filter_api/Run_File_Extraction_Func.htm",l:-1,t:"Run File Extraction Functions Out of Process",i:0.00178925631288882,a:"Run File  Extraction Functions Out of  Process The out-of-process setting specified in the call to  fpInit()  or in the formats.ini file is automatically propagated to the File Extraction API in the call to  KVGetExtractInterface(). In KVGetExtractInterface(), you pass a context pointer that was ..."},"79":{y:0,u:"../Content/filter_shared/Out_of_Process_Logging.htm",l:-1,t:"Out-of-Process Session Logging",i:0.00836655332078951,a:"Logging is available for out-of-process filtering. The kvoop server can create a log file that captures information on the files being processed, storing one entry per session (process). The generated log file is called  xxxx_kvoop.log, where  xxxx is a unique number identifying the session.  In the ..."},"80":{y:0,u:"../Content/C/filter_api/Run_File_Detection_In_or.htm",l:-1,t:"Run File Detection In or Out of Process",i:0.00220586890937194,a:"By default, detection runs in out-of-process mode. However, you can enable in-process detection through the API or in the formats.ini file. If the type of process is not specified in the formats.ini or in the API,  detection runs in out-of-process mode. If the type of process is specified in the ..."},"81":{y:0,u:"../Content/kv_security/_KV_ProtectTempDir.htm",l:-1,t:"Protect the Temporary Directory",i:0.00208417739744357,a:"Filter writes temporary files to the temporary directory. These temporary files frequently include the contents of files that Filter is processing, including decrypted parts of encrypted files. Sensitive information is therefore exposed in the temporary directory, so it is important that only users ..."},"82":{y:0,u:"../Content/kv_security/_KV_RunMinimalPrivileges.htm",l:-1,t:"Run Filter with Minimal Privileges",i:0.00141612328505515,a:"OpenText recommends that you run Filter with only those privileges that are necessary for it to function correctly, which follows best practice for any application. In particular, Filter needs access only to the following directories: the Filter bin directory. any input and output locations. the ..."},"83":{y:0,u:"../Content/C/filter_api/Run_KeyView_Reduced_Privileges.htm",l:-1,t:"Run KeyView with Reduced Privileges",i:0.0043541635927509,a:"KeyView, by default, runs as the same user and has the same privileges as the application that calls it. When you run KeyView in-process this cannot be changed. When you run KeyView out-of-process, you can choose to run KVOOP (the out-of-process server) as a different user with reduced privileges. ..."},"84":{y:0,u:"../Content/kv_security/_KV_DLLPreloading.htm",l:-1,t:"Mitigate Against DLL Pre-Loading",i:0.00141612328505515,a:"When an  application loads a shared library such as  kvfilter.dll or kvfilter.so , the Operating System or runtime linker might search several locations. This search can allow DLL pre-loading attacks if an attacker is able to place a malicious binary in one of the locations searched. It might also ..."},"85":{y:0,u:"../Content/kv_metadata_api_c/_KV_Using_Metadata_API.htm",l:-1,t:"Use the Metadata API",i:0.0161671164488485,a:"Use the Metadata API This section describes how to use KeyView to access metadata."},"86":{y:0,u:"../Content/kv_metadata_api_c/_KV_What_Is_Metadata.htm",l:-1,t:"What is Metadata?",i:0.00125561630257064,a:"Documents may contain information about the document itself: we call this metadata. For instance, a raster image file contains metadata recording the image\u0027s width and height; a word processing document may contain metadata recording the document\u0027s author and title. Metadata can be represented by ..."},"87":{y:0,u:"../Content/kv_metadata_api_c/_KV_Understanding_Metadata.htm",l:-1,t:"Understanding Metadata Fields in KeyView",i:0.00367844720723486,a:"Field Standardization Common metadata fields such as \"Title\", \"Author\", and \"Subject\" exist in many different file formats, but can be stored in different ways. For instance, one raster image format may store the image width as a key-value pair with key Width. Another format may store the image ..."},"88":{y:0,u:"../Content/kv_metadata_api_c/_KV_Process_Fields.htm",l:-1,t:"Access Metadata Fields",i:0.00125561630257064,a:"This section explains how to process metadata fields using the KeyView API. Standardized Fields When KeyView understands the meaning of a metadata field in a document, it outputs that data in a standardized field. Standardized fields are represented as  KVMetadataElement  objects with an eKey set to ..."},"89":{y:0,u:"../Content/kv_metadata_api_c/_KV_Metadata_Examples.htm",l:-1,t:"Metadata Examples",i:0.00125561630257064,a:"If you want to process both standardized and non-standardized metadata fields, you can loop through a  KVMetadataList  without checking the eKey member – both standardized and non-standardized metadata can be handled in the same way. However, standardization allows you to handle particular metadata ..."},"90":{y:0,u:"../Content/kv_metadata_api_c/_KV_Standardized_Metadata_Fields.htm",l:-1,t:"Standardized Metadata Fields",i:0.00277248963897921,a:"The following tables describe the standardized metadata fields that are supported by KeyView. Audio Metadata Date Metadata These dates are retrieved from values stored within the document, and are typically set by the creating application. For documents that exist on disk, this may differ from what ..."},"91":{y:0,u:"../Content/Chapter_Samples.htm",l:-1,t:"Sample Programs",i:0.00171636965705205,a:"Sample Programs This section describes the sample programs provided with Filter SDK."},"92":{y:0,u:"../Content/C/samples/Introduction.htm",l:-1,t:"Introduction",i:0.00125561630257064,a:"The C sample programs demonstrate how to use the C implementation of the Filter API. The sample code is intended to provide a starting point for your own applications or to be used for reference purposes. The following C sample programs are provided: tstxtract filter The source code and makefile ( ..."},"93":{y:0,u:"../Content/Shared/_KV_xtract_samples_c.htm",l:-1,t:"tstxtract",i:0.00178925631288882,a:"The tstxtract sample program demonstrates the File Extraction API. It opens a file, extracts subfiles from the file, and repeats the extraction process until all subfiles are extracted. It also demonstrates how to extract the default set of metadata and pass integer or string names to extract ..."},"94":{y:0,u:"../Content/C/samples/filter.htm",l:-1,t:"filter",i:0.00267819831084909,a:"The filter sample program demonstrates the advanced functionality of the Filter API. It is composed of the following files: filter.c—command line interface filtersupport.c—contains core functionality, such as file filtering, stream filtering, metadata extraction, and format detection. ..."},"95":{y:0,u:"../Content/Chapter_AdvancedTopics.htm",l:-1,t:"Advanced Topics",i:0.00146907230669791,a:"Advanced Topics This section describes some advanced topics that apply to both the Filter API and the Extraction API."},"96":{y:0,u:"../Content/filter_shared/Architectural_Overview.htm",l:-1,t:"Architectural Overview",i:0.002322896323207,a:"Architectural Overview The general architecture of the KeyView Filter technology is the same across all supported platforms and is illustrated in the following diagram. Each component is described in the following table."},"97":{y:0,u:"../Content/filter_shared/File_Caching.htm",l:-1,t:"File Caching",i:0.00125561630257064,a:"To reduce the frequency of I/O operations, and consequently improve performance, the KeyView readers load file data into memory. The readers then read the data from the cache rather than the physical disk. You can configure the amount of memory used for file caching through the formats.ini file. ..."},"98":{y:0,u:"../Content/filter_shared/Use_Streaming_Mode.htm",l:-1,t:"Configure Pipe-Streaming",i:0.011909489276117,a:"This section describes advanced options for configuring the streaming method that Filter uses, to optimize performance. By default, when you run Filter out-of-process,  and pass file streams to the API (instead of file names),  Filter uses temporary files for much of the communication with ..."},"99":{y:0,u:"../Content/filter_shared/Generate_an_Error_Log.htm",l:-1,t:"Generate an Out-of-Process Error Log",i:0.00429211064490209,a:"You can monitor and debug out-of-process filtering operations by enabling a detailed error log. This enables you to see errors that are generated at run time, and to track problem files in stream or file mode. Error logs are not generated when in-process filtering is enabled. The out-of-process ..."},"100":{y:0,u:"../Content/filter_shared/Enable_Disable_Error_Logging.htm",l:-1,t:"Enable or Disable Out-of-Process Error Logging",i:0.00345118107711409,a:"You can enable or disable out-of-process error logging by using either the API or environment variables. By default, a file called kvoop.log is created in the system temporary directory; however, you can change the path and file name of this file (see  Configure the Out-of-Process Error Log ). To ..."},"101":{y:0,u:"../Content/filter_shared/Configure_Error_Log.htm",l:-1,t:"Configure the Out-of-Process Error Log",i:0.00393849628450853,a:"Configure the Out-of-Process Error Log To configure the out-of-process error log, set the following configuration parameters in the [kvooplog] section of the formats.ini configuration file. For example: [kvooplog]\nKvoopLogName=filepath\nLogFileSize=1024\nOverWriteLog=1"},"102":{y:0,u:"../Content/C/filter_api/Report_the_Filename_in_S.htm",l:-1,t:"Report the File Name in Stream Mode",i:0.00386953953672065,a:"When you run Filter in file mode, the file name is always reported in the log file. To report the file name in stream mode, you must extract it through the API.  To add the input file name to the log, call the  fpSetConfig()  function with the following arguments: For example: ..."},"103":{y:0,u:"../Content/Part_C_API_Ref.htm",l:-1,t:"C API Reference",i:0.00125561630257064,a:"C API Reference This section provides detailed reference information for the C-language implementation of the File Extraction and Filter APIs. File Extraction API Functions File Extraction API Structures Filter API Functions Filter API Structures Enumerated Types"},"104":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_functions.htm",l:-1,t:"File Extraction API Functions",i:0.00301198246995933,a:"This section describes the functions in the File Extraction API. The File Extraction functions open a container file, and extract the container’s subfiles so that the subfiles are exposed and available for  filtering . Subfiles can be files within a Zip archive, messages in a mail store, attachments ..."},"105":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpCloseFile.htm",l:-1,t:"fpCloseFile()",i:0.010117063443598,a:"This function frees the memory allocated by  fpOpenFile()  and closes the file. Syntax int (pascal *fpCloseFile) (void *pFile); Arguments Returns If the file is closed, the return value is KVERR_Success.  If the file is not closed, the return value is an error code. Example ..."},"106":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpCloseSubFile.htm",l:-1,t:"fpCloseSubFile()",i:0.00245641820172727,a:"fpCloseSubFile() Closes a stream opened by  fpOpenSubFile() . Syntax int (pascal *fpCloseSubFile) (\n        KVInputStream *stream); Arguments Returns If the subfile is closed, the return value is KVERR_Success If the subfile is not closed, the return value is an error code."},"107":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpExtractSubFile.htm",l:-1,t:"fpExtractSubFile()",i:0.0203837176660325,a:"This function extracts a subfile from a container file to a user-defined path or output stream. This call returns file format information when file is extracted to a path. Syntax int (pascal *fpExtractSubFile)  (\n\t            void                          *pFile, \n\t\t    ..."},"108":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpFreeStruct.htm",l:-1,t:"fpFreeStruct()",i:0.00777120794220441,a:"This function frees the memory allocated by fpGetMainFileInfo(), fpGetSubFileInfo(), fpGetSubFileMetadata(), and fpExtractSubFile(). Syntax int (pascal *fpFreeStruct) (\n    void      *pFile, \n    void      *obj);  Arguments Returns If the allocated memory is freed, the return value is KVERR_Success. ..."},"109":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpGetExtractInfo.htm",l:-1,t:"fpGetExtractInfo()",i:0.00125561630257064,a:"This function returns information about a stream opened by  fpOpenSubFile() . Syntax int (pascal *fpGetExtractInfo) (\n       KVInputStream *stream,\n       KVSubFileExtractInfo *extractInfo);\n Arguments Returns If an issue occurs when obtaining the extraction information, the return value is an error ..."},"110":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpGetExtractStatus.htm",l:-1,t:"fpGetExtractStatus()",i:0.00137420297153023,a:"This function returns the status of an input stream opened by  fpOpenSubFile() . Syntax int (pascal *fpGetExtractStatus) (\n       KVInputStream *stream);\n Arguments Returns If an error occurred when one of the input stream function pointers was last called, this function should return the associated ..."},"111":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpGetMainFileInfo.htm",l:-1,t:"fpGetMainFileInfo()",i:0.00284695526270943,a:"This function  determines whether a file is a container file—that is, whether it contains subfiles—and should be extracted further.  Syntax int (pascal *fpGetMainFileInfo) (\n    void               *pFile, \n    KVMainFileInfo     *fileInfo);  Arguments Returns If the file information is retrieved, ..."},"112":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpGetSubFileInfo.htm",l:-1,t:"fpGetSubFileInfo()",i:0.00586653467263444,a:"This function gets  information about a subfile in a container file. Syntax int (pascal *fpGetSubFileInfo)  (\n    void                    *pFile, \n    int                      index,\n    KVSubFileInfo           *subFileInfo); Arguments Returns If the file information is retrieved, the return value ..."},"113":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpGetSubFileMetadataList.htm",l:-1,t:"fpGetSubFileMetadataList()",i:0.00545015278216941,a:"Containers can store metadata about their subfiles that is independent of the metadata stored within those subfiles. This function allows you to retrieve the metadata stored within the container about a particular subfile. Syntax KVErrorCode pascal fpGetSubFileMetadataList(\n    void* const pFile,\n   ..."},"114":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpGetSubFileMetaData.htm",l:-1,t:"fpGetSubFileMetaData()",i:0.00232092728552868,a:"The function fpGetSubFileMetaData() is deprecated in KeyView 23.2.0 and later. OpenText recommends that you use the function  fpGetSubFileMetadataList()  instead. This function is still available for existing implementations, but it might be incompatible with new functionality and might be removed ..."},"115":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpOpenFile.htm",l:-1,t:"fpOpenFile()",i:0.0302838634858988,a:"This function opens a file to make the file accessible for subfile extraction. Syntax int (pascal *fpOpenFile) (\n    void                      *pContext,\n    KVOpenFileArg              openArg,\n    void                      **pFile); Arguments Returns If the file is opened, the return value is ..."},"116":{y:0,u:"../Content/kv_xtract_functions/_KV_fpOpenFileFromFilterSession.htm",l:-1,t:"fpOpenFileFromFilterSession()",i:0.00132181446217191,a:"This function opens a container file so that you can extract its subfiles. Syntax KVErrorCode (pascal *fpOpenFileFromFilterSession)(\n    KVFilterSession session,\n    KVOpenFileArg openArg,\n    void** pFile\n); Arguments Returns If the file is opened successfully, the return value is KVERR_Success.  ..."},"117":{y:0,u:"../Content/kv_xtract_functions/_KV_XTRACT_funct_fpOpenSubFile.htm",l:-1,t:"fpOpenSubFile()",i:0.00509274061383577,a:"This function opens a subfile as a stream, which can be used directly or passed to other KeyView interfaces. Syntax int (pascal *fpOpenSubFile) (\n        void                  *pFile,\n        KVExtractSubFileArg    extractArg,\n        KVInputStream        **stream); Arguments Returns If the subfile ..."},"118":{y:0,u:"../Content/kv_xtract_structures/_KV_XTRACT_struct.htm",l:-1,t:"File Extraction API Structures",i:0.00146907230669791,a:"File Extraction API Structures This section provides information on the structures used by the File Extraction API. These structures define the input and output parameters required to extract subfiles from a container file, and are defined in kvxtract.h. "},"119":{y:0,u:"../Content/kv_xtract_structures/_KV_XTRACT_struct_KVCredential.htm",l:-1,t:"KVCredential",i:0.00312216898919328,a:"This structure contains a count of the number of  credential elements, and a pointer to the first element of the array of individual elements. The structure is initialized by calling  fpOpenFile() , and is defined in kvxtract.h. typedef struct  ..."},"120":{y:0,u:"../Content/kv_xtract_structures/_KV_XTRACT_struct_KVCredentialComponent.htm",l:-1,t:"KVCredentialComponent",i:0.0131411676379871,a:"This structure contains the value of a credential item. The structure is defined in  kvxtract.h. typedef struct  tag_KVCredentialComponent\n{\n    KVCredKeyType      keytype;\n    union\n    {\n        void           *pkey;\n        char           *skey;\n        unsigned ..."},"121":{y:0,u:"../Content/kv_xtract_structures/_KV_XTRACT_struct_KVExtractInterface.htm",l:-1,t:"KVExtractInterface",i:0.00206633729420158,a:"The members of this structure are pointers to the file extraction functions described in  File Extraction API Functions . Calling the  fpGetExtractInterface()  function assigns the function pointers in the structure. The structure is defined in kvxtract.h.  typedef struct  ..."},"122":{y:0,u:"../Content/kv_xtract_structures/_KV_XTRACT_struct_KVExtractSubFileArg.htm",l:-1,t:"KVExtractSubFileArg",i:0.0329429937134864,a:"This structure defines the input parameters  required to extract a subfile. See  fpExtractSubFile() . The structure is defined in kvxtract.h. typedef struct ..."},"123":{y:0,u:"../Content/kv_xtract_structures/_KV_XTRACT_struct_KVGetSubFileMetaArg.htm",l:-1,t:"KVGetSubFileMetaArg",i:0.00209546599822334,a:"This structure defines the  metadata tags whose values are retrieved by  fpGetSubFileMetaData() . This structure is defined in kvxtract.h. typedef struct  ..."},});