define({"0":{y:0,u:"../Content/Part_Overview.htm",l:-1,t:"Overview of Filter SDK",i:0.00203862280282815,a:"Overview of Filter SDK This section provides an overview of the OpenText KeyView Filter SDK and describes how to get started with the C++ API. Introducing Filter SDK Getting Started"},"1":{y:0,u:"../Content/filter_shared/filtersdk_intro/intro_filtersdk.htm",l:-1,t:"Introducing Filter SDK",i:0.00290507352024183,a:"Introducing Filter SDK This section describes the KeyView Filter SDK. "},"2":{y:0,u:"../Content/filter_shared/filtersdk_intro/Overview.htm",l:-1,t:"Overview",i:0.00203862280282815,a:"OpenText KeyView Filter SDK enables you to incorporate text extraction functionality into your own applications. It extracts text and metadata from a wide variety of file formats on numerous platforms, and can automatically recognize over 1900 document types. It supports both file-based and ..."},"3":{y:0,u:"../Content/Shared/_KV_Platform.htm",l:-1,t:"Platforms, Compilers, and Dependencies",i:0.00203862280282815,a:"Platforms, Compilers, and Dependencies This section lists the supported platforms, supported compilers, and software dependencies for the KeyView software."},"4":{y:0,u:"../Content/Shared/_KV_Platform_Supported_Platforms.htm",l:-1,t:"Supported Platforms",i:0.00203862280282815,a:"The following sections list supported operating systems for each platform. Windows (x86-64) Microsoft Windows Server 2022 Microsoft Windows Server 2019 Microsoft Windows Server 2016 Microsoft Windows Server 2012 Microsoft Windows 11 Microsoft Windows 10 Windows (x86) Microsoft Windows 10 Windows ..."},"5":{y:0,u:"../Content/Shared/_KV_Platform_Compilers.htm",l:-1,t:"Supported Compilers",i:0.00203862280282815,a:"Supported Compilers The following table lists the supported compilers for the C++ Filter SDK."},"6":{y:0,u:"../Content/Shared/_KV_Platform_Dependencies.htm",l:-1,t:"Software Dependencies",i:0.00290507352024183,a:"To run KeyView on Windows requires the Microsoft Visual C++ 2019 redistributables to be installed. The redistributables are provided in the vcredist folder of the KeyView SDK but you can  download the latest installers from Microsoft  to get the latest security, reliability, and performance ..."},"7":{y:0,u:"../Content/Shared/_KV_install_Windows.htm",l:-1,t:"Windows Installation",i:0.00203862280282815,a:"To install the SDK on Windows, use the following procedure. To install the SDK  Run the installation program, KeyViewProductNameSDK_VersionNumber_OS.exe, where ProductName is the name of the product, VersionNumber is the product version number, and OS is the operating system. For example: ..."},"8":{y:0,u:"../Content/Shared/_KV_install_UNIX.htm",l:-1,t:"UNIX Installation",i:0.00203862280282815,a:"To install the SDK, use one of the following procedures. To install the SDK from the graphical interface Run the installation program and follow the on-screen instructions. To install the SDK from the console Run the installation program from the console as follows: ..."},"9":{y:0,u:"../Content/filter_shared/filtersdk_intro/Package_Contents.htm",l:-1,t:"Package Contents",i:0.00203862280282815,a:"The Filter SDK installation contains: All the libraries and executables necessary for extracting text from a wide variety of formats. The  include files that define the C API. These files can be found in the include directory. The Java API implemented in the package com.verity.api.filter contained ..."},"10":{y:0,u:"../Content/Shared/_KV_License_Update.htm",l:-1,t:"License Information",i:0.0294025541939552,a:"Your license key controls whether you have the full version of the KeyView SDK, or a trial version. It also determines whether the following advanced features are enabled: Advanced character set detection with the character set detection library (kvlangdetect). Advanced document readers: Microsoft ..."},"11":{y:0,u:"../Content/Chapter_GettingStarted.htm",l:-1,t:"Getting Started",i:0.00533768761098803,a:"Getting Started This section provides information about how to get started with the KeyView Filter C++ API."},"12":{y:0,u:"../Content/C++/gettingstarted/Using_the_C++_Imp.htm",l:-1,t:"Use the C++ Language Implementation of the API",i:0.00203862280282815,a:"The C++ API is designed to make extraction of content from documents as straightforward as possible. The primary advantage over the C API is the use of C++ features to provide a simpler interface that is easy to use.\n         The API consists of: Header files that define all of the classes and ..."},"13":{y:0,u:"../Content/C++/gettingstarted/Building.htm",l:-1,t:"Build the C++ API",i:0.00237424080120022,a:"This section describes the build process for Windows and Linux. To build the C++ API on Windows To build on Windows, you need at least Microsoft Visual Studio 2015.  Switch to the cppapi/bin directory.  At the Visual Studio command prompt, run nmake -f Makefile.  This command creates a file called ..."},"14":{y:0,u:"../Content/C++/gettingstarted/Start_Using_KeyView.htm",l:-1,t:"Start Using KeyView",i:0.00434306435164209,a:"Create a KeyView Session To use the C++ Filter SDK, link the library built in  Build the C++ API , and include the following headers in your code: Copy #include \"Keyview_FilterSDK.hpp\" #include \"Keyview_IO.hpp\" To use the SDK, you must create a KeyView session: Copy auto session = ..."},"15":{y:0,u:"../Content/C++/gettingstarted/Using_Documents.htm",l:-1,t:"Using Document Objects",i:0.00237424080120022,a:"OpenText recommends that you create a  Document  object for each document that you want to process, as described in  Start Using KeyView . The document object provides access to the data within a document, including its format, metadata, text, and subfiles.  This topic provides some additional ..."},"16":{y:0,u:"../Content/C++/gettingstarted/Exceptions.htm",l:-1,t:"Exceptions",i:0.00203862280282815,a:"All of the C++ API methods can throw exceptions. KeyView errors take the form of an instance of keyview_error, which is itself derived from std::exception. The exceptions that can be thrown are defined in Keyview_Errors.hpp. In application code, it is possible to catch and correctly handle many of ..."},"17":{y:0,u:"../Content/C++/gettingstarted/GenericIO.htm",l:-1,t:"Input and Output Methods",i:0.00258875919307398,a:"The previous examples show KeyView taking input from a file (keyview::io::InputFile) and sending output to a file (keyview::io::OutputFile). The C++ API allows you to take input from and write output to other data sources by making use of generic types for input and output. For example, the ..."},"18":{y:0,u:"../Content/Part_UseSDK.htm",l:-1,t:"Use Filter SDK",i:0.00203862280282815,a:"Use Filter SDK This section explains how to perform some basic tasks by using the File Extraction and Filter APIs, and describes the sample programs. Use the File Extraction API Use the Filter API Use the Metadata API Sample Programs Advanced Topics"},"19":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c.htm",l:-1,t:"Use the File Extraction API",i:0.00238520308979362,a:"Use the File Extraction API This section describes how to extract subfiles from a container file by using the File Extraction API. "},"20":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Introduction.htm",l:-1,t:"Introduction",i:0.00203862280282815,a:"To  filter  a file, you must first determine whether the file contains any subfiles (attachments, embedded OLE objects, and so on). A file that contains subfiles is called a  container file. A container file has a main file (parent) and subfiles (children) embedded in the main file.  The following ..."},"21":{y:0,u:"../Content/kv_xtract_api_cpp/_KV_xtract_cpp_Extract_Sub_Files.htm",l:-1,t:"Extract Subfiles",i:0.00203862280282815,a:"To filter all files in a container file, you must open the container and extract its subfiles to either a file or a stream by using the File Extraction API. The extraction process is done repeatedly until all subfiles are extracted and exposed for filtering. After a subfile is extracted, you can ..."},"22":{y:0,u:"../Content/Shared/_KV_xtract_Extract_Images.htm",l:-1,t:"Extract Images",i:0.00203862280282815,a:"You can use the File Extraction API  to extract images within a file. If you use this feature, images within the file  behave in the same way as any other subfile. Extracted images  have the name image[X].[Y], where [X] is an integer, and [Y] is the extension. The format of the image is the same as ..."},"23":{y:0,u:"../Content/kv_xtract_api_cpp/_KV_xtract_cpp_Extract_SubFiles_OutlookExpress.htm",l:-1,t:"Extract Subfiles from Outlook Express Files",i:0.00203862280282815,a:"Extract Subfiles from Outlook  Express Files If the Outlook file contains a non-mail attachment, the attachment is extracted in its native format to the same directory as the message text file. If the Outlook file contains a mail attachment, the complete attachment (including message text and ..."},"24":{y:0,u:"../Content/kv_xtract_api_cpp/_KV_xtract_cpp_Extract_SubFiles_MailboxFiles.htm",l:-1,t:"Extract Subfiles from Mailbox Files",i:0.00203862280282815,a:"Extract  Subfiles from Mailbox Files A Mailbox (MBX) file is a collection of individual emails compiled with RFC 822 and RFC 2045 - 2049 (MIME), and divided by message separators. There are many mail applications that export to an MBX format, such as Eudora Email and Mozilla Thunderbird.  In Eudora ..."},"25":{y:0,u:"../Content/kv_xtract_api_cpp/_KV_xtract_cpp_Extract_SubFiles_OutlookPersonalFolder.htm",l:-1,t:"Extract Subfiles from Outlook Personal Folders Files",i:0.00203862280282815,a:"Extract  Subfiles from Outlook Personal Folders Files KeyView can extract Outlook items such as messages, appointments, contacts, tasks, notes, and journal entries from a PST file.  If an Outlook item contains a non-mail attachment, the attachment is extracted in its native format to a subdirectory. ..."},"26":{y:0,u:"../Content/kv_xtract_api_cpp/_KV_xtract_cpp_Extract_SubFiles_LotusDominoXML.htm",l:-1,t:"Extract Subfiles from Lotus Domino XML Language Files",i:0.00203862280282815,a:"Extract  Subfiles from Lotus Domino XML Language Files When you extract a Lotus Domino XML Language (.DXL) file, the message text and header information (To, From, Sent, and so on) is extracted to a text file. You can make sure that dates and times extracted from Lotus Domino .DXL files are ..."},"27":{y:0,u:"../Content/kv_xtract_api_cpp/_KV_xtract_cpp_Extract_SubFiles_LotusNotes_DB.htm",l:-1,t:"Extract Subfiles from Lotus Notes Database Files",i:0.00203862280282815,a:"Extract  Subfiles from Lotus Notes Database Files A Lotus Notes database is a single file that contains multiple documents called notes. Notes include design notes (such as forms, views, folders, navigators, outlines, pages, framesets, agents, and resources), data document notes, profile document ..."},"28":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_SubFiles_PDF_Files.htm",l:-1,t:"Extract Subfiles from PDF Files",i:0.0037715242376555,a:"Extract  Subfiles from PDF Files KeyView can extract document-level and page-level attachments from a PDF document. Document-level attachments are added by using the Attach A File tool, and can include links to or from the parent document or to other file attachments. Page-level attachments are ..."},"29":{y:0,u:"../Content/Shared/_KV_PDF_ImprovePerformanceWithSmallImages.htm",l:-1,t:"Improve Performance for PDFs with Many Small Images",i:0.00203862280282815,a:"To improve performance when processing  PDF files that contain many small images, you can choose to ignore images unless they exceed a minimum width and/or height. If an image is smaller than the minimum width or height, KeyView does not extract the image.  For example, to ignore images that are ..."},"30":{y:0,u:"../Content/kv_xtract_api_c/_KV_xtract_c_Extract_Embedded_OLE_Objects.htm",l:-1,t:"Extract Embedded OLE Objects",i:0.00203862280282815,a:"The File Extraction API can extract embedded OLE objects from the following types of documents: Lotus Notes (DXL) Microsoft Excel Microsoft Word Microsoft PowerPoint Microsoft Outlook Microsoft Visio Microsoft Project OASIS Open Document Rich Text Format (RTF) When an embedded OLE object is ..."},"31":{y:0,u:"../Content/Chapter_UseFilterAPI.htm",l:-1,t:"Use the Filter API",i:0.00238520308979362,a:"Use the Filter API This section describes how to perform some basic filtering tasks by using the Filter API."},"32":{y:0,u:"../Content/filter_shared/filter_detection/Extract_Format_Info.htm",l:-1,t:"Obtain Format Information",i:0.00203862280282815,a:"The KeyView format detection module (kwad) detects a file\u0027s format, and reports the information to your application. You can detect the format of a file by using the  info  method on a  document  object. For example: Copy auto myinput = keyview::io::InputFile{ std::string(\"InputFile.docx\") }; auto ..."},"33":{y:0,u:"../Content/Shared/_KV_Code_Identification.htm",l:-1,t:"Source Code Identification",i:0.00203862280282815,a:"When KeyView auto-detects a file that contains source code, it can attempt to identify the programming language that it is written in. When you do not enable source code identification, files containing source code may be identified as ASCII text files, causing the application to treat them in the ..."},"34":{y:0,u:"../Content/filter_shared/filter_detection/Determine_Doc_Reader.htm",l:-1,t:"File Formats and Document Readers",i:0.00348270733185094,a:"The KeyView configuration file  formats.ini contains a section named [Formats]. Each line in this section matches a file format with the reader to use to parse the format. For most file formats there is only one suitable reader, but for some formats you can choose a reader to use. Each file format ..."},"35":{y:0,u:"../Content/Shared/_KV_Refine_Detection_of_Text.htm",l:-1,t:"Refine Detection of Text Files",i:0.00203862280282815,a:"During text detection, KeyView analyses the first 1 kB and last 1 kB of data in a document. If less than 10% of that data consists of non-ASCII characters, KeyView detects the document as a text file. However, depending on the type of documents you are working with, the default settings might not ..."},"36":{y:0,u:"../Content/Shared/_KV_AdditionalFormatInfo.htm",l:-1,t:"Additional Format Information",i:0.00203862280282815,a:"KeyView returns basic information about a document\u0027s format, but sometimes it can be useful to have additional information. The file formats_description.tsv, which can be found in the bin directory, provides a mapping between file format ID, human-readable format description, and the format\u0027s MIME ..."},"37":{y:0,u:"../Content/filter_shared/Convert_Character_Sets.htm",l:-1,t:"Character Encoding",i:0.00232039763731722,a:"To ensure that all filtered text is output in the same character encoding, KeyView performs character encoding conversion. In most cases, if your license includes advanced character set detection, KeyView can detect the character encoding used in a source file, and automatically outputs filtered ..."},"38":{y:0,u:"../Content/C++/filter_api/Set_CharSet_Extraction.htm",l:-1,t:"Set the Character Set During Subfile Extraction",i:0.00203862280282815,a:"Set the Character Set During  Subfile Extraction You can convert the character set of a subfile at the time the subfile is extracted from the container and before it is filtered. This is most often used to set the character set of a mail message\u0027s body text. To specify the source character set of a ..."},"39":{y:0,u:"../Content/filter_shared/Filter_Deleted_Text.htm",l:-1,t:"Filter Deleted Text",i:0.00203862280282815,a:"Some applications have revision tracking features—such as Microsoft Word\u0027s Track Changes—that identify changes to a document. When these features are used, text that was deleted from a document might still be stored in the file. KeyView does not filter deleted text by default, but the Filter API ..."},"40":{y:0,u:"../Content/filter_shared/Filter_PDF_Files.htm",l:-1,t:"Filter PDF Files",i:0.00203862280282815,a:"Filter PDF Files Filter has special configuration options that allow greater control over the conversion of Adobe Acrobat PDF files."},"41":{y:0,u:"../Content/filter_shared/pdf2sr.htm",l:-1,t:"Use the pdf2sr Reader",i:0.00203862280282815,a:"The pdf2sr reader is an alternative that can be used instead of pdfsr for filtering PDF files. It uses a different parsing technology and may yield better results for some files. The pdf2sr reader has the following features: supports standard and custom metadata (non-XMP) supports basic text ..."},"42":{y:0,u:"../Content/filter_shared/Filter_PDF_LogicalOrder.htm",l:-1,t:"Filter PDF Files to a Logical Reading Order",i:0.00203862280282815,a:"The order of the text inside a PDF file has no relation to the layout of the text on the page or screen. By default, KeyView extracts paragraphs in the order in which they are stored in the file, not the order in which they appear on the page. For example, a three-column article could be output with ..."},"43":{y:0,u:"../Content/filter_shared/Rotated_Text.htm",l:-1,t:"Rotated Text",i:0.00203862280282815,a:"When a PDF that contains rotated text is filtered, the rotated text is extracted after the text at the end of the PDF page on which the rotated text appears. If the PDF is filtered with logical order enabled, and the amount of rotated text on a page surpasses a predefined threshold,  the page is ..."},"44":{y:0,u:"../Content/filter_shared/Filter_Tagged_PDF_Content.htm",l:-1,t:"Filter Tagged PDF Content",i:0.00203862280282815,a:"A tagged PDF contains an additional layer of text for visually impaired readers. This text is used in text-to-speech features in various PDF viewing programs. You can enable filtering of tagged PDF text in the API. Filtering the extra layer of tagged content might result in duplicate text in the ..."},"45":{y:0,u:"../Content/filter_shared/Skip_Embedded_Fonts.htm",l:-1,t:"Skip Embedded Fonts",i:0.00203862280282815,a:"Text in PDF files sometimes contains embedded fonts. If you experience difficulties filtering embedded fonts, you can skip this type of text. If you skip embedded fonts, none of the content that contains embedded fonts is included in the output. To skip text that uses embedded fonts In the C++ API, ..."},"46":{y:0,u:"../Content/filter_shared/Control_Hyphenation.htm",l:-1,t:"Control Hyphenation",i:0.00203862280282815,a:"There are two types of hyphens in a PDF document: A soft hyphen is added to a word by a word processor to divide the word across two lines. This is a discretionary hyphen and is used to ensure proper text flow in justified text. A hard hyphen is intentionally added to a word regardless of the word\u0027s ..."},"47":{y:0,u:"../Content/filter_shared/Filter_Portfolio_PDF.htm",l:-1,t:"Filter Portfolio PDF Files",i:0.00203862280282815,a:"Filter Portfolio PDF Files Portfolio PDF files contain subfiles and an ActionScript interface for navigating between them. You can use the extraction API to extract the subfiles.  See  Extract Subfiles from PDF Files ."},"48":{y:0,u:"../Content/Shared/_KV_Table_Detection_PDF.htm",l:-1,t:"Table Detection for PDF Files",i:0.00203862280282815,a:"PDF files often contain data presented in a tabular form. However, there is no information about the table stored within the PDF itself – the text is simply placed in  an arrangement that looks like a table to the human eye. When this data is filtered, it can be very difficult to reconstruct the ..."},"49":{y:0,u:"../Content/filter_shared/Filter_Spreadsheet_Files.htm",l:-1,t:"Filter Spreadsheet Files",i:0.00203862280282815,a:"Filter Spreadsheet Files Filter has special configuration options that enable greater control over the conversion of spreadsheet files."},"50":{y:0,u:"../Content/filter_shared/Filter_Worksheet_Names.htm",l:-1,t:"Filter Worksheet Names",i:0.00203862280282815,a:"Filter  Worksheet Names Normally, Filter does not extract worksheet names from a spreadsheet because it is assumed that the text should not be exposed. To extract worksheet names, add the following lines to the  formats.ini file: [Options]\ngetsheetnames=1"},"51":{y:0,u:"../Content/filter_shared/Filter_Hidden_Text_Excel.htm",l:-1,t:"Filter Hidden Text in Microsoft Excel Files",i:0.00203862280282815,a:"Filter  Hidden Text in Microsoft Excel Files Normally, Filter does not filter hidden text from a Microsoft Excel spreadsheet because it is assumed the text should not be exposed. You can change this default behavior, and extract text from hidden rows, columns, and sheets from Excel spreadsheets by ..."},"52":{y:0,u:"../Content/filter_shared/Specify_Date_and_Time_Fo.htm",l:-1,t:"Specify Date and Time Format on UNIX Systems",i:0.00203862280282815,a:"In Microsoft Excel you can choose to format dates and times according to the system locale.  On Windows, KeyView uses the system locale settings to determine how these dates and times should be formatted.  In other operating systems, KeyView uses the U.S. short date format (mm/dd/yyyy).  You can ..."},"53":{y:0,u:"../Content/filter_shared/large_numbers_excel.htm",l:-1,t:"Filter Very Large Numbers in Spreadsheet Cells to Precision Numbers",i:0.00203862280282815,a:"Numbers in Microsoft Excel files can be extracted and written to the output without formatting. By default, numbers are extracted in the format specified by the Excel file (for example, General, Currency and Date). Spreadsheets might contain cells that have very large numbers in them. Excel displays ..."},"54":{y:0,u:"../Content/filter_shared/Extract_Excel_Formulas.htm",l:-1,t:"Extract Microsoft Excel Formulas",i:0.00203862280282815,a:"When you filter a Microsoft Excel spreadsheet, KeyView extracts the value of each cell. The value of a cell might be calculated from a formula, but the formula is not included in the output unless you configure KeyView to include it. You can extract the cell value, the formula, or both. For example, ..."},"55":{y:0,u:"../Content/Shared/_KV_Standardize_Cell_Formats.htm",l:-1,t:"Standardize Cell Formats",i:0.00203862280282815,a:"In Microsoft Excel you can format cell values. For example, the date \"15/09/2021\" could be formatted as \"15 September 2021\" or \"2021-09-15\". By default, KeyView extracts cell values with formatting, as they would appear in Excel. If you prefer, you can configure KeyView to standardize cell values. ..."},"56":{y:0,u:"../Content/filter_shared/Tab_Delimited_Output.htm",l:-1,t:"Tab Delimited Output for Spreadsheets and Embedded Tables",i:0.00203862280282815,a:"You can use KeyView to convert spreadsheets, embedded tables in Word Processing documents (for example, Microsoft Word documents), and tables detected by Optical Character Recognition (OCR), to tab-delimited form. In this format, KeyView inserts a tab character between each cell, and a line break ..."},"57":{y:0,u:"../Content/filter_shared/Filter_Hidden_Data.htm",l:-1,t:"Filter Hidden Data",i:0.00203862280282815,a:"Filter Hidden Data Some documents contain hidden information, which is not filtered by default. Depending on the type of hidden data that you want to filter and the type of document that you are filtering, you can either use the API or set parameters in the formats.ini file."},"58":{y:0,u:"../Content/filter_shared/Hidden_Data_HTML.htm",l:-1,t:"Hidden Data in HTML Documents",i:0.00203862280282815,a:"KeyView can filter comments from HTML documents. To enable comment filtering, you must set a flag in the formats.ini file. To enable filtering of comments from HTML files Open the formats.ini file in a text editor. Under [Options], set the following flag. GetHTMLHiddenInfo= 1"},"59":{y:0,u:"../Content/Shared/_KV_No_Phonetic_Guides.htm",l:-1,t:"Exclude Japanese Guide Text",i:0.00203862280282815,a:"Exclude Japanese Guide Text This option prevents output of Japanese phonetic guide text when Microsoft Excel (.xlsx) files are processed. To prevent output of Japanese phonetic guide text In  formats.ini, set the following parameter.  [Options]\nNoPhoneticGuides=TRUE"},"60":{y:0,u:"../Content/filter_shared/Optical_Character_Recognition.htm",l:-1,t:"Optical Character Recognition",i:0.00203862280282815,a:"When processing raster image files, KeyView can perform Optical Character Recognition (OCR) to attempt to filter text that might be visible in the image. If text is detected to form part of a table, it will be filtered in the same way as tables in Word Processing documents. KeyView performs OCR only ..."},"61":{y:0,u:"../Content/filter_shared/OCR.htm",l:-1,t:"Optimize OCR Performance",i:0.0037715242376555,a:"The default settings for OCR attempt to detect as much text as possible. For example, KeyView attempts to detect text in multiple languages and alphabets, and rotated text in increments of 90 degrees from upright. This increases the amount of text that can be detected, prioritizing recall over ..."},"62":{y:0,u:"../Content/filter_shared/OCR_Config_Examples.htm",l:-1,t:"Configure OCR",i:0.00203862280282815,a:"In the following examples, OCR is configured to process scanned pages that contain only English or only Japanese text. Providing information about the input can result in a performance improvement, but OCR may fail to recognize text that does not match your configuration. For more information about ..."},"63":{y:0,u:"../Content/filter_shared/DocumentRestrictions.htm",l:-1,t:"Document Restrictions",i:0.0725200503162725,a:"Some applications, and corresponding file formats, allow users to restrict the ways in which a document can be used. For example, you might be able to read a document but additional credentials (such as a password) could be required to modify the document content, add comments, or print the ..."},"64":{y:0,u:"../Content/kv_metadata_api_c/_KV_Using_Metadata_API.htm",l:-1,t:"Use the Metadata API",i:0.00238520308979362,a:"Use the Metadata API This section describes how to use KeyView to access metadata."},"65":{y:0,u:"../Content/kv_metadata_api_c/_KV_What_Is_Metadata.htm",l:-1,t:"What is Metadata?",i:0.00203862280282815,a:"Documents may contain information about the document itself: we call this metadata. For instance, a raster image file contains metadata recording the image\u0027s width and height; a word processing document may contain metadata recording the document\u0027s author and title. Metadata can be represented by ..."},"66":{y:0,u:"../Content/kv_metadata_api_c/_KV_Understanding_Metadata.htm",l:-1,t:"Understanding Metadata Fields in KeyView",i:0.00484605709196568,a:"Field Standardization Common metadata fields such as \"Title\", \"Author\", and \"Subject\" exist in many different file formats, but can be stored in different ways. For instance, one raster image format may store the image width as a key-value pair with key Width. Another format may store the image ..."},"67":{y:0,u:"../Content/kv_metadata_api_c/_KV_Process_Fields.htm",l:-1,t:"Access Metadata Fields",i:0.00203862280282815,a:"This section explains how to process metadata fields using the KeyView API. Standardized Fields When KeyView understands the meaning of a metadata field in a document, it outputs that data in a standardized field. Standardized fields are represented as  MetadataElement  objects where: is_standard() ..."},"68":{y:0,u:"../Content/kv_metadata_api_c/_KV_Metadata_Examples_CPP.htm",l:-1,t:"Metadata Examples",i:0.00203862280282815,a:"If you want to process both standardized and non-standardized metadata fields, you can loop through  Metadata  without checking  is_standard()  or  standard_key()  – both standardized and non-standardized metadata can be handled in the same way. However, standardization allows you to handle ..."},"69":{y:0,u:"../Content/kv_metadata_api_c/_KV_Standardized_Metadata_Fields.htm",l:-1,t:"Standardized Metadata Fields",i:0.00330269030247585,a:"The following tables describe the standardized metadata fields that are supported by KeyView. Audio Metadata Date Metadata These dates are retrieved from values stored within the document, and are typically set by the creating application. For documents that exist on disk, this may differ from what ..."},"70":{y:0,u:"../Content/Chapter_Samples.htm",l:-1,t:"Sample Programs",i:0.00325165380720729,a:"Sample Programs This section describes the sample programs provided with Filter SDK."},"71":{y:0,u:"../Content/C++/samples/Introduction.htm",l:-1,t:"Introduction",i:0.00203862280282815,a:"The C++ sample programs demonstrate basic usage of the C++ implementation of the Filter API. The sample code is intended to provide a starting point for your own more advanced applications or to be used for reference purposes. The sample programs share a single header, to assist with parsing ..."},"72":{y:0,u:"../Content/C++/samples/detect.htm",l:-1,t:"detect",i:0.00232743970863271,a:"KeyView can provide information about a very large number of file formats. This sample program makes use of the file format detection API method.   The sample program takes a path to a file and prints the information reported by the API. For example, the following output is produced when you run the ..."},"73":{y:0,u:"../Content/C++/samples/extract.htm",l:-1,t:"extract",i:0.00637436845642828,a:"Some files can contain embedded subfiles, including archive formats such as zip and rar, email containers, and Office formats. This sample program makes use of the subfile extraction API methods.   The sample program takes two positional arguments:  a path to a file a path to an output directory The ..."},"74":{y:0,u:"../Content/C++/samples/filter_document.htm",l:-1,t:"filter_document",i:0.00232743970863271,a:"Filtering is the extraction of text from a document. This sample program makes use of the filter API method.  The program takes two positional arguments: an input file an output text file By default, the ouput is encoded in UTF-8. $ ./filter_document input_file output.txt Not all document formats ..."},"75":{y:0,u:"../Content/C++/samples/metadata.htm",l:-1,t:"metadata",i:0.00232743970863271,a:"Some file formats contain additional documentation (metadata) about document contents. This sample program makes use of the metadata_map API method to provide metadata fields and values.  The fields that are present vary depending on file type and the individual document. For example, running the ..."},"76":{y:0,u:"../Content/C++/samples/subfiles.htm",l:-1,t:"subfiles",i:0.00232743970863271,a:"Like the  extract  sample program, the subfiles sample program uses the subfile extraction API methods. However, rather than copying the files to disk, it prints the number of embedded subfiles, and the information that could be obtained about each one, such as the file name and size. The API ..."},"77":{y:0,u:"../Content/C++/samples/filter_container.htm",l:-1,t:"filter_container",i:0.00232743970863271,a:"This sample program is a slightly more advanced example that combines several API methods. The sample program takes an input file and an output text file. The program  writes detection information and the filtered text to the output file. It then recursively extracts  all subfiles in the input file, ..."},"78":{y:0,u:"../Content/Chapter_AdvancedTopics.htm",l:-1,t:"Advanced Topics",i:0.00238520308979362,a:"Advanced Topics This section describes some advanced topics that apply to both the Filter API and the Extraction API."},"79":{y:0,u:"../Content/filter_shared/File_Caching.htm",l:-1,t:"File Caching",i:0.00203862280282815,a:"To reduce the frequency of I/O operations, and consequently improve performance, the KeyView readers load file data into memory. The readers then read the data from the cache rather than the physical disk. You can configure the amount of memory used for file caching through the formats.ini file. ..."},"80":{y:0,u:"../Content/filter_shared/Use_Streaming_Mode.htm",l:-1,t:"Configure Pipe-Streaming",i:0.00203862280282815,a:"This section describes advanced options for configuring the streaming method that Filter uses, to optimize performance. By default, when you run Filter out-of-process, Filter uses temporary files for much of the communication with kvoop.exe. You can instead configure KeyView to use pipe-streaming. ..."},"81":{y:0,u:"../Content/filter_shared/Generate_an_Error_Log.htm",l:-1,t:"Generate an Out-of-Process Error Log",i:0.00203862280282815,a:"You can monitor and debug out-of-process filtering operations by enabling a detailed error log. This enables you to see errors that are generated at run time, and to track problem files in stream or file mode. Error logs are not generated when in-process filtering is enabled. The out-of-process ..."},"82":{y:0,u:"../Content/filter_shared/Enable_Disable_Error_Logging.htm",l:-1,t:"Enable or Disable Out-of-Process Error Logging",i:0.00290507352024183,a:"You can enable or disable out-of-process error logging by using either the API or environment variables. By default, a file called kvoop.log is created in the system temporary directory; however, you can change the path and file name of this file (see  Configure the Out-of-Process Error Log ). To ..."},"83":{y:0,u:"../Content/filter_shared/Configure_Error_Log.htm",l:-1,t:"Configure the Out-of-Process Error Log",i:0.00413978605730871,a:"Configure the Out-of-Process Error Log To configure the out-of-process error log, set the following configuration parameters in the [kvooplog] section of the formats.ini configuration file. For example: [kvooplog]\nKvoopLogName=filepath\nLogFileSize=1024\nOverWriteLog=1"},"84":{y:0,u:"../Content/Part_C++_API_Ref.htm",l:-1,t:"C++ API Reference",i:0.00290507352024183,a:"C++ API Reference This section provides detailed reference information for the C++ implementation of the File Extraction and Filter APIs. InputTypes and OutputTypes The keyview Namespace The keyview::io Namespace"},"85":{y:0,u:"../Content/C++/reference/InputOutput.htm",l:-1,t:"InputTypes and OutputTypes",i:0.00286176449420608,a:"InputTypes and OutputTypes Some of the methods in the C++SDK are templated on InputType,  OutputType, or both. You can pass these methods instances of the input and output types defined in the keyview::io namespace. See  Getting Started  for more details and examples."},"86":{y:0,u:"../Content/C++/reference/KVNamespace.htm",l:-1,t:"The keyview Namespace",i:0.00286176449420608,a:"The keyview Namespace This section provides details of the classes in the keyview namespace."},"87":{y:0,u:"../Content/C++/reference/SessionClass/SessionClass.htm",l:-1,t:"The Session Class",i:0.00237424080120022,a:"Defined in: Keyview_Session.hpp The Session class is the entry point to the C++ API. The Session class has methods to configure the session and  open a document . Options can be set by the Configuration class. This can be used in the constructor, or modified after construction. See  The ..."},"88":{y:0,u:"../Content/C++/reference/SessionClass/Constructor.htm",l:-1,t:"Constructor",i:0.00203862280282815,a:"Constructor Constructs a new Session with the specified parameters. Syntax Session::Session(\n        const std::string\u0026 license,\n        const std::string\u0026 bin_path,\n        bool in_process,\n        Configuration config \n    ) Arguments"},"89":{y:0,u:"../Content/C++/reference/SessionClass/Config.htm",l:-1,t:"config",i:0.00203862280282815,a:"config Get a reference to the configuration. This can be used to configure the next API call. Syntax const Configuration\u0026 Session::config() const\nConfiguration\u0026 Session::config()"},"90":{y:0,u:"../Content/C++/reference/SessionClass/Detect.htm",l:-1,t:"detect",i:0.00203862280282815,a:"Find the autodetected format of a file. The Session::detect method is deprecated in KeyView 23.3.0 and later. OpenText recommends that you create a  Document  object to represent each document and use the Document::info method instead. This method is still available for existing implementations, but ..."},"91":{y:0,u:"../Content/C++/reference/SessionClass/Filter.htm",l:-1,t:"filter",i:0.00203862280282815,a:"Filter a file to the provided output type. The Session::filter method is deprecated in KeyView 23.3.0 and later. OpenText recommends that you create a  Document  object to represent each document and use the Document::filter method instead. This method is still available for existing ..."},"92":{y:0,u:"../Content/C++/reference/SessionClass/get_metadata.htm",l:-1,t:"get_metadata",i:0.0030494819731441,a:"Get document metadata (including summary information and XMP metadata) from a file, preserving the ability to access it in the original type. The Session::get_metadata method is deprecated in KeyView 23.3.0 and later. OpenText recommends that you create a  Document  object to represent each document ..."},"93":{y:0,u:"../Content/C++/reference/SessionClass/get_restrictions.htm",l:-1,t:"get_restrictions",i:0.00203862280282815,a:"Gets information about any restrictions that exist on a file. For more information about this feature, see  Document Restrictions . The Session::get_restrictions method is deprecated in KeyView 23.3.0 and later. OpenText recommends that you create a  Document  object to represent each document and ..."},"94":{y:0,u:"../Content/C++/reference/SessionClass/GetSummaryInformation.htm",l:-1,t:"get_summary_information",i:0.00203862280282815,a:"Get document summary information metadata from a file, preserving the ability to access it in the original type. The get_summary_information method is deprecated in KeyView 23.2.0 and later. OpenText recommends that you create a  Document  object to represent each document and use the ..."},"95":{y:0,u:"../Content/C++/reference/SessionClass/MetadataMap.htm",l:-1,t:"metadata_map",i:0.00574993355225165,a:"Get document metadata from a file. This function converts all metadata values of any type are converted to UTF-8 strings. It outputs date/time values in UTC in the ISO-8601 date format YYYY-MM-DDThh:mm:ssZ. For example 2016-02-09T16:15:51Z. The Session::metadata_map method is deprecated in KeyView ..."},"96":{y:0,u:"../Content/C++/reference/SessionClass/open.htm",l:-1,t:"open",i:0.00304772105335627,a:"Opens a document and returns a KeyView  Document  object that you can use to access information from the document - for example its format information, text, metadata, and subfiles. A Document must be created and used on the same thread that created the parent session. A Document object maintains a ..."},"97":{y:0,u:"../Content/C++/reference/SessionClass/Subfiles.htm",l:-1,t:"subfiles",i:0.00203862280282815,a:"Obtain information about subfiles. The Container holds references to the session and input. The Session::subfiles method is deprecated in KeyView 23.3.0 and later. OpenText recommends that you create a  Document  object to represent each document and use the Document::subfiles method instead. This ..."},"98":{y:0,u:"../Content/C++/reference/ConfigurationClass/ConfigurationClass.htm",l:-1,t:"The Configuration Class",i:0.0089432631045481,a:"The Configuration Class Defined in: Keyview_Configuration.hpp The Configuration class allows you to set a wide variety of options. You can use this class to construct a KeyView Session, and to modify a Session that has already been constructed. Each option has a setter method."},"99":{y:0,u:"../Content/C++/reference/ConfigurationClass/Constructor.htm",l:-1,t:"Constructor",i:0.00203862280282815,a:"Constructor Create a new Configuration object. Syntax Configuration::Configuration()\nConfiguration::Configuration(const Configuration\u0026)"},"100":{y:0,u:"../Content/C++/reference/ConfigurationClass/character_set_detection.htm",l:-1,t:"character_set_detection",i:0.00232039763731722,a:"character_set_detection Setting character_set_detection to true enables advanced character set detection. Setting it to false disables it. Default value: true Syntax \nConfiguration\u0026 character_set_detection( bool character_set_detection);"},"101":{y:0,u:"../Content/C++/reference/ConfigurationClass/custom_pdf_metadata.htm",l:-1,t:"custom_pdf_metadata",i:0.00203862280282815,a:"custom_pdf_metadata Setting custom_pdf_metadata to true results in all custom metadata being filtered from PDF documents when the metadata APIs are used. Default value: false Syntax \nConfiguration\u0026 Configuration::custom_pdf_metadata(bool emit_custom_metadata)"},"102":{y:0,u:"../Content/C++/reference/ConfigurationClass/date_time_field_codes.htm",l:-1,t:"date_time_field_codes",i:0.00203862280282815,a:"date_time_field_codes If you set date_time_field_codes to true, date/time field codes are extracted from Microsoft Word, PowerPoint, and RTF documents, instead of date/time values. Default value: false Syntax \nConfiguration\u0026 Configuration::date_time_field_codes(bool use_fieldcode)"},"103":{y:0,u:"../Content/C++/reference/ConfigurationClass/extract_pipe_streaming.htm",l:-1,t:"extract_pipe_streaming",i:0.00290507352024183,a:"extract_pipe_streaming Enables pipe-streaming mode for extraction.\n Default value: false Syntax Configuration\u0026 extract_pipe_streaming(bool use_pipe_streaming);"},"104":{y:0,u:"../Content/C++/reference/ConfigurationClass/extraction_timeout.htm",l:-1,t:"extraction_timeout",i:0.00203862280282815,a:"(Out-of-process only) Sets the timeout for extracting one document. If the process times out, KeyView shuts down the out-of-process process, which might add some time before the function returns. Default value: 350 seconds Syntax \nConfiguration\u0026 Configuration::extraction_timeout(long seconds)"},"105":{y:0,u:"../Content/C++/reference/ConfigurationClass/filename_field_code.htm",l:-1,t:"filename_field_code",i:0.00203862280282815,a:"filename_field_code If you set filename_field_code to true, file name field codes are extracted from Microsoft Word documents. Default value: false Syntax \nConfiguration\u0026 Configuration::filename_field_code(bool use_fieldcode)"},"106":{y:0,u:"../Content/C++/reference/ConfigurationClass/filter_pipe_streaming.htm",l:-1,t:"filter_pipe_streaming",i:0.00290507352024183,a:"filter_pipe_streaming Enables pipe-streaming mode for filtering. Default value: false Syntax Configuration\u0026 filter_pipe_streaming(bool use_pipe_streaming);"},"107":{y:0,u:"../Content/C++/reference/ConfigurationClass/formatted_mail.htm",l:-1,t:"formatted_mail",i:0.00203862280282815,a:"formatted_mail If you set formatted_mail to true, the formatted version of the message body (HTML or RTF) is extracted from mail files where possible. Default value: false Syntax \nConfiguration\u0026 Configuration::formatted_mail(bool extract_formatted)"},"108":{y:0,u:"../Content/C++/reference/ConfigurationClass/header_and_footer.htm",l:-1,t:"header_and_footer",i:0.00203862280282815,a:"header_and_footer Extracts headers and footers. Default value: false Syntax \nConfiguration\u0026 Configuration::header_and_footer(bool emit_header_text)"},"109":{y:0,u:"../Content/C++/reference/ConfigurationClass/header_and_footer_tags.htm",l:-1,t:"header_and_footer_tags",i:0.00203862280282815,a:"header_and_footer_tags Puts tags around header and footer data. Default value: false Syntax \nConfiguration\u0026 Configuration::header_and_footer_tags(bool tag_headers)"},"110":{y:0,u:"../Content/C++/reference/ConfigurationClass/hidden_text.htm",l:-1,t:"hidden_text",i:0.00203862280282815,a:"hidden_text If you set hidden_text to true, hidden text in Microsoft Word, Excel, and PowerPoint documents is extracted. Default value: false Syntax \nConfiguration\u0026 Configuration::hidden_text(bool emit_hidden_text)"},"111":{y:0,u:"../Content/C++/reference/ConfigurationClass/no_encoding_conversion.htm",l:-1,t:"no_encoding_conversion",i:0.00232039763731722,a:"Setting no_encoding_conversion to true prevents the default conversion of the text encoding. Filter  retains the original character encoding of the document if it is available. Default value: false Syntax \nConfiguration\u0026 Configuration::no_encoding_conversion(bool suppress_conversion)"},"112":{y:0,u:"../Content/C++/reference/ConfigurationClass/ocr.htm",l:-1,t:"ocr",i:0.00203862280282815,a:"Configure Optical Character Recognition (OCR) on raster image files to attempt to filter text.  This option is available only if OCR is included in your license. Setting enable to false disables it. The default OCR settings attempt to recognize as much text as possible, prioritizing recall over ..."},"113":{y:0,u:"../Content/C++/reference/ConfigurationClass/oop_error_log.htm",l:-1,t:"oop_error_log",i:0.0037715242376555,a:"oop_error_log Setting this to true enables out of process error logging. Default value: false Syntax \nConfiguration\u0026 Configuration::oop_error_log(bool use_log)"},"114":{y:0,u:"../Content/C++/reference/ConfigurationClass/out_of_process_log.htm",l:-1,t:"out_of_process_log",i:0.00203862280282815,a:"out_of_process_log The out_of_process_log function is deprecated in KeyView version 24.1. Use  oop_error_log  instead. Setting this to true enables out of process logging. Default value: false Syntax \nConfiguration\u0026 Configuration::out_of_process_log(bool use_log)"},"115":{y:0,u:"../Content/C++/reference/ConfigurationClass/output_table_delimiters.htm",l:-1,t:"output_table_delimiters",i:0.00261625661443727,a:"Specifies to insert delimiters between tables that are understood by IDOL Eduction. For this to take effect, you must also enable  tab_delimited , and the target encoding must be KVCS_UTF8. Default value: false Syntax \nConfiguration\u0026 Configuration::output_table_delimiters(bool delimiters)"},"116":{y:0,u:"../Content/C++/reference/ConfigurationClass/password.htm",l:-1,t:"password",i:0.00203862280282815,a:"password Specifies a password to open a password-protected file for filtering. Default value: empty string Syntax \nConfiguration\u0026 Configuration::password(std::string document_password)"},"117":{y:0,u:"../Content/C++/reference/ConfigurationClass/pdf_logical_reading.htm",l:-1,t:"pdf_logical_reading",i:0.00203862280282815,a:"pdf_logical_reading Specifies the order in which the user would like paragraphs in PDF file to be extracted (logical reading order). Default value: raw Syntax \nConfiguration\u0026 Configuration::pdf_logical_reading(LogicalPDFDirection mode)"},"118":{y:0,u:"../Content/C++/reference/ConfigurationClass/revision_marks.htm",l:-1,t:"revision_marks",i:0.0037715242376555,a:"revision_marks If you set revision_marks to true, text that was deleted from  documents with revision tracking enabled is included in the filtered output. Default value: false Syntax \nConfiguration\u0026 Configuration::revision_marks(bool emit_revision_marks)"},"119":{y:0,u:"../Content/C++/reference/ConfigurationClass/skip_comments.htm",l:-1,t:"skip_comments",i:0.00203862280282815,a:"skip_comments If you set skip_comments to true, comments from Microsoft Word, PowerPoint, or Excel documents are not extracted. Default value: false Syntax \nConfiguration\u0026 Configuration::skip_comments(bool no_comments)"},"120":{y:0,u:"../Content/C++/reference/ConfigurationClass/skip_embedded_fonts.htm",l:-1,t:"skip_embedded_fonts",i:0.00290507352024183,a:"skip_embedded_fonts If you set skip_embedded_fonts to true, text that contains embedded fonts is not filtered from PDF documents. Default value: false Syntax \nConfiguration\u0026 Configuration::skip_embedded_fonts(bool no_embedded_fonts)"},"121":{y:0,u:"../Content/C++/reference/ConfigurationClass/skip_thumbnail.htm",l:-1,t:"skip_thumbnail",i:0.00203862280282815,a:"skip_thumbnail If you set skip_thumbnail to true, text from thumbnail images for embedded objects in Microsoft Word documents is not  extracted. Default value: false Syntax \nConfiguration\u0026 Configuration::skip_thumbnail(bool no_thumbnail)"},"122":{y:0,u:"../Content/C++/reference/ConfigurationClass/soft_hyphens.htm",l:-1,t:"soft_hyphens",i:0.00290507352024183,a:"soft_hyphens If you set soft_hyphens to true, soft hyphens are retained when text is filtered from PDF documents. Default value: false Syntax \nConfiguration\u0026 Configuration::soft_hyphens(bool emit_softhyphens)"},"123":{y:0,u:"../Content/C++/reference/ConfigurationClass/source_encoding.htm",l:-1,t:"source_encoding",i:0.00232039763731722,a:"source_encoding Specifies the character encoding of the input file. Default value: KVCS_UNKNOWN Syntax \nConfiguration\u0026 Configuration::source_encoding(Encoding encoding_of_source)"},"124":{y:0,u:"../Content/C++/reference/ConfigurationClass/tab_delimited.htm",l:-1,t:"tab_delimited",i:0.00484017380880223,a:"tab_delimited Enables tab-delimited mode for spreadsheets and embedded tables. Default value: false Syntax \nConfiguration\u0026 Configuration::tab_delimited(bool tabs)"},"125":{y:0,u:"../Content/C++/reference/ConfigurationClass/tagged_pdf_content.htm",l:-1,t:"tagged_pdf_content",i:0.00203862280282815,a:"tagged_pdf_content If you set tagged_pdf_content to true, tagged PDF content is filtered from PDF documents. Default value: false Syntax \nConfiguration\u0026 Configuration::tagged_pdf_content(bool emit_tagged_content)"},"126":{y:0,u:"../Content/C++/reference/ConfigurationClass/target_encoding.htm",l:-1,t:"target_encoding",i:0.00232039763731722,a:"target_encoding Sets the encoding of the text in the output file. Default value: KVCS_UTF8 Syntax \nConfiguration\u0026 Configuration::target_encoding(Encoding encoding_of_target)"},"127":{y:0,u:"../Content/C++/reference/ConfigurationClass/string_temporary_directory.htm",l:-1,t:"temporary_directory",i:0.00203862280282815,a:"temporary_directory Sets the path of the temporary directory where temporary files are created during filtering. Default value: empty string Syntax \nConfiguration\u0026 Configuration::temporary_directory(std::string temp_dir)"},"128":{y:0,u:"../Content/C++/reference/ConfigurationClass/timeout.htm",l:-1,t:"timeout",i:0.00203862280282815,a:"(Out-of-process only) Sets the timeout used when filtering a particular document.  If the process times out, KeyView shuts down the out-of-process process, which might add some time before the function returns. Note that this timeout is also used for SummaryInformation and Detection. Default value: ..."},"129":{y:0,u:"../Content/C++/reference/ConfigurationClass/unicode_byte_order_marker.htm",l:-1,t:"unicode_byte_order_marker",i:0.00203862280282815,a:"unicode_byte_order_marker If you set unicode_byte_order_marker to true, the  filtered text in the output begins with a Unicode byte order marker. Default value: false Syntax \nConfiguration\u0026 Configuration::unicode_byte_order_marker(bool emit_bom)"},"130":{y:0,u:"../Content/C++/reference/DetectionInfoClass/DetectionInfoClass.htm",l:-1,t:"The DetectionInfo Class",i:0.00454783506342201,a:"The DetectionInfo Class Provides information about the format of a file. Defined in: Keyview_Detect.hpp"},"131":{y:0,u:"../Content/C++/reference/DetectionInfoClass/appleDoubleEncoded.htm",l:-1,t:"appleDoubleEncoded",i:0.00203862280282815,a:"appleDoubleEncoded Returns true if the file is AppleDouble encoded Syntax bool DetectionInfo::appleDoubleEncoded() const"},});