Filter SDK enables you to filter many different types of documents. Filtering is the process of extracting the text from a document without the application-specific markup. However, the filtering process can also include the following:
File format extraction—this process detects a file’s format, and reports the information to the API, which in turn reports the information to the developer’s application. See File Format Detection.
Metadata extraction—this process extracts selected metadata (document properties) from a file. See Extract Metadata.
Character set conversion—this process controls the character set of both the input and the output text. See Convert Character Sets.