Built-in grammars and entities
File Analysis Suite includes grammars and entities derived from Micro Focus IDOL Eduction. The following grammar and entity information is presented as a reference and represents the type of information supported by the grammars and entities that are built into File Analysis Suite.
Grammar | Grammar description | Entity information type |
---|---|---|
Contact Data |
Any information that can be used to contact an individual, such as postal addresses, phone numbers, and email addresses. This grammar includes information for multiple languages and countries. |
Addresses
A postal address. This entity returns the addresses in a normalized format by default. The normalized form standardizes apartment and house numbers, removes additional punctuation, and converts the text to uppercase. For example, "ABIDEI HURRIYET CD TANER PALAS APT 9" or "KAT:7, D:9, 34437 ISTANBUL". The exact order depends on the country. For CJKVT, this entity returns the addresses in a normalized format. The normalized form standardizes apartment and house numbers, removes additional punctuation, and for Romanized text, it converts the text to uppercase. CJKVT native script is not normalized to ASCII, and Romanized text is not normalized to CJKVT native script. Email Addresses
Email address. For example, "jsmith@mailserver.com". Email address with mailto: prefix. For example, "mailto:jsmith@mailserver.com". Phone Numbers
A telephone number with context. For example "Tel: +44 1234 224050", "Telephone: (204)-243-9955", or "numéro de téléphone: +1-902-861-7000". For CJKVT, numbers can be ASCII or full-width numbers. |
Devices | Any information that can identify electronic devices, such as IP and MAC addresses. |
Device ID
An identification number for a computing device (such as a computer, tablet, or smart phone). The following device IDs are included.
|
Financial Data |
Any personal data related to financial data such as bank accounts, IBAN, salary information, and so on. This grammar includes information for multiple languages and countries. |
Bank Account Numbers
A bank account number. The following bank account patterns are included.
Bank Details
A name of a bank. Major bank names for the following countries are included.
Credit Card Numbers
Any credit card number. The following credit card formats are included.
IBAN (International Bank Account Number)
Undelimited or space-delimited International Bank Account Number (IBAN) for each supported country. For more information on IBAN formatting requirements for each country, see https://www.iban.com/structure.html. Sort Codes
A bank sort code. The following sort code formats are included.
|
Government ID |
Government issued identification information such as drivers license, passport, social security, and so on. This grammar includes information for multiple languages and countries. |
Driving License Numbers
A driving license number with context. For example: "australian automobile association: 103 805 501", or "driver's license: A234567890". This entity matches both the driving license number, and the personal number or driver number, if present. On the standard European driving license, these are fields 5 and 4d. Machine Readable Passport
A machine readable passport number. For example "P<GBRUK<SPECIMEN<<ANGELA<ZOE<<<<<<<<<<<<<<<< 5333244280GBR8812049F2509286<<<<<<<<<<<<<<00" A CJKVT machine readable passport line. For example "P<JPN<<<<<<<KEIKO<INOUE<<<<<<<<<<<<<<<<<<<<<". Machine Readable TD-1 Travel Document
A machine readable TD1-size travel document number. For example, "IDD<<T220001293<<<<<<<<<<<<<<< 6408125<2010315D<<<<<<<<<<<<<4 MUSTERMANN<<ERIKA<<<<<<<<<<<<<". A CJKVT machine readable TD1-size travel document line. For example, "KEIKO<<INOUE<<<<<<<<<<<<<<<<<<". National ID
A national identity number with context. For example, "SSN 111-22-3333", "National Insurance Number AB 12 34 56 C", "Code INSEE 187090100100141", or "ImmiCard AMS123456". NOTE: PossibleTurkish national identity numbers are identified without context. Each country has their own format. Passport Numbers
A passport number with context. For example, "Passport number: 533324428", "Passport Number: P4366918", or "italian passaporti AA5275702". Social Security Tax ID
A tax identification number (TIN or ITIN) with context. For example "ITIN: 911-92-3333", or "TIN-numre: 101111113". Each country has their own format. VAT number
A value added tax identification number (VATIN) with context. For example "NUIS: ALK99999999L" or "VAT Reg No GB 980 7806 84". |
Identification Data |
Any personal data closely related to the identity of an individual such as name, date of birth, gender, salutation, title, and so on. This grammar includes information for multiple languages and countries. |
Date of Birth
A date of birth, written numerically or using words. For example "date of birth 1/1/2018", "GEBOORTEDATUM: 01/01/2018" Genders
A gender or family relation in the English, French, or German language, either in a word or in context. For example, "lady", "father", "Dame", "voisines", "Frau", or "mensch". Names
A full personal name, in title case or upper case. For example, "John Smith", "KEIKO NAKAMURA", or "山田恵". For CJKVT, a full personal name, in romanized text or CJKVT native script. Romanized names can be in title case or upper case, and can be in the order given name surname or surname given name. CJKVT native script names must be surname given name. For Japanese, either form can include honorifics. |
Nationalities |
Any nationality. This grammar matches nationalities written in English or French, such as "French" or "Francais". |
Nationalities
Any combination of nationality adjective and noun landmark and value, with context. For example, "Country: British", or "Nationality: British". |
Sensitive Data |
Any personal information that defines the racial or ethnic origin of an individual. This grammar matches racial or ethnic origin written in English or French, such as "caucasian" or "caucasien". |
Racial Ethnic Origin
A reference to ethnicity or race identification. For example, "White", "Fijian", "Inuit", or "Irish". United Kingdom identity code. For example, IC1, IC2. Ethnic groups in the French language. For example, "Africain" or "Autres". |