Features and Capabilities
- The Web Connector retrieves information from Web sites over HTTP.
- The embedded web browser in Web Connector 12.8 is Chromium 78.0.3886.1.
- The connector can crawl a Web site starting from a page that you specify, or retrieve the pages listed in a site map. When crawling a web site, the connector can follow links that exist in the HTML source of the page or in Adobe Flash content.
-
The connector can extract information contained in data URIs. For example, images might be base-64 encoded and included in the source of an HTML page:
<img src="data:image/png;base64,...">
-
The connector can log on to Web sites to retrieve content. The connector supports:
- Basic authentication
- HTTP Digest authentication.
- NTLMv2 authentication.
- You can configure the connector to populate and submit HTML forms. This feature is useful when the connector must log on to a Web site.
- The connector uses canonical links in web pages and response headers to de-duplicate content retrieved from the web.