Parsed URLs

Silk Performer offers an extensible HTML parser that detects parsed URLs (also known as custom URLs). In addition to embedded objects, frames, links and forms, parsed URLs are also a category of HTML element that is detected by the HTML parser.

Extending the HTML parser: WebPageParseURL

To enable the HTML parser to parse custom URLs, you need to specify a parsing rule before the page-level custom URL parsing function. The API function for this purpose is WebPageParseUrl. WebPageParseUrl works much like the WebParseDataBound function.

A parsed URL gets a name (this corresponds to the name of the link). The name is specified in the first parameter of WebPageParseUrl. The second and third parameters are the left and right boundaries. The fourth parameter is options (for example, to ignore white spaces when parsing for the boundaries).

A single custom URL parsing specification can result in multiple parsed URLs since parsing does not stop after the first URL is found. More than one WebPageParseUrl statement can be placed before a page-level API call, resulting in multiple parsing rules applied concurrently during page download.

Parsing relative URLs

In parsing custom URLs, the HTML parser applies the same rules that are in effect for resolving relative URLs for links and frames. This means that you can parse, for example, only for a filename and receive a complete absolute URL.

Example: The following HTML document was generated by submitting a login form. It has the base URL http://www4.company.com/user/6543/navigation.asp:

<a href="JavaScript:window.open('account.asp')">Edit Account</a>

When the user clicks this link, the browser opens the URL http://www4.company.com/user/6543/account.asp in a new browser window.

This can be modeled in Silk Performer as:

WebPageParseUrl("Javascript window open", "open('", "'");
WebPageSubmit("login", FORM_LOGIN, "LoginPageTimer");
// now the parsed URL is available under
// the name "Javascript window open" and can be used.

Custom URLs, once parsed, can be used in a variety of ways.

Parsed URLs and WebPageLink

A parsed URL that is parsed using the WEB_FLAG_PARSE_LINK flag can be used anywhere that a link can be used (for example, for the WebPageLink or WebPageQueryLink functions). Note that if you do not specify the WEB_FLAG_PARSE_LINK and WEB_FLAG_PARSE_URL flags, both flags will be in effect by default.

Now the preceding example can be completed:

WebPageParseUrl("Javascript window open", "open('", "'");
WebPageSubmit("login", FORM_LOGIN, "LoginPageTimer");
WebPageLink("Javascript window open");
Note:

The Silk Performer recorder can generate a WebPageParseUrl call and use the parsed URL for a WebPageLink call. The recorder does this automatically whenever possible to avoid context-less function calls.

To enable this feature, select the Dynamic link parsing checkbox on the Advanced Context Management Settings dialog box at Settings > Active Profile > Web > Recording tab.

Parsed URLs and context-less functions: WebPageQueryParsedUrl

Parsed URLs can be retrieved and saved in string variables using the WebPageQueryParsedUrl function.

Such a string variable can then be used as a parameter for all page-level or low-level functions that require URL parameters, in addition to other purposes (for example, diagnostics output and StrSearchDelimited).

The following example HTML document resulted by submitting a login form. It has the base URL http://www4.company.com/user/6543/navigation.asp
<!--
function ShowContent(url, category, vendor)
{
  top.frames["content"].location.href=
    url + "?cat=" + category + "&vendor=" + vendor;
}
// end of script -->
…
<a href="JavaScript:ShowContent('products.asp', 'HD', 'IBM')">
   hard discs by IBM</a>
<a href="JavaScript:ShowContent('products.asp', 'HD', 'WD')">
   hard discs by Western Digital</a>
<a href="JavaScript:ShowContent('products.asp', 'Mon', 'Sony')">
   Monitors by Sony</a>

Assume that the API call that led to this page looks like this:

WebPageSubmit("login", FORM_LOGIN, "LoginPageTimer");

Now when the user clicks the second link, the browser loads the URL http://www4.company.com/user/6543/products.asp?cat=HD&vendor=WD into a frame named content. The parsed URLs and WebPageLink approach will not work for modelling this in BDL. That approach would require that parsing boundaries could be found for one of the following strings:

  • http://www4.company.com/user/6543/products.asp?cat=HD&vendor=WD
  • /user/6543/products.asp?cat=HD&vendor=WD
  • products.asp?cat=HD&vendor=WD

But none of these strings occur in the HTML code. Parsing of the URL does not seem possible. Consider the following:

WebPageSubmit("login", FORM_LOGIN, "LoginPageTimer");
WebPageUrl(
  "http://www4.company.com/user/6543/products.asp",
  "ProductTimer", FORM_PRODUCT_SELECT);
...
dclform
  FORM_PRODUCT_SELECT:
    "cat"    := "HD",
    "vendor" := "WD";

What you get is a context-less function, which, in this example, incorporates dynamic data in the URL. But this can be improved upon. While http://www4.company.com/user/6543/products.asp?cat=HD&vendor=WD can not be parsed, the shorter URL http://www4.company.com/user/6543/products.asp is now in the script and can be used as a parameter for the WebPageUrl function. This URL can be parsed using the boundaries "ShowContent('" and "'". This will parse the string products.asp, which will, after relative URL resolution, yield the required URL. This parsed URL can be copied into a string variable and the variable can be used rather than the hard coded URL parameter in the script. Thereby the following script is generated:

var
  sParsedUrl : string;
..
WebPageParseUrl("ShowContent", "ShowContent('", "'");
WebPageSubmit("login", FORM_LOGIN, "LoginPageTimer");
WebPageQueryParsedUrl(sParsedUrl, sizeof(sParsedUrl),
                  "ShowContent");
WebPageUrl(sParsedUrl, "ProductTimer", FORM_PRODUCT_SELECT);
..
dclform
  FORM_PRODUCT_SELECT:
    "cat"    := "HD",
    "vendor" := "WD";

The advantage here is that a context-less function call has been made, in a sense, "semi-context-full." While the query string is still context-less, the URL is parsed and so the dynamic data in the URL can be handled properly.

Note that in other examples it may happen that a URL won't be dynamic, yet the query string will contain dynamic data. In such instances the technique shown here won't deliver improved context management.

Note: The Silk Performer recorder can automatically generate WebPageParseUrl calls and query the parsed URLs for use with context-less function calls. The recorder does this whenever this technique can be used in conjunction with a context-less page-level API function call. To enable this feature, select the Dynamic link parsing checkbox on the Advanced Context Management Settings dialog box at Settings > Active Profile > Web > Recording tab.