Data Extraction / Screen Scraping Using XPath

Posted on: May 16, 2007

Google AJAXSLT is an implementation of XSL-T in JavaScript, intended for use in fat web pages, which are nowadays referred to as AJAX applications. Because XSL-T uses XPath, it is also an implementation of XPath that can be used independently of XSL-T.

Selenium Core uses AJAXSLT’s XPath function to locate element on plain html. Selenium IDE can generate the XPath very much same as Solvent.

Here is a sample to extracting data from a web page using XPath using AJAXSLT.

script language=”JavaScript” type=”text/javascript” src=”xpath/misc.js”></script>
script language=”JavaScript” type=”text/javascript” src=”xpath/dom.js”></script>
script language=”JavaScript” type=”text/javascript” src=”xpath/xpath.js”></script>
script language=”JavaScript” type=”text/javascript”>
findElementUsingFullXPath(xpath, inDocument) {
var context = new ExprContext(inDocument);
var xpathObj = xpathParse(xpath);
var xpathResult = xpathObj.evaluate(context);
if (xpathResult && xpathResult.value) {
    return xpathResult.value[0];
  return null;
function start() {
“//*[.=’b’]”, window.document).innerHTML);
body onload=”start()”>



