Originally Posted by anu sandhanam in Web Performance Management

Hello Readers!! It gives me immense pleasure in meeting you with yet another article on WebDriver. WebDriver has been creating a lot of buzz among the Selenium community lately and Neustar, an active member of the Selenium community, wasted no time in announcing its support for the WebDriver API.  I hope you had the chance to read the great introduction article on the Webdriver API by fellow blogger Ben.  In this article I would like to touch on the basics of scripting with the WebDriver and the Neustar API. I’m sure many of you, who are new to this API are asking yourselves these questions just like I did.

  • What is WebDriver?
  • Why is it needed?

WebDriver is a framework for automated testing of web applications. WebDriver is a part of Selenium 2.0, which is a major departure from the 1.0 model. Selenium 1.0 is a client-server architecture which uses the RC libraries to program tests that communicate with the server, and the server relays those commands to a browser.  The primary downside of the 1.0 model is that it is written in JavaScript. Browsers impose a strict security model on any JavaScript that they execute in order to protect a user from malicious scripts. Additionally, the API for RC has grown over time that users find it hard to keep up with this ever growing dictionary of methods. The WebDriver helps overcome both these hurdles with an object oriented API that controls the browser itself rather than running as a JavaScript application within the browser.  WebDriver has five major Interfaces - ImeHandler, Navigation, Options, TargetLocator , Timeouts, the methods of which fall into following three categories.

  • Control of the browser itself  Eg: get(), navigate(), close()etc
  • Selection of WebElements  Eg: findElement(), findElements() etc
  • Debugging aids  Eg: getCurrentUrl(), getPageSource(), getTitle() etc

Now that we have a clear understanding of what WebDriver is and what its benefits are, let’s take a deep dive into the various useful WebDriver methods exposed via Neustar through some example scripts. Let’s look at this simple script below, to begin with.

var driver = browserMob.openBrowserWebDriver();
var tx = browserMob.beginTransaction();
var step = browserMob.beginStep("Home Page");
// polls DOM every 500 ms for ten seconds until the text 'Test Design Considerations' is found
driver.manage().timeouts().implicitlyWait(10,  TimeUnit.SECONDS);
// Content validation - find element by xpath
driver.findElement(By.xpath("//div[@id='test-design-considerations']/h1[contains(text(), 'Test Design Considerations')]"));

Deciphering the script…

  • var driver = browserMob.openBrowserWebDriver();
    The openBrowserWebDriver()method of WebDriver starts a WebDriver browser session.
  • driver.manage().timeouts().implicitlyWait(10,  TimeUnit.SECONDS);
    driver.findElement(By.xpath(“//div[@id='test-design-considerations']/h1[contains(text(), 'Test Design Considerations')]“));
    WebDriver has a blocking API. i.e. WebDriver will wait until the page has fully loaded (that is, the “onload” event has fired) before returning control to your test or script. However, under some conditions it is possible for a ‘get’ call to return before the page has finished loading. The classic example is JavaScript starting to run after the page has loaded (triggered by onload). Browsers (e.g. Firefox) will notify WebDriver when the basic HTML content has been loaded, which is when WebDriver returns. It’s difficult to know when JavaScript has finished executing, since JS code may schedule functions to be called in the future, depend on server response, etc. WebDriver cannot wait for all conditions to be met before the test proceeds because it does not know them. The solution to ensure whether the element involved in the next interaction is present and ready is to use the ‘Wait’ class to wait for a specific element to appear. This class simply calls findElement over and over, until the element is found (or a timeout has expired). Since this is the behavior desired by default for many users, a mechanism for implicitly-waiting for elements to appear has been implemented. This is accessible through the Timeouts() interface of the WebDriver class (WebDriver.manage().timeouts() ). In the example script above, the driver.manage.timeouts().implicitlyWait call polls the DOM every 500 ms for 10 seconds until the text ‘Test Design Considerations’ is found.


  • Optionally you can reset the Implicit Timeout to zero using the code below, as once set, it remains effective until the life of the WebDriver instance. The default setting is zero, i.e zero wait time.
    driver.manage().timeouts().implicitlyWait(0,  TimeUnit.SECONDS);
  • Alternatively, browserMob’s  waitFor() method could be used in place of the implicitlyWait() for content validation. Modifying our script with waitFor()…
    var step = browserMob.beginStep("Home Page");
    browserMob.waitFor(function() {
          return check = driver.findElement(By.xpath("//div[@id='test-design-considerations']/h1")).getText().toLowerCase().startsWith("test design considerations");
        }, 10000);
    if (!check)
         throw "Content validation failed";

Great!! We just wrote our first WebDriver script. We now know how to interact with page elements via the findElement method of WebDriver’s WebElement Interface using XPath. But does WebDriver provide other alternatives for interaction? Absolutely.  Below are few alternative ways. As implied through the script, using the ‘findElement’ is a great way to validate page content before proceeding to the next step in the script.

By.cssSelector:driver.findElement(By.cssSelector(“div#test-design-considerations > h1″));

By.linkText:driver.findElement(By.linkText(“Data Driven Testing”));

By.partialLinkText:driver.findElement(By.partialLinkText(“Data Driven”));





driver.findElements:In situations where you have to iterate through multiple elements with the same id/class/tag name etc or have to pick one randomly, the ‘findElements’ method comes in handy. The example script below uses this method to pick a random suggestion from an AutoSuggest list.

var elem = driver.findElement(By.cssSelector("input[id*='KeywordTextBox']"));
// clear the search box first
// type Carlsbad in the search box
// wait for autosuggest to populate
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
var autoSuggest = driver.findElements(By.xpath("//div[@class='ac_results']/ul/li"));
// 'autoSuggest' from above is a Java List. Either use size() to get its length 
// or convert this to an array - autoSuggest.toArray().length
// click on a random element – picking the last element in this case
var selected = driver.findElement(By.xpath("//div[@class='ac_results']/ul/li["+autoSuggest.size()+"]")).click();
// throw an error if content validation returns false
if (!driver.getTitle().contains("Search Results"))
     throw 'Content error: "Search Results" not found';

As you can see, I’ve also managed to introduce a few other useful methods through this script. The comments above each one of them describe their purpose.


  • sendKeys can also be used to simulate pressing of the keyboard keys by using the “Keys” class. It is possible to call sendKeys on any element. The code below triggers the “ENTER’ key press to force a form submission. The list of Keys supported can be found here.
  • With advanced locators, it is possible to traverse down the DOM with a single line of code as in below:

    As the index starts with zero, the line of code above would click on the third “li” of the second “ul” on the page.

A few more common page element interactions in detail…  How to handle frames? Swinging between frames is easy with the switchTo() method of the TargetLocator interface of the WebDriver class.

  • Switch to the frame from main window
    • By index
    • By frame name
  •  Switch back to the parent window
  • driver.switchTo().defaultContent();

The following example illustrates a simple swinging between two frames on a web page. Let’s assume the page has two frames, frame1 and frame2, out of which frame1 has a link and frame2 has a form.


Try running this snippet commenting out the “driver.switchTo().defaultContent();” before switching to frame2 and you would end up with an error that frame2 cannot be located. This explains the need for switching back to the parent page from inside a frame first, before navigating to another frame on the same page.  How to handle JavaScript Alert and Confirmation boxes? Confirmation boxes are handled the same way as the Alerts.

  • Switch focus to the alert/confirmation box
    var alert = driver.switchTo().alert();
  • Dismiss – equivalent to hitting “Cancel” button on the Confirmation box or ignoring an Alert by closing it.
  • Alternatively accept – equivalent to hitting “OK” on the Alert or the Confirmation box
  • Handle Confirmation boxes the JavaScript way
    driver.executeScript("window.confirm = function(msg){return true;};");

How to handle Pop-up Windows?

browserMob.beginStep("Handle Popup Windows");
// Get the handle to the current window
var currentHandle = driver.getWindowHandle(); 
driver.manage().timeouts().implicitlyWait(10,  TimeUnit.SECONDS);
// Open Pop-up
driver.findElement(By.partialLinkText("Open the JavaScript Window")).click();
// Switch focus to Pop-up using its name
// Alternate way - if the Pop-up window doesn't have a name, which is very common - 
// get the handle to the last opened window
var handles = driver.getWindowHandles().toArray();
var id = handles.length - 1
//close popup
driver.executeScript("self.close ();");
//switch focus to parent window

How to interact with SELECT drop-downs?

var lastItem = driver.findElement(By.xpath("//select[@class='dropDown']/option[last()]"));

How to simulate advanced interaction such as Drag and Drop?

The Actions subclass of the Selenium Interactions class makes emulation of these complex user gestures easier.

  • Drag an element to drop on to another element
    var from = driver.findElement(By.xpath("//ul[@id='Fav']/li[3]"));
    var to = driver.findElement(By.xpath("//ul[@id='Tol']/li"));
  • Drag an element to an offset
    var draggable = browser.findElement(By.id("draggable")); 
    Actions(driver).dragAndDropBy(draggable, 200, 10).build().perform();

I hope this article served as a comprehensive tutorial, covering most of the basics of WebDriver, if not all. See you soon with more useful WebDriver tips and tricks. Happy testing until then!!!