Automation Testing

Automation testing is a type of testing that involves automated test case execution using an automation tool.

The tester writes test scripts and then run the test scripts either on demand or schedule them for periodic run. This reduces the overall test execution time, thus helping in faster product releases.

  • Automation testing improves efficiency of testing.
  • Reduced testing efforts and costs.
  • Testing can be replicated across different platforms.
  • Gives accurate results.
  • Usually used for large applications with stringent deadlines.

Automation in preferred in following cases

  • Repetitive Tasks
  • Smoke and Sanity Tests
  • Test with multiple data set
  • Regression test cases

Selenium is a robust test automation suite that is used for automating web based applications. It supports multiple browsers, programming languages and platforms.

In the automation process, steps involved are

  • Selecting the Test tool
  • Define scope of automation
  • Planning, design and development
  • Test execution
  • Maintenance

Selenium comes in four forms-

  • Selenium WebDriver – Selenium WebDriver is used to automate web applications using browser’s native methods.
  • Selenium IDE – A firefox plugin that works on record and play back principle.
  • Selenium RC – Selenium Remote Control(RC) is officially deprecated by selenium and it used to work on javascript to automate the web applications.
  • Selenium Grid – Allows selenium tests to run in parallel across multiple machines.

Following are the advantages of Selenium-

  • Selenium is open source and free to use without any licensing cost.
  • It supports multiple languages like Java, ruby, python etc.
  • It supports multi browser testing.
  • It has good amount of resources and helping community over the internet.
  • Using selenium IDE component, non-programmers can also write automation scripts
  • Using selenium grid component, distributed testing can be carried out on remote machines possible.
  • We cannot test desktop application using Selenium.
  • We cannot test web services using Selenium.
  • For creating robust scripts in Selenium Webdriver, programming langauge knowledge is required.
  • We have to rely on external libraries and tools for performing tasks like – logging(log4J), testing framework-(testNG, JUnit), reading from external files(POI for excels) etc.

Besides Selenium, there is Load Runner, Sahi, Silk Test, QTP, Jmeter, WinRunner, etc.

The different locators in Selenium are-

  • Id
  • XPath
  • cssSelector
  • className
  • tagName
  • name
  • linkText
  • partialLinkText

Xpath or XML path is a query language for selecting nodes from XML documents. XPath is one of the locators supported by Selenium Webdriver.

An absolute XPath is a way of locating an element using an XML expression beginning from root node i.e. html node in case of web pages. The main disadvantage of absolute xpath is that even with slightest change in the UI or any element the whole absolute XPath fails.

Example - html/body/div/div[2]/div/div/div/div[1]/div/input

A relative XPath is a way of locating an element using an XML expression beginning from anywhere in the HTML document. There are different ways of creating relative XPaths which are used for creating robust XPaths (unaffected by changes in other UI elements).

Example - //input[@id='username']

There are two ways of navigating to the nth element using XPath-

Using square brackets with index position-

Example - div[2] will find the second div element.

Using position()-

Example - div[position()=3] will find the third div element. 

By .className we can select all the element belonging to a particluar class e.g. ‘.red’ will select all elements having class ‘red’.

By #idValue we can select all the element belonging to a particluar class e.g. ‘#userId’ will select the element having id – userId.

The fundamental difference between XPath and css selector is using XPaths we can traverse up in the document i.e. we can move to parent elements. Whereas using CSS selector we can only move downwards in th

By creating an instance of driver of a particular browser-WebDriver driver = new FirefoxDriver();

The Selenium WebDriver is used for automating tests for websites.

 Feature  Selenium IDE  Selenium RC  WebDriver
Browser CompatibilitySelenium IDE comes as a Firefox plugin, thus it supports only FirefoxSelenium RC supports a varied range of versions of Mozilla Firefox, Google Chrome, Internet Explorer and Opera.WebDriver supports a varied range of versions of Mozilla Firefox, Google Chrome, Internet Explorer and Opera.Also supports HtmlUnitDriver which is a GUI less or headless browser.
Record and PlaybackSelenium IDE supports record and playback featureSelenium RC doesn’t supports record and playback feature.WebDriver doesn’t support record and playback feature
Server RequirementSelenium IDE doesn’t require any server to be started before executing the test scriptsSelenium RC requires server to be started before executing the test scripts.WebDriver doesn’t require any server to be started before executing the test scripts
ArchitectureSelenium IDE is a Javascript based frameworkSelenium RC is a JavaScript based Framework.WebDriver uses the browser’s native compatibility to automation
Object OrientedSelenium IDE is not an object oriented toolSelenium RC is semi object oriented tool.WebDriver is a purely object oriented tool
Dynamic Finders(for locating web elements on a webpage)Selenium IDE doesn’t support dynamic findersSelenium RC doesn’t support dynamic finders.WebDriver supports dynamic finders
Handling Alerts, Navigations, DropdownsSelenium IDE doesn’t explicitly provides aids to handle alerts, navigations, dropdownsSelenium RC doesn’t explicitly provides aids to handle alerts, navigations, dropdowns.WebDriver offers a wide range of utilities and classes that helps in handling alerts, navigations, and dropdowns efficiently and effectively.
WAP (iPhone/Android) TestingSelenium IDE doesn’t support testing of iPhone/Andriod applicationsSelenium RC doesn’t support testing of iPhone/Android applications.WebDriver is designed in a way to efficiently support testing of iPhone/Android applications. The tool comes with a large range of drivers for WAP based testing.For example, AndroidDriver, iPhoneDriver
Listener SupportSelenium IDE doesn’t support listenersSelenium RC doesn’t support listeners.WebDriver supports the implementation of Listeners
SpeedSelenium IDE is fast as it is plugged in with the web-browser that launches the test. Thus, the IDE and browser communicates directlySelenium RC is slower than WebDriver as it doesn’t communicates directly with the browser; rather it sends selenese commands over to Selenium Core which in turn communicates with the browser.WebDriver communicates directly with the web browsers. Thus making it much faster.

Selenium IDE is the simplest and easiest of all the tools within the Selenium Package. Its record and playback feature makes it exceptionally easy to learn with minimal acquaintances to any programming language. Selenium IDE is an ideal tool for a naïve user.

Selenese is the language which is used to write test scripts in Selenium IDE.

Single Slash “/” – Single slash is used to create Xpath with absolute path i.e. the xpath would be created to start selection from the document node/start node.

Double Slash “//” – Double slash is used to create Xpath with relative path i.e. the xpath would be created to start selection from anywhere within the document.

Selenium Grid can be used to execute same or different test scripts on multiple platforms and browsers concurrently so as to achieve distributed test execution, testing under different environments and saving execution time remarkably.

The following syntax can be used to launch Browser:

WebDriver driver = new FirefoxDriver();
WebDriver driver = new ChromeDriver();WebDriver driver = new InternetExplorerDriver();

The different drivers available in WebDriver are:

  • FirefoxDriver
  • InternetExplorerDriver
  • ChromeDriver
  • SafariDriver
  • OperaDriver
  • AndroidDriver
  • IPhoneDriver
  • HtmlUnitDriver

There are two types of waits available in WebDriver:

  1. Implicit Wait
  2. Explicit Wait

Implicit Wait: Implicit waits are used to provide a default waiting time (say 30 seconds) between each consecutive test step/command across the entire test script. Thus, the subsequent test step would only execute when the 30 seconds have elapsed after executing the previous test step/command.

Explicit Wait: Explicit waits are used to halt the execution till the time a particular condition is met or the maximum time has elapsed. Unlike Implicit waits, explicit waits are applied for a particular instance only.

The user can use sendKeys(“String to be entered”) to enter the string in the textbox.


WebElement username = drv.findElement(“Email”));
// entering username

Get command is used to retrieve the inner text of the specified web element. The command doesn’t require any parameter but returns a string value. It is also one of the extensively used commands for verification of messages, labels, errors etc displayed on the web pages.


String Text = driver.findElement(“Text”)).getText();

The value in the dropdown can be selected using WebDriver’s Select class.



Select selectByValue = new Select(driver.findElement(“SelectID_One”)));



Select selectByVisibleText = new Select (driver.findElement(“SelectID_Two”)));



Select selectByIndex = new Select(driver.findElement(“SelectID_Three”)));


Following are the navigation commands:

navigate().back() – The above command requires no parameters and takes back the user to the previous webpage in the web browser’s history.

Sample code:


navigate().forward() – This command lets the user to navigate to the next web page with reference to the browser’s history.

Sample code:


navigate().refresh() – This command lets the user to refresh the current web page there by reloading all the web elements.

Sample code:


navigate().to() – This command lets the user to launch a new web browser window and navigate to the specified URL.

Sample code:


findElement(): findElement() is used to find the first element in the current web page matching to the specified locator value. Take a note that only first matching element would be fetched.


WebElement element = driver.findElements(By.xpath(“//div[@id='example']//ul//li”));

findElements(): findElements() is used to find all the elements in the current web page matching to the specified locator value. Take a note that all the matching elements would be fetched and stored in the list of WebElements.


List <WebElement> elementList = driver.findElements(By.xpath(“//div[@id='example']//ul//li”));

close(): WebDriver’s close() method closes the web browser window that the user is currently working on or we can also say the window that is being currently accessed by the WebDriver. The command neither requires any parameter nor does it return any value.

quit(): Unlike close() method, quit() method closes down all the windows that the program has opened. Same as close() method, the command neither requires any parameter nor does is return any value.

Junit is a unit testing framework introduced by Apache. Junit is based on Java.

Following are the JUnit Annotations:

  • @Test: Annotation lets the system know that the method annotated as @Test is a test method. There can be multiple test methods in a single test script.
  • @Before: Method annotated as @Before lets the system know that this method shall be executed every time before each of the test methods.
  • @After: Method annotated as @After lets the system know that this method shall be executed every time after each of the test method.
  • @BeforeClass: Method annotated as @BeforeClass lets the system know that this method shall be executed once before any of the test methods.
  • @AfterClass: Method annotated as @AfterClass lets the system know that this method shall be executed once after any of the test methods.
  • @Ignore: Method annotated as @Ignore lets the system know that this method shall not be executed.

A fluent wait is a type of wait in which we can also specify polling interval(intervals after which driver will try to find the element) along with the maximum timeout value.

The different keyboard operations that can be performed in selenium are-

  • .sendKeys(“sequence of characters”) – Used for passing character sequence to an input or textbox element.
  • .pressKey(“non-text keys”) – Used for keys like control, function keys etc that are non-text.
  • .releaseKey(“non-text keys”) – Used in conjuntion with keypress event to simulate releasing a key from keyboard event.

The different mouse events supported in selenium are

  • click(WebElement element)
  • doubleClick(WebElement element)
  • contextClick(WebElement element)
  • mouseDown(WebElement element)
  • mouseUp(WebElement element)
  • mouseMove(WebElement element)
  • mouseMove(WebElement element, long xOffset, long yOffset)

Using driver.getTitle(); we can fetch the page title in selenium. This method returns a string containing the title of the webpage.

Using driver.getPageSource(); we can fetch the page source in selenium. This method returns a string containing the page source.

Some of the commonly seen exception in selenium are-

  • NoSuchElementException – When no element could be located from the locator provided.
  • ElementNotVisibleException – When element is present in the dom but is not visible.
  • NoAlertPresentException – When we try to switch to an alert but the targetted alert is not present.
  • NoSuchFrameException – When we try to switch to a frame but the targetted frame is not present.
  • NoSuchWindowException – When we try to switch to a window but the targetted window is not present.
  • UnexpectedAlertPresentException – When an unexpected alert blocks normal interaction of the driver.
  • TimeoutException – When a command execution gets timeout.
  • InvalidElementStateException – When the state of an element is not appropriate for the desired action.
  • NoSuchAttributeException – When we are trying to fetch an attribute’s value but the attribute is not correct
  • WebDriverException – When there is some issue with driver instance preventing it from getting launched.
Using Select class-
Select countriesDropDown = new Select(driver.findElement("contient")));
//or using index of the option starting from 0
//or using its value attribute

The difference between driver.findElement() and driver.findElements() commands is-

findElement() returns a single WebElement (found first) based on the locator passed as parameter. Whereas findElements() returns a list of WebElements, all satisfying the locator value passed.

Syntax of findElement()-

WebElement textbox = driver.findElement("textBoxLocator"));

Syntax of findElements()-

List <WebElement> elements = element.findElements(“value”));

Another difference between the two is- if no element is found then findElement() throws NoSuchElementException whereas findElements() returns a list of 0 elements.

An implicit wait, while finding an element waits for a specified time before throwing NoSuchElementException in case element is not found. The timeout value remains valid throughout the webDriver’s instance and for all the elements.

driver.manage().timeouts().implicitlyWait(180, TimeUnit.SECONDS);

Whereas, Explicit wait is applied to a specified element only-

WebDriverWait wait = new WebDriverWait(driver, 5);  

It is advisable to use explicit waits over implicit waits because higher timeout value of implicit wait set due to an element that takes time to be visible gets applied to all the elements. Thus increasing overall execution time of the script. On the other hand, we can apply different timeouts to different element in case of explicit waits.

Robot API is used for handling Keyboard or mouse events. It is generally used to upload files to the server in selenium automation.

Robot robot = new Robot();
//Simulate enter key action

File upload action can be performed in multiple ways-

Using element.sendKeys(“path of file”) on the webElement of input tag and type file i.e. the elements should be like –

<input type="file" name="fileUpload">
  • Using Robot API.
  • Using AutoIT API.

JavaScript can be executed in selenium using JavaScriptExecuter. Sample code for javascript execution-

WebDriver driver = new FireFoxDriver();
if (driver instanceof JavascriptExecutor) {
 ((JavascriptExecutor)driver).executeScript("{JavaScript Code}");

HtmlUnitDriver is the fastest WebDriver. Unlike other drivers (FireFoxDriver, ChromeDriver etc), the HtmlUnitDriver is non-GUI, while running no browser gets launched.

  • Teleric Test Studio, Developed by Teleric.
  • TestingWhiz
  • HPE Unified Functional Testing (HP – UFT formerly QTP)
  • Tosca Testsuite
  • Watir
  • Quick Test Professional, provided by HP.
  • Rational Robot, provided by IBM.
  • Coded UI, provided by Microsoft.
  • Selenium, open source.
  • Auto It, Open Source.
  • Load Runner, provided by Hp.
  • JMeter, provided by Apache.
  • Burp Suite, provided by PortSwigger.
  • Acunetix, provided by Acunetix.

Different types of testing’s that we can achieve through Selenium are.

  • Functional Testing
  • Regression Testing
  • Sanity Testing
  • Smoke Testing
  • Responsive Testing
  • Cross Browser Testing
  • UI testing (black box)
  • Integration Testing

The assertion is used as a verification point. It verifies that the state of the application conforms to what is expected. The types of assertion are “assert”, “verify” and “waitFor”.

JUnit annotations which can be used are:

  • Test
  • Before
  • After
  • Ignore
  • BeforeClass
  • AfterClass
  • RunWith

“type” command is used to type keyboard key values into the text box of software web application. It can also be used for selecting values of combo box whereas “typeAndWait” command is used when your typing is completed and software web page start reloading. This command will wait for software application page to reload. If there is no page reload event on typing, you have to use a simple “type” command.

public class FirefoxBrowserLaunchDemo {  
public static void main(String[] args) {  
//Creating a driver object referencing WebDriver interface  
WebDriver driver;  
//Setting webdriver.gecko.driver property  
System.setProperty("webdriver.gecko.driver", pathToGeckoDriver + "\\geckodriver.exe");  
//Instantiating driver object and launching browser  
driver = newFirefoxDriver();  
//Using get() method to open a webpage  
//Closing the browser  
public class ChromeBrowserLaunchDemo {  
public static void main(String[] args) {  
//Creating a driver object referencing WebDriver interface  
WebDriver driver;  
//Setting the property to its executable's location  
System.setProperty("", "/lib/chromeDriver/chromedriver.exe");  
//Instantiating driver object  
driver = newChromeDriver();  
//Using get() method to open a webpage  
//Closing the browser  
public class IEBrowserLaunchDemo {  
public static void main(String[] args) {    
//Creating a driver object referencing WebDriver interface  
WebDriver driver;
//Setting the property to its executable's location  
System.setProperty("", "/lib/IEDriverServer/IEDriverServer.exe");  
//Instantiating driver object  
driver = newInternetExplorerDriver();  
//Using get() method to open a webpage  
//Closing the browser  

There are multiple ways of refreshing a page in Webdriver.

1. Using driver.navigate command –


2. Using driver.getCurrentUrl() with driver.get() command –


3. Using driver.getCurrentUrl() with driver.navigate() command –


4. Pressing an F5 key on any textbox using the sendKeys command –

driver.findElement(By textboxLocator).sendKeys(Keys.F5); 

5. Passing ascii value of the F5 key, i.e., “\uE035” using the sendKeys command –

driver.findElement(By textboxLocator).sendKeys("\uE035");

No, captcha and barcode reader cannot be automated.

HtmlUnitDriver is the fastest WebDriver. Unlike other drivers (FireFoxDriver, ChromeDriver etc), the HtmlUnitDriver is non-GUI, while running no browser gets launched.

Using javaScript executor we can handle hidden elements-

(JavascriptExecutor(driver)) .executeScript("document.getElementsByClassName(ElementLocator).click();")

Start points indicate the point from where the execution should begin. They can be used to run a test script from a breakpoint or the middle of the code.

Breakpoints are used to stop the execution of code. They help you verify that your code is working as expected. 

Page Object Model(POM) is a design pattern in selenium. A design pattern is a solution or a set of standards that are used for solving commonly occuring software problems.

Now coming to POM – POM helps to create a framework for maintaining selenium scripts. In POM for each page of the application a class is created having the web elements belonging to the page and methods handling the events in that page. The test scripts are maintained in seperate files and the methods of the page object files are called from the test scripts file.

The advantages are POM are-

  • Using POM we can create an Object Repository, a set of web elements in seperate files along with their associated functions. Thereby keeping code clean.
  • For any change in UI(or web elements) only page object files are required to be updated leaving test files unchanged.
  • It makes code reusable and maintable.

An object repository is centralized location of all the object or WebElements of the test scripts. In selenium we can create object repository using Page Object Model and Page Factory design patterns.

Page factory is an implementation of Page Object Model in selenium. It provides @FindBy annotation to find web elements and PageFactory.initElements() method to initialize all web elements defined with @FindBy annotation.

public class SamplePage {
WebDriver driver
WebElement searchTextBox;
WebElement searchButton;
public samplePage(WebDriver driver){
this.driver = driver;
//initElements method to initialize all elements
PageFactory.initElements(driver, this);
//Sample method
public void search(String searchTerm) {

A keyword driven framework is one in which the actions are associated with keywords and kept in external files e.g. an action of launching a browser will be associated with keyword – launchBrowser(), action to write in a textbox with keyword – writeInTextBox(webElement, textToWrite) etc. The code to perform the action based on a keyword specified in external file is implemented in the framework itself.

In this way the test steps can be written in a file by even a person of non-programming background once all the identified actions are implemented.

A data driven framework is one in which the test data is put in external files like csv, excel etc separated from test logic written in test script files. The test data drives the test cases, i.e. the test methods run for each set of test data values. TestNG provides inherent support for data driven testing using @dataProvider annotation.

Selenium grid is a tool that helps in distributed running of test scripts across different machines having different browsers, browser version, platforms etc in parallel. In selenium grid there is hub that is a central server managing all the distributed machines known as nodes.

A hub is server or a central point in selenium grid that controls the test executions on the different machines.

Nodes are the machines which are attached to the selenium grid hub and have selenium instances running the test scripts. Unlike hub there can be multiple nodes in selenium grid.

The advantages of selenium grid are-

  • It allows running test cases in parallel thereby saving test execution time.
  • Multi browser testing is possible using selenium grid by running the test on machines having different browsers.
  • It is allows multi-platform testing by configuring nodes having different operating systems.

TestNG(NG for Next Generation) is a testing framework that can be integrated with selenium or any other automation tool to provide multiple capabilities like assertions, reporting, parallel test execution etc.

testng.xml file is used for configuring the whole test suite. In testng.xml file we can create test suite, create test groups, mark tests for parallel execution, add listeners and pass parameters to test scripts. Later this testng.xml file can be used for triggering the test suite.

Following are the advantages of testNG-

  • TestNG provides different assertions that helps in checking the expected and actual results.
  • It provides parallel execution of test methods.
  • We can define dependency of one test method over other in TestNG.
  • We can assign priority to test methods in selenium.
  • It allows grouping of test methods into test groups.
  • It allows data driven testing using @DataProvider annotation.
  • It has inherent support for reporting.
  • It has support for parameterizing test cases using @Parameters annotation.

Using @DataProvider we can create a data driven framework in which data is passed to the associated test method and multiple iteration of the test runs for the different test data values passed from the @DataProvider method. The method annotated with @DataProvider annotation return a 2D array of object.

//Data provider returning 2D array of 3*2 matrix
@DataProvider(name = "dataProvider1")
public Object[][] dataProviderMethod(){
  return new Object[][]{{"Raj","Mehta"},{"r1","m1"},{"r2","m2"}};
//This method is bound to the above data provider returning 2D array of 3*2 matrix
// The test case will run 3 times with different set of values.
@Test(dataProvider = "dataProvider1")

public void sampleTest(String s1, String s2){

Using @Parameter annotation and ‘parameter’ tag in testng.xml we can pass parameters to the test script.

Sample testng.xml –

<suite name="sampleTestSuite">
   <test name="sampleTest">
   <parameter name="sampleParamName" value="sampleParamValue"/>
     <class name="SampleTestFile" />

Sample test script-

public class SampleTestFile {
public void parameterTest(String paramValue) {
   System.out.println("Value of sampleParamName is - " + sampleParamName);

@Factory annotation helps in dynamic execution of test cases. Using @Factory annotation we can pass parameters to the whole test class at run time. The parameters passed can be used by one or more test methods of that class.

Example – there are two classes TestClass and the TestFactory class. Because of the @Factory annotation the test methods in class TestClass will run twice with the data “k1” and “k2”

public class SampleTestClass{
  private String str;
public TestClass(String str) {
  this.str = str;
public void TestMethod() {
public class SampleTestFactory {
  //The test methods in class TestClass will run twice with data "k1" and "k2"
public Object[]factoryMethod(){
  return new Object[] {new TestClass("K1"), new TestClass("k2") };

TestNG provides us different kind of listeners using which we can perform some action in case an event has triggered. Usually testNG listeners are used for configuring reports and logging. One of the most widely used lisetner in testNG is ITestListener interface. It has methods like onTestSuccess, onTestFailure, onTestSkipped etc. We need to implement this interface creating a listener class of our own. After that using the @Listener annotation, we can use specify that for a particular test class, our customized listener class should be used.

public class SampleTestClass {
 WebDriver driver= new FirefoxDriver();
public void testMethod(){

@Factory method creates instances of test class and run all the test methods in that class with different set of data.

Whereas, @DataProvider is bound to individual test methods and run the specific methods multiple times.

Using priority parameter in @Test annotation in TestNG we can define priority of test cases. The default priority of test when not specified is integer value 0. Example-


Using dependsOnMethods parameter inside @Test annotation in testNG we can make one test method run only after successful execution of dependent test method.

@Test(dependsOnMethods = { "preTests" })

Some of the common assertions provided by testNG are-

  • assertEquals(String actual, String expected, String message) – (and other overloaded data type in parameters)
  • assertNotEquals(double data1, double data2, String message) – (and other overloaded data type in parameters)
  • assertFalse(boolean condition, String message)
  • assertTrue(boolean condition, String message)
  • assertNotNull(Object object)
  • fail(boolean condition, String message)
  • true(String message)

The commonly used TestNG annotations are-

  • @Test- @Test annotation marks a method as Test method.
  • @BeforeSuite- The annotated method will run only once before all tests in this suite have run.
  • @AfterSuite-The annotated method will run only once after all tests in this suite have run.
  • @BeforeClass-The annotated method will run only once before the first test method in the current class is invoked.
  • @AfterClass-The annotated method will run only once after all the test methods in the current class have been run.
  • @BeforeTest-The annotated method will run before any test method belonging to the classes inside the <test> tag is run.
  • @AfterTest-The annotated method will run after all the test methods belonging to the classes inside the <test> tag have run.

Log4j is an open source API widely used for logging in Java. It supports multiple levels of logging like – ALL, DEBUG, INFO, WARN, ERROR, TRACE and FATAL.

Apache POI API and JXL(Java Excel API) can be used for reading, writing and updating excel files.

 In order to run the tests in parallel just add these two key value pairs in suite-

  • parallel=”{methods/tests/classes}”
  • thread-count=”{number of thread you want to run simultaneously}”.
<suite name="ArtOfTestingTestSuite" parallel="methods" thread-count="5">

Logging helps in debugging the tests when required and also provides a storage of test’s runtime behaviour.

Using clear() method we can delete the text written in a textbox.


Selenium has driver.getWindowHandles() and driver.switchTo().window(“{windowHandleName}”) commands to work with multiple windows. The getWindowHandles() command returns a list of ids corresponding to each window and on passing a particular window handle to driver.switchTo().window(“{windowHandleName}”) command we can switch control/focus to that particular window.

for (String windowHandle : driver.getWindowHandles()) {

The driver.switchTo() commands can be used for switching to frames.


For locating a frame we can either use the index (starting from 0), its name or Id.

Desired capabilities are a set of key-value pairs that are used for storing or configuring browser specific properties like its version, platform etc in the browser instances.

All the links are of anchor tag ‘a’. So by locating elements of tagName ‘a’ we can find all the links on a webpage.

List<WebElement> links = driver.findElements(By.tagName("a"));

Using profiles in firefox we can handle accept the SSL untrusted connection certificate. Profiles are basically set of user preferences stored in a file.

FirefoxProfile profile = new FirefoxProfile();
WebDriver driver = new FirefoxDriver(profile);

Sikuli is a tool that uses “Visual Image Match” method to automate graphical user interface. All the web elements in Sikuli should be taken as an image and stored inside the project.

Sikuli is comprised of

  • Sikuli Script
  • Visual Scripting API for Jython
  • Sikuli IDE

Practical uses of Sikuli is that

  • It can be used to automate flash websites or objects
  • It can automate window based application and anything you see on screen without using internal API support
  • It provides simple API
  • It can be easily linked with tools like Selenium
  • Desktop application can be automated
  • Sikuli offers extensive support to automate flash objects
  • To automate desktop, it uses powerful “Visual Match” and Flash objects
  • It can work on any technology-.NET, Java.
It provides extensive support to automate flash objectsIt has simple APIIt uses a visual match to find elements on the screen. So, we can automate anything we see on the screenIt can automate the web as well as windows applicationIt cannot automate flash objects like video player, audio player,It has got complicated APIIt does not have visual matchIt can automate only web applications

The navigation commands are as follows.


The above command needs no parameters and takes back the user to the previous webpage.


  1. driver.navigate().back();  


The above command allows the user to navigate to the next web page with reference to the browser’s history.


  1. driver.navigate().forward();  


The navigate().refresh() command allows the user to refresh the current web page by reloading all the web elements.


  1. driver.navigate().refresh();  


The navigate().to() command allows the user to launch a new web browser window and navigate to the specified URL.


  1. driver.navigate().to(“”);  
import org.junit.After;  
import org.junit.Before;  
import org.junit.Test;  
import org.openqa.selenium.OutputType;  
import org.openqa.selenium.TakesScreenshot;  
import org.openqa.selenium.WebDriver;  
import org.openqa.selenium.firefox.FirefoxDriver;  
public class TakeScreenshot {  
WebDriver drv;  
public void setUp() throws Exception {  
driver = new FirefoxDriver();  
public void tearDown() throws Exception {  
public void test() throws IOException {  
//capture the screenshot  
File scrFile = ((TakeScreenshot)drv).getScreenshotAs(OutputType.FILE);  
// paste the screenshot in the desired location  
FileUtils.copyFile(scrFile, new File("C:\\Screenshot\\screen.png"))  

Test data can efficiently be read from excel using JXL or POI API. POI API has many advantages than JXL.

1.Bitmap comparison is not possible using Selenium WebDriver

2. Automating Captcha is not possible using Selenium WebDriver

3. We can not read bar code using Selenium WebDriver