Wednesday, 17 December 2008

Selenium and HTMLUnit - the abstraction layer


I once did an experiment to see how easy I could wrap HTMLUnit with  Selenium for automated software testing.
In my experiment I created a wrapper by extending the DefaultSelenium class and then using eclipse to create wrapper functions for all the methods. Then inject that HTMLUnitSelenium class into my abstraction layer and voila - your Selenium tests run with HTMLUnit (*cough* well, in theory).
Then I recently discovered that someone else had done the same thing. Frank Cohen, at a recent workshop, mentioned a wrapper that already existed in PushToTest.
So I hunted it out, and I tried to use it...


Please Explain HTMLUnit?

HTMLUnit acts a 'headless' browser - a browser without a renderer - which lets you test the functionality of a website without necessarily testing the visual aspects. By having a wrapper I could run tests quickly without having to amend the tests just by creating an HTMLUnit browser.

Where Do I get the wrapper?

I did a Google search and found the original code created by Denali.
Denali passed this code over to PushToTest to productionize, and it now forms part of the PushToTest distribution. So you can either checkout the whole code base from svn:
and build the selenium-tmbranch (the source provides an Ant build file).
Or install push to test and use the selenium-ptt.jar from the lib folder.
Add the HtmlUnit jars to your project and the selenium-ptt.jar to your project

How do I use it?

import be.denali.test.SeleniumHTMLUnit;
// And instead of creating a normal Selenium object
// Do the following

public static SeleniumHTMLUnit selSession = null;
selSession = new SeleniumHTMLUnit();

// Then use selSession as normal
// If you built it yourself from the source then you also
// generated extensive JavaDocs during the build process
Easy!

Warning Possible Fixes Required

Sadly just downloading this and incorporating it into your project doesn't guarantee a complete instant Selenium wrapper.
If you scan the source then you will see a few "Action not implemented" e.g. openWindow - but if the actions you want to use do exist then you can quickly run your selenium tests against HTMLUnit. Contributing some incomplete methods would seem a really useful way to add back to the open source project.
public void answerOnNextPrompt(String answer) {
// TODO Auto-generated method stub
logger.warning("Action not implemented: answerOnNextPrompt");
}
But you can easily scan your source code and check if any of the ones that you rely on have already undergone some implementation work.
I found that most of the methods that I really needed did have implementations. Those that did not have implementations did not really matter to me because HTMLUnit works slightly differently to Selenium, and I could amend my abstraction layer to compensate.

Things I had to do to get it working that you may not need to

  • Resolve a Rhino conflict
  • Stop using DefaultSelenium
  • Stop using openWindow

Resolve a Rhino conflict

Every time I ran it I received a missing method exception from Rhino (The Javascript interpreter that HTMLUnit uses). Upon typing the error message into Google I discovered that I had a Rhino conflict from one of the other things in my classpath. So I did what I should have done ages ago and cleared out a whole bunch of extraneous jars that we inherited from the development team.
Quite what caused the conflict I did not find out since I did not take a disciplined approach to removing the items from the classpath - I just wanted the darn thing to start working - I'll have to go back later and find that out.

Stop using DefaultSelenium

We used the DefaultSelenium in our tests at work, so I amended our abstraction layer use:
seleniumSession = new SeleniumHTMLUnit() ;
seleniumSession.setBaseURL("...");

Stop using openWindow

SeleniumHTMLUnit has an incomplete implementation of openWindow.
Rather than add additional code to SeleniumHTMLUnit (because I tried to get something up and running quickly) I changed my abstraction layer to use open instead of openWindow.

Good to Go

I had my tests running quickly, and using HTMLUnit.
...and failing.
HTMLUnit displays more warning messages about your html pages than Selenium does so you can find errors on your site by reading the HTMLUnit logs that you would not find by using Selenium or reading the Selenium logs.

Not Quite The Same Behaviour Out of the Box

Note: see Frank's comment below for workarounds for my next point.
HTMLUnit as a direct plug-in replacement for Selenium didn't work straight out of the box in terms of behaviour.
After running my tests using HTMLUnit instead of Selenium - a few more failed than normal, reasons for this:
  • Selenium forgives (because the browser forgives) if your page has missing javascript files, but HTMLUnit throws an exception if your webpage has a missing .js file. So some of my tests failed on opening a page. In some ways this helped me as it identified a misconfiguration between production and test. In other ways it does not help because Selenium does not pick up this error so a test that runs in Selenium does not run using HTMLUnit.
  • Calling a selSession.check on a checkbox works in selenium but not in HTMLUnit "java.lang.ClassCastException: com.gargoylesoftware.htmlunit.html.HtmlRadioButtonInput cannot be cast to com.gargoylesoftware.htmlunit.html.HtmlCheckBoxInput
    at be.denali.test.SeleniumHTMLUnit.check(SeleniumHTMLUnit.java:727)". Since my usage of .check passes in Selenium I consider this a bug in SeleniumHTMLUnit, but since SeleniumHTMLUnit runs as an Open Source project I can raise this in their Trac system, or contribute a fix.
  • Other errors I found appear related to timing issues due to HTMLUnit working slightly differently than Selenium. I need to investigate this further.

But Why would you want to do this?

Well, Selenium does run quite slowly. And not all tests really need to run in a 'real' browser to add some valuable feedback.
Ideally I would like a subset of basic tests that run on the Continuous Integration server without compromising the integrity of that box by spawning a slew of browsers. Currently we have a set of Hudson slaves and run the test on 'other' boxes, or just trigger the tests manually on our local desktops.
Having Selenium tests run quickly on the same server as part of the build seems like a really useful goal to aim for.

So what next?

Obviously I just spent an couple of hours hacking around but the basic principle of a Selenium HTMLUnit abstraction seems sound - at least to me.
SeleniumHTMLUnit really impressed my since all my tests ran, and that I didn't have to build this abstraction layer myself (I can score it off my todo list now).
I will definitely add this to my abstraction layer. I will immediately create a testNG group for SeleniumHTMLUnit tests that cover basic page structure and link checking since these run much faster in SeleniumHTMLUnit than Selenium, and these don't really need to run cross browser as much.
If I have to amend SeleniumHTMLUnit to get my tests working fully then I shall certainly contribute my code to the PushToTest team.
The PushToTest development team, and Denali team, responded really quickly to some out of the blue emails I sent and answered some questions I had and pointing me to the up to date code base.
Hopefully this post achieves a 2 goals:
  • makes it even easier for anyone else to investigate this really important wrapper.
  • makes more people aware of this hidden gem inside the PushToTest source code - I wonder what else they have hidden in there?
I have seem plenty of posts from people on forums trying to create an abstraction layer like this - perhaps by contributing to PushToTest as a community we may manage to contribute to the development of a well maintained and full abstraction around Selenium.
Try it out for yourself and get some headless tests going.
































2 comments:

  1. Great write-up.

    HTMLUnit 2.3 offers some utility APIs to handle the missing JavaScript files in the WebClient API. We find these all the time in our customer's sites. We added the following commands to TestMaker. These are in the 5.3 code base:


    loglevel = "off", "info", "debug" controls the Selenium HTMLUnit logging
    throwExceptionOnFailingStatusCode, when set to true HTMLUnit will ignore errors when the source of an external script cannot be loaded
    throwExceptionOnScriptError, when set to true HTMLUnit will ignore JavaScript? execution errors


    -Frank

    Thanks Frank - now I have to do even less to improve the abstraction layer :)

    ReplyDelete
  2. This looks promising. I'm at a project where we want to migrate lots of existing Selenium tests (in HTML format) to HTMLUnit.

    I'll use this code as a basis, and also add a HTML parser that can parse Selenium HTML tests and invoke the com.thoughtworks.selenium.Selenium API. In practive the instance will be be.denali.test.SeleniumHTMLUnit.

    For the HTML parsing I might base some of the code on http://code.google.com/p/selenium-htmlunit-adapter/

    In order to be able to commit my work, I might put this up on a Git repository - probably at GitHub. Send me an email if you want to discuss this further.

    Whoa - before doing that - consider that PushToTest already has a Selenese parser in it, you might be able to just use PushToTest out of the box - we don't write our Selenium tests in Selenese so I had to dig into the source to get the part I needed - you might find PushToTest an even easier solution.

    ReplyDelete