Link here

Link here

Using HtmlUnit on .NET for Headless Browser Automation

BDD, Testing, UI 15 Comments »

If you subscribe to this blog, you may have noticed that I’ve been writing about test automation methods a lot lately. You could even think of it as a series covering different technical approaches:

The reason I keep writing about this is that I still think it’s very much an unsolved problem. We all want to deliver more reliable software, we want better ways to design functionality and verify implementation… but we don’t want to get so caught up in the beaurocracy of test suite maintenance that it consumes all our time and obliterates productivity.

Yet another approach

Rails developers (and to a lesser extent Java web developers) commonly use yet another test automation technique: hosting the app in a real web server, and accessing it through a fast, invisible, simulated web browser rather than a real browser. This is known as headless browser automation.

How is this beneficial?

  • It’s faster than driving a real browser. The simulated browser is just a native code library, so it’s very quick to launch and shut down, doesn’t require interprocess communication with your test suite, and doesn’t waste time physically drawing things on your screen, opening and closing pop-up windows, etc. Nonetheless, the simulated browser is full-featured: it exposes the usual HTML DOM API, runs JavaScript, evaluates CSS rules, and so on, so it’s still an effective way to specify rich client-side behaviours.
  • It’s more reliable than a real browser. Real browsers are complicated. For example, if you launch and shut down Firefox many times in rapid succession, occasionally it will fail to launch because a previous instance is still locking some file. Simulated browsers are totally independent, so don’t suffer these kinds of weirdness.
  • You can get more low-level control if you want it. For example, the simulated browser can easily offer an API to alter the HTTP headers it sends, or let you get or set the contents of its cache. A real browser wouldn’t usually make this easy.

And what drawbacks might you expect?

  • It can be harder to debug. Because you can’t physically see the browser on your desktop, it’s not as obvious what’s happening if a test fails (or passes) unexpectedly. Your debugger’s “immediate mode” will let you call the browser’s API to figure things out, but it can be a longer investigative process.
  • There’s no absolute guarantee that it faithfully replicates a real-world browser. For example, HtmlUnit can simulate multiple versions of Firefox, Internet Explorer, and Netscape, but it may not simulate every single quirk.

About HtmlUnit

HtmlUnit is a headless browser automation library for Java. It’s very well-developed and mature, as you can see from its extensive API. (Need to configure whether the user has JavaScript on or off? No problem.) It’s also the same technology that underlies Celerity, a Ruby library that exposes the Watir API but runs faster.

Unfortunately, in .NET world, we don’t have any good headless browser automation libraries like HtmlUnit that I know of. But don’t let that stop you! We do have IKVM – a near-magic way of running Java code on .NET (either by hosting a JVM on the CLR at runtime, or by a one-time conversion process that generates a native .NET assembly directly from the Java bytecode). So, why not use HtmlUnit itself?

Converting HtmlUnit to .NET

It’s surprisingly easy to get HtmlUnit, a Java library, converted into a native .NET assembly (no Java Virtual Machine needed!) using IKVM.

First, download HtmlUnit (as a compiled JAR file) from SourceForge (I’m using version 2.7), and extract all its files from the ZIP archive.

Second, download IKVM binaries from ikvm.net/SourceForge (I’m using version 0.42.0.3), and again extract all its files from the ZIP archive.

Open a command prompt, ensure you’ve added IKVM’s /bin folder to your PATH variable, change directory to HtmlUnit’s /lib folder, and then run ikvmc to convert the Java bytecode of all of the HtmlUnit JAR files into .NET bytecode. I ran this command:

ikvmc -out:htmlunit-2.7.dll *.jar

This produced a lot of warnings and a large (~10Mb) .NET assembly called htmlunit-2.7.dll. If you don’t want to bother with this process, you can download my sample project (linked below) which contains the .NET assembly I generated.

Using HtmlUnit from .NET

Once you’ve got htmlunit-2.7.dll as a .NET assembly, you can use it with any .NET unit testing library such as NUnit or XUnit. You will also need to reference a few of the IKVM runtime assemblies (I narrowed it down to six additional IKVM assemblies that appear to be needed – these are all included in IKVM’s /bin folder).

As a trivial example, here’s how you can use HtmlUnit with NUnit to load the Google homepage:

[TestFixture]
public class GoogleTests
{
    private WebClient webClient;
    [SetUp] public void Setup() { webClient = new WebClient(); }
 
    [Test]
    public void Can_Load_Google_Homepage()
    {
        var page = (HtmlPage)webClient.getPage("http://www.google.com");
        Assert.AreEqual("Google", page.getTitleText());
    }
}

Slightly more interesting, here’s how you can fill out a form (again, Google Search), click a button, and inspect the results:

[Test]
public void Google_Search_For_AspNetMvc_Yields_Link_To_Codeplex()
{
    var searchPage = (HtmlPage)webClient.getPage("http://www.google.com");
    ((HtmlInput)searchPage.getElementByName("q")).setValueAttribute("asp.net mvc");
    var resultsPage = (HtmlPage)searchPage.getElementByName("btnG").click();
 
    var linksToCodeplex = from tag in resultsPage.getElementsByTagName("a").toArray().Cast<HtmlAnchor>()
                          let href = tag.getHrefAttribute()
                          where href.StartsWith("http://")
                          let uri = new Uri(href)
                          where uri.Host.ToLower().EndsWith("codeplex.com")
                          select uri;
    CollectionAssert.IsNotEmpty(linksToCodeplex);
}

As you can see, the API is full of Java idioms and feels a bit odd to a .NET developer. It would be great if someone decided to create a .NET wrapper library to expose a nicer API, .NET-style PascalCase, better use of IEnumerable and generics so LINQ queries were simpler, etc.

The other side of it is that HtmlUnit is very powerful. You can trivially scan the DOM with XPath, search for child elements, invoke events, much more easily than with WatiN.

To show that HtmlUnit has great JavaScript and Ajax support, here’s an example of automating a jQuery AutoComplete plugin to check its suggestions:

[Test]
public void jQuery_Autocomplete_Lon_Suggests_London()
{
    // Arrange: Load the demo page
    var autocompleteDemoPage = (HtmlPage)webClient.getPage("http://jquery.bassistance.de/autocomplete/demo/");
 
    // Act: Type "lon" into the input box
    autocompleteDemoPage.getElementById("suggest1").type("lon");
    webClient.waitForBackgroundJavaScript(1000);
 
    // Assert: Suggestions should include "London"
    var suggestions = autocompleteDemoPage.getByXPath("//div[@class='ac_results']/ul/li").toArray().Cast<HtmlElement>().Select(x => x.asText());
    CollectionAssert.Contains(suggestions, "London");
}

In case you want to download all this code as a demo project you can run without needing IKVM or Java or anything weird, I’ve put it on GitHub.

Summary

This is experimental! I haven’t used this on any real project, though it was pretty effortless so far, so I’d definitely consider it. I don’t know how much faster HtmlUnit would be than a real browser, but it does bypass the trickiness of real browsers, which may be worth it alone. If I was going to use it seriously, I’d definitely make some .NET wrapper classes to hide the Java naming and idioms. HtmlUnit on .NET feels robust, but I haven’t pushed it hard.

kick it on DotNetKicks.com

Deleporter: Cross-Process Code Injection for ASP.NET

ASP.NET, BDD, Testing 5 Comments »

Deleporter is a little .NET library that teleports arbitrary delegates into an ASP.NET application in some other process (e.g., hosted in IIS) and runs them there. At the moment, it’s still pretty experimental, but I’ve found it useful so far.

Why would you want this? It’s mainly useful when you’re integration testing. Normally, integration tests can’t directly interact with your application’s code because it’s in a separate process (and possibly running on another machine). So, unlike unit tests, integration tests often struggle to test different configurations and have no option to use mocks.

Deleporter gets around that limitation. It lets you delve into a remote ASP.NET application’s internals without any special cooperation from the remote app (it just needs to include an extra IHttpModule in its config), and then you can do any of the following:

  • Cross-process mocking, by combining it with any mocking tool. For example, you could inject a temporary mock database or simulate the passing of time (e.g., if your integration tests want to specify what happens after 30 days or whatever)
  • Test different configurations, by writing to static properties in the remote ASP.NET appdomain or using the ConfigurationManager API to edit its <appSettings> entries.
  • Run teardown or cleanup logic such as flushing caches. For example,  recently I needed to restore a SQL database to a known state after each test in the suite. The trouble was that ASP.NET connection pool was still holding open connections on the old database, causing connection errors. I resolved this easily by using Deleporter to issue a SqlConnection.ClearAllPools() command in the remote appdomain – the ASP.NET app under test didn’t need to know anything about it.

Rails developers can already do this either by hosting their app in-process (e.g., with WebRat) or using a cross-process mocking tool. I’ve already talked about hosting ASP.NET (MVC) apps in-process with integration tests (the starting point for this code); now I wanted to get the same benefits while hosting my app on a real web server.

This library is built on regular .NET remoting. But unlike regular remoting, it run arbitrary delegates, not just methods specifically exposed by the remoting service. If the code for your delegate isn’t already loaded into the ASP.NET appdomain, Deleporter will serialize the code and send it across. This is really essential for being useful from an integration test suite, because the test code wouldn’t normally be in the real app.

Enough talk! Code now.

To use it, first download the binary or compile the source to get Deleporter.dll. Reference it from both your integration test project and the ASP.NET MVC/WebForms app you’re testing.

Next, to get your ASP.NET app to listen for commands, edit its Web.config file to reference this extra HTTP module:

<configuration>
  <system.web>
    <httpModules>
      <!-- If you're using Visual Studio's built-in web server, or IIS 5 or 6, add this: -->
      <add name="DeleporterServerModule" type="DeleporterCore.Server.DeleporterServerModule, Deleporter" />
    </httpModules>
  </system.web>
  <system.webServer>
    <modules>
      <!-- If you're using IIS 7+, add this: -->
      <add name="DeleporterServerModule" type="DeleporterCore.Server.DeleporterServerModule, Deleporter" />
    </modules>
  </system.webServer>
</configuration>

Now, when you next start up your ASP.NET app, it will listen on TCP port 38473 - a randomly-chosen default - for commands from your integration test suite.

In your integration test code, you can now run arbitrary delegates in the remote ASP.NET app, sending both data and code into it, and optionally pulling data back. As a trivial example, you can pull back the server’s time local time:

[Test]
public void ServerClockHasCorrectYear()
{
    DateTime serverTime = Deleporter.Run(() => DateTime.Now);
    Assert.AreEqual(DateTime.Now.Year, serverTime.Year, "The server's clock is way off");
}

You can use multi-statement lambdas, too, even if they capture locals from the containing method. Deleporter will send the captured locals into the remote app (as long as they’re serializable), and if the remote app edits them, it will pull back the new values and update your locals. In other words, anonymous methods and multi-statement lambdas work with captured local variables just as if all the code was running locally. Example:

[Test]
public void SendingAndUpdatingCapturedLocals()
{
    DateTime serverTime = default(DateTime);
    string fileToLocate = "~/web.config";
    string mappedPhysicalPath = null;
 
    Deleporter.Run(() => {
        // This code, which both reads and writes locals from the outer scope,
        // runs in the remote ASP.NET application
        serverTime = DateTime.Now;
        mappedPhysicalPath = HostingEnvironment.MapPath(fileToLocate);
    });
 
    Assert.AreEqual(DateTime.Now.Year, serverTime.Year);
    Assert.IsTrue(File.Exists(mappedPhysicalPath)); // Of course, this will only be true if the ASP.NET app runs on the same machine as the integration test code
}

More realistic applications

So far I’ve only shown very artificial examples designed to illustrate the syntax. To give you a more realistic idea of how you could apply it, I can offer two sample applications.

  • A very simple ASP.NET MVC app called WhatTimeIsIt whose sole behaviour varies according to the date. Like all good ASP.NET MVC applications, it uses the Dependency Injection pattern (in this example, with Ninject) to get the current date & time from an IDateProvider – the default implementation of which just returns DateTime.Now.
    It has simple integration tests that use Deleporter and Moq to inject a mock IDateProvider across the process boundary while the app is running, and then verifies some behaviour that only happens during specific dates. It also demonstrates an easy way of tidying up after the tests run (restoring the original IDateProvider) automatically without cluttering the test code with teardown logic.
    Here’s how the cross-process mocking works:
    // Inject a mock IDateProvider, setting the clock back to 1975
    var dateToSimulate = new DateTime(1975, 1, 1);
    Deleporter.Run(() => {
        var mockDateProvider = new Mock<IDateProvider>();
        mockDateProvider.Setup(x => x.CurrentDate).Returns(dateToSimulate);
        NinjectControllerFactoryUtils.TemporarilyReplaceBinding(mockDateProvider.Object);
    });
  • An enhanced version of the SpecFlow BDD-style integration test suite from my previous blog post. It now uses Deleporter to apply a mock repository to verify certain specifications.
    For example, one of the Gherkin files now specifies this scenario:
    Scenario: Most recent entries are displayed first
    	Given we have the following existing entries
    		| Name      | Comment      | Posted date       |
    		| Mr. A     | I like A     | 2008-10-01 09:20  |
    		| Mrs. B    | I like B     | 2010-03-05 02:15  |
    		| Dr. C     | I like C     | 2010-02-20 12:21  |
          And I am on the guestbook page
        Then the guestbook entries includes the following, in this order
    		| Name      | Comment      | Posted date       |
    		| Mrs. B    | I like B     | 2010-03-05 02:15  |
    		| Dr. C     | I like C     | 2010-02-20 12:21  |
    		| Mr. A     | I like A     | 2008-10-01 09:20  |

Gotcha: What you must remember to avoid confusing problems

One behaviour might surprise you at first. It’s certainly caught me out plenty of times already. Each time you edit and recompile your integration test code, you must also recycle your ASP.NET appdomain (e.g., by resaving its Web.config file, or the nuclear option - running iisreset) otherwise the old code will still be loaded into it. That’s because, as far as I’m aware, there’s no way to unload an assembly from a .NET appdomain. Once your delegates have been transferred over, they’re staying there and can’t automatically be replaced by newer versions.

This isn’t a problem if you’re running an integration suite through your continuous integration (CI) system – the app code would be recompiled and hence reset between test suite runs. It’s only something to watch out for during local development. In the Deleporter.Test.Client project I’ve shown a crude but working way of automatically recycling the ASP.NET appdomain when the test suite starts running; you may be able to work out a better way of automating this if you need one.

Important security note

You should not enable DeleporterServerModule on a production web site. It’s only supposed to be used in dev and QA environments, because it’s literally an invitation to upload and execute arbitrary code.

To reduce the chance of a mistake, DeleporterServerModule will refuse to run if your web site was compiled in release mode – it will demand that you remove the IHttpModule from your Web.config file. Technically you should be OK if your firewall doesn’t accept inbound connections on port 38473 (or whatever port you configure it to listen on) anyway. Obviously, no warranties, you use it at your own risk, etc.

License: Ms-Pl

kick it on DotNetKicks.com

Site Meter