The Siren Call of Automated Browser Testing

Perhaps you’ve heard of the Greek’s mythical Sirens. These were the seductive, deadly creatures who lured passing ships to certain doom upon the rocky coasts with such irresistible voice and song. They were something so appealing; yet to pursue them was fatal. Whenever I hear mention about automated browser tests (ABTs), I can’t help but think of these creatures. The reasons for using ABT are so appealing and seem so obvious, but it is extremely easy to get burned.

Don’t get me wrong, I am all for testing and automation—both are indispensable to making good software. The issue is that ABTs are rarely done well. The cons are never realized until after they are discovered the hard way.

Music to Management’s Ears

The reasons to use ABTs are plentiful and to management they seem like no-brainers.

In many automated browser testing platforms, tests can be written once and then run against any number of browsers. The team can finish developing the code and writing ABTs and then move on to the next effort. All along the ABTs can run daily without supervision, detecting when the latest Firefox/IE/Safari update breaks your site’s features. Then you can sleep soundly at night without having to spend a large amount of QA resources running mind-numbingly repetitive, manual regression tests each time a new browser update is released—which in the case of Firefox is every six weeks!

Be careful—the first time management learns about the concept of ABTs, they’ll think it’s the greatest thing since sliced bread. They will want to introduce ABTs into the process as soon as possible, which for new efforts means the moment the UI is started. Managers will remember ruefully how previous products have been plagued by browser changes and hard to reproduce JavaScript bugs that only occur on this one customer’s machine. “Not this time!” they’ll think. Unfortunately, these advantages that I have just outlined are half-truths at best, and oftentimes downright wishful thinking.

Good at Detecting Change, Bad at Dealing With It

Automated browser tests are very good at detecting changes in your UI. In fact, they’re so good that sometimes they’ll even throw a fit when practically nothing has changed at all (more on this in a bit). What ABTs aren’t good at is dealing with changes to your UI.

The problem lies in the fact that ABTs are highly coupled with the UI’s HTML/DOM. ABTs can range from fairly brittle to extremely brittle. The less brittle ABTs typically use some sort of DOM traversal like XPath or CSS-style selectors which can tolerate some UI changes like colors and minor layout changes as long as the changes are via CSS only and don’t dramatically affect the DOM. The more brittle tests are the recorded-point-and-click driven ones where even the slightest changes to the UI are liable to render large portions of tests obsolete since everything is spatially dependent. How resilient a test is to change is also largely driven by the skill and foresight of the person writing the test.

If the test author has the foresight, he can try and write the tests in the most generic and reusable manner possible. This of course, is if they’re using more of the scripting-based writing versus recorded-point-and-click approach. This requires a skill set more similar to a developer, which only some QA people possess. All too often, less developer-esque QA engineers go off and spend weeks or months writing tests that become nearly worthless after a few major UI changes because they were not written with change in mind and have to be completely redone. It is important to stress flexibility and re-usability early on with the test authors.

Since ABTs don’t cope well with change, they don’t mix well with Agile. Software development is by its nature full of change. From constantly evolving requirements to capricious product owners and designers, it’s a pretty safe bet that the way your UI looks today may be quite different a few sprints from now. Introducing automated browser tests early on in the development cycle doesn’t mean that test writing will be finished earlier, in lockstep with development. Rather, QA will just be wasting time writing automated tests that will have to be completely rewritten until the UI has settled down enough. This point is rarely realized by management and can be a point of frustration when progress is slowed by the constant fire-fighting from broken tests.

Clearly there are issues when UI changes are made. But remember how I told you that ABTs can throw a fit when seemingly nothing has changed? This is particularly the case in single-page JavaScript applications. Specifically the ones written in frameworks that have an unholy mix of logic and DOM (think AngularJs, et al.). Having behavior coupled with the DOM means that subtly changing the behavior or appearance of something via your framework may unexpectedly add, remove, or modify the DOM surrounding the component in question. Added an extra Angular directive to your page? Guess what? Now everything has an extra <div> tag around it and all your tests’ selectors have to be updated. It gets worse. Sometimes changing the browser type or even version can mean that these JavaScript frameworks output a different DOM. So the idea of writing tests using Firefox and then expecting them to run against IE11 may not always pan out. Soon, your tests will have special conditions for the different browser types scattered everywhere.

Man Versus Machine

Another problem with automated browser tests is that they’re not as valuable as you think. Yes, you may have a test for every single feature in your web application, but your app isn’t used by robots, it’s used by people. The funny thing about people is, that they can be rather unpredicatabe. They don’t steadily move the mouse in straight lines, they can type very slowly or very quickly. They can get impatient and click the back button if something takes more than three seconds to load. They can hit submit five times instead of one. To be able to represent all of these behaviors through automated test cases is impossible. There just isn’t enough time in the world to write that many tests. But a living, breathing tester could do all of those things effortlessly in no time at all.

The fallacy is that decision makers think they can leave everything to ABTs to tell them when the software is broken. Doing this will lead to embarrassment when time and time again subtle bugs are being discovered by irate users when all the while the automated tests are running cleanly. With today’s modern web-apps that are single-page, event-driven, and choc full ‘o JavaScript, there can be countless subtle bugs that only show themselves when things are clicked within a certain time window. Automated tests either click everything as fast as possible, or at some very controlled time interval. Even if you introduce some sort of fuzzy timing and randomness into your automated tests, you would have to run them infinitely many times to cover all of the scenarios that occur when real people are using the system.

The bottom line is that ABTs are good for a general smoke test, but they should never be the sole pillar upon which you’re gauging your software quality. It is still a good idea to have ABTs and to run them regularly, but nothing will replace due diligence and some manual tests done by a good ole fashion human.

Ugly Surprises

Even if you manage to be proactive in writing automated browser tests in a flexible, reusable manner, you’re still likely to run in to a few unpleasant surprises along the way. You’ll be surprised to find just how long ABTs take to run. To run all of the tests can sometimes take hours for more complicated applications. And then multiply that amount of time by the number of browsers you need to support. So it is not inconceivable that in order to have your test results every morning at 8AM, you may have to kick off your tests around midnight.

The time it takes to run the tests is often exacerbated by a lot of explicit delay statements in the test cases. Particularly for single-page/JavaScript-heavy applications, it is difficult to write tests that wait for a certain page element to reach the desired state, so test writers often write something along the lines of “Click Button X, wait 2 seconds, verify element Y is red.” Although a change may typically take less than a second in practice, when running the application through a testing platform, everything in the browser can be slowed down significantly. This is likely due to the way the testing platform is hooking in to the browser in order to control it—this feature usually comes at the price of performance.

Another unpleasant surprise is the amount of hardware you may need. Most testing platforms only allow you to run one browser at a time, so in order to speed up the lengthy test time you’ll have to run multiple machines at once. Also, the browser test can be unexpectedly hard on the CPU/Memory, likely due to inefficient testing platform code that controls the browser. Another performance issue can be due to heavily exercising your web-app in the same browser window for extended periods of time. Once again single-page JavaScript web-apps built in the popular flavor-of-the-week framework can occasionally be leak-y and cause browsers to eat up memory. So it is wise to tell the testing platform to close and reopen the browser after so many test cases.

By the way, if you are running your test suite on multiple machines, you probably need a license for each machine depending on what testing platform you’ve purchased.

The last wart you’ll find is that the number of browsers supported by ABTs are not as numerous as it seemed. Different test platforms may claim to support testing on things like Safari, but in reality they support testing on Safari for Windows, which by no stretch is the same thing as testing Safari on a Mac! Most test suites are limited to Windows only, although this is likely to become less common in the near future. Typically, these testing platforms use DLLs or some sort of plug-ins to be able to manipulate the different browser types during test runs. This means that when a new browser version is released, you may have to wait for the testing platform vendor to release the necessary DDL/plug-in in order to start running automated tests on the new browser. Depending on your vendor, this may be a significant delay.

It’s Not All Bad

I’ve beaten up quite a bit on automated browser testing, but it’s not all bad. If tests are written in a more developer-esque manner—by that I mean tests are written to be more flexible and reusable—and if automated browser testing isn’t introduced until the UI has solidified quite a bit, then ABTs can be useful in identifying changes to your application.

Since they are automatic, they can and should be run ad nauseam. To get around potentially long test run times and to reduce the amount of tests that need to be updated when the UI changes, ABTs are best used in proper moderation. Rather than testing every single minutia with ABTs, it might be wiser to only cover the core functionality of the application. This way, you can quickly run and re-run tests and have a lower cost of ownership in maintaining your set of browser tests when they need to be updated to reflect changes or new features in your application.

ABTs can have a positive impact when they are taken with a grain of salt. They are not the end-all-be-all. They cannot and should not be the only means for detecting bugs or changes in your software. They can be a useful tool used in conjunction with manual testing. In fact, they should alleviate manual testing, not replace it. The most repetitive and time-consuming manual steps should be the ones that get automated.

Keep these thoughts in mind next time you’re asked whether or not your team should start using automated browser tests. Or perhaps you’ve already integrated automated testing and are wondering how you’ve arrived at your current state. Hopefully you can glean some insight from this advice and improve your lot. Whenever someone mentions automated browser testing, remember the Sirens.

Advertisements

Single Responsibility Principle promotes greater reuse

Speaking to the design around a piece of code he had written, an aspiring programmer asked me “but how do I make it more generic?”

He had the right desire to reuse code, but the approach to keep adding parameters to his methods was having the opposite effect of promoting their reuse.

At times like this, it’s helpful to focus on the Single Responsibility Principle — the ‘S’ in SOLID object oriented design.

Case in question

Imagine a class that takes a List of text and replaces all occurrences of a given word with another.  Something like:

class TextReplacer {
    private String replace;
    private String with;

    public TextReplacer(String replace, String with) {
        this.replace = replace;
        this.with = with;
    }

    public List&lt;String&gt; replace(List&lt;String&gt; text) {
        List&lt;String&gt; results = new ArrayList&lt;&gt;(text.size());
        for (String textToCheck : text) {
            String result = textToCheck.equals(replace) ? with : textToCheck;
            results.add(result);
        }
        return results;
    }
}

The use of this class is pretty straightforward:

TextReplacer replacer = new TextReplacer(&quot;foo&quot;, &quot;fu&quot;);
    List&lt;String&gt; results = replacer.replace(Arrays.asList(&quot;foo&quot;, &quot;bar&quot;, &quot;foo&quot;));
    assertEquals(Arrays.asList(&quot;fu&quot;, &quot;bar&quot;, &quot;fu&quot;), results);

Getting more complicated

The discussion at work focused on an added requirement — we wanted a UUID added to the end of each line.  (The real situation was more complex, it’s easier to understand with a simpler example).

In this case the programmer wanted to update the existing class with the code.  This made sense in his mind.  This class was where the text was already being transformed.  Go to this spot, add the work needed, and you’re done.

This is what how the resultant class looked:

class TextReplacer {
    private String replace;
    private String with;
    private boolean addUuid;

    public TextReplacer(String replace, String with, boolean addUuid) {
        this.replace = replace;
        this.with = with;
        this.addUuid = addUuid;
    }

    public List&lt;String&gt; replace(List&lt;String&gt; text) {
        List&lt;String&gt; results = new ArrayList&lt;&gt;(text.size());
        for (String textToCheck : text) {
            String result = textToCheck.equals(replace) ? with : textToCheck;
            results.add(result);
        }
        if (addUuid) {
            results.add(UUID.randomUUID().toString());
        }
        return results;
    }
}

A small change, yet the complexity has just been magnified. Additionally there’s a fundamental change being made to the class.

It started off as TextReplacer. Now, it would be more accurately labeled TextReplacerAndUuidAppender.   Any time there’s an ‘and’ in a class’s name, it’s usually a code smell.  Rather than focused on one task, the class is now trying to do two things, and is a good sign it should be broken into two separate classes.

Reasons against updating this class in place

It hurts reuse in a couple of ways.

For starters, it discourages a later developer from using the class if they have no need for appending a UUID at the end of their text… which is the one true requirement this class was originally created to solve.

Even if the class was updated with an overridden constructor and only appends the UUID where a flag is true — there’s the issue of its single responsibility being violated.   There’s no reason for a class whose only concern is replacing text to have any concerns outside of replacing text.

Even if the add-on functionality were made optional, updating this class would introduce the risk of introducing a NullPointerException in existing code that has already been working.

Perhaps the biggest impact, although hard to accurately measure, is how much reuse would be limited by the added confusion.

The best APIs are intuitive and easy to use.  Any later developer to use this code would have to take a step back and wonder why he’s being asked about UUIDs if all he wants is to replace some text.

Even if the developer reads the documentation or source for the class and understands the ins and outs, it’s time he shouldn’t have had to invest.  And in a more complicated scenario, unless the developer is confident he knows the ins and outs, he may rightly choose not to use the class at all due to the risk of unknown consequences.

Lastly, keeping classes simple and focused aids unit testing.  Every additional conditional that gets added doubles the number of edge cases that should rightly be checked.

Best bet

For our real-life scenario, the best option was to leave the existent class alone and make a new one to handle the new requirement, then pass the data through the two classes in sequence.  Like so:

/** Use original TextReplacer and add this class */
class UuidAdder {
    public List&lt;String&gt; add(List&lt;String&gt; text) {
        List&lt;String&gt; results = new ArrayList&lt;&gt;(text.size() + 1);
        results.addAll(text);
        results.add(UUID.randomUUID().toString());
        return results;
    }
}
/** Use in control flow changed like so */
    TextReplacer replacer = new TextReplacer(&quot;foo&quot;, &quot;fu&quot;);
    UuidAdder adder = new UuidAdder();

    List&lt;String&gt; results = replacer.replace(Arrays.asList(&quot;foo&quot;, &quot;bar&quot;, &quot;foo&quot;));
    results = adder.add(results);

    assertEquals(4, results.size());
    assertEquals(Arrays.asList(&quot;fu&quot;, &quot;bar&quot;, &quot;fu&quot;), results.subList(0, 3));
    UUID.fromString(results.get(3));

This is a better design b/c the ‘decision making’ of what should be done is occurring at the level of the program’s control flow.

Moreover, the system has now gained another, highly-reusable class that can be applied anywhere a UUID is needed, to go along with the easily-tested text-replacing class we already had, and without risk of introducing bugs in any existing code, where we only need to test the simple class we’ve added and the small change we made to the control flow logic.

Builder pattern – a handy option

One of my favorite design patterns in Java is using the Builder Pattern in my DTOs (Data Transfer Objects).

Here we’re talking about POJOs (Plain Old Java Objects) whose main purpose is to transfer data.  They’re essentially a group of fields and getter/setters, and in web programming, they come into play frequently.   The Builder Pattern is appropriately used in many scenarios beyond this, but this is where I use it the most since it’s so easy and I enjoy the difference it makes in using these guys later.

Typical Standard POJO

Say we need a class to represent some data points on a monthly basis:

class MonthlyData {

    private String date;
    private double revenue;
    private double costs;

    public String getDate() {
        return date;
    }
    public void setDate(String date) {
        this.date = date;
    }

    public double getRevenue() {
        return revenue;
    }
    public void setRevenue(double revenue) {
        this.revenue = revenue;
    }

    public double getCosts() {
        return costs;
    }
    public void setCosts(double costs) {
        this.costs = costs;
    }
}

This is the standard Java- private fields with public accessors/mutators.  To create a new element we do:

    MonthlyData data = new MonthlyData();
    data.setCosts(1.02);
    data.setRevenue(12.45);
    data.setDate("2014-04-01");

Introducing the Builder Pattern

Nothing wrong with this, but if we update our setters in the Java class definition to return the class itself instead of a void, we gain the option of using another way.

With the builder pattern.  Note the only change is to the return type of the ‘setter methods’ and then return this.

class MonthlyData {

    private String date;
    private double revenue;
    private double costs;

    public String getDate() {
        return date;
    }
    public MonthlyData setDate(String date) {
        this.date = date;
        return this;
    }

    public double getRevenue() {
        return revenue;
    }
    public MonthlyData setRevenue(double revenue) {
        this.revenue = revenue;
        return this;
    }

    public double getCosts() {
        return costs;
    }
    public MonthlyData setCosts(double costs) {
        this.costs = costs;
        return this;
    }
}

Now with this one change in place, we can construct a new MonthlyData object using the same code as earlier, or we can chain the calls on the same line:

  new MonthlyData().setRevenue(12.7).setDate("2014-04-01");

While this features multiple method calls on the same line (which some people argue against), it’s effectively no different than if we created a constructor to the class taking these same parameters as arguments.   However, unlike an overloaded constructor, we don’t have to define bunch of constructors deciding in advance whether a field should be optional or not.  We’re just using the same get/set methods we’d have to write anyway.

Like I said, this is especially handy for DTO-type classes, particularly those with optional fields, and to ease testing.

More Advanced Use

Here’s a Java class I use frequently to return a single value serialized to JSON (using Jackson) from a REST service:

public class ValueDto<T> implements Serializable {
    private T value;

    public T getValue() {
        return value;
    }

    public ValueDto<T> setValue(T value) {
        this.value = value;
        return this;
    }
}

Then in any REST service where I just want to return a single value as JSON, I can do:

    return new ValueDto<Long>().setValue(1234L);

which ends up getting serialized as:

{
  "value":1234
}

And thanks to the Java generics, can also be used with Strings:

    return new ValueDto<String>().setValue("Andrew");

If you want to get real fancy, the same Builder Pattern is also a basic building block for crafting fluent APIs.  Here we can look at how Mockito uses it for setting behavior on mocks:

    when(mock.getBar().getName()).thenReturn("deep");

It’s the same pattern at in use-  each of those methods are returning the object being modified, but in such a way as to make the code extremely readable.

Takeaways

The great things about this pattern is it’s relatively free to implement– instead of a void type, return the appropriate object.  For classes used frequently, especially with optional fields and those re-used in tests, it can make a big difference later.