The Fragility of Record/Replay Tests

The convenience of using record/replay web testing tools is often defeated by the brittleness of the tests they produce.  Mouna Hammoudi’s paper “Why Do Record/Replay Tests of Web Applications Break?” explores why these tests break so easily as the application under test changes.  Toffee overcomes many of these issues with record/replay testing.

Hammoudi used a record/replay tool called Selenium IDE, which allows a user to record themselves using the application under test and replay their actions at a later time to test functionality.  After creating a test for each of 300 releases across 5 web applications, Hammoudi found that roughly 73% of all breaks in her Record/Replay tests were due to locators.

Hammoudi explains that locator breaks occur when either an HTML element’s attributes or the HTML structure has changed.  In our experience, tests that use structure based locators such as XPath are especially vulnerable: as web applications are updated the structure changes regularly and even small changes can invalidate structural locators.

Toffee offers two solutions that make Toffee tests more robust and less expensive to maintain compared to record/replay tests.  First, Toffee makes it easy to write test steps using attribute based locators like id, name, label, and others.  These locators tend to be more stable because they are not affected by common structural changes. Instead, developers have to intentionally change the attribute value.

Second, Toffee provides users with the ability to create aliases.  Aliases are custom commands which are defined by the the tester and refer to other Toffee commands.  For example, the command “click button with id button-logout-user” can be aliased as “logout”.  Then the “logout” alias can be used throughout your tests, instead of the original command.  This minimizes the necessary repair after the button’s id changes because only the “logout” alias must be updated.

The drawback of record/replay tests is their brittleness.  Toffee tests are more robust and reusable without sacrificing the accessibility of record/replay tools.


For more information about Toffee, visit, or feel free to contact us with questions at

Manual Testing: Toffee is about Orchestration not only Automation

We created Toffee to not simply automate QA testing but to orchestrate both manual and automated tests.  Manual tests are an important part of QA testing because it is either impractical or impossible to automate everything.  See our prior post on the virtues of manual testing: Two Cheers for Manual Testing

Creating manual tests for Toffee is as straightforward as writing traditional manual tests.  Simply mark the test step as manual and then write the instructions for the tester to perform.  Toffee will know it is a manual step and will take care of the rest.

Manual test are typically carried out using stacks of paper or lengthy text documents with instructions and a space to fill in the result.  Testers are stymied by going back and forth between the application under test and the test documentation, squandering time and becoming susceptible to mistakes every time they lose their place in the instructions.

Toffee eliminates these problems. Testers no longer have to deal with shuffling paper or scrolling through online documents.  Toffee provides test step instructions, and a place to record the results for only the current step, in a “heads up” display alongside the application under test.  This reduces human errors and is less taxing on testers.

As part of testing orchestration, Toffee allows manual steps to be integrated directly into otherwise automated tests.  Toffee executes all of the automated steps until it comes to a manual step, then it will pause and request the manual action from the tester.  After the tester completes the manual step and records the result, the computer continues on with automated steps.  This way if the majority of a test can be automated, a whole separate test does not need to be made just for the manual steps.  In addition, the results from the manual step and automated steps are all in the same report generated by Toffee.

Manual tests are still a vital part of QA testing, and Toffee does not just accommodate manual testing, but it makes it less of a headache for testers and more accurate.

For more information about Toffee, visit, or feel free to contact us with questions at

Automatic Screenshots: Evidence Gathering Made Easy

When working with clients and performing tests ourselves, we found that documenting tests with screenshot evidence is one of the most time consuming and challenging aspects of testing especially for manual tests.  With this in mind, we designed Toffee to make evidence gathering easy with automatic screenshot capture for both manual and automated tests.

By using the simple command “enable automatic screenshot capture”, Toffee will begin to automatically take screenshots at every test step regardless if it is manual or automated.  Testers no longer have to remember to do it themselves or repeat steps if they forget.  The default timing for the screenshot is within milliseconds of the test step being completed, but the timing can be delayed as needed to accommodate slower applications.  The screenshot delay can be modified at any point within the test.  This allows flexibility to ensure the appropriate evidence is captured with every screenshot.

One of the main frustrations Toffee’s screenshot capture solves is keeping screenshots connected to the correct test step results.  For example, without Toffee a tester often has to paste screenshots for manual tests into some additional word processing tool and document it appropriately so others know which screenshot corresponds to which test step.  This method is tedious and prone to error.  Toffee stores the results and screenshots together and can easily generate a single report containing all of the results and evidence in a single document.  

Toffee makes gathering screenshot evidence easy by automatically capturing them after test steps and keeping them all in one place with the test results whether they are manual or automated.

For more information about Toffee, visit, or feel free to contact us with questions at

Announcing Toffee

Toffee: Test Orchestration for the Enterprise

Since my last post on functional testing, KSM has been hard at work transforming the ideas from that post into a new product and service offering. Toffee (“Test Orchestration for the Enterprise”) allows QA professionals to build and execute automated tests in an interactive, online test environment, without requiring programming expertise. For those test cases that automation cannot easily reach, Toffee lets you include manual test steps in your scripts alongside automated ones. Automated screenshots capture your entire desktop, providing evidence for steps whether executed within or outside of the browser. Test results for both automated and manual tests are presented in a familiar step/expected result/actual result format, along with screenshot evidence.

Toffee started as a command-line based solution, which allowed us to focus on the syntax and scope of the Toffee command set. We used this first incarnation to test solutions we developed in house. The largest test suite achieved 100% automation with over 28,000 test steps, and completely replaced the Selenium tests we had written in Java.

On the trade show floor of the Society of Quality Assurance Annual Meeting last week, KSM previewed the next generation of Toffee, called Toffee Composer. Composer provides the same level of functionality as the initial version, but in a user-friendly web interface. Build your scripts, execute them, and store your results online, either in our cloud-based environment or in your own data center.

For more information about Toffee, visit We would be happy to schedule an online demonstration for you; just send us an email at

We’re Hiring

KSM Technology Partners LLC is seeking solutions-focused professionals who enjoy using technology to solve complex, open-ended business problems. A KSM Consultant develops custom application and integration software for our clients in the pharmaceutical and utilities industries. The ideal candidate is a multi-disciplinary professional with 2+ years of experience writing great code in multiple languages, eliciting and documenting requirements, and designing and executing thorough unit and functional tests. Our consultants work in small, cross-functional, hyper-productive teams that deliver business value quickly and with a minimum of overhead.

Our offices and many of our clients are located in the greater Philadelphia area, particularly in the route 202 corridor in the western suburbs. Telecommuting is occasionally available for some engagements. We prefer local candidates.

This is a full-time position; we are not a body shop looking for subcontractors. We offer a competitive salary, flexible working hours, paid time off, healthcare and long-term disability coverage, and 401K with a generous match and no-wait vesting.

We will not consider candidates presented by agencies or other third parties. Applicants must be able to work unrestricted in the U.S.

If you are interested, email us at careers [/at/] ksmpartners [\dot\] com.

RPN Calc Part 10 – Macros and the Intent of the Code

One of the key attributes I look for when writing and reviewing code is that code should express the intent of the developer more than the mechanism used to achieve that intent. In other words, code should read as much as possible as if it were a description of the end goal to be achieved. The mechanism used to achieve that goal is secondary.

Over the years, I’ve found this emphasis improves the quality of a system by making it easier to write correct code. By removing the distraction of the mechanism underneath the code: it’s easier for the author of that code to stay in the mindset of the business process they’re implementing. To see what I mean, consider how hard it would be to query a SQL database if every query was forced to specify the details of each table scan, index lookup, sort, join, and filter. The power of SQL is that it eliminates the mechanism of the query from consideration and lets a developer focus on the logic itself. The computer handles the details. Compilers do the same sort of thing for high level languages: coding in Java means not worrying about register allocation, machine instruction ordering, or the details of free memory reclamation. In the short-term, these abstractions make it easier to think about the problem I’m being paid to solve. Over a longer time scale, the increased distance between the intent and mechanism makes it easier to improve the performance or reliability of a system. Adding an index can transparently change a SQL query plan and Java seamlessly made the jump from an interpreter to a compiler.

One of the unique sources of power in the Lisp family of languages is a combination of features that makes it easier build the abstractions necessary to elevate code from mechanism to intent. The combination of dynamic typing, higher order functions, good data structures, and macros can make it possible to develop abstractions that allow developers to focus more on what matters, the intent of the paying customer, and less on what doesn’t. In this article, I’ll talk about what that looks like for the calculator example and how Clojure brings the tools needed to focus on the intent of the code.

Continue Reading…

RPN Calc Part 9 – State and Commands in Clojure

In my last post, I started porting the RPN calculator example from Java to Clojure, moving a functional program into a functional language. In this post, I finish the work and show how the Clojure calculator models both state and calculator commands.

Continue Reading…

Two Cheers for Manual Testing (Functional Test Automation, part 3)

If you’ve never played it before, a hand of manual testing misery poker plays out something like this:

“It took six of us eight weeks to plow through a three and a half foot stack of system test scripts”

“That’s nothing.  Our site acceptance testing alone took fifteen of us three months for a six-foot stack.”

“But were yours double sided?”

“Erm, no”

“Then what took you so long?”

“Screenshots every step”

“Oh.  I fold.”

We automate functional testing for a reason: the alternative is tedious, resource-intensive, and expensive.  So why do test suites still comprise so many manual tests? Continue Reading…

RPN Calc Part 8 – Moving to Clojure

So far in this series, I’ve taken a basic calculator written in Java and transformed it from a command-oriented procedural design into a more functional style. In some ways, this has made for simpler code:
calculator state is better encapsulated in value objects, and explicit control flow structures have been replaced with domain-specific higher order functions. Unfortunately, Java wasn’t designed to be a functional language, so the notation has become progressively more cumbersome and lengthy. 151 lines of idiomatic Java is now 327 lines of inner classes, custom iterators, and inverted control flow patterns. It should be difficult to get this kind of code through a serious Java code review.

Despite this difficulty, there is value in the functional design approach; What we need is a new notation. To show what I mean, this article switches gears and ports the latest version of the calculator from Java to Clojure. This reduces the size of the code from 327 lines down to a more reasonable-for-the-functionality 82. More importantly, the new notation opens up new opportunities for better expressiveness and further optimization. Building on the Clojure port, I’ll ultimately build out a version of the calculator that uses eval for legitimate purposes, and compiles calculator macros and can run them almost as fast as code written directly in Java.

Continue Reading…

Functional Test Automation, Part 2: The Subject, the Standard, and the Evidence

In my last post I wrote that the reality of automated functional testing has so far failed to live up to my expectations. In this post I’ll define what I mean by functional testing. What follows might not be the definition you’re familiar with, and I don’t mean to suggest that this is the only valid definition. It is certainly influenced by the industries I work with, where:

  • The subject of functional testing is a black box
  • The standard of functional testing is the set of functional requirements
  • The evidence of functional testing formally links test cases to those requirements they test

Continue Reading…

Page 1 of 41234»