Archiv der Kategorie: Test Automatisierungs Strategien

Accept their flaky nature

Automated UI tests are flaky by their nature. They are not unit tests, which are stable by their nature. If you try to fix a UI test, that fails 2 times while 10 times passing, you’ll find yourself in the Heisenbug hell. Even throwing man-years on debugging those flaky testcases, you can only reduce the flakyness of your testsuite, but you can’t dissolve it. When even Google’s ‘world class engineers’ can’t beat that beast – we think we can? You can continue playing that game, but from this day on, I am out!
What does one FAILed UI test tell us? Nothing! If only one UI test PASSed out of hundered runs (given the same version of AUT) – the testcase is PASSed! Because it practically can’t PASS even once, if there would be a functional bug on its way. We don’t know why the other 99 runs have FAILed. Perhaps because the server of the testenvironment was under varying load (which BTW also lies in a testservers nature), perhaps because UI element x sometimes is faster than the UI element y, the test-driver itself is flaky or perhaps because the devil’s in the code. Even if the root cause is really a race-condition in the AUTs code: You won’t convince the dev to spend time on it, with your testruns telling him ‘sometimes it works, sometimes not’, as this is no reproducable base for his debugging.
CI-Tools have to adapt: Re-run is the magic word. If an automated UI testcase FAILes in the nightly testrun, rerun it. Repeat that till dawn. If it won’t PASS once in that night – then, and only then, it’s really FAILed and you have to analyze it in the morning.
These thoughts are nothing less than a paradigm change in UI testautomation. In the aseptic world of quality assurance you were teached that tests have to be stable, accurate and repeatable to be reliable. Now, you have to convince yourself that a FAIL is not an alert. It’s like the change in mind you have to execute coming from classical logic to fuzzy logic. Reality is fuzzy, UI tests are flaky.
Honor to my great test automation collegues @ GfK.

German Testing Day 2013

Some crumbs of that day:
– Great Keynote by Michael Palotas, Head of Productivity & Test Engineering Europe.
– Facebook has no manual testers at all. This reflects the testing world in 10 years.
– A tester in a scrum team may be called ’embedded tester’.
– Testers probably have the best overview of the product compared to the rest of the company.
– eBay (Francois Reynaud) has developed iOs Driver, Selendroid (which is to some extent a base of Appium) and Selenium Grid -> Selenium isn’t just a project of individual geeks, it’s also driven by the big internet companies (google, ebay, facebook, …).
– eBay Europe uses almost no test specifications, test plans, test concepts (=secondary test artifacts) or closed-source tools
– Selenium IDE has chasen away people (due to maintenance nightmare!)
– There are more than 2000 different Android devices – happy testing!
– Main challegenge with crowdtesting is the aggregation of the feedback/findings.
– Keeping up-to-date the specification is harder in JBehave (et al.) than in FitNesse.
– Selenium is also used in game-testing – think of browser games!

Catching Heisenbugs in Test Automation

“Ah, but I may as well try and catch the wind.” (Donovan)
GUI-based testautomation (hopefully done with Selenium WebDriver) is programming. No, it’s even harder than common programming, because you have to cope with insane effects. Even if your test suite works fine locally and even if it works for a while on the server, it’s no guarantee that it works reliable. That’s because the AUT is a living thing. Everyone trying to testautomate for few weeks knows what I mean. Sometimes the performance of the AUT’s server is bad and fails your beautiful testcases without reason. Sometimes the AUTs developers feel that they have to change the IDs and Bang! Sometimes heaven decides to change something in the layout and your f*** testcase waits for a f*** link that isn’t visible anymore (without scrolling) because the f*** floating menu decided to shift 50 pixels. Sometimes this, sometimes that. Ok, one could live with it: being called a bungler, analyse, fix and getting more and more stable with every new release. But things aren’t that “easy”: you got a ticket from the defect manager that your testcase works incorrectly (proven with a screenshot). You are running the test suite again and again without beeing able to even reproduce the bug. This type of bugs is called a heisenbug and unfortunately that’s not a rare case in testautomation. A bug in your test suite that comes in let’s say 7 of 100 runs and happens non-deterministic. Without being able to reproduce it, of course you can’t verify if any of your fixes works. Dead end!
But there is a way to “reproduce” the bug: use the bulk! Run your test-suite 100 times and count the number of appearances of that bug (in above case = 7). Try a fix in your code and rerun the test-suite 100 times. If the appearance of that bug is 0 you managed to fix it.
You can implement the bulk in various ways – my favorite is Jenkins:
0.) Given you’re running your nice testsuite in a Jenkins Job yet and called it “RottenTest”
1.) Create 100 jobs and run them in parallel.
Execute in jenkins script console:

def jobName = “RottenTest”
def job = Jenkins.instance.getItem(jobName) //get a reference to the job containing the heisenbug
def i = 1
while (i < 101) { def newJobName = "CatchingTheWind" + jobName + i def newJob = Hudson.instance.copy(job, newJobName) //create new jobs to get the execute in parallel for shortest possible total execution time newJob.scheduleBuild(new hudson.model.Cause.UserIdCause()) //start the new job i++ } [/sourcecode] 2.) Create a listview to directly compare the results of your 100 runs. Filter it with the following Regex: [sourcecode language="text"] CatchingTheWind.* [/sourcecode] 3.) Delete the 100 jobs after successful bugfix: [sourcecode language="text"] def jobName = "RottenTest" def job = Jenkins.instance.getItem(jobName) //get a reference to the job containing the heisenbug def i = 1 while (i < 101) { def newJobName = "CatchingTheWind" + jobName + i newJob = Jenkins.instance.getItem(newJobName) newJob.delete() i++ } [/sourcecode] Thanks to Karen, the unknown hero, for helping me with the quirks of Jenkins API.
BTW:
Run the complete test-suite because you can’t anticipate all dependencies of your buggy testcase.
The night is your friend.