Warning: This post highlights the reality of mobile test automation. It isn’t all that pretty, but there are a lot of smart people working to make it better and there is room for hope. At least you will know you are not alone.
How many in the room have experience with mobile test automation? 60% of folks raise their hands. If you have had a successful mobile automation experience keep your hands up. Everyone drops their hands. Just another poll of people attending a mobile development and testing conference. These are people who’s companies have dedicated testers, with the budget to send them to conferences to dig deep into the world of testing, have collectively tried every open source and commercial option, and they still aren’t seeing success.
I spoke with a leading testing vendor over coffee a few days ago. His company will automate your mobile app for a price. The companies biggest problem is that customers are quickly unhappy with the results, cost and time. Mobile tests are more difficult to update and maintain but clients don’t understand this, or want to be believe the costs are real. He often sells the test maintenance at cost just to keep customers from leaving.
I just geeked out with someone who owns automation for a popular app. The results are typical: 4 engineers over one year. The resulting test automation provided false failures 70% of the time, and most importantly no product bugs were actually found by the automation. The team decided to abandon their test automation, with all the sunk costs, and fall back to crowd-sourced manual testing last month.
Everyone in mobile is somewhere along the same mobile automation hype curve. Some are even on their third iteration of the wave. What is going on? Why is this so much more painful in mobile versus automating tests for the Web? Lets walk along the hype curve…
As teams dive into mobile apps, they quickly discover testing is a good idea. With the web, when you found a bug it could quickly be rolled back to the old version, or the team could push out a new version with a fix. You owned and controlled the web servers. For mobile apps, deploying through the App Store means it might take a or two, sometimes longer to get your fixes deployed. It also often takes days or months for most users to actually install the updated build. Every good app team starts testing before they release. They write up dozens, hundreds, even thousands of tests and execute them manually before each build is released to users. They do all this because they can’t risk their users being on broken build for days on end.
All this manual testing costs a lot of time and money. Often tests passes aren’t completed before a new build appears and needs testing. Testers become bleary-eyed the 100th time they try logging into the app and lose their motivation and mental creativity for more critical testing such as exploratory, negative and boundary testing.
Most teams eventually say: “We should just automate this app testing like we automated our web testing”. Seems reasonable enough, even smart. Have the machines just test everything and test it super quickly. At this moment there is a mix of anticipation of the ideal world where all tests are automated, but in the mobile world this is moderated by fear of the unknowns: new programming languages, new mobile test frameworks, and a lab full of devices.
The question of whether to automate becomes most acute after the app team releases a build or two with a bad bug in it and realize they have to spend a day or more manually testing the build before they can deploy the new version to the App Store. Then, wait up to an additional week for approval.
It is at this moment that app teams have a mix of optimism and dread. About half those teams take the plunge and attempt to build out automation and enter the execution phase.
Engineers love to start from scratch and build a ‘version one’ of anything. Developers and testers love the low pressure and technical fun of planning, designing and building out a mobile test automation solution. Just the thought of a watching a bunch of mobile devices being automatically controlled in unison can bring some nerds to the verge of euphoria.
Even better, most mobile automation build-outs don’t have the pressure of product managers or the business folks looking over their shoulder. Mobile automation efforts are developer and tester playgrounds.
This is the high point. All promise and no pain. The hype curve reaches it’s peak at the beginning of the execution phase, then quickly plummets as reality sets in.
Frameworks: There aren’t any great or obvious choices. There are the familiar and cross-platform solutions such as Appium, aka ‘Selenium for mobile’, but it is a complicated cross-platform stack which soon has your team debugging timeouts across several different software layers, missing platform-specific functionality, and frustratingly slow bug fixes when new mobile OS versions come out. The Appium team should be applauded, and it is likely the best solution for cross-platform automation, but we are still in the early days of test frameworks.
For those who chose platform-specific test frameworks such as KIF and XCTest for iOS or Espresso for Android. These frameworks give you near-full command of the app for testing, but engineers are quickly frustrated when they realize that their iOS and Android apps are pretty similar, but end up writing each test case twice — once for each platform. Worse, the tests are often written in different languages. This means testing quickly slows to a crawl, costs go up, and hype reaches a new low.
It is generous to say that mobile test frameworks are a bit glitchy. “Who tests the tests?” is a common meta question, and in mobile it seems no one is at the wheel. Test teams soon get fed up with having to execute every test multiple times to ignore random and intermittent failures stemming from framework bugs. These mobile frameworks also break at the worst time — just when a new version of the mobile OS or development tools are ready, the very time you want to catch regressions in your app. Mobile test frameworks just aren’t updated as quickly as the product and developer tools. App teams often turn off tests that no longer run correctly and rarely revisit or fix them.
Test Execution: All these tests need to be executed. Again, all options have advantages and disadvantages, with no clear winner. App teams have to pick their poison.
Simulators and emulators are great. You don’t need to buy physical devices and these test platforms are built into every developer’s development environment. Problems occur when you realize you can only control one simulator per machine, the emulators crash and corrupt the image files, and the emulators are so slow that tests simply timeout and fail. App teams have the non-trivial task of creating many different emulators to represent the different device families — DPI, API-level, screen dimensions, storage, etc. Before you know it, the team realizes the emulators are failing to launch 5% of the time, sometimes corrupting the emulator image itself. Worse is the revelation that these emulators can’t run at speed on the cheap virtualized servers in the cloud. That app running in the iOS simulator also hides crashing issues because it has access to over 4GB of RAM where real devices might only have 1GB max on real world devices. Virtual devices promise the best of all worlds, but they are almost as cumbersome and expensive as physical device labs.
Physical Device Labs: So the smartest, most well-funded app teams inevitably think it is a great idea to build out a device lab. A bunch of devices, right down the hall sounds like a great idea, looks impressive, and all the test infrastructure is under the team’s direct control. But road bumps like actually buying the devices, plugging them in only to realize a test run discharges the device faster than it can charge, and soon wiping toxic ooze leaking out the battery panel add up to a lot of overhead.
Device labs are a pain, I know. My team built one cool enough to end up on TechCrunch, but we also quickly shut it down because of these issues. I’ve been on this hype curve myself.
Other teams aren’t so quick to realize the folly of building their own test lab. One billion dollar app company has had 3 successive Test Managers. Each Test manager has built a new device lab only to shut it down later. I wonder what the fourth test manger will do. And, I wonder why the previous Test Managers were let go.
Cloud Device Labs: An awesome thing has appeared in the past several years. Amazon, Google and Microsoft, along with several startups like TestDroid and Sauce Labs, now share device labs — in the cloud. You don’t have to build or maintain these devices, and even better, the cost is shared by all users and is even subsidized or free. But, with these cloud device labs, test engineers can’t control their device selection, watch patiently as their tests wait in queues for execution or suffer debugging timeouts as each tests step now span the entire internet. Most cloud devices are only available via API — you can’t do manual testing on the devices. The few options that allow for manual interaction with cloud devices are almost too expensive to mention. Cloud devices are likely the best option, but by no means optimal.
The frustration phase comes after the first tests are running. The good news is that tests are running. The bad news is that tests are running. These tests need maintenance, and every couple steps forward seems to be met by one step backwards. Despite the frustration, progress is happening, albeit at a disappointing pace.
No matter the test framework or infrastructure chosen, frustration comes more quickly than folks experienced web test automation engineers. There are several points of frustration common to most every mobile automation project that are off the ground and running: Test coverage, maintenance, new features, and cost. Mobile app teams simply get far less Return on Investment (ROI) than web projects.
Test coverage: How fast can we write these new mobile app test cases? Slower than we could write them for the web. Browsers are mature, almost boringly stable platforms, as are the web test APIs and tools. Everything you need to write test code on mobile is far less stable, less-complete, and slows down test development.
Frugal and experienced web test developers often choose a cross-platform automation solution such as Appium which has the promise of writing code that runs on both iOS and Android (and the Web!). But the reality is that only about 60% of the code is truly reusable cross platform, and there are platform-specific things you just can’t handle in your tests.
The richest test teams often build platform-specific test automation suites to have more reliable, clean and higher-fidelity test code, but the price they pay is that they get to write much of the test code twice — once for each platform, and often in different programming languages. With the web, you can simply write your tests once and run on all browsers.
The result of all this is that mobile test teams are getting half the test coverage per unit of effort thanks to all this duplicate test logic. Adding test coverage for mobile is just plain slower than adding coverage for similar cross-browser web apps.
Crying Wolf and Maintenance Tests fail, teams jump in to investigate, only to discover that the tests just need to be updated for the fifth time this month. Over time, false-failures due to test maintenance issues often mean teams just ignore the test results. Most mobile app test failures are false alarms.
New app features and redesigns to look fresh and modern push the patience of most mobile test automation engineers. Even well-crafted test automation suites will often break when a new introductory screen is added to the application — up to 30% of automated tests will need to be updated for a single app change. Browsers rarely change in significant ways, but mobile operating systems are changing every 6 to 12 months. These OS-level changes mean new technology stacks for testing, new emulators, new APIs and an ever-growing matrix of devices and OS versions. In the meantime, automated test failures have caused more angst in the team with each new false alarm. With so many false alarms, developers and testers stop looking at the test results. Teams should plan for maintenance of mobile tests to be upwards of 2-times what you would expect for web tests with the same level of coverage.
An open secret with test automation is that they don’t actually test new features — they only test what was known to need testing. A few folks out there practice Test Driven Development (TDD), but those shops are rare, and are often missing test coverage for the cross-cycle, cross-feature test coverage as they are so focused on the current milestone. Agile software teams are often adding new functionality each week, but testers have to decide wether to test the new features as much as they can in that weekly cycle, or instead focus testing cycles on the highest-risk, most automatable parts of the apps that were written three or even eighteen months ago. Automation efforts at many top app teams don’t have the time or priority to test the latest features as they come in, but these are the very features that are most likely to fail or not work correctly. Most of the app is already tested by real world users if you think about it, so automation is often verifying old features, tested by users, and rarely finds new bugs.
Ignoring that mobile app test engineers are more expensive than their web test engineer counterparts, ignoring the cost of mobile infrastructure, and even ignoring test maintenance costs, mobile test automation simply provides less app test coverage for the dollar. Mobile testers may be plodding happily along adding test cases, but management is thinking that their web test automation efforts are a large cost-center with little value. Overall, mobile is perhaps double the cost, for less coverage, and more chaos.
What do teams that reach this breaking point do? They almost always remove tests to focus only on the basic features of the app such as launch, login and a few other tests for critical features that involve money.
After a mix of slow-progress and frustration, teams realize their dream of a fully automated testing system are much further away than they thought, and far too expensive to be practical today. The time and money is often best spent in other ways.
In this hope phase, app teams often either abandon their efforts or reduce coverage to the bare essentials. Automation efforts are abandoned due to low automation ROI, and the engineers end up moving to other parts of the company in an effort to escape the mobile automation quagmire before anyone realizes what has happened under their watch. Those that keep their automation efforts alive, almost always cut back on the promise and coverage of their automation test suites. Whether the automation effort is completely mothballed, or left running with only a fraction of the original dream’s worth of test coverage, everyone still hopes of restarting the effort when the frameworks, languages, and infrastructure improve. Someday.
As teams reach out to other app teams to commiserate, they soon realize that many other engineers have experienced the same mobile app hype cycle. Knowing that so many others share the same predicament gives them even more hope. The problem wasn’t them, their app, their team, or their technical expertise — it simply a shared problem. Common problems often mean that someone else might eventually fix the problem, or it will slowly fix itself over time. This is the hope that most mobile app automation teams hold on to. The dream and promise of full automation is too tempting and too valuable a dream to just let go.
All this hope (and hype) for mobile test automation is warranted. Walk down the street and everyone is staring into their phones. These people are are using apps to entertain, navigate, communicate, trade stocks, close sales, and share experiences. These mobile apps are powering more and more of our personal lives and running businesses. These apps need to be great, which means there is demand for a solution, so the problem will be solved — eventually.
Test engineers at some of the largest, and most tech-savvy app companies in the San Francisco area are quietly self-organizing to create and share mobile automation infrastructure, best practices and new test frameworks. In the last year, Apple, Google, Amazon and Microsoft have all bought companies the deliver mobile test automation and infrastructure and are continuing to invest in making these services better and often making them free. More venture capital is moving into mobile testing startups to fund new and more creative approaches to the problem. Existing test infrastructure and services companies are scrambling to build and improve mobile automation offerings. A lot of energy and capital is being invested in the mobile app automation space.
There is more pain and investment coming in the next couple years — the mobile automation problem will eventually be solved, but we all need to have some patience and look for ways we can collaborate. No one is alone in the mobile automation hype cycle.
— Jason Arbon, CEO Appdiff.com
Yeah, call me biased because I’m in the mobile app testing business. But it is these very stories, and personal frustrations with mobile automation that led me to speak with my feet, take the risk, and found a company to focus on a creatively different approach to solving this problem. At Appdiff, we have a team of ex-Google, ex-Microsoft, ex-Intel engineers training AI bot brains to automatically test *all* mobile apps at scale. The bots automatically generate thousands of UI and Performance regression tests for every app, and automatically run these tests on every new app build. The best thing is these bots don’t have test code, they test every new app and test every new feature automagically. The bots are already testing thousands of apps. Signup at Appdiff.com if you are curious to see how the Appdiff bots test your app so you can focus on the harder testing problems.