AI for Mobile Guideline Testing

jason arbon
5 min readDec 22, 2019

--

There is a painful corner of mobile app testing: Human Interface Guidelines (HIGs). HIGs define how applications should look and behave on mobile. This helps make sure that mobile apps are designed well and consistently for great user experiences. But, these guidelines come at the cost of unpredictable ship schedules and release schedules. HIGs are the Damocles Sword of mobile app development. It is time for a bit of AI to come to the rescue!

Unpredictable Release Schedules

Both Android and iOS have Human Interface Guidelines, but were Google Play has simple test bots, iOS has human reviewers to check for HIG compliance. When a developer pushes a new version of their application to the Google Play Store, they are supposed to understand and comply with the Android Design Guidelines, but the Play Store only does simple automated checks that the app is ‘stable’ and ‘virus-free’, and almost immediately lets the new version of the app deploy to the all its users. On iOS, the same care is expected of developers to adhere to the iOS Human Interface Guidelines — but with the additional check of humans who actually verify compliance before the application is released to customers.

The problem for app teams is that when the human reviewers reject an app for HIG violations, it can take a day or so to find out about these violations, and the app developers and designers have to scramble to redesign and reimplement the user interface to comply, resubmit the application for review, and wait further to find out if the new version of the app is compliant. Worse, the app team has to cross their fingers hoping no new issues will be found in the new HIG review cycle.

Most app teams want and need to keep feature parity between both Android and iOS apps — meaning the Android (and Web!) app team finishes their build, but then waits an indefinite period of time until they know their iOS app can release, before releasing their Android app. Not only does this create some ‘sprint’ complexity, but companies also have to have new features ready far in advance of any pre-scheduled marketing announcements of new features. HIGs cause project management and testing headaches for app teams.

Worse, these HIG reviews are often done by different human reviewers when submitting to the iOS app store. Since the HIG criteria are numerous and somewhat open to fuzzy interpretation, the same build or same old features may be flagged as non-compliant anytime you release a new build. App teams cringe every time they release, never knowing what might get them rejected from the app store based on HIG reviews.

To help minimize the risk of HIG review rejections, some large companies with ten or more apps have small specialized teams that review all new app releases for possible violations before the company submits the new version for review. These teams can be expensive in money and time but is a necessity when teams want a predictable release cadence. Most smaller app teams just don't have the resources and their schedules are at the whim of these HIG reviews.

What are these guidelines anyway?

These guidelines are written with good intentions to protect the user experience, but they are numerous, and sometimes vague and open to interpretation. Some examples:

  • When possible, present choices. Make data entry as efficient as possible. Consider using a picker or table instead of a text field, for example, because it’s easier to choose from a list of predefined options than to type a response.
  • Enable the Clear button. Most search bars include a Clear button that erases the contents of the field. Here is a simple example of search box that doesn’t have a clear button in a major app (found by AI bots):
Example of App missing ‘clear’ button in search box
  • Give text-titled buttons enough room. If your toolbar includes multiple buttons, the text of those buttons may appear to run together, making the buttons indistinguishable. Add separation by inserting fixed space between the buttons.

There are almost 200 of HIG rules for each platform. And, these rules apply to every screen/state in your application. Doing the math…if an app has only 20 major ‘screens’ in the app, that is an implied check of 20,000 tests that need to be performed every release.

To add to the problem, Android and iOS have *different* and sometimes conflicting HIG rules! Ouch. This is a thorny issue in today’s app release world. If you look, many top apps in production have HIG violations today — violations that can stall any of their future releases when noticed during HIG review.

AI to the rescue

So what can we do about all this HIG drama? At https://www.test.ai we have been training AI bots to automatically walk through an application and catch many of the common HIG violations. Today we can validate/catch about 30 of the most common issues, and adding more every sprint. AI is a great way to catch these issues as the “AI” has been trained to see the screen similar to that of the customer’s and reviewer’s mind. They don't look at code or have app-specific checks — our AI bots are trained specifically to look at the application UI, recognize toolbars, shopping carts, and login buttons — not cats, dogs, and beaches. Much like human HIG testers, our AI bots:

  1. Explore the application as a real user or review would
  2. Visually examine the elements and groups of elements on the ‘screen’
  3. Check all these visual elements against an AI trained on examples HIG violations
  4. Flag any issues that are found for human review

The bots are finding HIG violations in many of the top applications in production today. Importantly they find these HIG violations almost instantly. More importantly, they are repeatable and avoid the error of human memory and interpretation.

Future of HIG testing and AI

The team at test.ai has cataloged all the HIG rules and believe almost all are automatable with our AI bots ‘soon’. With AI enabling automated validation of HIG guidelines, there really is little reason for humans to look for the issues that the machines can now identify. There is now little excuse for the automatable HIG violations to randomize App team release schedules. Someday soon, these HIG bots will just be built into the IDE’s and build systems of app developers, and the review tools at the app stores, and automatically perform these checks on each new build. This will free humans of the HIG pain, provide for more predictable app reviews, and ensure better user experiences for users of all platforms.

Ad: We are letting folks sign up for access to these new HIG AI bots, just ping me: jason@test.ai

— Jason Arbon, CEO / tester @ test.ai

--

--

jason arbon
jason arbon

No responses yet