All RPA automations require a consistent way to “target” a given UI element. In a perfect world, each UI Element would have a guaranteed unique ID that can be used to query that element at runtime. But alas, no such ID exists. Without it, we need a technique to locate that element at runtime when our automation wants to read or write its data. There are a number of techniques based on the specific technology underlying the target application. These techniques include text recognition/mapping, HTML parsing, window enumeration, pixel X/Y coordinates, and many others. But the best and most popular choice is to use the application’s “Accessibility Tree”.
Although end users see text boxes, labels, drop-downs and other field types on a 2D surface, the operating systems sees much more. Since Windows XP, the Windows operating system has been able to access a hidden object model that represents the UI elements as a hierarchy of elements called the “Accessibility Tree”. This feature was added to Windows to comply with section 509 of the Americans with Disabilities Act and enabled 3rd party developers to build applications that communicate with other desktop windows to read field data, write field data and push buttons. This operating system feature allowed for a new class of products called “screen readers” to emerge, enabling visually impaired computer users to access a wider array of Windows applications. Most RPA automations rely heavily on the accessibility tree to find specific fields, read/write data and press buttons. Typical RPA automations locate UI elements at runtime by following a series of steps, starting at the root window, with each step traversing branch to branch until the desired element is found. This is very similar to the way the Windows file system works (see Figure 1). The accessible path shown here uses numeric values to represent the element’s order within its parent container (ordinal position). So TextBox[Order=3] means the third textbox within the Name group box.
This ordinal path technique works great, and its fast, but it relies on the UI layout remaining constant for the lifespan of your automation. The target application’s author, however, has made no such promise. Software vendors rarely even consider existing automations as they evolve their product. In fact, most don’t even know such automations are possible. So, with each new version of the target application comes the risk of a structural change (a new field inserted, existing fields rearranged, etc…), that could break your automation. This is a situation that cannot be completely avoided but can be mitigated.
If you switch from “Ordinal” targeting to “Property” targeting, you can trade some run-time efficiency for durability. With property targeting, you don’t rely on the order of fields (e.g. 3rd text box), but instead, do a wildcard search and query each UI element’s accessible properties looking for a match to some constant value. The most useful accessible properties are Name and Value. In the example above, the target text box element has the name property of “SSN:”. If we assume that UI element’s name property is more likely to stay constant than its order within the group box, then we gain a degree of durability by targeting that property in our path. Methods for path definition vary by vendor but, using the simple example above:
“Root/Tabset[Order=1]/GroupBox[Order=1]/TextBox[Order=3]” is replaced by “Root/Tabset[Order=1]//TextBox[Name=’SSN:’]”.
Notice the double slash in the path. In this example, double slash starts a wildcard search, checking all group boxes (or other containers), inside that tab set and exhaustively searching for a text box with the accessible name property of “SSN:”. This technique will be less efficient because it needs to address each UI element and evaluate its name property instead of just addressing the UI element by its ordinal position inside its container. A path that would have taken 20 milliseconds to evaluate might go up to 50 or 60 milliseconds, but you have gained a degree of durability because the text box can (in future versions), change position within its group box, or move to another group box without breaking this automation. So in short, techniques like this can be useful if you want your application to be as durable as possible.
After 10 years and 100+ RPA projects I can tell you that (IMHO), this effort is rarely worthwhile. Trying to make your automation future-proof consumes lots of labor upfront, adds runtime performance delays and produces an often disappointing result. This is because you are trying to plan for the unknown. Version updates may change anything, including the process name, the field names, the layout of fields, the introduction or removal on hidden containers which have no visual impact but change the accessible tree. The list goes on.
About 3 years ago, I gave up on trying to avoid the inevitable breaks that can occur with version upgrades. I write my automations to be as fast as they can be and I use wildcard/exhaustive searching only when necessary for other reasons (UI element position variability). Instead, our attention has focused more on just-in-time “rewiring”. We have educated our customers to understand the impact of a new release and implement a testing plan to uncover possible problems prior to version updates. We have a methodology for rewiring the automation (changing in paths, etc…), that usually takes a day and can be rolled out alongside the target applications update. With customer’s expectations set appropriately, we have limited the friction caused by these breaks and keep our customers feeling confident about their automation.