help desk software
+(800) 286-4232

Most successful RPA projects emanate from a good design. Regardless of one’s preferred method for arriving at that design (e.g. Agile, Waterfall, etc.), the process should include the development of a formalized specification document (spec), that details the road ahead and builds project team consensus. While this might seem obvious to the experienced developer, RPA’s new-found celebrity has attracted a lot of new and eager practitioners whose enthusiastic desire to produce quick wins might also encourage some short-cutting. The purpose of this article is to articulate some the best practices our professional services team has identified over the years when it comes to building an RPA specification document. 

Just the facts. The most important aspect of constructing the spec document is that it precisely captures just the process being automated. While “color commentary” can be helpful to the PD (Process Designer), when trying to understand the process and assessing potential improvements, such commentary should not be included in the spec. The goal of the document is to focus the AA (Automation Architect), only on the aspects of the process that are required to complete the transaction. All information beyond what is needed to complete the process is often considered noise. In other words, the AA cares about the: “who, what, which and where”, and not much about the “why”. While the spec must contain as many details as possible regarding the transaction, the details should focus on:

  1. Where to start?
  2. Which screens to navigate?
  3. Which user interface controls to manipulate?
  4. What data should be extracted and/or pasted?
  5. How should data be manipulated?
  6. What application and screen states to look for?
  7. How to handle error conditions and exceptions?
  8. What state to leave the application in when the transaction completes?
  9. How should logging be performed?

Minimize jargon. The AA is usually not intimately familiar with the user’s business nor conversant in the specific language of the business. Therefore, keeping jargon down to a minimum is important. The best method is to try to relate jargon to standardized business terms most AAs do understand such as: invoices, purchase orders, inventory item, price, etc. If it is required to include jargon to help the customer team understand the spec, make sure you include a glossary of terms early in the spec.

Make each step as atomic as reasonably possible. Each individual function the automation must perform is called a “step”. In the spec, each step is uniquely numbered so it can be cross-referenced and tested individually, and linked back to when viewing logs. For this reason, it is important to define each discrete function the automation performs, (e.g. the pasting of data to one field, or the clicking of a button), as its own step, and not lump multiple functions together. Break each process down to its most reasonable atomic function. When I say “reasonable”, I mean the step that represents a logical unit of work that you would want to link back to from a log. Let us review a couple of simple examples.

Example 1: If a step calls for pressing the key combination, “Alt+K”, this should be expressed as one step, not two.

Example 2: What about if you want select a sub navigation menu bar that requires two sets of key presses (e.g. “Alt+F” and “O” to pop a File Open dialog). Should this be represented as one step or two? In this case, it makes sense to combine the key presses into one step since the logical unit of work is the popping of the File Open dialog. However, it would not be wrong to break the process down into two steps. It ultimately comes down to style and it is probably more important to represent these steps consistently in one given spec than to definitively handle them one way or another. 

Steps should be numbered and sub-numbered. Well-designed automations can cross reference their functions back to the steps defined in a spec when a user is viewing an action execution log. This allows the user to easily figure out, using the language of the user, what the action was supposed to be doing at a specific point during the execution. However, this cross-referencing can only happen if the steps defined in the spec are uniquely numbered. 


A picture is worth a thousand words. While narrative descriptions are important, nothing informs the AA more about the task at hand than a picture (see Figure 1).

Figure 1

A spec should make heavy use of screen shots to communicate information such as:

  • What should the state of the screen look like at this step?
  • Via highlights, which user interface (UI), elements are the elements to be manipulated in this step(s).

Other points to consider when working with screen shots:

  • Screen highlights should use colors that are not contained within the screen upon which they are overlaid and those colors should be used consistently throughout the spec.
  • It is a common practice to include the step number in a screen shot highlight for each UI element highlighted.
  • Screen shots usually embody references to multiple steps (i.e. multiple screen shots of the same screen should be avoided). 

Spec the negative condition. Documenting a process where everything goes according to plan is easy. However, accounting for error or unexpected conditions can be more of a challenge. For example, a step that states; “4. Select part number from list.”. What should the automation do if the part number is not in the list? This is the kind of “negative” condition that should be accounted for in your spec. The more negative conditions you can capture in the spec (again, within reason), the less back and forth the AA will have with the user team for clarifications.

Seek out keyboard short cuts and mnemonics where possible. While most RPA tools support drag/drop and icon clicking, it is always faster and less subject to error when a keyboard shortcut or menu mnemonic is used in an automation. Although the AA will ultimately decide the best method for automating the user interface, calling out such shortcuts are always helpful.

Demarcate commentary via notes. Even though sticking to the facts is paramount when building a spec, there are times when some commentary is required (e.g. how something is calculated or the conditions under which certain states arise). In these cases, it is a best practice to not include the commentary in a step, but rather, break it out as its own “section note”.

Include an automation start state & preparation section if applicable (usually applicable to attended bot automations). If access to the development environment is proctored, then in most cases, the proctor should be able to navigate the AA to the application screens from which the automation is initiated. However, if access is not proctored, it is important to include in the spec (prior to the step definition), an automation start state section that includes the following:

  • Application load methods.
  • Login credentials for the applications.
  • Navigation path to get to the automation start screen.
  • Any data required to support the navigation path.

Include incomplete transaction rollback instructions. Some transactions commit data at specific steps prior to the completion of the process. If this is the case, the spec should include the process for rolling back the transaction to its pre-processing state. If the rollback conditions are only applicable to the testing of the action, it should be demarcated as a note. If the rollback must happen in production as well, it should be documented as its own set of steps.

Defines skills required to handle a prompt. This point applies exclusively to unattended bot projects. When an unattended bot encounters a processing condition that requires assistance from a human, it can raise a “prompt” and send a notification to one or more authorized users. When the prompt notification is received, the user can either provide the information requested by the prompt or click through the notification and take control of the unattended bot’s desktop. This is a powerful RPA feature that helps reduce job rejections and speeds up transaction processing. However, not all users may be able to handle all raised prompts. Most RPA tools that support prompting usually allow you to associate a “skill” with a prompt so that only users possessing that skill will be sent specific prompts. This being the case, it is important the spec define where and what specific skills are required to address any defined prompts.    

Building and getting the spec approved is an iterative process. Though things seem clear upon the first pass of a design, there are always clarifications and modifications that take place as people give the process more scrutiny. All changes should be incorporated into a new version of the spec. The initial version of the spec draft should be versioned “v1.0”, with subsequent versions incrementing the sub number (e.g. v1.1, v1.2, v1.3, etc.), assuming the modifications and clarifications do not change the scope of the project. Most specs do not require the major number to be incremented, but it does happen. This is usually the case when a project has major functional changes added to it during the design phase.

Once the specification is approved by the user, that version is considered the “build draft”. I call it a build draft because, undoubtedly, the development process will uncover issues that were not captured properly in the original spec, thus requiring final modifications. This is normal part of the process.

Finally, one of the most important aspects of the spec document is that it be kept current during the development and testing phases. It is critical that any modifications made to the process to accommodate variances uncovered downstream of the build draft get incorporated back into the spec. Otherwise, the spec will have little use when users try to use it to understand exceptions or use it as the basis for a phase II project.