This article appears in the Summer 2019 digital issue of DOCUMENT Strategy. Subscribe.

Image by: freedom_naruk, ©2019 Getty Images

Is the idea behind "big data" really just hype? While there is an enormous amount of data available, it first needs to be understood before it can be useful. Data doesn't necessarily translate into information or even for that matter knowledge, without a lot of additional work. The reality is that data needs to be classified, extracted, and analyzed. Then, it must be transformed into information that is made available to the stakeholders providing knowledge. This kind of insight, which varies according to the process application, is necessary for true digital transformation.

As of late, robotic process automation (RPA) is often cited as a solution to help understand, move, and process such data. Although RPA has been around in its primitive form for more than a decade, we are still in the early days of automation. To advance the state of automation strategies within enterprises, we believe the application of Capture 2.0 services will be necessary to interpret and understand incoming data (whether through paper or electronic channels), thus, transforming it into useable information.

When considering a work process, it's necessary to contemplate a few things:
  • Are there tasks with limited value?
  • Could the workflow be changed to make the process run more smoothly?
  • What new technologies are available to make the process more efficient?
Questions like these led to the First Industrial Revolution over 120 years ago, when skilled craftsmen were brought out of their shops and into factories to perform specialized tasks rather than producing one product from beginning to end.

Then came the Second Industrial Revolution, ushered in with the help of Henry Ford and his first moving assembly line. Since assembly workers focused on a single task, the repetitive nature of the job was tedious but more efficient. Soon, designs for industrial robotics cropped up in the 1930s to relieve workers of these tedious processes while still improving efficiency and quality. However, it wasn't until 1961 when General Motors deployed an actual robotic application on the manufacturing floor to transport die castings from an assembly line and then weld these parts onto auto bodies. These robots provided General Motors with a competitive advantage, and worldwide adoption quickly followed.

Fast-forward to the 1990s with the arrival of the Third Industrial Revolution, when the task at hand was no longer about moving material but moving data. Now the term "workflow automation" referred to transferring data from a legacy system to one or more modern solutions automatically. What had once been associated with the assembly line is now tied to the concept of business process automation. While this automated process helped to identify data in structured formats, it still has its limitations.

To this day, a vast amount of data that we create remains unstructured, and more is generated every minute of the day. This means that organizations will need to decipher, understand, classify, manipulate, and place that data into processes where it's needed. To do that effectively, companies will need intelligence. Although the idea of artificial intelligence (AI) is not new, the development of practical applications leveraging this particular toolset has marked a new era for the business world. The Third Industrial Revolution has led to computer-based systems identifying and understanding data in a way that was once only decipherable to humans. Today, organizations are looking at a sub-segment of AI—called machine learning—to automatically understand the mass amounts of data that is available to them.

As a result, many companies are beginning to look at improving processes that involve mundane, manual data transfer and simple manipulation. Enter RPA. In our view, there are three levels of maturity required for the advancement of RPA solutions.

First, we have Level One RPA that is applied to processes with structured data input, standardized workflows, manipulation of data through mouse clicks and keystrokes, and simple workflow logic. This is predominantly where the RPA market stands today, but it is transitioning.

In Level Two, advanced capture technologies, such as document classification with self-learning algorithms, field identification, and data extraction, are applied to minimize human intervention and further automate the process. Intelligent capture is key to classifying, understanding, and extracting relevant information from semi-structured and unstructured data, leading many RPA companies to form strategic partnerships with traditional capture software companies. For example, Automation Anywhere paired up with IBM to take advantage of the document classification capabilities in IBM Datacap. ABBYY is working with a variety of different RPA vendors to improve document understanding, and KnowledgeLake recently acquired RPA software firm RatchetSoft for an integrated approach to capture and RPA. Some RPA vendors are even developing their own capture classification technology in house. Automation Anywhere has developed a capture product called IQBot, which provides context to unstructured documents and extracts data.

However, the ability to merely extract data from documents is certainly not the ultimate destination for RPA's evolution.

We see this market moving closer to achieving genuine reasoning capabilities. In Level Three, this is characterized by a high level of machine understanding for a broader set of data inputs. Data is no longer bound in just documents but also enters the organization in the form of voice recordings, videos, and still images. At this stage, Capture 2.0 solutions, like natural language processing (NLP), sentiment analysis, translation algorithms for both voice and text, and object recognition, are essential tools in the RPA ecosystem to meet customer needs. In addition, predictive algorithms are also used to aid in automated decision-making, impacting business process workflows. It is only with an emphasis on user-centric design, proper architecture, Capture 2.0 technology, and systems integration that the RPA market can meet the expectations of both customers and investors alike.

RPA is a technology market that seems to be on a rapid growth trajectory, given the market analyst consensus and reported financial revenue. For example, just take a look at one of the major RPA companies out there: BluePrism closed their fiscal year with revenues of £55.2 million, up 125% year over year. While their losses were also 150% greater, their stock price rose by over 50%, with a market capitalization of £1.4 billon—more than 25 times their revenue.

HSA predicts an average annual growth rate of 40% for RPA. However, this optimistic outlook is contingent on the premise that RPA systems will move beyond basic automation of data transfer and employ more advanced Capture 2.0 technologies to enable a better understanding of data.

For those considering revamping and automating a business process workflow, it is important to:
  • Understand the process holistically
  • Question all aspects of the process
  • Examine the process in a broader sense, including all possible intersections of data, systems, and stakeholders
  • Evaluate all possible alternative solutions
  • Determine to what extent a data-driven process is definable, repeatable, and adheres to a set of rules
  • Consider available Capture 2.0 tools to advance automation process efficiency
  • Make sure the revised work process provides flexibility for changes in needs, system integration, and visibility to all stakeholders
RPA, coupled with intelligent Capture 2.0, will provide improvements in business process efficiencies. Moreover, Capture 2.0 services will also be employed to bring about customized solutions, rapid deployment, and advanced data understanding in an agile architecture. This will lead to positive advances in customer/stakeholder experience through more effective business process automation.

Mike Spang is the Vice President of Research at Harvey Spencer Associates, where he focuses on capture software and content services market analysis. Contact Mike at or visit
  • Generative AI (Gen AI) has captured the imagination of industries worldwide, but the true potential lies in its practical applications
  • Digital Asset Management (DAM) is a system designed for organizing, storing and retrieving media files and managing digital rights and permissions. DAM systems have become a core component of creative
  • Is Generative AI tipping the scales in favor of building Enterprise Content Management (ECM) software, or will it ever get to that point?
  • Information technology has undergone a major transformation in recent years, sparked by the rise of “big data.”
  • Every day, large organizations face multiple challenges with the hundreds or thousands of pieces of mail received through the USPS and other carriers, documents that include general business mail