人工智能遭遇文档及人员障碍:数字化和流程数字化

来源:Forbes    2020-06-28 10:08:34

关键字: 数字化 人工智能

数字化基本就是将信息转换为计算机可读的数字格式的过程。要从数据和信息中获得真正的见解,就需要将其转换成数字格式,而不能是纸质形式或存储在实体文件柜里。

人工智能的使用已到了令人惊叹的地步,人工智能在识别、模式和异常检测(https://www.cognilytica.com/2019/07/25/infographic-the-identification-pattern-of-ai/)、预测性分析(https://www.cognilytica.com/2019/08/01/infographic-the-predictive-analytics-decision-support-pattern-of-ai/)、自主系统(https://www.cognilytica.com/2019/10/23/ai-today-podcast-112-patterns--ai-autonomous-systems/)、超个性化和目标驱动系统等方面提供了强大的功能。但人工智能系统如果不能访问数据就不能训练机器学习模型,也就没有办法做任何事情。而大部分数据都以纸质或电子形式或人工控制的形式封闭在文档里。

Often, a necessary first step to making any AI project happen is simply getting those documents and processes out of paper and human-based forms and into digital forms that a machine can understand. The notion of converting these analog assets into digital forms is known as digitization in the context of documents and information, and digitalization, in the context of processes and human-based activities. [Not surprisingly, digitization and digitalization efforts are seeing some of their most robust activity in the context of AI- enabling systems, according to a report from analyst firm Cognilytica.](https://www.cognilytica.com/2020/06/23/digitization-and- digitalization-june-2020/) _(Disclosure, I’m a principal analyst with Cognilytica)_

通常,实现任何AI(人工智能)项目的第一步就是简单地将这些文档和流程从纸质和基于人类形式里提取出来及转换成机器可以理解的数字形式。数字化是指将这些模拟资产转换为数字形式,是相对文档和信息的上下文而言。而流程数字化(Digitalization)则是相对流程和基于人类活动的上下文而言。根据分析公司Cognilytica的报告,数字化和数字流程化工作在AI支持系统方面的活动十分强劲,这是不足为奇的。(有必要披露一下,笔者是Cognilytica的首席分析师)

**Digitization**

数字化

The general idea of digitization is the process of converting information into a digital, computer-readable format. In order to gain real insights from your data and information it needs to make its way into a digital format rather than existing on paper and stored in a physical filing cabinet. Data is the foundational layer upon which information, understanding, and insights can be gathered. Document digitization is the idea of getting information that computers can’t process into a format they can handle.

数字化基本就是将信息转换为计算机可读的数字格式的过程。要从数据和信息中获得真正的见解,就需要将其转换成数字格式,而不能是纸质形式或存储在实体文件柜里。数据是基础层,信息、理解和见解的收集都是基于该基础层。文档数字化就是将计算机无法处理的信息整成计算机可以处理的形式。

By digitizing data, organizations and agencies can extract more value from assets that are otherwise literally gathering dust and occupying space. To gain higher level understanding from their data including performing analyses, automating various tasks, and incorporating more intelligent and cognitive processes, information needs to be converted from its non-digital form into one that computers can understand.

数据经过数字化后,组织和机构就可以从这样的资产中获取更多的价值,未经数字化的资产实际上是用来落灰的,而且还占空间。为了从数据中获得更高层次的理解,例如包括执行分析、自动执行等各种任务以及整合更多智能和认知过程的数据,信息需要从非数字形式转换为计算机可以理解的形式。

Examples of digitizing information include:

信息数字化的例子包括:

* Converting printed and handwritten text to digital format

  • 将打印文本和手写文本转换为数字格式

* Converting audio recordings in analog formats to digital format

  • 将模拟格式的音频唱片转换为数字格式

* Converting archival documents to digital format

  • 将档案文件转换为数字格式

* Converting video and film content to digital format

  • 将视频和电影内容转换为数字格式

For information that relates to documents, the concept of document digitization is also known as **document capture.   **The goal of document capture and document digitization is one of taking non-digital information and representing it in a digital manner for further processing. Many document capture systems take a digital image or sample of a print document, video, film, or other non-digital asset. The resulting digital format can then be electronically stored for further processing and analysis. Below is an example of digitization of text.

与文档有关的信息数字化也叫文档捕获。文档捕获和文档数字化的目标是获取非数字信息并用数字方式表示出来做进一步处理。许多文档捕获系统往往先获得一个数字图像或打印文档、视频、胶片或其他非数字资产的样本。然后再将所生成的数字格式进行电子存储,再进行进一步的处理和分析。以下是文本数字化的示例。

![Document capture and extraction](https://specials- images.forbesimg.com/imageserve/5ef2a029c708900006e39e4e/960x0.jpg?fit=scale)

Document capture and extraction Cognilytica

文档捕获和提取(图:Cognilytica)

Just as documents can be digitized, so too can audio and video assets. Analog video or audio must be transferred to digital format for organizations to be able to use it in a meaningful way such as posting to the internet or a website or transferring to someone via email or digital file shares.

文档可以数字化,音频和视频资产也可以数字化。组织必须先将模拟视频或音频转换成数字格式用起来才有意义,例如发布到网上或网站上,或通过电子邮件或数字文件共享发给某个人。

Examples of audio and video digitization include:

音频和视频数字化的例子包括:

* Converting film and magnetic video to digital format

  • 将电影和视频磁带转换为数字格式

* Converting music and magnetic audio to digital format

  • 将音乐和音频磁带转换为数字格式

* Converting analog audio and video production to digital formats

  • 将模拟音频和视频作品转换为数字格式

Once a document has been captured, it can then be further processed and analyzed to extract more value. Post-processing activities involve content analysis and document processing beyond simple scanning and storage, including the following:

捕获文档完成后便可以对其进行进一步处理和分析,获取更多的价值。后处理活动除了简单的扫描和存储外还涉及内容分析和文档处理,包括以下的内容:

* **_Optical character recognition (OCR)_** for recognizing printed text and converting to machine text representation

  • 光学字符识别(OCR识别打印的文本并将其转换为机器文本形式

* **_Intelligent character recognition (ICR)_** that can handle handwriting, hand marks (such as initials), cross-outs, and free-form information filled out by hand.

  • 智能字符识别(ICR可以处理手写文字、手写标记(例如名字的首个字母缩写)、删除线和人手填写的自由格式信息。

* **_Optical mark recognition (OMR)_** identifies meaningful text or handwritten indications such as ticked checkboxes, filled in bubbles, and other indicator marks as would be useful in automated grading of exams, handling of surveys, election ballots, and the like.

  • 光学标记识别(OMR标识出那些有意义的文本或手写指示,例如勾选的复选框、填满的气泡和其他指示标记,这样的标记在自动评分、问卷处理、选举投票里用得上。

* **_Optical barcode recognition (OBR)_** that can identify barcodes, indexing, and other marks for high-rate data collection.

  • 光学条形码识别(OBR可以识别条形码、索引和其他标记,用于高速率数据收集的实现。

**Digitization vs. Digitalization**

数字化与流程数字化

Digitalization expands upon the idea of digitization by addressing processes that have previously been dependent on non-digital information. Digitalization is focused on capturing processes that have previously been based on non- digital information and encoding them in a digital-centric manner. The below chart shows the differences between digitization, digitalization, and digital transformation.

流程数字化在数字化基础上进一步扩展,主要是解决那些依赖于非数字信息流程的问题。流程数字化的重点放在那些基于非数字信息的流程上并将其进行编码及以数字为中心的方式表达。下图显示了数字化、流程数字化和数字转型之间的区别。

![Digitization vs. Digitalization vs. Digital Transformation](https://specials- images.forbesimg.com/imageserve/5ef2a04ac708900006e39e51/960x0.jpg?fit=scale)

Digitization vs. Digitalization vs. Digital Transformation Cognilytica

数字化、流程数字化及数字转型(图:Cognilytica)

Digitalizing processes allows companies and governments alike to enhance services, save money, and improve citizens’ quality of life. The move to digital signatures in the banking, mortgage, and insurance industries provides a good example of digitalization of processes. The movement to e-filing of tax documents and check-scanning for digital and mobile banking are additional examples of processes that have been digitized through the use of exchange of digital documents.

流程数字化可以令公司和政府增强服务、节省资金及改善公民的生活质量。银行、抵押和保险行业里转向使用数字签名,为流程数字化提供了一个很好的例子。税收文件的电子归档以及数字银行和移动银行的支票扫描等等的发展,都是使用数字文件交换实现流程数字化的例子。

Examples of digitalization include:

流程数字化的例子包括:

* “Capture” of existing human and document-based workflows into computer-based representations of those workflows for later automation or analysis

  • 现有的基于人员和文档的工作流“捕获”转成基于计算机的表示形式的工作流,以后再对其进行自动化或分析

* Automation of existing human-based processes

  • 现有的基于人类流程的自动化

* Process analysis and process management tools that can provide visibility into effectiveness and efficiency of workflow steps

  • 流程分析和流程管理工具可提供工作流程步骤的有效性和效率可视性

* Applying advanced analytics and value-add technologies to multi-step document-based interactions

  • 用在基于文档的多步骤交互里的高级分析和增值技术

* Improvement of processes that have previously been manual to be centered on digital exchange of information (i.e. digital signatures)

  • 流程改进:以前手动进行的流程转成以信息为中心(即数字签名)的数字交换

One way to handle the movement of paper and human-based processes to digital ones is to capture and automate existing processes. Robotic Process Automation (RPA) **** technology provides benefits here by taking existing processes that have previously required manual activity via computer interfaces and transitioning them to automated software-based processes that accomplish repetitive tasks. While RPA solutions don’t aim to modify existing workflows as a primary benefit, they do help to remove the human element from the equation, making those processes more efficient and effective.

处理纸质流程和基于人的流程可以向数字流程转化,其中的一种方法是捕获现有流程及对其进行自动化。机器人流程自动化(RPA)技术可以在这一块提供不少好处,采用手动操作的现有流程可通过计算机界面转换,重复的任务将在基于软件的自动化流程里完成。RPA解决方案的主要目的并不是修改现有的工作流程,但RPA确实有助于移除人为因素,可以令这些流程更加高效。

In addition to process automation, companies looking to digitalize processes can also use process mining and discovery software to analyze existing workflows, provide insights into opportunities to improve and make more efficient those workflows, and add more monitoring and management to human- based workflows as they exist. These “ **Process Capture** ” tools are capable of recording and documenting existing human-based workflows into a machine- understandable format for later automation or analysis.

一些公司除了在做流程自动化方面的工作之外还希望将流程数字化,这些公司可以使用流程挖掘和发现软件分析现有的工作流程,可以找机会进行洞悉改进及提高这些工作流程,在基于人的工作流程中增加更多的监视和管理。这种所谓的流程捕获工具能够将现有的基于人的工作流程以机器可理解的格式记录下来存档,以后可以再对其进行自动化或分析。

**The Relationship between Digitalization and Digital Transformation  **

流程数字化与数字转型之间的关系

In addition to the concepts of digitization and digitalization, there’s another term that often gets wrapped into and confused with those terms: digital transformation. **Digital transformation** is a broad idea that has been around for several decades. The concept of digital transformation is the strategic and fundamental change to an organization’s operations such that they are driven by digital processes, technologies and methods to enable high rates of efficiency and operation. Forward-thinking organizations are taking advantage of the tremendous advancements in computing, storage, and software technology to digitally-enable their workforce, and in the process achieving substantial productivity, time savings, and improved citizen or customer satisfaction.

上面讨论了数字化和流程数字化,另外一个术语经常与数字化和流程数字化搅在一起:数字转型。 数字转型是个广泛的概念,存在了几十年。数字转型概念是指组织运营的策略性和根本性变革,组织的运营将以数字化流程、数字科技和数字方法为驱动力,实现高效率和高运营率。具有远见卓识的组织利用计算、存储和软件技术方面的巨大进步对旗下的生产力数字化,同时还在整个过程里实现显着提升生产力、节省时间及提高公民或客户满意度。

Digital transformation rests on a foundation of digital information (digitization) and digital processes (digitalization). It builds upon these to change the very nature of the operation to move beyond simply storing more data and automating existing systems and processes by adding intelligence to their strategies and put the power of cognitive technology to work addressing the more complicated challenges in their work environment that simple automation won’t achieve. Organizations that have successfully digitally transformed their operations have reduced the friction between customer and stakeholder needs and the ability for the organization to satisfy those needs efficiently.

数字转型的基础是数字信息(数字化)及数字流程(流程数字化)。数字转型以其为基础改变营运的实质,超越了以前的营运模式,不再是简单地存储许多数据及将流程自动化,而是在公司策略里加入智能及发挥认知技术的威力,在简单的自动化无法实现的工作环境里应对更复杂的挑战。那些成功实现了数字化运营的组织减少了客户和利益相关者需求与有效满足这些需求的能力之间的摩擦。

**Digitization as a necessary first step for many AI projects**

数字化是许多AI项目必不可少的第一步

At first glance, it may seem that digitization has nothing to do with AI. However, digitization is a necessary first step to extracting value from data that is locked in non-digital assets or human-based processes. By first digitizing and then digitalizing processes and documents, greater value can be applied to business organizations letting them tackle increasingly harder business problems of increasingly more strategic value. Without the foundational layer of digitization, organizations can’t apply higher level technology such as AI and ML to extract additional value. After all, data is the foundational layer upon which information, understanding, and insights can be gathered.

初初一看,数字化似乎与AI无关。但要在非数字资产或基于人的流程数据里提取价值,必不可少的第一步就是数字化。先数字化,然后将流程和文档进行数字化,进而得到更大的价值,这些用在业务组织里可以令这些组织解决各种业务难题及提高业务价值。假若没有数字化基础层,组织就无法利用AI和机器学习等高层次科技获取附加价值。总之,数据是基础层,信息、理解和见解的收集基于数据基础层。

    扫一扫

    分享文章到微信


    北京第二十六维信息技术有限公司(至顶网)版权所有. 京ICP备15039648号-7 京ICP证161336号京公网安备 11010802021500号
    举报电话:13070156560 举报邮箱:jubao@zhiding.cn 安全联盟认证