


Often, a necessary first step to making any AI project happen is simply getting those documents and processes out of paper and human-based forms and into digital forms that a machine can understand. The notion of converting these analog assets into digital forms is known as digitization in the context of documents and information, and digitalization, in the context of processes and human-based activities. [Not surprisingly, digitization and digitalization efforts are seeing some of their most robust activity in the context of AI- enabling systems, according to a report from analyst firm Cognilytica.](https://www.cognilytica.com/2020/06/23/digitization-and- digitalization-june-2020/) _(Disclosure, I’m a principal analyst with Cognilytica)_




The general idea of digitization is the process of converting information into a digital, computer-readable format. In order to gain real insights from your data and information it needs to make its way into a digital format rather than existing on paper and stored in a physical filing cabinet. Data is the foundational layer upon which information, understanding, and insights can be gathered. Document digitization is the idea of getting information that computers can’t process into a format they can handle.


By digitizing data, organizations and agencies can extract more value from assets that are otherwise literally gathering dust and occupying space. To gain higher level understanding from their data including performing analyses, automating various tasks, and incorporating more intelligent and cognitive processes, information needs to be converted from its non-digital form into one that computers can understand.


Examples of digitizing information include:


* Converting printed and handwritten text to digital format

  • 将打印文本和手写文本转换为数字格式

* Converting audio recordings in analog formats to digital format

  • 将模拟格式的音频唱片转换为数字格式

* Converting archival documents to digital format

  • 将档案文件转换为数字格式

* Converting video and film content to digital format

  • 将视频和电影内容转换为数字格式

For information that relates to documents, the concept of document digitization is also known as **document capture.   **The goal of document capture and document digitization is one of taking non-digital information and representing it in a digital manner for further processing. Many document capture systems take a digital image or sample of a print document, video, film, or other non-digital asset. The resulting digital format can then be electronically stored for further processing and analysis. Below is an example of digitization of text.


![Document capture and extraction](https://specials- images.forbesimg.com/imageserve/5ef2a029c708900006e39e4e/960x0.jpg?fit=scale)

Document capture and extraction Cognilytica


Just as documents can be digitized, so too can audio and video assets. Analog video or audio must be transferred to digital format for organizations to be able to use it in a meaningful way such as posting to the internet or a website or transferring to someone via email or digital file shares.


Examples of audio and video digitization include:


* Converting film and magnetic video to digital format

  • 将电影和视频磁带转换为数字格式

* Converting music and magnetic audio to digital format

  • 将音乐和音频磁带转换为数字格式

* Converting analog audio and video production to digital formats

  • 将模拟音频和视频作品转换为数字格式

Once a document has been captured, it can then be further processed and analyzed to extract more value. Post-processing activities involve content analysis and document processing beyond simple scanning and storage, including the following:


* **_Optical character recognition (OCR)_** for recognizing printed text and converting to machine text representation

  • 光学字符识别(OCR识别打印的文本并将其转换为机器文本形式

* **_Intelligent character recognition (ICR)_** that can handle handwriting, hand marks (such as initials), cross-outs, and free-form information filled out by hand.

  • 智能字符识别(ICR可以处理手写文字、手写标记(例如名字的首个字母缩写)、删除线和人手填写的自由格式信息。

* **_Optical mark recognition (OMR)_** identifies meaningful text or handwritten indications such as ticked checkboxes, filled in bubbles, and other indicator marks as would be useful in automated grading of exams, handling of surveys, election ballots, and the like.

  • 光学标记识别(OMR标识出那些有意义的文本或手写指示,例如勾选的复选框、填满的气泡和其他指示标记,这样的标记在自动评分、问卷处理、选举投票里用得上。

* **_Optical barcode recognition (OBR)_** that can identify barcodes, indexing, and other marks for high-rate data collection.

  • 光学条形码识别(OBR可以识别条形码、索引和其他标记,用于高速率数据收集的实现。

**Digitization vs. Digitalization**


Digitalization expands upon the idea of digitization by addressing processes that have previously been dependent on non-digital information. Digitalization is focused on capturing processes that have previously been based on non- digital information and encoding them in a digital-centric manner. The below chart shows the differences between digitization, digitalization, and digital transformation.


![Digitization vs. Digitalization vs. Digital Transformation](https://specials- images.forbesimg.com/imageserve/5ef2a04ac708900006e39e51/960x0.jpg?fit=scale)

Digitization vs. Digitalization vs. Digital Transformation Cognilytica


Digitalizing processes allows companies and governments alike to enhance services, save money, and improve citizens’ quality of life. The move to digital signatures in the banking, mortgage, and insurance industries provides a good example of digitalization of processes. The movement to e-filing of tax documents and check-scanning for digital and mobile banking are additional examples of processes that have been digitized through the use of exchange of digital documents.


Examples of digitalization include:


* “Capture” of existing human and document-based workflows into computer-based representations of those workflows for later automation or analysis

  • 现有的基于人员和文档的工作流“捕获”转成基于计算机的表示形式的工作流,以后再对其进行自动化或分析

* Automation of existing human-based processes

  • 现有的基于人类流程的自动化

* Process analysis and process management tools that can provide visibility into effectiveness and efficiency of workflow steps

  • 流程分析和流程管理工具可提供工作流程步骤的有效性和效率可视性

* Applying advanced analytics and value-add technologies to multi-step document-based interactions

  • 用在基于文档的多步骤交互里的高级分析和增值技术

* Improvement of processes that have previously been manual to be centered on digital exchange of information (i.e. digital signatures)

  • 流程改进:以前手动进行的流程转成以信息为中心(即数字签名)的数字交换

One way to handle the movement of paper and human-based processes to digital ones is to capture and automate existing processes. Robotic Process Automation (RPA) **** technology provides benefits here by taking existing processes that have previously required manual activity via computer interfaces and transitioning them to automated software-based processes that accomplish repetitive tasks. While RPA solutions don’t aim to modify existing workflows as a primary benefit, they do help to remove the human element from the equation, making those processes more efficient and effective.


In addition to process automation, companies looking to digitalize processes can also use process mining and discovery software to analyze existing workflows, provide insights into opportunities to improve and make more efficient those workflows, and add more monitoring and management to human- based workflows as they exist. These “ **Process Capture** ” tools are capable of recording and documenting existing human-based workflows into a machine- understandable format for later automation or analysis.


**The Relationship between Digitalization and Digital Transformation  **


In addition to the concepts of digitization and digitalization, there’s another term that often gets wrapped into and confused with those terms: digital transformation. **Digital transformation** is a broad idea that has been around for several decades. The concept of digital transformation is the strategic and fundamental change to an organization’s operations such that they are driven by digital processes, technologies and methods to enable high rates of efficiency and operation. Forward-thinking organizations are taking advantage of the tremendous advancements in computing, storage, and software technology to digitally-enable their workforce, and in the process achieving substantial productivity, time savings, and improved citizen or customer satisfaction.

上面讨论了数字化和流程数字化,另外一个术语经常与数字化和流程数字化搅在一起:数字转型。 数字转型是个广泛的概念,存在了几十年。数字转型概念是指组织运营的策略性和根本性变革,组织的运营将以数字化流程、数字科技和数字方法为驱动力,实现高效率和高运营率。具有远见卓识的组织利用计算、存储和软件技术方面的巨大进步对旗下的生产力数字化,同时还在整个过程里实现显着提升生产力、节省时间及提高公民或客户满意度。

Digital transformation rests on a foundation of digital information (digitization) and digital processes (digitalization). It builds upon these to change the very nature of the operation to move beyond simply storing more data and automating existing systems and processes by adding intelligence to their strategies and put the power of cognitive technology to work addressing the more complicated challenges in their work environment that simple automation won’t achieve. Organizations that have successfully digitally transformed their operations have reduced the friction between customer and stakeholder needs and the ability for the organization to satisfy those needs efficiently.


**Digitization as a necessary first step for many AI projects**


At first glance, it may seem that digitization has nothing to do with AI. However, digitization is a necessary first step to extracting value from data that is locked in non-digital assets or human-based processes. By first digitizing and then digitalizing processes and documents, greater value can be applied to business organizations letting them tackle increasingly harder business problems of increasingly more strategic value. Without the foundational layer of digitization, organizations can’t apply higher level technology such as AI and ML to extract additional value. After all, data is the foundational layer upon which information, understanding, and insights can be gathered.









