Digitization of government records is an age old problem plaguing the governments. While most departments at both the central & state level have spent thousands of crores on digitization, substantial number of government records are yet to be digitized. Now the government wants to crowd source the entire process of digitization. It has launched the ‘Digitize India Platform’ as a part of the Digital India Initiative.
Digitization of government records has an important role to play in effective governance. For many years now, governments have been mandating departments to digitize their documents. Despite all these efforts, digitization of legacy government documents has not taken place at the pace it was supposed to. Even in cases where the digitization took place, the documents were simply scanned in most cases making them non searchable. This mere scanning in fact goes against the principles of open data. Most departments have already spent thousands of crores on digitization of records. Now, the government wants to crowd source this entire process by seeking the help of citizens in digitization of government records. As a part of the Digital India initiative, the government has now launched the ‘Digitize India Platform’.
What is it all about?
As per the description in the website, Digitize India Platform (DIP) will provide digitization services for scanned document images or physical documents for any organization. The aim is to digitize and make usable all the existing content in different formats and media, languages, digitize and create data extracts for document management, IT applications and records management. DIP provides an innovative solution by combining machine intelligence and a cost effective crowd sourcing model. It features a secure and automated platform for processing and extracting relevant data from document images in a format that is usable for meta-data tagging, IT application processing and analysis.
The Three Stakeholders
The government has identified three important stakeholders namely the User Organization, the Digital Contributor & the Platform Operator.
Government departments, Public Sector Organization and Autonomous bodies can become a user organization and utilize this platform. A user Organization can submit their records for digitization to platform operator. The records should preferably be in a scanned image format. However, organizations who wish to submit physical records will have to pay for scanning separately.
The scanned images are then shredded into snippets with meaningful data. The following process is followed for the scanned documents.
- All scanned images are shredded into snippets with meaning full data
- Shredding done as per Organizations requirement for data digitization
- Documents meta data information is maintained throughout the life cycle of the document
Any Indian citizen with an Aadhaar Number can become a Digital Contributor (DC) and perform simple data entry tasks on the DIP. For every verified and correct task performed, the Contributor will earn reward points. They can redeem the reward points into monetary value or donate them to the Digital India initiative. The contributor is served random snippets by the platform.
The platform operator will help in the on boarding of user organization, pre-processing the scanned document images, creating templates for pages being digitized and delivering the digitized data to the user organization. Platform operator will also remunerate the Digital Contributors for their earned reward points. The following is broadly how the platform works
- Randomly serves snippets to contributors
- Snippets are matched for converted data in the match engine
- Correct entries get reward points for each correct words digitized
- Platform organizes the snippet text digitized by contributors
- Document are re-assembled and provided back to the organizations
How does it work?
Any Indian citizen who wishes to become a contributor can sign up with the platform. A valid Aadhaar and Bank account are mandatory for signing up. The following are the prerequisites for any contributor.
Once the user registers, he/she can take an assessment of data entry & language proficiency. One can also practice to hone his/her skills.
Once the assessment process is complete, data to be digitized will be sent to the contributor as images. He/She is supposed to enter the data as they see in the image and submit.
The platform will then approve the submitted work and the contributor will keep earning reward points for each approved work. The reward points will then be converted to earnings and then transferred to the contributor’s bank account.
The platform also has a mobile application. Those with a smart phone can download the mobile application and continue data entry even on the move. All that one needs is access to the internet.
The PaaS solution for digitization of records
The Government believes that this is an innovative cloud bases PaaS (Platform as a Service) solution for digitization and hopes that more and more organizations will use the platform for digitizing their records. As on date, there are more than 4000 registered contributors. More than 1.7 lakh documents are scanned and more than 16 lakh snippets were served by the platform. The government is calling it ‘Pixel to Data’ transformation.
The entire process is captured in this video.