In this guide you’ll learn how to automatically OCR PDFs into searchable PDFs using Power Automate. Using Muhimbi’s OCR technology, text from scanned PDFs are automatically recognized and overlaid into your PDF document. Once completed you’ll be able to search and copy text in your PDF document.
Steps to OCR your PDF using Power Automate:
- Create a flow in Power Automate
- Define your action (Convert to OCRed PDF)
- Create File (define were your file will be created)
- Publish your workflow
It’s also important to note that in this example we are using Power Automate to automatically OCR a PDF in SharePoint. You can easily OCR PDF files and save them to another destination like Dropbox, Google Drive, OneDrive, or any platform supported by Power Automate. You can also extend your flow by adding additional automation like sending the searchable PDF in an email.
Using PowerAutomate to Convert to PDF
This example takes you through converting an image (*.png) file to PDF and updating the same to the MS SharePoint library. From a high-level perspective, the Flow looks as follows:
The steps to create the Flow are as follows:
Create a new Flow and use the SharePoint Online trigger ‘When a file is created (properties only)’. Fill out the URL for the site collection and select the relevant SharePoint Site Address, Library Name, and Folder from the dropdown menu.
Insert MS SharePoint's ‘Get file content’ action and fill it out as per the screenshot displayed below. Naturally, you will need to substitute the Site Address with a suitable value and File identifier with the output value of ‘When a file is created (properties only)’ action.
Insert Muhimbi's ‘Convert to OCRed PDF’ action and fill it out as per the screenshot displayed below.
Source file name: Name of the source file including extension.
Source file content: Content of the file to OCR. Select ‘File Content’ which is the output value of ‘Get file content’ action.
Language: Select the language of the OCR file. In our case, we select ‘English’.
Insert an MS SharePoint ‘Create file’ action and fill it out as per the screenshot displayed below. This will write the OCRed PDF back to the document library.
Site Address: Select the site address where the MS SharePoint library to which the OCRed PDF needs to be written.
Folder Path: Select the MS SharePoint library to which the OCRed PDF must be written.
File Name: This is the name to be given to the new PDF. We select the original file name and add a PDF extension in our case.Select ‘Base file name’ which is the output of ‘Convert to OCRed PDF’ action. Ensure you add a ‘PDF’ extension.
File Content: Select ‘Processed file content’ which is the output of ‘Convert to OCRed PDF’ action.
Publish the workflow and upload a *.png file in the specified document library. After a few seconds, the Flow will trigger and the OCRed PDF will be created in our MS SharePoint library.