In this guide, you’ll learn how to automatically OCR PDFs into searchable PDFs using Power Automate and Muhimbi PDF Connector. Using Muhimbi’s OCR technology, text from scanned PDFs is automatically recognized and overlaid into your PDF document. Once completed, you’ll be able to search and copy text in your PDF document.
Steps to OCR your PDF using Power Automate:
- Create a flow in Power Automate
- Define your action
- Create a file
- Publish your workflow
It’s also important to note that this example uses Power Automate to automatically OCR a PDF in SharePoint. However, you can easily OCR PDF files and save them to another destination, like Dropbox, Google Drive, OneDrive, or any platform supported by Power Automate. You can also extend your flow by adding additional automation, such as sending a searchable PDF in an email.
Using PowerAutomate to Convert to PDF
This example takes you through converting an image (.png
) file to PDF and updating it in the MS SharePoint library. From a high-level perspective, the flow looks like what's shown in the image below.
1: Creating a New Flow
Create a new flow and use the When a file is created (properties only) SharePoint Online trigger. Fill out the URL for the site collection and select the relevant SharePoint Site Address, Library Name, and Folder from the dropdown menu.
2: Getting the File Content
Insert MS SharePoint's Get file contentaction and fill it out as shown in the screenshot displayed below. Substitute the Site Address field with a suitable value and the File Identifier field with the output value of the When a file is created (properties only) action.
3. Converting to OCRed PDF
Insert Muhimbi's Convert to OCRed PDF action and fill it out as shown in the screenshot below.
- Source file name — Name of the source file, including the extension.
- Source file content — Content of the file to OCR. Select Body, which is the output value of the Get file content action.
- Language — Select the language of the file. This example uses English.
4. Creating a File
Insert an MS SharePoint Create file action and fill it out as shown in the screenshot below. This will write the OCRed PDF back to the document library.
- Site Address — Select the site address of the MS SharePoint library to which the OCRed PDF needs to be written.
- Folder Path — Select the folder of the MS SharePoint library to which the OCRed PDF must be written.
- File Name — This is the name to be given to the new PDF. Select the original file name and add a PDF extension. In this scenario, it's Base file name, which is the output of the Convert to OCRed PDF action. Ensure you add the
.pdf
extension. - File Content — Select Processed file content, which is the output of the Convert to OCRed PDF action.
5. Publishing the Workflow
Publish the workflow and upload a .png
file in the specified document library. After a few seconds, the flow will trigger and the OCRed PDF will be created in your MS SharePoint library.