OCR PDFs & Images with Nintex Workflows

We recently released the Muhimbi PDF Converter Xtension for Nintex Automation Cloud. You can download it here or learn more about available Muhimbi deployments for Nintex on our product page.

We have been working with Nintex Workflow ever since we integrated it into the PDF Converter for SharePoint On-Premises all the way from MS SharePoint 2007.

One of the workarounds that we have recommended to our customers over the years is to create a MS SharePoint Designer Workflow using our workflow actions and invoke that from Nintex Workflow for Office 365. But this does not leverage the full power of Nintex Workflow for Office 365 and Muhimbi PDF Converter for SharePoint Online.

One other way to leverage the full power of Nintex Workflow for Office 365 and Muhimbi PDF Converter for SharePoint Online is to integrate the functionality exposed by the PDF Converter for SharePoint Online directly into a Nintex Workflow by invoking our comprehensive REST API.

When we first released the Muhimbi PDF Converter for SharePoint Online in early 2015, we were very much aware that due to technical limitations in MS SharePoint Online platform - it would not be possible to bring the full power of our existing on-premise (SP2007-SP2016) products to The Cloud. The first release focused on the most important elements (the PDF Conversion user interface and Workflow Actions for MS SharePoint Designer), but one of the key features of our on-premise products was missing, an API to allow integration with 3rd party solutions and software partners.

Although we have a comprehensive Web Services (SOAP) interface exposed by our on-premise products, it is less suitable for use by online subscription based services. So, we now have a much simplified, REST based interface, as that is how modern systems, especially Cloud based products, talk to each other. This new REST based service was launched earlier this year as a part of the PDF Converter Services product. This is a separate product, that has no dependencies on MS SharePoint and can be used to integrate with services such as Microsoft Flow, Azure Logic Apps, C#, Java, PHP, JavaScript, Python Ruby and many other services including Nintex Workflow for Office 365. Although available as a stand-alone subscription, this new service is automatically included in each PDF Converter for SharePoint Online subscription at no additional charge.

Using Nintex Workflow to Convert Scans and Images to PDF

Prerequisites

Before you begin, please make sure the following prerequisites are in place:

Please note that this article is for the MS SharePoint Online version of Nintex Workflow.

Building the Workflow

It is strongly recommended to follow the tutorial below, but the workflow is available for download as well. Import it in Nintex Workflow for Office 365, SET THE API KEY, publish it and you are ready to go.

  1. Navigate to a site collection and document library of your choice. You can choose the option to create a new Nintex Workflow. In this example, we use the standard Document Library that is available on most site collections.

  2. Create the following workflow variables as we need them later:

  • JSON (Text): Contains the JSON, JavaScript Object Notation, the command that will be sent to the conversion service.

  • API_KEY (Text): A unique ID that will be used to look up your Muhimbi subscription details.

  • ResponseText (Text): The status message returned by the Conversion Service.

  • ResponseCode (Integer): The status code returned by the Conversion Service.

    workflow variables

  1. You can then insert a Set Workflow Status action, edit it and set it to ‘Started’. As MS SharePoint Online does not show a separate status, adding this action will show us the status that the workflow has actually triggered and it will also give us something to click on to inspect the current status of the workflow.

    set workflow status

  2. You can then add a Build String action and set the Output to the JSON workflow variable. In the String field enter the following:

You need to pay attention to the following:

  • JSON Notation: Please note that we have replaced the curly braces - { } - with square brackets [ ] due to a bug in Nintex Workflow for Office 365. If you have any concerns using square brackets, (as they are also used for Array types) you can replace them with anything else, as we will fix this in a follow-up step.
  • Copy & Paste: When copying and pasting the JSON code, ensure you paste it in Notepad (and copy back) to strip out non-standard characters and formatting being copied.
  • References: The text displayed in red are Nintex Workflow references. After copy pasting the code fragment, you need to replace each Nintex reference using the Advanced Lookup facility located below the field.
  • Output file name: In this basic example, we just add '.pdf' to the end of the output path and file name. This is not particularly pretty, but in order to keep things simple we are not including the Nintex Workflow actions to strip off the old extension and add the new one. You can use whatever you like here as long as it is a valid output path and file name.
  1. In an earlier step, as we have used square brackets in JSON, we need to replace them with curly braces again. You can do this by using Replace Substring in String action and by configuring it as follows:
  • Search String: Enter the opening square bracket [.
  • Replace String: Enter the opening curly brace {.
  • String: Insert a reference to the workflow variable named JSON.
  • Output: Pick the JSON workflow variable to store the results in.

Click Save button.

  1. You can now copy the workflow action using the action's menu, and by pasting it as the next action. You can configure the newly pasted workflow action and replace the opening bracket with the closing bracket ']'.

You can do the same for the curly brace and replace '{' with '}', and click Save button to save the action. You now have valid JSON that you can send to the Conversion Service.

  1. As the next step, we need to set the API_KEY. Insert a Set Workflow Variable action and configure it to set the API_KEY workflow variable to the API Key you received by email when signing up for the Muhimbi PDF Converter Services Online. e.g.:

decafbad-baad-baad-baad-decafbaaaaad

Do not try to use this particular key, as it will not work. Ensure you do not put curly braces around the key. Click Save button to save the action.

  1. Next, insert a Web Request action and configure it as follows:

URL: https://api.muhimbi.com/api/v1/operations/ocr_pdf

Method: POST

Content type: application/json

Add header: Click Add header, specify API_KEY as the Header name and insert a reference to the API_KEY workflow variable for the Header value.

Body: Select the Content option, add a reference to the JSON workflow variable in the Data field.

Store response content in: ResponseText.

Click Save button to save the action.

  1. Finally, insert another Set Workflow Status action and configure it with the text ‘Completed’. This should indicate when the workflow instance has completed its run. Your workflow should look something like the following:
    Nintex-O365-ConvertBasic-Part1
    Nintex-O365-ConvertBasic-Part2

  2. Save and Publish the workflow by giving it a suitable name and set the Start Options to a value of your choice.

  3. Once published, open the document library the workflow is associated with, make sure a file of the supported type is present, and manually start the workflow. After a few seconds, the PDF file will show up next to the file the workflow was started on.

Troubleshooting

Although both Nintex Workflow for Office 365 and the Muhimbi PDF Converter work very well together, there are a lot of moving parts in the workflow like custom generated JSON, customer-specific API keys, paths to the document libraries, etc. So, there are chances that you may encounter some issues when deploying the workflow. Some common issues and troubleshooting tips are provided below for your reference:

  • Check prerequisites: Double-check that the prerequisites listed in the beginning of this section are in place.
  • Log to History List: If it is not clear what is going wrong, log critical parts such as the JSON workflow variable (after the replace operation) as well as the ResponseText workflow variable (after the web request) using the Log To History List workflow action. You can see the contents of this list by clicking on the Workflow Status column for the List Item the workflow is running on.
  • Send email: The amount of text that can be logged to the History List is limited (roughly 250 characters). For larger messages, use the Send an Email action instead to send an email with debug content in the body of the email to yourself.
  • Copy & Paste: When copying the JSON fragment into your workflow, paste it into Notepad first to clean it, and then copy it from Notepad and paste it into your workflow. This is because browsers tend to insert hidden characters that are not filtered out by the Nintex Workflow editor.
  • Nintex References: Make sure that the Nintex Workflow references in the JSON provided are replaced by actual Nintex Workflow references. You can double-check if the references are active by logging the JSON workflow variable to the History List. You should see the actual paths and not {Current Item:Server Relative URL}.
  • Muhimbi Support: After double checking all prerequisites and going over all troubleshooting steps in this section, if you are still stuck, please contact our friendly support desk, who are here to help.

Fine-tuning

The workflow created in the previous section was to give a quick idea of how to use the Converter. However, it would benefit from error handling and a solution for a possible recursion problem where the workflow will be triggered for PDF files that it has created by itself.

We have created a version of the workflow that is more production ready. Full details on the same are beyond the scope of this article. You can download the full workflow here and customize this as per your requirements.

After customization, you can import it into Nintex Workflow for Office 365, and set the API KEY, and then publish it for your use.

47

48

49

Other Operations

This section demonstrated how to invoke the Convert action on Muhimbi's REST interface. Full examples are beyond the scope of this article, but you can find examples in the SharePoint section of our GitHub repository.

Additional Resources

Have a Question?
We’re Always Happy to Help.

© Muhimbi Ltd. 2008 - 2024
This website uses cookies to ensure you get the best experience. Learn more