Automatically convert millions of emails, including all attachments, to PDF

Jonathan D. Rhyne

Jonathan D. Rhyne

I was talking to one of our customers the other day about an interesting use case that turns out  to be more common than I anticipated.

During our discussion it came to light that their regulatory body requires all communication - exchanged with customers - to be stored in a format suitable for long term archiving. In their case PDF/A. The problem is that doing this by hand is an impossible amount of work and difficult to enforce. This is made even more difficult by the fact that attachments need to be converted to PDF as well.

Guess what.... they need to do this for 100,000 emails per month! Doing this by hand is just not an option, which is why they went looking for a third-party solution.

There are a small number of solutions available in the market. A number of service providers and vendors of development libraries claim to be able to convert EML and MSG files to PDF, but few do this in a way that:

  • generates perfect looking PDFs;
  • supports emails written in a multitude of languages and character sets;
  • converts all attachments and merges them into a single PDF;
  • provides many ways to filter and configure these attachments;
  • takes care of rendering delivery receipts;
  • includes calendar entries and contact cards;
  • outputs PDFs in PDF/A1b, 2b and 3B formats;
  • allows the process to be fully automated via workflow platforms or an API.

We are generally a modest bunch, but we truly believe we have the best email to PDF converter in the world. We know this, because we searched for 3rd party libraries when we first implemented this facility. Nothing existed that was half decent, so we decided to build our own. Our team has spent an enormous amount of time on this facility, more than any of our other converters including our popular and comprehensive InfoPath converter.  The results are clearly visible, this works very well.

PDF renditions of regular emails.

So, this customer was set a very difficult task, how did they end up solving it? Their in-house team built a simple solution using Java code in combination with the REST API exposed by our online service. Things just sit quietly in the background, beavering away 24x7 to generate PDFs out of emails.

The REST API approach works well for them. We also support a SOAP API in combination with hosting our software on your own servers, SharePoint Online, SharePoint on-premise, Power Automate (Microsoft Flow), Azure Logic Apps, UiPath, Nintex Workflow, K2, C#, JavaScript, Python, PHP and anything else that is remotely modern.

PDF Rendition of a calendar entry, including embedded content

We could make up fancy ROI figures for this use case, but the fact is that the requirement was nearly impossible. Whatever figure we come up with is bound to be wrong by an order of magnitude. Let's just says it is working out very well for everyone involved.

Relevant links:

Many of our customers are sitting on gigabytes of emails that need to be archived for eDiscovery, Freedom Of Information requests and SOX, SEC, FTS, FCC, EPA, NLRB, IRS, EEOC, OSH, OFCOM retention regulations. Being able to access these emails 10, 20 or even 40 years down the line, in a universally accepted format such as PDF (including PDF/A), is absolutely essential. Muhimbi’s range of PDF Conversion products make this possible for all common file formats as well as some uncommon ones such as MSG, EML and even InfoPath.

If you have any questions or comments, leave a message below or contact our support desk, we love to help.

Labels: Articles, EML, MSG, pdf

Author

Jonathan D. Rhyne

Jonathan D. Rhyne

Co-Founder and CEO

Jonathan joined PSPDFKit in 2014. As CEO, Jonathan defines the company’s vision and strategic goals, bolsters the team culture, and steers product direction. When he’s not working, he enjoys being a dad, photography, and soccer.

Have a Question?
We’re Always Happy to Help.

© Muhimbi Ltd. 2008 - 2024
This website uses cookies to ensure you get the best experience. Learn more