PDF Converter API and Server Platform 7.2 - Extract text using OCR, MSG Improvements

Related Products

PDF Converter

PDF Converter

Share

We are happy to announce version 7.2 of the popular Muhimbi PDF Converter API and Server Platform. This new release further extends the OCR facility and MSG improvements introduced in the previous version and adds support for extracting text from bitmap based content and rendering of MSG based calendar entries.

A quick introduction for those not familiar with the product: The Muhimbi PDF Converter API and Server Platform is an ‘on premises’ server based SDK that allows software developers to convert typical Office files to PDF format using a robust, scalable but friendly Web Services interface from Java, .NET, Ruby & PHP based solutions. It supports a large number of file types including MS-Office and ODF file formats as well as HTML, MSG (email), EML, AutoCAD and Image based files and is used by some of the largest organisations in the world for mission critical document conversions. In addition to converting documents the product ships with a sophisticated watermarking engine, PDF Splitting and Merging facilities, an OCR facility and the ability to secure PDF files. A separate SharePoint specific version is available as well.

  Example of a converted Calendar entry with an (OLE) embedded Excel sheet

In addition to the changes listed above, some of the main changes and additions in the new version are as follows:

2100ExcelNewOptionally scale Excel to page width & height2059HTMLFixSystem.ArgumentException: uri - string can not be empty1996HTMLImprovementReduce white space causing occasional extra empty PDF pages at end of file.1802MergingFixBookmark targets bottom of page2093MergingFix"Unexpected token Unknown before 107448" while merging file2078MergingFixKernel Error while loading PDF2073MergingFixSystem.IndexOutOfRangeException while merging2074MergingFixSystem.NullReferenceException while merging2075MergingFixSystem.NullReferenceException while merging2076MergingFixSome HTML Converted files cannot be saved in Acrobat Pro after merging2126MSGFix"System.InvalidOperationException: Stack empty" during conversion of 3rd party generated MSG files2133MSGFix"Parameter is not valid" during conversion of 3rd party generated MSG files2136MSGFixContent missing from converted MSG file2106MSGFixFixed MSG body for 3rd party generated MSG files2116MSGFixConversion of MSG files with an attached MSG that is signed2124MSGFix"System.IndexOutOfRangeException" Converting German email2125MSGFixConversion of email never finishes2105MSGFix"Invalid Compressed RTF header" during conversion of 3rd party generated emails2090MSGFixExtra '}' in body text2058MSGFixNo bookmark generated for certain attachments2056MSGFix‘Sent date' not correct on some 3rd party generated emails2057MSGFixUnicode converter issue (also with EML)2088MSGImprovementAdd support for attendees to meeting invitations2086MSGImprovementOptionally throw error if embedded content is encountered that cannot be converted2013MSGImprovementFrom address shows LDAP path2046MSGImprovementWeb Service support for MSGConverterFullFidelity.EmailAddressDisplayMode and FromEmailAddressDisplayMode2087MSGNewConvert the visual representation of embedded objects2068MSGNewAdd support for the conversion of Calendar Entries2050MSGNewAdd config value to allow MSG attachments list to be displayed, even when attachments are disabled2113MSG/HTMLFixRendering error in very long emails / HTML pages2066MSG/HTMLFixSometimes content is truncated on systems running IE9, IE10 or IE112005MSG/HTMLFixFonts look weird in some emails1786OCRFixHandle leak during OCR2054OCRFixSome Mixed content (MS-Word files with scanned images) does not always OCR1999OCRFixArabic training data causes exception1788OCRImprovementIncrease OCR Performance2089OCRImprovementUpdate Diagnostics tool to display OCRed text2081OCRImprovementIn-line images are recognised but text is not placed on it correctly1998OCRImprovementAdd support for Hebrew2048OCRNewSupport for extracting text from bitmap based content using OCR2072OtherNewAllow timeouts to be specified on web service call2102WatermarkingFixChinese & Japanese fonts are not displayed in watermarks2103WatermarkingFixWatermarking some documents causes problem in Adobe Reader 9

For more information check out the following resources:

As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.

Download your free trial here (39MB). .

Labels: News, OCR, PDF Converter Professional, PDF Converter Services

Have a Question?
We’re Always Happy to Help.

© Muhimbi Ltd. 2008 - 2023
This website uses cookies to ensure you get the best experience. Learn more