Muhimbi's range of PDF Conversion products have been architected with scalability in mind. (For details and results of load tests see this Knowledge Base Article). However, from time-to-time we receive questions about how the PDF Converter schedules requests internally, parallelises them and how the settings can be optimised for a specific scenario.
Concurrency
The Conversion Service automatically keeps track of all requests, how many requests are currently executing, how long they have been running, are they responsive, and what type of request they are. Each file format (Word, Excel etc) is configured and tracked separately and the number of parallel request for each file type will never exceed a configured value (2 for each file type by default).
Things to know / take into account:
- Each incoming request / operation is executed sequentially. What this means is that the steps of an operation are executed in sequence (e.g. a document is converted, then it is optionally watermarked followed by an optional encryption step).
- As a result Merge operations are also executed sequentially. Each file in the merge operation is converted in sequence, not in parallel. Therefore, if you don't expect many concurrent Merge operations then there is no need for more than 2 CPU cores in your conversion server.
- However, concurrent incoming requests (conversion operations, merge operations, watermarking operations etc) are executed in parallel to make optimal use of the available CPU cores and system resources.
- The total number of concurrent requests will never exceed the value configured in maxConcurrentCalls. (See Tweaking Settings below).
Any requests that come in when the maximum number of requests for a file type (or overall) are already executing, are queued up internally and are automatically executed as soon as resources become available.
This simple, but elegant, model makes it possible for the software to scale exceptionally well, while still being easy to understand and troubleshoot.
Tweaking settings
The default settings cover the most common scenarios, but if you have a very specific scenario (e.g. you are only converting MS-Word files, or if you have more than 2 CPU cores in your system) then you may want to consider tweaking the following settings in the service's configuration file. Please note that these settings are for releases older than 8.4, details for newer version can be found further down below.
- Concurrency.MaximumInstances.WinWord: Maximum number of parallel conversions carried out by the Word Processing converter (e.g. DOC, DOCX, RTF, TXT etc). Set to 2 by default.
- Concurrency.MaximumInstances.Excel: Maximum number of parallel conversions carried out by the Spreadsheet converter (e.g. XLS, XLSX, CSV etc). Set to 2 by default.
- Concurrency.MaximumInstances.MSPub: Maximum number of parallel conversions carried out by the MS-Publisher converter. Set to 2 by default.
- Concurrency.MaximumInstances.Visio: Maximum number of parallel conversions carried out by the Vector converter (e.g. VSD, VDX, SVG). Set to 2 by default.
- Concurrency.MaximumInstances.CAD: Maximum number of parallel conversions carried out by the CAD converter (e.g. DWG, DXF). Set to 2 by default.
- Concurrency.MaximumInstances.TIFF: Maximum number of parallel conversions carried out by the TIFF converter. Set to 2 by default.
- Concurrency.MaximumInstances.MSG: Maximum number of parallel conversions carried out by the MSG (email) converter. Set to 6 by default.
- Concurrency.MaximumInstances.PowerPNT: Maximum number of parallel conversions carried out by the Presentations converter (e.g. PPT, PPTX, ODP etc). Set to 1, do not change!
- Concurrency.MaximumInstances.InfoPath: Maximum number of parallel conversions carried out by the InfoPath converter. Set to 1, do not change!
- Concurrency.MaximumInstances.HTML: Maximum number of parallel conversions carried out by the HTML & Image converter (e.g. HTML, JPG, GIF, PNG etc). Set to 2 by default.
- Concurrency.MaximumInstances.CommandLineConverter: Maximum number of parallel conversions carried out by the Command Line Converter. Set to 2 by default. Please note that this value is shared between all configured Command Line Converters.
- Concurrency.MaximumInstances.OCR: Maximum number of parallel operations carried out by the OCR Facility. Set to 2 by default.
- serviceThrottling maxConcurrentCalls: The maximum number of overall concurrent requests, including conversions request, watermarking and security requests. Set to 15 by default.
With the release of version 8.4 of the Muhimbi PDF Converter, these settings have been moved to the <MuhimbiDocumentConverters> config section. Each document type has a maxInstances property matching the figures described previously, with the exception of the setting for the OCR Processor, which can be found in the <MuhimbiOCRProcessors> config section.
NEVER change the value for the Presentations (PowerPoint) and InfoPath based converters. These converters do not allow concurrent instances to be tracked separately due to architectural limitations in the underlying Microsoft Office applications.
Do not set the value for each file format to a number larger than 16. The various underlying Office applications do not perform well under such high load.
Please do not set maxConcurrentCalls too low (below 5) as that may result in deadlocks when converting file formats for which attachments are converted as well ( MSG and InfoPath).
The configuration file can be edited as follows:
- Open Muhimbi.DocumentConverter.Service.exe.config in your favourite text editor (notepad works as well). A handy shortcut to the configuration / installation folder that holds this config file can be found in the Muhimbi Document Converter Windows Start Menu group.
- Search for the configuration keys you wish to modify.
- Save the configuration file.
- Restart the Muhimbi Document Converter Service using the Services Management Console or using the command prompt:
Net stop "Muhimbi Document Converter Service"
Net start "Muhimbi Document Converter Service"
For more details see: