news

HTML to PDF with POS receipt printers using CUPS

How to convert HTML to long and narrow PDF which can be used with POS receipt printers? Issues and solutions.

HTML to PDF with POS receipt printers using CUPS
Share this

The problem

Some time ago I have been playing with a POS printer connected to a Linux-based Intel NUC device. I have tried to print numerous receipts based on different templates. A problem which I have quickly realized was the lack of any good library for any of the known by me languages which would allow me to have any freedom in terms of the format. There were some popular ESC POS handling libraries (for example: ESC-POS.NET, but most of them allowed only a logo at the top, at full width and later rows of texts. Some even suggested to print lines by writing a set of dashes. Some articles suggested rasterization of an HTML and printing a BMP or PNG file, yet at least the printer which I had for testing (Custom Engineering PLUS2) in such cases applied some blur and background was not truly white: no matter if it was white in the BMP or transparent in the PNG. Quality was much lower than when just printing text.

Solution

Conversion of HTML suggested me that maybe, since I use CUPS, I can use PostScript and if PostScript then maybe also even PDF. So I have started the experiment and results were quite promising, yet all of the known to me conversion tools had one big issue: required exact page size during input while I could not calculate exact page size before the actual conversion. Obviously if I would like to omit the HTML part, I could create PDF page objects using popular libraries with any of the programming environments and then add text lines, draw objects etc. but the concept was to use a popular markup language to make the receipts easily interchangeable. Solution was a two-pass rendering process where first is just to establish the size and second is to generate the final PDF.

Tools

For conversion from HTML I have selected htmldoc. Why not wkhtmltopdf? htmldoc although supports only basic html 4 markup without javascript or css is much more predictable and lightweight, no need to install a big set of chromium-related or xserver libraries to use it. Its only issue is the fact that it is based on GPL license which means that I cannot distribute it easily and freely with a MIT library or project.

For ease of development I have used C# .NET 6.0 which allows execution on all popular platforms (Windows, Debian and/or CentOS based, OSX), can be compiled to run without .NET etc.

Library

The final library which is a result of this development can be found here: IDCT POS PDF Assistant C#.

How to use it?

First create a new object:

using IDCT;
using IDCT.Type;

PosPdfAssistant pdfAssistant = new();

then provide the HTML (4) which is meant to be printed:

string html = "" +
    "<html>" +
    "<body>" +
    "<center>" + DateTime.Now.ToLocalTime().ToString() + "</center>" +
    "<p><img src='C:\\path\\to\\file\\logo.png' width='100%'></p>" +
    "<table>" +
    "<tr><td width='100%'><b>Koku-Kola</b></td><td>3.14</td>" +
    "<tr><td width='100%'><b>Pefsi</b></td><td>6.14</td>" +
    "<tr><td width='100%'><b>Lymbark</b></td><td>2.30</td>" +
    "</table>" +
    "<hr>" +
    "<table>" +
    "<tr><td width='100%'><b>Sum</b></td><td>11,58</td>" +
    "</table>" +
    "</body>" +
    "</html>";

and execute:

var size = pdfAssistant.HtmlToReceipt(html, "out.pdf", new HtmldocOptions() { Gray = true, PdfSupportedFont = PdfSupportedFont.Monospace });

Using the HtmldocOptions object you can set multiple options, like width or additional bottom margin for the cut-off:

    public class HtmldocOptions
    {
        /// <summary>
        /// Width in milimeters. Defaults to 48mm which is a popular width of a paper roll.
        /// </summary>
        public int Width { get; set; } = 48;

        /// <summary>
        /// Height of a page in mm. Best if long enough to cover whole potential receipt, but if output file is split into pages library will combine it.
        /// </summary>
        public int Height { get; set; } = 1000;

        /// <summary>
        /// Bottom margin in pixels.
        /// </summary>
        public int BottomMargin { get; set; } = 0;

        /// <summary>
        /// Default font size in points.
        /// </summary>
        public int FontSize { get; set; } = 10;

        /// <summary>
        /// Default font.
        /// </summary>
        public PdfSupportedFont PdfSupportedFont { get; set; } = PdfSupportedFont.Arial;

        /// <summary>
        /// Should the output file be generated in grayscale?
        /// </summary>
        public bool Gray { get; set; } = true;
    }

Apart from saving the final PDF it returns you its size in multiple units:

Console.WriteLine(String.Format("Size in points: {0}x{1}pt", size.Width.Point, size.Height.Point));
Console.WriteLine(String.Format("Size in inches: {0}x{1}cm", size.Width.Inch, size.Height.Inch));
Console.WriteLine(String.Format("Size in milimeters: {0}x{1}mm", size.Width.Millimeter, size.Height.Millimeter));

which is especially useful if you need to pass the exact dimensions to the driver of your POS printer which was the case with Custom Engineering PLUS2 when printing with CUPS:

strCmdText = "-H [cups host] -o PrintDensity=7Density+37 -o PageSize=Custom.136x" + size.Height.Point.ToString() + "pt -o orientation-requested=3 -P myprinter out.pdf"
process = System.Diagnostics.Process.Start("lpr", strCmdText);
process.WaitForExit();

How it works?

In the first step it executes htmldoc which generates a long, paginated PDF (for example 1000 points). Best if everything fits on one page later cropped than to split, but still, it will work if split is needed. That happens in HtmlToPdf method.

Secondly it combines a long and paginated PDF into a single, very long PDF, this happens in CombineLongPdf. As a result, all page breaks will be removed.

Last step is trim the PDF out of all the unwanted white space at the end. It can also keep a short margin, defined in settings, for printer's cut-off. Check TrimPdf.

Known limitations

You still need to plan a bit your receipts: a break is still added at every defined value, for example 1000 points (which should be enough for most cases). Text breaks without any problem, but image will be moved to the next page. This can be solved at some point in the future but removing whitespace at the end already before combining of the pages. As the library is open-source you are more than welcome to contribute that! Also there is a theoretical option to set the initial page height lower then size of any image which may result in a problem: this can be solved at some point by first checking all the images meant to be added into the PDF.

Result

Exactly trimmed PDFs:

sample-print

As mentioned, library is provided for free, also available on Github: IDCT POS Printer PDF Assistant

Can be installed from nuget:

dotnet add package IDCT.PosPdfAssistant --version 1.0.1

IDCT POS PDF Assistant on NuGet

For the moment I have ended experiments with POS printers so further development may not be very active, but please share any requests in the Issues on Github, Pull Requests with bugfixes or features are also kindly appreciated.