2024-10-03 Web Development

Converting Markdown to PDF with Puppeteer and Markdown-it

By O Wolfson

How to convert a Markdown file to a PDF using Node.js. We'll use the puppeteer-core library to generate the PDF and markdown-it to parse the Markdown content. Additionally, we'll add custom styling, including margins and page breaks.

Prerequisites

Before we begin, ensure you have Node.js installed. You'll also need to install the following packages:

  • puppeteer-core
  • markdown-it
  • markdown-it-container

You can install these packages using npm:

bash
npm install puppeteer-core markdown-it markdown-it-container

Step-by-Step Guide

1. Initializing the Project

First, create a new file called convertMarkdownToPdf.js and add the following code:

javascript
const fs = require("fs");
const path = require("path");
const puppeteer = require("puppeteer-core");
const markdownIt = require("markdown-it");
const markdownItContainer = require("markdown-it-container");

This code imports the necessary modules:

  • fs and path for file system operations.
  • puppeteer-core to generate the PDF.
  • markdown-it and markdown-it-container to parse the Markdown content and handle custom HTML containers.

2. Configuring Markdown-it

Next, we configure markdown-it to use the markdown-it-container plugin for custom HTML containers:

javascript
const md = markdownIt().use(markdownItContainer, "pagebreak", {
  render(tokens, idx) {
    if (tokens[idx].nesting === 1) {
      return '<div style="page-break-after: always;"></div>';
    } else {
      return "";
    }
  },
});

This configuration allows us to use ::: pagebreak ::: in our Markdown file to insert a page break in the PDF.

3. Converting Markdown to HTML

We create a function to convert Markdown content to HTML and include custom styling:

javascript
function convertMarkdownToHtml(markdown) {
  const htmlContent = md.render(markdown);
  return `
    <html>
      <head>
        <style>
          body {
            padding: 20px;
            font-family: Arial, sans-serif;
          }
          h1, h2, h3, h4, h5, h6 {
            margin-top: 20px;
          }
          hr {
            margin-top: 20px;
            margin-bottom: 20px;
          }
        </style>
      </head>
      <body>
        ${htmlContent}
      </body>
    </html>
  `;
}

This function takes the Markdown content, converts it to HTML, and wraps it with additional styling for padding and margins.

4. Converting HTML to PDF

We then create a function to convert the HTML content to a PDF using Puppeteer:

javascript
async function convertHtmlToPdf(html, outputPath) {
  const browser = await puppeteer.launch({
    executablePath:
      "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome", // Update this path
  });
  const page = await browser.newPage();
  await page.setContent(html, { waitUntil: "networkidle0" });
  await page.pdf({
    path: outputPath,
    format: "A4",
    margin: {
      top: "20mm",
      right: "20mm",
      bottom: "20mm",
      left: "20mm",
    },
  });
  await browser.close();
  console.log(`Converted to PDF: ${outputPath}`);
}

Ensure the executablePath points to your Chrome executable. You can find the path using:

  • macOS: which /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome
  • Linux: which google-chrome
  • Windows: Use the full path, e.g., C:\Program Files\Google\Chrome\Application\chrome.exe

5. Main Function to Convert Markdown to PDF

Finally, we create the main function to read the Markdown file, convert it to HTML, and then to a PDF:

javascript
async function convertMarkdownToPdf(inputPath, outputPath) {
  const markdown = fs.readFileSync(inputPath, "utf-8");
  const html = convertMarkdownToHtml(markdown);
  await convertHtmlToPdf(html, outputPath);
}

// Example usage
const inputMarkdownPath = path.join(__dirname, "example.md");
const outputPdfPath = path.join(__dirname, "example.pdf");

// Ensure the markdown file exists
if (fs.existsSync(inputMarkdownPath)) {
  convertMarkdownToPdf(inputMarkdownPath, outputPdfPath).catch(console.error);
} else {
  console.log(`Markdown file not found: ${inputMarkdownPath}`);
}

This function reads the content of example.md, converts it to HTML, and then generates a PDF example.pdf.

Running the Script

Ensure your example.md file is in the same directory as the script. The Markdown file can contain the following content for testing:

markdown
# Example Markdown

This is a sample markdown file.

## Section 1

Here is some content for section 1.

::: pagebreak :::

## Section 2

Here is some content for section 2.

---

Some more content after a horizontal rule.

Run the script using Node.js:

bash
node convertMarkdownToPdf.js

This will generate example.pdf with the specified formatting and page breaks.

Conclusion

By following this guide, you can convert Markdown files to PDFs with custom styling and page breaks using Node.js, Puppeteer, and Markdown-it. This method provides a flexible way to generate professional-looking PDFs from Markdown content.