[Awesome Ruby Gem] Use pdfkit gem to create PDFs using plain old HTML+CSS on the backend which renders HTML using Webkit

PDFKit

Create PDFs using plain old HTML+CSS. Uses wkhtmltopdf - http://wkhtmltopdf.org/downloads.html on the back-end which renders HTML using Webkit.

Installation

imkit

You can install it as a gem:

1
$ gem install pdfkit

or add it into a Gemfile (Bundler):

1
2
3
4
5
# Gemfile

# pdfkit/pdfkit: A Ruby gem to transform HTML + CSS into PDFs using the command-line utility wkhtmltopdf
# https://github.com/pdfkit/PDFKit
gem 'pdfkit', '0.8.5'

Then, run bundle install.

1
$ bundle install

wkhtmltopdf

Usages

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# PDFKit.new takes the HTML and any options for wkhtmltopdf
# run `wkhtmltopdf --extended-help` for a full list of options
kit = PDFKit.new(html, :page_size => 'Letter')
kit.stylesheets << '/path/to/css/file'

# Get an inline PDF
pdf = kit.to_pdf

# Save the PDF to a file
file = kit.to_file('/path/to/save/pdf')

# PDFKit.new can optionally accept a URL or a File.
# Stylesheets can not be added when source is provided as a URL or File.
kit = PDFKit.new('http://google.com')
kit = PDFKit.new(File.new('/path/to/html'))

# Add any kind of option through meta tags
PDFKit.new('<html><head><meta name="pdfkit-page_size" content="Letter"')
PDFKit.new('<html><head><meta name="pdfkit-cookie cookie_name1" content="cookie_value1"')
PDFKit.new('<html><head><meta name="pdfkit-cookie cookie_name2" content="cookie_value2"')

Resolving relative URLs and protocols

If the source HTML has relative URLs (/images/cat.png) or protocols (//example.com/site.css) that need to be resolved, you can pass :root_url and :protocol options to PDFKit:

1
2
3
PDFKit.new(html, root_url: 'http://mysite.com/').to_file
# or:
PDFKit.new(html, protocol: 'https').to_file

Using cookies in scraping

If you want to pass a cookie to cookie to pdfkit to scrape a website, you can pass it in a hash:

1
2
kit = PDFKit.new(url, cookie: {cookie_name: :cookie_value})
kit = PDFKit.new(url, [:cookie, :cookie_name1] => :cookie_val1, [:cookie, :cookie_name2] => :cookie_val2)

Configuration

If you’re on Windows or you would like to use a specific wkhtmltopdf you installed, you will need to tell PDFKit where the binary is. PDFKit will try to intelligently guess at the location of wkhtmltopdf by running the command which wkhtmltopdf. If you are on Windows, want to point PDFKit to a different binary, or are having trouble with getting PDFKit to find your binary, please manually configure the wkhtmltopdf location. You can configure PDFKit like so:

1
2
3
4
5
6
7
8
9
10
11
# config/initializers/pdfkit.rb
PDFKit.configure do |config|
config.wkhtmltopdf = '/path/to/wkhtmltopdf'
config.default_options = {
:page_size => 'Legal',
:print_media_type => true
}
# Use only if your external hostname is unavailable on the server.
config.root_url = "http://localhost"
config.verbose = false
end

Middleware

PDFKit comes with a middleware that allows users to get a PDF view of any page on your site by appending .pdf to the URL.

Rails apps

1
2
3
# in application.rb(Rails3) or environment.rb(Rails2)
require 'pdfkit'
config.middleware.use PDFKit::Middleware

With PDFKit options

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# options will be passed to PDFKit.new
config.middleware.use PDFKit::Middleware, :print_media_type => true
With conditions to limit routes that can be generated in pdf

# conditions can be regexps (either one or an array)
config.middleware.use PDFKit::Middleware, {}, :only => %r[^/public]
config.middleware.use PDFKit::Middleware, {}, :only => [%r[^/invoice], %r[^/public]]

# conditions can be strings (either one or an array)
config.middleware.use PDFKit::Middleware, {}, :only => '/public'
config.middleware.use PDFKit::Middleware, {}, :only => ['/invoice', '/public']

# conditions can be regexps (either one or an array)
config.middleware.use PDFKit::Middleware, {}, :except => [%r[^/prawn], %r[^/secret]]

# conditions can be strings (either one or an array)
config.middleware.use PDFKit::Middleware, {}, :except => ['/secret']

With conditions to force download

1
2
3
4
# force download with attachment disposition
config.middleware.use PDFKit::Middleware, {}, :disposition => 'attachment'
# conditions can force a filename
config.middleware.use PDFKit::Middleware, {}, :disposition => 'attachment; filename=report.pdf'

Saving the generated .pdf to disk

Setting the PDFKit-save-pdf header will cause PDFKit to write the generated .pdf to the file indicated by the value of the header.

For example:

1
headers['PDFKit-save-pdf'] = 'path/to/saved.pdf'

Will cause the .pdf to be saved to path/to/saved.pdf in addition to being sent back to the client. If the path is not writable/non-existent the write will fail silently. The PDFKit-save-pdf header is never sent back to the client.

References

[1] pdfkit/pdfkit: A Ruby gem to transform HTML + CSS into PDFs using the command-line utility wkhtmltopdf - https://github.com/pdfkit/PDFKit

[2] pdfkit | RubyGems.org | your community gem host - https://rubygems.org/gems/pdfkit

[3] wkhtmltopdf/wkhtmltopdf: Convert HTML to PDF using Webkit (QtWebKit) - https://github.com/wkhtmltopdf/wkhtmltopdf

[4] zakird/wkhtmltopdf_binary_gem: Ruby gem containing easily installable access to wkhtmltopdf application - https://github.com/zakird/wkhtmltopdf_binary_gem

[5] wkhtmltopdf-binary | RubyGems.org | your community gem host - https://rubygems.org/gems/wkhtmltopdf-binary

[6] Generating PDFs with Ruby: 2018 Edition | PSPDFKit - https://pspdfkit.com/blog/2018/generating-pdfs-with-ruby/

[7] prawnpdf/prawn: Fast, Nimble PDF Writer for Ruby - https://github.com/prawnpdf/prawn

[8] gettalong/hexapdf: Versatile PDF creation and manipulation for Ruby - https://github.com/gettalong/hexapdf

[9] HexaPDF - Home - https://hexapdf.gettalong.org/index.html