- Wkhtmltopdf — умная утилита для конвертирования веб-страниц в PDF в Linux
- Особенности Wkhtmltopdf
- Установка Evince (программа для просмотра PDF)
- Скачивание исходного кода Wkhtmltopdf
- Установка Wkhtmltopdf в Linux
- Как использовать Wkhtmltopdf?
- Конвертирование HTML-страницы в формат PDF
- Проверка сгенерированного файла PDF
- Просмотр информации о сгенерированном файле PDF
- Просмотр сгенерированного файла PDF
- Создание таблицы содержимого (Table Of Content) файла PDF
- Convert HTML Page To a PDF Using Open Source Tool [ Linux / OS X / Windows ]
- Software features
- A note about Debian / Ubuntu Linux user
- Install wkhtmltopdf on MacOS unix
- Wkhtmltopdf – A Smart Tool to Convert Website HTML Page to PDF in Linux
- Wkhtmltopdf Features
- Install Evince (PDF Viewer)
- Download Wkhtmltopdf Source File
- Install Wkhtmltopdf in Linux
- How to Use Wkhtmltopdf?
- Convert Website HTML Page to PDF File
- View Generated PDF File
- View Information of Generated PDF File
- View Created PDF File
- Create TOC (Table Of Content) of a Page to PDF
- Wkhtmltopdf Options and Usage
- If You Appreciate What We Do Here On TecMint, You Should Consider:
- Convert html to pdf with Linux
- First step, download the web page in html
- Second step, convert the html file to pdf file
- Stable
- Archive
- Why do you have static builds with patched Qt?
- Why are there no “generic” Linux builds (which were provided earlier)?
- I don’t see an appropriate download for my platform!
- How do I use it with FaaS setups?
- How do I use it in AWS Lambda?
- Symantec reports a virus WS.Reputation.1 for the Windows builds
Wkhtmltopdf — умная утилита для конвертирования веб-страниц в PDF в Linux
Оригинал: Wkhtmltopdf – A Smart Tool to Convert Website HTML Page to PDF in Linux
Автор: Ravi Saive
Дата публикации: 28 января 2017 года
Перевод: А. Кривошей
Дата перевода: февраль 2018 г.
Wkhtmltopdf — простая и эффективная утилита командной строки с открытым исходным кодом, которая позволяет пользователю конвертировать любую веб-страницу в документ PDF или изображение (jpg, png и т. д.).
Wkhtmltopdf написана на C ++ и распространяется под лицензией GNU/GPL (General Public License). Она использует механизм рендеринга WebKit для преобразования веб-страниц в PDF без потери качества. Это действительно очень полезное и заслуживающее доверия решение для создания и хранения снимков веб-страниц в режиме реального времени.
Особенности Wkhtmltopdf
— открытый исходный код, кроссплатформенная утилита;
— преобразование любых веб-страниц в файлы PDF с использованием движка WebKit;
— опции для добавления верхних и нижних колонтитулов;
— опция генерации таблицы содержимого (TOC);
— обеспечивает конвертирование в пакетном режиме;
— поддержка PHP или Python через привязки к libwkhtmltox.
В этой статье мы покажем вам, как установить программу Wkhtmltopdf в Linux из исходного кода.
Установка Evince (программа для просмотра PDF)
Давайте сначала установим программу evince для просмотра PDF в Linux.
Скачивание исходного кода Wkhtmltopdf
Загрузите исходные коды wkhtmltopdf для вашей архитектуры Linux, с помощью команды Wget, также вы можете загрузить последнюю версию на странице загрузки wkhtmltopdf.
Для 64-битных систем
Для 32-битных систем
Установка Wkhtmltopdf в Linux
Распакуйте файлы в текущую рабочую директорию с помощью команды tar.
В 64-битной системе
В 32-битной системе
Установите wkhtmltopdf в директорию /usr/bin, чтобы ее можно было запускать из любого места.
Как использовать Wkhtmltopdf?
Здесь мы рассмотрим, как конвертировать HTML-страницы в PDF, верифицировать информацию и просматривать созданные файлы с помощью программы evince.
Конвертирование HTML-страницы в формат PDF
Для преобразования веб-страницы в PDF, выполните приведенную ниже команду. Он конвертирует указанную веб-страницу в файл 10-Sudo-Configurations.pdf в текущем рабочем каталоге.
Проверка сгенерированного файла PDF
Для того, чтобы проверить, что файл создан корректно, введите команду:
Просмотр информации о сгенерированном файле PDF
Чтобы просмотреть информацию о созданном файле, воспользуйтесь следующей командой:
Просмотр сгенерированного файла PDF
Давайте взглянем на содержимое созданного нами файла с помощью программы evince:
Пример созданного файла
В моей Linux Mint 17 выглядит отлично.
Создание таблицы содержимого (Table Of Content) файла PDF
Для создания таблицы содержимого PDF-файла используется опция toc.
Для проверки TOC созданного файла снова используем программу evince:
Дополнительную информацию об использовании и опциях Wkhtmltopdf можно получить с помощью команды help:
Источник
Convert HTML Page To a PDF Using Open Source Tool [ Linux / OS X / Windows ]
D o you need a simple open source cross-platform command line tool that converts web pages and HTML to a PDF file? Look no further, try wkhtmltopdf.
From the project home page:
Simple shell utility to convert html to pdf using the webkit rendering engine, and qt. Searching the web, I have found several command line tools that allow you to convert a HTML-document to a PDF-document, however they all seem to use their own, and rather incomplete rendering engine, resulting in poor quality. Recently QT 4.4 was released with a WebKit widget (WebKit is the engine of Apples Safari, which is a fork of the KDE KHtml), and making a good tool became very easy.
Software features
- Cross platform.
- Open source.
- Convert any web pages into PDF documents using webkit.
- You can add headers and footers.
- TOC generation.
- Batch mode conversions.
- Can run on Linux server with an XServer (the X11 client libs must be installed).
- Can be directly used by PHP or Python via bindings to libwkhtmltox.
A note about Debian / Ubuntu Linux user
You can install wkhtmltopdf using apt-get command:
$ sudo apt-get install wkhtmltopdf
$ sudo ln -s /usr/bin/wkhtmltopdf /usr/local/bin/html2pdf
Sample outputs:
Install wkhtmltopdf on MacOS unix
First, install Homebrew on macOS and then type the following brew command:
$ brew install wkhtmltopdf
OR
$ brew cask install wkhtmltopdf
Источник
Wkhtmltopdf – A Smart Tool to Convert Website HTML Page to PDF in Linux
Wkhtmltopdf is an open source simple and much effective command-line shell utility that enables user to convert any given HTML (Web Page) to PDF document or an image (jpg, png, etc).
Wkhtmltopdf is written in C++ programming language and distributed under GNU/GPL (General Public License). It uses WebKit rendering layout engine to convert HTML pages to PDF document without loosing the quality of the pages. Its is really very useful and trustworthy solution for creating and storing snapshots of web pages in real-time.
Wkhtmltopdf Features
- Open source and cross platform.
- Convert any HTML web pages to PDF files using WebKit engine.
- Options to add headers and footers
- Table of Content (TOC) generation option.
- Provides batch mode conversions.
- Support for PHP or Python via bindings to libwkhtmltox.
In this article we will show you how to install Wkhtmltopdf program under Linux systems using source tarball files.
Install Evince (PDF Viewer)
Let’s install evince (a PDF reader) program for viewing PDF files in Linux systems.
Download Wkhtmltopdf Source File
Download wkhtmltopdf source files for your Linux architecture using Wget command, or you can also download latest versions (current stable series is 0.12.4) at wkhtmltopdf download page.
On 64-bit Linux OS
On 32-bit Linux OS
Install Wkhtmltopdf in Linux
Extract the files to a current working directory using following tar command.
Install the wkhtmltopdf under /usr/bin directory for easy execution of program from any path.
How to Use Wkhtmltopdf?
Here we will see how to covert remote HTML pages to PDF files, verify information, view created files using evince program from the GNOME Desktop.
Convert Website HTML Page to PDF File
To convert any website HTML web page to PDF, run the following example command. It will convert the given webpage to 10-Sudo-Configurations.pdf in current working directory.
Sample Output :
View Generated PDF File
To verify that the file is created, use the following command.
Sample Output :
View Information of Generated PDF File
To view the information of generated file, issue the following command.
Sample Output :
View Created PDF File
Take a look at the newly created PDF file using evince program from the desktop.
Sample Screenshot :
Looks pretty nice under my Linux Mint 17 box.
View Website Page in PDF
Create TOC (Table Of Content) of a Page to PDF
To create a table of content for a PDF file, use the option as toc.
Sample Output :
To check the TOC for the created file, again use evince program.
Sample Screenshot :
Take a look at the picture below. it looks even more better than the above.
Create Website Page to Table of Contents in PDF
Wkhtmltopdf Options and Usage
For Wkhtmltopdf more usage and options, use the following help command. It will display list of all available options that you can use with it.
If You Appreciate What We Do Here On TecMint, You Should Consider:
TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.
If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.
We are thankful for your never ending support.
Источник
Convert html to pdf with Linux
Written by Guillermo Garron
Date: 2010-10-07 10:36:30 00:00
When you may need to convert a complete web page in html to a pdf file, Linux can help you.
We will need two tools:
- wget — To download the complete page, including css, and others
- wkhtmltopdf — To make the real conversion from html to pdf
You should be able to install both of them using your package manager.
To be able to convert the html to pdf, we will follow a two stage process.
First step, download the web page in html
To do that enter this command:
I will first create a folder to store the page, so.
Then download the web page:
That will create a structure like this:
There you will find the file, new-branch-on-debian-1.html
Second step, convert the html file to pdf file
Enter into the folder where the html file is.
Convert the file.
Using this format
wkhtmltopdf [html file] [pdf file]
That is it, you now have converted a complete html file including format, css, etc. to a pdf file, that you can send by email, archive, or anything you want.
Note: If the page you are downloading does not have .html extension you may get errors, to solve that, just mv (rename) the file to have an .html extension. Now a days, most of the pages does not have .html extensions.
If you enjoyed the article, please share it
Источник
All downloads are currently hosted via GitHub releases, so you can browse for a specific download or use the links below.
Do not use wkhtmltopdf with any untrusted HTML – be sure to sanitize any user-supplied HTML/JS, otherwise it can lead to complete takeover of the server it is running on! Please read the project status for the gory details.
Stable
The current stable series is 0.12.6, which was released on June 11, 2020 – see changes since 0.12.5.
OS/Distribution | Supported on | Architectures | ||||
---|---|---|---|---|---|---|
Windows | Installer (Vista or later) | 64-bit | 32-bit | |||
7z Archive (XP/2003 or later) | 64-bit | 32-bit | ||||
macOS | Installer (10.7 or later) | 64-bit | ||||
Debian | 10 ( buster ) | amd64 | i386 | arm64 | ppc64le | raspberrypi |
9 ( stretch ) | amd64 | i386 | arm64 | raspberrypi | ||
Ubuntu | 20.04 ( focal ) | amd64 | arm64 | ppc64le | ||
18.04 ( bionic ) | amd64 | i386 | arm64 | ppc64le | ||
16.04 ( xenial ) | amd64 | i386 | arm64 | |||
CentOS | 8 | x86_64 | aarch64 | ppc64le | ||
7 | x86_64 | i686 | aarch64 | ppc64le | ||
6 | x86_64 | i686 | ||||
Amazon Linux | 2 | x86_64 | aarch64 | lambda zip | ||
openSUSE Leap | 15 | x86_64 | aarch64 | ppc64le | ||
Arch Linux | 20200705 | x86_64 |
All of the above packages were produced automatically via Azure Pipelines and were built on the latest OS/distribution patch release at the time of the release.
Archive
Please note that bug reports will not be accepted against the following, which are considered obsolete. It is recommended to use the latest stable release instead, and report an issue if there is a regression from a previous release.
Date | Release |
---|---|
2018-06-11 | 0.12.5 |
2019-04-30 | 0.12.1.4 (linux-only) |
2016-11-22 | 0.12.4 |
2016-03-02 | 0.12.3.2 (windows-only) |
2016-01-30 | 0.12.3.1 (windows-only) |
2016-01-20 | 0.12.3 |
2015-07-12 | 0.12.2.4 (windows-only) |
2015-06-20 | 0.12.2.3 (windows-only) |
2015-04-06 | 0.12.2.2 (windows-only) |
2015-01-19 | 0.12.2.1 |
2015-01-09 | 0.12.2 |
2014-06-26 | 0.12.1 |
2014-02-06 | 0.12.0 |
If you need versions older than 0.12.0 , you can look at the obsolete downloads.
Why do you have static builds with patched Qt?
Good question. Some features require you to use a patched Qt, because those aren’t yet upstream – please read the project status for a longer explanation.
Most Linux distributions (quite understandably) would prefer that this project upstreamed the patches, and choose to compile without those features. This leads to quite different behavior – you get a later web engine, but behavior can vary from distribution to distribution.
Why are there no “generic” Linux builds (which were provided earlier)?
Although the builds are static, it is very important to understand what it means in the context of Qt – on which wkhtmltopdf is built. A static build means that only Qt is linked in this manner – the remaining system packages still need to be installed. Over a period of time, major areas of divergence between distributions were found by trial and error:
- different library versions: not every distribution provides the same versions. This was especially the case for libpng and libjpeg , with a lot of distributions choosing between the 1.2, 1.5 and 1.6 series for the former and multiple versions of libjpeg and/or its fork libjpeg-turbo . While this could be addressed easily by linking them statically (and was actually done so for previous releases) – it broke down when it came to the next point.
- different OpenSSL versions: due to OpenSSL having a bad track record then (it’s better now), distributions started aggressively upgrading their OpenSSL version and disabling unused parts of the library. This led to a situation where there was effectively zero backward compatibility and things started breaking randomly – see #3001 for a very long read of the problems faced. This was the direct motivation to create a separate packaging repository.
- incompatible libc: not every distribution has the same glibc version. If you compile with a later version, it won’t work on a distribution which uses an older version. This was worked around earlier by using CentOS 6 (which had an old enough glibc version). But due to the rise of Docker, the alpine image became very popular. This doesn’t use glibc at all, but the musl libc. So the generic binaries never really worked on Alpine.
While Python has also tried to do this using manylinux – it doesn’t always work well (e.g. alpine is not recommended with binary wheels if you google for it), and requires you to statically link everything. This may work for them, but wkhtmltopdf also depends on the runtime configuration on actual fonts installed (i.e. fontconfig and freetype2 ). It’s not possible to abstract everything out and test/fix everything for every OS/distribution with the limited resources this project has – it makes more sense to make distribution-specific versions which are almost guaranteed to work, as they use the specific versions that the distribution has packaged.
I don’t see an appropriate download for my platform!
If the distribution you are using is listed:
- but not the specific patch release – try it, as it’s very likely to work regardless.
- the major release isn’t listed – we only support LTS versions, so try a LTS version older than your release.
- cannot install package – you can always extract it (google for extract from ), but you’ll need to have the dependencies installed.
Head over to the packaging repository and start a discussion if your platform isn’t listed.
How do I use it with FaaS setups?
You’ll need to extract the distribution-specific package, bundle it with necessary libraries, configuration and/or fonts and then upload it. See this StackOverflow question for Google Cloud Functions. PRs are welcome to expand this section, if you have more information about this – this is not a setup that the maintainer uses 😄
How do I use it in AWS Lambda?
All files required for lambda layer are packed in one zip archive (Amazon Linux 2 / lambda zip). You may test it locally by unpacking the archive into the layer directory and running next commands:
After that, you may find a pdf file generated from the google home page in your layer directory.
To use wkhtmltox in your lambda function you may put the content of the archive together with your lambda function or create a layer. Don’t forget to provide an environment variable for fontconfig ( FONTCONFIG_PATH=/opt/fonts ).
In case you use Serverless framework you may add the next lines to your serverless.yml file:
Symantec reports a virus WS.Reputation.1 for the Windows builds
This is a false positive reported because Symantec has not seen this file before – see this clarification for details.
Источник