Web to pdf linux

Содержание
  1. Wkhtmltopdf — умная утилита для конвертирования веб-страниц в PDF в Linux
  2. Особенности Wkhtmltopdf
  3. Установка Evince (программа для просмотра PDF)
  4. Скачивание исходного кода Wkhtmltopdf
  5. Установка Wkhtmltopdf в Linux
  6. Как использовать Wkhtmltopdf?
  7. Конвертирование HTML-страницы в формат PDF
  8. Проверка сгенерированного файла PDF
  9. Просмотр информации о сгенерированном файле PDF
  10. Просмотр сгенерированного файла PDF
  11. Создание таблицы содержимого (Table Of Content) файла PDF
  12. Convert HTML Page To a PDF Using Open Source Tool [ Linux / OS X / Windows ]
  13. Software features
  14. A note about Debian / Ubuntu Linux user
  15. Install wkhtmltopdf on MacOS unix
  16. Wkhtmltopdf – A Smart Tool to Convert Website HTML Page to PDF in Linux
  17. Wkhtmltopdf Features
  18. Install Evince (PDF Viewer)
  19. Download Wkhtmltopdf Source File
  20. Install Wkhtmltopdf in Linux
  21. How to Use Wkhtmltopdf?
  22. Convert Website HTML Page to PDF File
  23. View Generated PDF File
  24. View Information of Generated PDF File
  25. View Created PDF File
  26. Create TOC (Table Of Content) of a Page to PDF
  27. Wkhtmltopdf Options and Usage
  28. If You Appreciate What We Do Here On TecMint, You Should Consider:
  29. Convert html to pdf with Linux
  30. First step, download the web page in html
  31. Second step, convert the html file to pdf file
  32. Stable
  33. Archive
  34. Why do you have static builds with patched Qt?
  35. Why are there no “generic” Linux builds (which were provided earlier)?
  36. I don’t see an appropriate download for my platform!
  37. How do I use it with FaaS setups?
  38. How do I use it in AWS Lambda?
  39. Symantec reports a virus WS.Reputation.1 for the Windows builds

Wkhtmltopdf — умная утилита для конвертирования веб-страниц в PDF в Linux

Оригинал: Wkhtmltopdf – A Smart Tool to Convert Website HTML Page to PDF in Linux
Автор: Ravi Saive
Дата публикации: 28 января 2017 года
Перевод: А. Кривошей
Дата перевода: февраль 2018 г.

Wkhtmltopdf — простая и эффективная утилита командной строки с открытым исходным кодом, которая позволяет пользователю конвертировать любую веб-страницу в документ PDF или изображение (jpg, png и т. д.).

Wkhtmltopdf написана на C ++ и распространяется под лицензией GNU/GPL (General Public License). Она использует механизм рендеринга WebKit для преобразования веб-страниц в PDF без потери качества. Это действительно очень полезное и заслуживающее доверия решение для создания и хранения снимков веб-страниц в режиме реального времени.

Особенности Wkhtmltopdf

— открытый исходный код, кроссплатформенная утилита;
— преобразование любых веб-страниц в файлы PDF с использованием движка WebKit;
— опции для добавления верхних и нижних колонтитулов;
— опция генерации таблицы содержимого (TOC);
— обеспечивает конвертирование в пакетном режиме;
— поддержка PHP или Python через привязки к libwkhtmltox.

В этой статье мы покажем вам, как установить программу Wkhtmltopdf в Linux из исходного кода.

Установка Evince (программа для просмотра PDF)

Давайте сначала установим программу evince для просмотра PDF в Linux.

Скачивание исходного кода Wkhtmltopdf

Загрузите исходные коды wkhtmltopdf для вашей архитектуры Linux, с помощью команды Wget, также вы можете загрузить последнюю версию на странице загрузки wkhtmltopdf.

Для 64-битных систем

Для 32-битных систем

Установка Wkhtmltopdf в Linux

Распакуйте файлы в текущую рабочую директорию с помощью команды tar.

В 64-битной системе

В 32-битной системе

Установите wkhtmltopdf в директорию /usr/bin, чтобы ее можно было запускать из любого места.

Как использовать Wkhtmltopdf?

Здесь мы рассмотрим, как конвертировать HTML-страницы в PDF, верифицировать информацию и просматривать созданные файлы с помощью программы evince.

Конвертирование HTML-страницы в формат PDF

Для преобразования веб-страницы в PDF, выполните приведенную ниже команду. Он конвертирует указанную веб-страницу в файл 10-Sudo-Configurations.pdf в текущем рабочем каталоге.

Проверка сгенерированного файла PDF

Для того, чтобы проверить, что файл создан корректно, введите команду:

Просмотр информации о сгенерированном файле PDF

Чтобы просмотреть информацию о созданном файле, воспользуйтесь следующей командой:

Просмотр сгенерированного файла PDF

Давайте взглянем на содержимое созданного нами файла с помощью программы evince:

Пример созданного файла

В моей Linux Mint 17 выглядит отлично.

Создание таблицы содержимого (Table Of Content) файла PDF

Для создания таблицы содержимого PDF-файла используется опция toc.

Для проверки TOC созданного файла снова используем программу evince:

Дополнительную информацию об использовании и опциях Wkhtmltopdf можно получить с помощью команды help:

Источник

Convert HTML Page To a PDF Using Open Source Tool [ Linux / OS X / Windows ]

D o you need a simple open source cross-platform command line tool that converts web pages and HTML to a PDF file? Look no further, try wkhtmltopdf.

Читайте также:  Mac os vehicle loader

From the project home page:

Simple shell utility to convert html to pdf using the webkit rendering engine, and qt. Searching the web, I have found several command line tools that allow you to convert a HTML-document to a PDF-document, however they all seem to use their own, and rather incomplete rendering engine, resulting in poor quality. Recently QT 4.4 was released with a WebKit widget (WebKit is the engine of Apples Safari, which is a fork of the KDE KHtml), and making a good tool became very easy.

Software features

  1. Cross platform.
  2. Open source.
  3. Convert any web pages into PDF documents using webkit.
  4. You can add headers and footers.
  5. TOC generation.
  6. Batch mode conversions.
  7. Can run on Linux server with an XServer (the X11 client libs must be installed).
  8. Can be directly used by PHP or Python via bindings to libwkhtmltox.

A note about Debian / Ubuntu Linux user

You can install wkhtmltopdf using apt-get command:
$ sudo apt-get install wkhtmltopdf
$ sudo ln -s /usr/bin/wkhtmltopdf /usr/local/bin/html2pdf

Sample outputs:

Install wkhtmltopdf on MacOS unix

First, install Homebrew on macOS and then type the following brew command:
$ brew install wkhtmltopdf
OR
$ brew cask install wkhtmltopdf

Источник

Wkhtmltopdf – A Smart Tool to Convert Website HTML Page to PDF in Linux

Wkhtmltopdf is an open source simple and much effective command-line shell utility that enables user to convert any given HTML (Web Page) to PDF document or an image (jpg, png, etc).

Wkhtmltopdf is written in C++ programming language and distributed under GNU/GPL (General Public License). It uses WebKit rendering layout engine to convert HTML pages to PDF document without loosing the quality of the pages. Its is really very useful and trustworthy solution for creating and storing snapshots of web pages in real-time.

Wkhtmltopdf Features

  1. Open source and cross platform.
  2. Convert any HTML web pages to PDF files using WebKit engine.
  3. Options to add headers and footers
  4. Table of Content (TOC) generation option.
  5. Provides batch mode conversions.
  6. Support for PHP or Python via bindings to libwkhtmltox.

In this article we will show you how to install Wkhtmltopdf program under Linux systems using source tarball files.

Install Evince (PDF Viewer)

Let’s install evince (a PDF reader) program for viewing PDF files in Linux systems.

Download Wkhtmltopdf Source File

Download wkhtmltopdf source files for your Linux architecture using Wget command, or you can also download latest versions (current stable series is 0.12.4) at wkhtmltopdf download page.

On 64-bit Linux OS
On 32-bit Linux OS

Install Wkhtmltopdf in Linux

Extract the files to a current working directory using following tar command.

Install the wkhtmltopdf under /usr/bin directory for easy execution of program from any path.

How to Use Wkhtmltopdf?

Here we will see how to covert remote HTML pages to PDF files, verify information, view created files using evince program from the GNOME Desktop.

Convert Website HTML Page to PDF File

To convert any website HTML web page to PDF, run the following example command. It will convert the given webpage to 10-Sudo-Configurations.pdf in current working directory.

Sample Output :

View Generated PDF File

To verify that the file is created, use the following command.

Sample Output :

View Information of Generated PDF File

To view the information of generated file, issue the following command.

Sample Output :

View Created PDF File

Take a look at the newly created PDF file using evince program from the desktop.

Sample Screenshot :

Looks pretty nice under my Linux Mint 17 box.

View Website Page in PDF

Create TOC (Table Of Content) of a Page to PDF

To create a table of content for a PDF file, use the option as toc.

Sample Output :

To check the TOC for the created file, again use evince program.

Sample Screenshot :

Take a look at the picture below. it looks even more better than the above.

Create Website Page to Table of Contents in PDF

Wkhtmltopdf Options and Usage

For Wkhtmltopdf more usage and options, use the following help command. It will display list of all available options that you can use with it.

If You Appreciate What We Do Here On TecMint, You Should Consider:

TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.

Читайте также:  Как скрыть резервный диск windows

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

We are thankful for your never ending support.

Источник

Convert html to pdf with Linux

Written by Guillermo Garron
Date: 2010-10-07 10:36:30 00:00

When you may need to convert a complete web page in html to a pdf file, Linux can help you.

We will need two tools:

  • wget — To download the complete page, including css, and others
  • wkhtmltopdf — To make the real conversion from html to pdf

You should be able to install both of them using your package manager.

To be able to convert the html to pdf, we will follow a two stage process.

First step, download the web page in html

To do that enter this command:

I will first create a folder to store the page, so.

Then download the web page:

That will create a structure like this:

There you will find the file, new-branch-on-debian-1.html

Second step, convert the html file to pdf file

Enter into the folder where the html file is.

Convert the file.

Using this format

wkhtmltopdf [html file] [pdf file]

That is it, you now have converted a complete html file including format, css, etc. to a pdf file, that you can send by email, archive, or anything you want.

Note: If the page you are downloading does not have .html extension you may get errors, to solve that, just mv (rename) the file to have an .html extension. Now a days, most of the pages does not have .html extensions.

If you enjoyed the article, please share it

Источник

All downloads are currently hosted via GitHub releases, so you can browse for a specific download or use the links below.

Do not use wkhtmltopdf with any untrusted HTML – be sure to sanitize any user-supplied HTML/JS, otherwise it can lead to complete takeover of the server it is running on! Please read the project status for the gory details.

Stable

The current stable series is 0.12.6, which was released on June 11, 2020 – see changes since 0.12.5.

OS/Distribution Supported on Architectures
Windows Installer (Vista or later) 64-bit 32-bit
7z Archive (XP/2003 or later) 64-bit 32-bit
macOS Installer (10.7 or later) 64-bit
Debian 10 ( buster ) amd64 i386 arm64 ppc64le raspberrypi
9 ( stretch ) amd64 i386 arm64 raspberrypi
Ubuntu 20.04 ( focal ) amd64 arm64 ppc64le
18.04 ( bionic ) amd64 i386 arm64 ppc64le
16.04 ( xenial ) amd64 i386 arm64
CentOS 8 x86_64 aarch64 ppc64le
7 x86_64 i686 aarch64 ppc64le
6 x86_64 i686
Amazon Linux 2 x86_64 aarch64 lambda zip
openSUSE Leap 15 x86_64 aarch64 ppc64le
Arch Linux 20200705 x86_64

All of the above packages were produced automatically via Azure Pipelines and were built on the latest OS/distribution patch release at the time of the release.

Archive

Please note that bug reports will not be accepted against the following, which are considered obsolete. It is recommended to use the latest stable release instead, and report an issue if there is a regression from a previous release.

Date Release
2018-06-11 0.12.5
2019-04-30 0.12.1.4 (linux-only)
2016-11-22 0.12.4
2016-03-02 0.12.3.2 (windows-only)
2016-01-30 0.12.3.1 (windows-only)
2016-01-20 0.12.3
2015-07-12 0.12.2.4 (windows-only)
2015-06-20 0.12.2.3 (windows-only)
2015-04-06 0.12.2.2 (windows-only)
2015-01-19 0.12.2.1
2015-01-09 0.12.2
2014-06-26 0.12.1
2014-02-06 0.12.0

If you need versions older than 0.12.0 , you can look at the obsolete downloads.

Why do you have static builds with patched Qt?

Good question. Some features require you to use a patched Qt, because those aren’t yet upstream – please read the project status for a longer explanation.

Most Linux distributions (quite understandably) would prefer that this project upstreamed the patches, and choose to compile without those features. This leads to quite different behavior – you get a later web engine, but behavior can vary from distribution to distribution.

Why are there no “generic” Linux builds (which were provided earlier)?

Although the builds are static, it is very important to understand what it means in the context of Qt – on which wkhtmltopdf is built. A static build means that only Qt is linked in this manner – the remaining system packages still need to be installed. Over a period of time, major areas of divergence between distributions were found by trial and error:

  • different library versions: not every distribution provides the same versions. This was especially the case for libpng and libjpeg , with a lot of distributions choosing between the 1.2, 1.5 and 1.6 series for the former and multiple versions of libjpeg and/or its fork libjpeg-turbo . While this could be addressed easily by linking them statically (and was actually done so for previous releases) – it broke down when it came to the next point.
  • different OpenSSL versions: due to OpenSSL having a bad track record then (it’s better now), distributions started aggressively upgrading their OpenSSL version and disabling unused parts of the library. This led to a situation where there was effectively zero backward compatibility and things started breaking randomly – see #3001 for a very long read of the problems faced. This was the direct motivation to create a separate packaging repository.
  • incompatible libc: not every distribution has the same glibc version. If you compile with a later version, it won’t work on a distribution which uses an older version. This was worked around earlier by using CentOS 6 (which had an old enough glibc version). But due to the rise of Docker, the alpine image became very popular. This doesn’t use glibc at all, but the musl libc. So the generic binaries never really worked on Alpine.

While Python has also tried to do this using manylinux – it doesn’t always work well (e.g. alpine is not recommended with binary wheels if you google for it), and requires you to statically link everything. This may work for them, but wkhtmltopdf also depends on the runtime configuration on actual fonts installed (i.e. fontconfig and freetype2 ). It’s not possible to abstract everything out and test/fix everything for every OS/distribution with the limited resources this project has – it makes more sense to make distribution-specific versions which are almost guaranteed to work, as they use the specific versions that the distribution has packaged.

I don’t see an appropriate download for my platform!

If the distribution you are using is listed:

  • but not the specific patch release – try it, as it’s very likely to work regardless.
  • the major release isn’t listed – we only support LTS versions, so try a LTS version older than your release.
  • cannot install package – you can always extract it (google for extract from ), but you’ll need to have the dependencies installed.

Head over to the packaging repository and start a discussion if your platform isn’t listed.

How do I use it with FaaS setups?

You’ll need to extract the distribution-specific package, bundle it with necessary libraries, configuration and/or fonts and then upload it. See this StackOverflow question for Google Cloud Functions. PRs are welcome to expand this section, if you have more information about this – this is not a setup that the maintainer uses 😄

How do I use it in AWS Lambda?

All files required for lambda layer are packed in one zip archive (Amazon Linux 2 / lambda zip). You may test it locally by unpacking the archive into the layer directory and running next commands:

After that, you may find a pdf file generated from the google home page in your layer directory.

To use wkhtmltox in your lambda function you may put the content of the archive together with your lambda function or create a layer. Don’t forget to provide an environment variable for fontconfig ( FONTCONFIG_PATH=/opt/fonts ).

In case you use Serverless framework you may add the next lines to your serverless.yml file:

Symantec reports a virus WS.Reputation.1 for the Windows builds

This is a false positive reported because Symantec has not seen this file before – see this clarification for details.

Источник

Читайте также:  Windows update cleanup tool
Оцените статью