- Building Linux with Clang/LLVM¶
- About¶
- Clang¶
- Cross Compiling¶
- LLVM Utilities¶
- Omitting CROSS_COMPILE¶
- Supported Architectures¶
- Clang. Часть 1: введение
- Что такое Clang?
- Как работает Clang?
- Clang AST
- Как использовать Clang?
- Clang Plugin
- LibTooling (Clang Tool)
- LibClang
- Начинаем работать с Clang
- Using clang on linux
- Clang
- Contents
- Installation
- Prerequisites
- USE flags
- Emerge
- Configuration
- GCC fallback environments
- Clang environments
- Global configuration via make.conf
- Usage
- Bootstrapping the Clang toolchain
- Link-time optimizations with Clang
- Environment
- Global configuration
- distcc
- ccache
- Troubleshooting
- Compile errors when using Clang with -flto
- Use of GNU extensions without proper -std=
- Differences from GCC
Building Linux with Clang/LLVM¶
This document covers how to build the Linux kernel with Clang and LLVM utilities.
About¶
The Linux kernel has always traditionally been compiled with GNU toolchains such as GCC and binutils. Ongoing work has allowed for Clang and LLVM utilities to be used as viable substitutes. Distributions such as Android, ChromeOS, and OpenMandriva use Clang built kernels. LLVM is a collection of toolchain components implemented in terms of C++ objects. Clang is a front-end to LLVM that supports C and the GNU C extensions required by the kernel, and is pronounced “klang,” not “see-lang.”
Clang¶
The compiler used can be swapped out via CC= command line argument to make . CC= should be set when selecting a config and during a build.
Cross Compiling¶
A single Clang compiler binary will typically contain all supported backends, which can help simplify cross compiling.
CROSS_COMPILE is not used to prefix the Clang compiler binary, instead CROSS_COMPILE is used to set a command line flag: —target=
LLVM Utilities¶
LLVM has substitutes for GNU binutils utilities. Kbuild supports LLVM=1 to enable them.
They can be enabled individually. The full list of the parameters:
The integrated assembler is enabled by default. You can pass LLVM_IAS=0 to disable it.
Omitting CROSS_COMPILE¶
As explained above, CROSS_COMPILE is used to set —target=
If CROSS_COMPILE is not specified, the —target=
That means if you use only LLVM tools, CROSS_COMPILE becomes unnecessary.
For example, to cross-compile the arm64 kernel:
If LLVM_IAS=0 is specified, CROSS_COMPILE is also used to derive —prefix=
to search for the GNU assembler and linker.
Supported Architectures¶
LLVM does not target all of the architectures that Linux supports and just because a target is supported in LLVM does not mean that the kernel will build or work without any issues. Below is a general summary of architectures that currently work with CC=clang or LLVM=1 . Level of support corresponds to “S” values in the MAINTAINERS files. If an architecture is not present, it either means that LLVM does not target it or there are known issues. Using the latest stable version of LLVM or even the development tree will generally yield the best results. An architecture’s defconfig is generally expected to work well, certain configurations may have problems that have not been uncovered yet. Bug reports are always welcome at the issue tracker below!
Источник
Clang. Часть 1: введение
Что такое Clang?
Я провёл последние несколько месяцев, работая с Clang, фронтендом LLVM. Clang умеет парсить и анализировать любой исходный код на языках семейства С (C, C++, ObjectiveC, и т.п. ) и имеет удивительную модульную структуру, которая делает его простым в использовании.
Если вы ищете статический анализатор кода, я настоятельно рекомендую Clang, он существенно превосходит другие статические анализаторы (такие, как CIL. ) и хорошо документирован. Также список рассылки Clang очень активен и полезен, если вы застряли на чём-то.
Лично я использую Clang для статического анализа драйверов ввода-вывода ядра Linux, включая драйвера камеры и драйвера DRM графической карты. Код ядра, особенно код драйвера, может быть очень сложным и трудным для анализа, но Clang позволяет нам легко поддерживать его. Давайте посмотрим, что можно сделать с его помощью.
Как работает Clang?
В большинстве случаев, Clang запустит препроцессор (который разворачивает все макросы) и парсит исходник, превращая его в абстрактное синтаксическое дерево (AST). C AST работать гораздо проще, чем с исходным кодом, но вы всегда можете получить ссылки на исходник. Фактически, каждая структура в Clang-е, используемая для представления кода (AST, CFG и т.п.), всегда имеет ссылку на оригинальный исходник, полезный для целей анализа, рефакторинга и т.п.
если вам нужно анализировать и модифицировать код на уровне исходника, Clang лучше, чем LLVM. Анализ с помощью LLVM означает, что вы можете использовать язык внутреннего представления LLVM, похожий на ассемблер.
Clang AST
Практически каждый компилятор и статический анализатор использует AST для представления исходного кода. AST, используемое в Clang, очень детализированное и сложное, но вы получите удовольствие, изучая различные классы элементов Clang AST. Ниже приводится краткое введение в Clang AST, но самый простой путь изучить его, это просто делать дампы AST для простых исходников, и смотреть, какое AST им соответствует.
В общем, Clang AST сделано из двух очень гибких классов: Decl и Stmt. У обоих есть множество подклассов, вот несколько примеров:
FunctionDecl — прототип или объявление функции
BinaryOperator — бинарный оператор, например (a + b)
CallExpr — вызов функции, например, foo(x);
Большинство классов имеют «говорящие» имена, например, ForStmt, IfStmt, и ReturnStmt. Вы поймёте суть AST, поиграв с ним несколько минут. Вы можете найти документацию по классам AST, поискав что-либо вроде “Clang FunctionDecl.”
Как использовать Clang?
Clang может использоваться как прямая замена gcc и предлагает несколько крутых инструментов статического анализа. Как программист (а не как нормальный пользователь!), вы можете получить доступ к всей мощи clang, используя его как библиотеку одним из трёх способов, в зависимости от того, как вы решите.
Для начала, ознакомьтесь с описанием интерфейсов clang. В дополнение к тому, что написано в этом описании, я выделю другие существенные различия между различными интерфейсами clang.
Clang Plugin
Ваш код является плагином, и запускается каждый раз заново для каждого файла исходника, что означает, что вы не можете сохранять глобальную информацию или другую контекстную информацию между двумя разными исходными файлами (но вы можете запустить плагин для множества файлов последовательно). Плагин запускается путём передачи соответствующих опций системе компиляции (Clang, Make и т.п.) через аргументы командной строки. Это похоже на то, как вы включаете оптимизацию в GCC (т.е. «-O1»). Вы не можете запустить какую-либо свою задачу до или после того, как исходный файл будет проанализирован.
LibTooling (Clang Tool)
Ваш код — обычная программа на С++, с нормальной функцией main(). LibTooling используется для запуска некоторого анализа на исходном коде (с множеством файлов, при желании) без запуска обычного процесса компиляции. Новый экземпляр кода для анализа (и новый AST) будет создан для каждого нового файла исходника (как и в случае Clang Plugin), но вы можете сохранять контекстную информацию между файлами исходников в своих глобальных переменных. Так как у вас есть функция main(), вы можете запускать какие-либо задачи перед или после того, как clang завершит анализ ваших исходных файлов.
LibClang
LibClang хорош тем, что это стабильный API. Clang периодически меняется, и если вы используете Plugin или Libtooling, вам нужно будет править ваш код, чтобы отслеживать эти изменения (но это не так сложно!). Если вам нужен доступ к Clang API из языков, отличных от C++ (например, из Python), вы должны использовать LibClang.
Примечание: LibClang не даёт полный доступ к AST (только высокоуровневый доступ), но другие два варианта дают. Как правило, нам нужен полный доступ к AST.
Если вы не можете решить, что использовать, я бы порекомендовал начать с интерфейса LibTooling. Он проще, и работает так, как вы ожидаете. Он предлагает гибкость и полный доступ к AST, как и Plugin, без потери глобального контекста между исходными файлами. LibTooling не сложнее в использовании, чем Plugin.
Начинаем работать с Clang
Сейчас, когда вы знаете основы, давайте начнём! Эта инструкция будет работать на любой версии Linux (и, возможно, OS X), но тестировалось на Ubuntu. Вы можете получить LLVM и Clang, проделав следующие шаги (взято из официальной инструкции к Clang):
Скачать и установить (например, с помощью apt-get) все необходимые пакеты.
(Типичный дистрибутив Linux идёт со всем необходимым, кроме subversion).
Смените директорию на ту, в которую вы хотите установить LLVM (например,
/static_analysis/). Будем называть её директорией верхнего уровня. Выполните следующие команды в терминале:
Компиляция LLVM и Clang займёт некоторое время.
Для проверки запустите:
Можно протестировать Clang, запустив классический пример Hello World:
В этом руководстве я использую Clang 3.4 на Ubuntu 13.04, но вы можете использовать другие варианты и того, и другого.
Сейчас перейдём к программированию на Clang.
Источник
Using clang on linux
This script installs self-contained standalone 9.0 versions of clang, LLVM, libc++, compiler-rt, libc++abi, lldb, and lld, on macOS and Linux, including linking clang and LLVM against libc++ themselves as well. The script keeps all of the installation within a given target prefix (e.g., /opt/clang ), and hence separate from any already installed compilers, libraries, and include files. In particular, you can later uninstall everything easily by just deleting, e.g., /opt/clang . Furthermore, as long as the prefix path is writable, the installation doesn’t need root privileges.
If you have used older version of this script before, see News below for changes.
To see the available options, use -h :
For example, to build Clang on a machine with multiple cores and install it in /opt/clang , you can use:
Once finished, just prefix your PATH with
By default, install-clang currently installs the 9.0 release branch of https://github.com/llvm (the «mono repository»). Adding -m on the command line instructs the script to use the current git master version instead. The script downloads the source code from GitHub and compiles the pieces as needed. Other OSs than macOS and Linux are not currently supported.
The script also has an update option -u that allows for catching up with upstream repository changes without doing the complete compile/install-from-scratch cycle again. Note, however, that unless coupled with -m , this flag has no immediate effect since the git versions to use are hardcoded to the Clang/LLVM release version.
Doing a self-contained Clang installation is a bit more messy than one would hope because the projects make assumptions about specific system-wide installation paths to use. The install-clang script captures some trial-and-error I (and others) went through to get an independent setup working. It compiles Clang up to three times, bootstrapping things with the system compiler as it goes. It also patches some of the Clang/LLVM projects to incorporate the installation prefix into configuration and search paths, and also fixes/tweaks a few other things as well.
install-clang comes with a Dockerfile to build a Docker image, based on Ubuntu, with Clang then in /opt/clang:
Источник
Clang
Clang is a «LLVM native» C/C++/Objective-C compiler using LLVM as a backend and optimizer. It aims to be GCC compatible yet stricter, offers fast compile times with low memory usage, and has useful error and warning messages for easier compile troubleshooting.
Contents
Installation
Prerequisites
One of the goals of the Clang project is to be compatible with GCC, in dialect and on the command-line (see #Differences from GCC). Occasionally some packages will fail to build correctly with it and some may build successfully but segfault when executed. Some packages also have GCC specific code and will also fail during compiling. In these events, GCC will need to be used as a fallback.
USE flags
Some packages are aware of the clang USE flag.
Emerge
Configuration
GCC fallback environments
Create a configuration file with a set of environment variables using Portage’s built in /etc/portage/env directory. This will override any defaults for any packages that fail to compile with clang. The name used below is just an example, so feel free to choose whatever name is desired for the fallback environment. Be sure to substitute chosen name with the examples used in this article.
The above is the most basic environmental variable needed. You can change it to suit your needs, such as enabling/disabling link-time optimizations, alternative AR, NM, RANLIB, and so on. Here are two examples below:
Basically, copy over your current working GCC config from your make.conf in the event we need to use it as a fallback.
If you choose to use LLVM’s implementation of AR , NM , and RANLIB as detailed later in the article, be sure to set them back to the GNU versions for your GCC fallback environments as shown in the above example.
If you choose not to, you can ignore the AR , NM , and RANLIB variables. If you want to use link-time optimization it’s a good idea to have two separate environments like the above examples.
In the event you have to use the GCC fallback environment(s) set the appropriate flags in the /etc/portage/package.env file.
Clang environments
Now that we’ve set up a safe fallback we can proceed to enable the usage of Clang in Gentoo. There are two ways to do this: System wide using /etc/portage/make.conf or via environmental variables like the one(s) we created for the GCC fallback.
We’ll use the same process as we did earlier in the article for setting up GCC fallbacks.
You can now use Clang on a per package basis by invoking the compiler-clang environmental variable you created.
The setup of a clang + LTO environment is described later in the article.
Global configuration via make.conf
When attempting to use Clang system wide the system absolutely must have a GCC fallback! This cannot be stressed enough as the system will not be able to compile everything using Clang at the moment, such as the GCC compiler. Gentoo maintains a bug tracker for packages that fail to build with Clang. Configuring Gentoo to use Clang system wide is simple. Change the CC and CXX variables in /etc/portage/make.conf to reference the Clang equivalents. No further configuration is necessary.
Packages that must use GCC for compiling can be handled with one of the fallback environments created earlier.
Usage
Bootstrapping the Clang toolchain
Mixing clang and its toolchain / libraries with the gcc toolchain / libraries (especially the linker) will often lead to issues like linker errors during emerge. To prevent this, the clang toolchain is built first with gcc and then with itself to get a self-providing compiler.
Prepare the environment for the Clang toolchain (see above), e.g.
This example replaces not only the compiler but also the GNU linker ld.bfd with the llvm linker lld. It is a drop-in replacement, but significantly faster than the bfd linker.
Set USE flags default-compiler-rt default-lld llvm-libunwind for clang. Then emerge clang llvm compiler-rt llvm-libunwind lld with the default gcc environment:
You can also add the default-libcxx USE flag to use LLVM’s C++ STL with clang, however this is heavily discouraged because libstdc++ and libc++ are not ABI compatible. i.e. A program built against libstdc++ will likely break when using a library built against libc++, and vice versa.
Note that sys-libs/llvm-libunwind deals with linking issues that sys-libs/libunwind has, so it is preferred to use and replace the non-llvm libunwind package if installed (it builds with -lgcc_s to resolve issues with __register_frame / __deregister_frame undefined symbols).
Enable the Clang environment for these packages now:
Repeat the emerge step with the new environment. The toolchain will now be rebuilt with itself instead of gcc.
You are now free to use clang with other packages.
Link-time optimizations with Clang
The link-time optimization feature defers optimizing the resulting executables to linking phase. This can result in better optimization of packages but isn’t standard behavior in Gentoo yet. Clang uses lld for LTO.
Note: Clang can also do LTO via the gold linker, however this is discouraged by llvm since gold is effectively dead upstream. To use gold with clang + LTO, you must first emerge llvm with the gold USE flag, and then set -fuse-ld=gold in the following examples.
Environment
Clang supports two types of link time optimization:
- Full LTO, which is the traditional approach also used by gcc where the whole link unit is analyzed at once. Using it is no longer recommended.
- ThinLTO, where the link unit is scanned and split up into multiple parts. [1] With ThinLTO, the final compilation units only contain the code that are relevant to the current scope, thus speeding up compilation, lowering footprint and allowing for more parallelism at (mostly) no cost. ThinLTO is the recommended LTO mode when using Clang.
If you need to use full LTO for some reason, replace -flto=thin with -flto in the following examples. There should be no compatibility differences between Full LTO and ThinLTO. Additionally, if you did not build Clang with the default-lld useflag, you will have to add -fuse-ld=lld to the following LDFLAGS.
As an alternative, LLVM provides its own ar , nm , and ranlib . You’re free to use them and may or may not get more mileage over using the standard ar , nm , and ranlib since they’re intended to handle LLVM bitcode which Clang produces when using the -flto flag.
Now you can set /etc/portage/package.env overrides using Clang with LTO enabled.
Global configuration
Similar to what we covered earlier in the article, we can do a system wide Clang with LTO enabled setup by changing our /etc/portage/make.conf file.
Again, it’s up to you if you want to set the AR, NM, and RANLIB to the LLVM implementations. Since earlier in the article we set up compiler environments using Clang without LTO, GCC without LTO, and GCC with LTO, we can pick and choose which is best on a per package basis. Since the goal is to compile packages system wide with Clang using LTO and not every package will successfully compile using it, we’ll have to fall back to Clang with LTO disabled or GCC. Your /etc/portage/package.env may look like this:
distcc
In order to use Clang on a distcc client, additional symlinks have to be created in /usr/lib*/distcc/bin:
ccache
Automatic with `>=ccache-3.9-r3` when Clang is emerged.
Troubleshooting
The main place for looking up known failures with Clang is the tracker bug #408963. If you hit an issue not reported on Gentoo’s Bugzilla already, please open a new bug report and make it block the linked tracker.
Compile errors when using Clang with -flto
If the packages being installed are failing, check the logs. Often, packages with errors like the following will need to disable LTO by invoking the compiler-clang environment.
You will also most likely see this error in every LTO failure case:
Simply add the failing package to /etc/portage/package.env . In this case, it’s sys-apps/less, so we’ll apply the proper override.
Sometimes a package will fail to compile even when disabling LTO because it requires another package which was compiled using -flto and works incorrectly. You may see an error like this:
In this case libatomic_ops is causing boehm-gc to fail compiling. Recompile the program causing the failure using your non-LTO environment and then recompile the new program. In this case, boehm-gc fails when using LTO, so we’ll add both of them to our /etc/portage/package.env file to build them without LTO.
Use of GNU extensions without proper -std=
Some packages tend to use GNU extensions in their code without specifying -std= appropriately. GCC allows that usage, yet Clang disables some of more specific GNU extensions by default.
If a particular package relies on such extensions being available, you will need to append the correct -std= flag to it:
- -std=gnu89 for C89/C90 with GNU extensions,
- -std=gnu99 for C99 with GNU extensions,
- -std=gnu++98 for C++:1998 with GNU extensions.
A common symptom of this problem are multiple definitions of inline functions like this:
This is because Clang uses C99 inline rules by default which do not work with gnu89 code. To work around it, you most likely have to pass -std=gnu89 or set one of your environmental overrides to use GCC to compile the failing package if passing the right -std= flag doesn’t work.
Since both current (2020) GCC and Clang default to -std=gnu17 with C99 inline rules, chances are the problems have already been spotted by a GCC user.
Differences from GCC
Clang’s optimizer is different from GCC’s. As a result, the command-line semantics are different.
- The -O flags will work, but mean slightly different things.
- Clang also vectorizes on -O2 and -Os , albeit more conservatively in terms of code size than -O3 .
- Instead of being the same as -O3 , Clang’s -O4 is an alias of -O3 -flto .
- The compatibilty of -f flags are limited as they can be simply meaningless to Clang.
- The -m and related flags are supposed to work identically, but Clang may not know about certain options. There are also Clang-only options not known by GCC.
- The PGO in clang is a bit different as it requires post-processing the sample with llvm-profdata .
The differences in language are documented by the project itself. [2]
Источник