Boot the linux kernel

Содержание

Linux Kernel EFI Boot Stub или «Сам себе загрузчик»
Введение
Итак, что же такое Linux Kernel EFI Boot Stub?
Основные требования
Приступаем
Dualboot без загрузчика
Итого
Заключение
1. The Linux/x86 Boot ProtocolВ¶
1.1. Memory LayoutВ¶
1.2. The Real-Mode Kernel HeaderВ¶
1.3. Details of Harder FiledsВ¶
1.4. The Image ChecksumВ¶
1.5. The Kernel Command LineВ¶
1.6. Memory Layout of The Real-Mode CodeВ¶
1.7. Sample Boot ConfiguartionВ¶
1.8. Loading The Rest of The KernelВ¶
1.9. Special Command Line OptionsВ¶
1.10. Running the KernelВ¶
1.11. Advanced Boot Loader HooksВ¶
1.12. 32-bit Boot ProtocolВ¶
1.13. 64-bit Boot ProtocolВ¶
1.14. EFI Handover ProtocolВ¶

Linux Kernel EFI Boot Stub или «Сам себе загрузчик»

Введение

Прочитав недавнюю статью Загрузка ОС Linux без загрузчика, понял две вещи: многим интересна «новинка», датируемая аж 2011 годом; автор не описал самого основного, без чего, собственно, и работать ничего не будет в некоторых случаях. Также была ещё одна статья, но либо она уже устарела, либо там опять таки много лишнего и недосказанного одновременно.

Итак, что же такое Linux Kernel EFI Boot Stub?

Общая информация

А ни что иное, как… «exe-файл»! Да-да, «виндовый» PE/COFF. Ну, а точнее, только закос под него с небольшими модификациями, чтобы угодить загрузчику UEFI. Можно убедиться в этом, прочитав первые 2 байта ядра:

Знакомо, не правда ли? Как минимум тем, кто хоть раз «для интереса» открывал исполняемый файл MS-DOS или Windows в блокноте, хекс-редакторе или чём-то покруче. Это инициалы Марка Збиковски, который, собственно, и разработал данный формат файлов в MS-DOS. Сигнатура этой заглушки до сих пор висит рудиментом в современных исполнимых файлах Windows, сжирая со своим заголовком целых 64 байта на каждый файл!

DOS-заголовок попадает на legacy-код, который выполняется в случае загрузки ядра как бут-сектора, и ругается на манеру MS-DOS при запуске PE-файлов: «Direct floppy boot is not supported. Use a boot loader program instead. Remove disk and press any key to reboot . ». Поэтому информация из этого заголовка здесь является мусором, кроме, собственно, сигнатуры ‘MZ’ и адреса смещения следующего заголовка.

Идём дальше.
Спецификация PE/COFF говорит нам, что по смещению 0x3c находится 32-битное смещение второго заголовка с сигнатурой «PE\0\0»:

итак, смещение равно 0xb8, что справедливо для текущего stable-ядра x86_64 архитектуры, на x86 будет 0xa8. Читаем:

А вот и сигнатура второго заголовка! Как можно было догадаться, это аббревиатура от словосочетания Portable Executable, с которой и начинается полезная нагрузка в исполнимых файлах.

Даже загрузчик Windows плевал на половину полей этого заголовка, а уж UEFI они и вовсе не нужны, поэтому некоторые из них прописаны статически, важные же — заполняются во время сборки ядра. Множество «ненужных» полей, всяких таймстемпов, котрольных сумм и пр. просто остаются нулями. Заполняются в основном размеры, смещения, точка входа и т. д. Поэтому, можно с натяжкой назвать данный PE-файл полностью валидным. Однако, классические утилиты LordPE или PETools вполне себе довольствуются сигнатурами и рассказывают о файле всё, что им известно:

Основное отличие от «реальных» исполняемых файлов в Windows — это флаг Subsystem опционального заголовка, который выставляется в IMAGE_SUBSYSTEM_EFI_APPLICATION, а не в IMAGE_SUBSYSTEM_WINDOWS_GUI для графических или IMAGE_SUBSYSTEM_WINDOWS_CUI для символьных (консольных) приложений Windows соответственно.

Структура

В общем же, всё как в обычном PE-файле. На текущий момент стабильной версии 3.11.4 ядра Arch Linux из репозиториев, в нём содержатся 3 секции: ‘.setup’, ‘.reloc’ и ‘.text’.

Секция .setup, содержит в основном legacy-код для инициализации в случае загрузки в режиме совместимости. При загрузке же в UEFI mode, все переключения режимов процессора, начальные инициализации производит прошивка.
.reloc секция обязательно требуется загрузчиком, поэтому при сборке ядра создаётся пустая заглушка «чтоб было».
Самая интересная секция .code, собственно, содержит EntryPoint и основной код всего остального ядра. После того, как EFI-application найдено, загрузчик выполняет загрузочный сервис LoadImage, тем самым загружая весь образ в память. Тип резидентности зависит от поля Subsystem: EFI_APPLICATION будет выгружаться, когда отработает. EFI_DRIVER же может быть Unloadable и выгрузится только в случае критической ошибки. Далее передаётся управление на точку входа, обычно это функция efi_main() — аналог main() в C.

На самом деле, я немного слукавил, в начале назвав ядро exe-файлом. По сути это простое себе приложение EFI, которое использует формат PE32+.

Основные требования

Прежде всего, необходимо активировать режим загрузки EFI-mode. Пункт может называться как вдумается вендору, обычно находится во вкладке Boot Options. Если увидели там что-то вроде Legacy Mode или CSM (Compatibility Support Mode), либо просто BOIS Mode, меняйте на что-то похожее: (U)EFI Mode, Enhanced Mode или Advanced Mode.

Если материнская плата имеет логотип «Windows 8 Ready!», то, скорее всего, режим EFI Boot Mode уже активирован по умолчанию.

В большинстве случаев, для загрузки ядра Linux в EFI-mode необходимо выключить опцию Secure Boot.

Разметка диска

Многие источники указывают, что обязательно нужна разметка диска GPT, а не MBR, но это не так. UEFI вполне себе умеет MBR. Другое дело, например, Windows насильно заставляет разбивать диск новым методом, чтобы грузиться в EFI mode и ругается на древность Master Boot Record’а. И правильно делает! Разметив диск «современно» ничего не потеряем, а только выиграем.
Во-первых, не будет проблем со всякими там Primary/Logical разделами, «туда не ходи — сюда ходи» и прочими рудиментами.
Во-вторых, хоть сейчас и продвигаются массово SolidState-диски, у которых объёмы пока не сильно удивляют, размером же обычной «вертушки» в несколько терабайт сейчас уже никого не удивишь. А ведь под MBR можно разметить раздел максимум около 2ТБ. GPT же, видит ну очень много, можно даже цифру не называть — относительно не скоро появятся диски таких размеров.
Ну и плюс всякие бонусы, типа дублирования GPT-записи в начале и конце диска, контрольные суммы целостности и т. п., добавляют желания не раздумывая размечать диск под GPT.

Статей, как разбить диск с помощью различных утилит в GNU/Linux можно найти огромное количество.

Отдельный раздел

Тип раздела

Спустя nn-цать лет разработки стандартов, инженеры таки решили, что хардкодить — не есть хорошо. Теперь не важно где находится наш загрузочный раздел, загрузчик UEFI делает очень просто: он перебирает все подряд разделы и диски и ищет один особенный. Особенность его заключается в том, что в случае с MBR-разметкой, он имеет тип с кодом 0xEF (как можно догадаться, от EFI). В случае разметки GPT — раздел с GUID равным C12A7328-F81F-11D2-BA4B-00A0C93EC93B.

Здесь существует некоторая неявность. Все утилиты для разметки, например, parted, имеют свойство установки и отображения флага «boot», который применяется к разделу. Так вот, в случае с MBR, такая возможность действительно имеется, т. е. существует реальный байт, который указывает БИОСу на «загрузочность» раздела. Этот флаг можно поставить на любой раздел, MBR которого, мы хотим скормить БИОСу для загрузки. Но, когда мы имеем дело с GPT, никакого флага в действительности нет! Под этим флагом parted понимает как раз GUID равный вышеуказанному. Т. е. по факту GPT boot flag = GPT EFI Partition!

Вывод: если наш EFI-раздел на MBR — ставим тип раздела EFI Partition и boot flag. Если GPT — либо тип EFI Partition, либо boot flag, так как они представляют собой одно и то же.

Есть ещё всякие вещи, типа GPT legacy boot flag, который устанавливается в Protective MBR, и прочее, но всё это костыли, которые используются только в режиме совместимости. В режиме GPT UEFI Boot должны игнорироваться.

Файловая система

В разных источниках пишут по-разному. Кто-то говорит, что FAT16 можно использовать, кто-то даже FAT12 рекомендует. Но, не лучше ли последовать совету официальной спецификации? А она говорит, что системный раздел должен быть в FAT32. Для removable-media (USB HDD, USB Flash) — ещё и FAT12/FAT16 в придачу к FAT32.
Про размер раздела ничего не говорится. Однако, по причине начальных костыльных и баганых реализаций загрузчиков и прошивок, опытным путём народ выяснил, что во избежание различных «внезапностей», рекомендуется размер не менее 520МиБ (546МБ). Здесь как повезёт, проблем может не быть и с 32-мегабайтным разделом.

Структура директорий

После того, как загрузчик нашёл свой «меченый» раздел и убедился, что поддерживает файловую систему, он начинает выполнять все действия с путями, относительно корня раздела. Кроме того, все файлы в данном разделе должны находиться в директории \EFI\, которая, в свою очередь, является единственной в корне раздела. По соглашению, каждому вендору рекомендуется выделить себе папку с уникальным названием и поместить её в \EFI\, например: \EFI\redhat\, \EFI\microsoft\, \EFI\archlinux\. В директории вендора находятся непосредственно исполнимые efi-приложения. Рекомендуется один файл на одну архитектуру. Файлы должны иметь расширение .efi.

Для съёмных устройств предназначена директория \EFI\BOOT\. В ней так же рекомендуется не более одного файла для каждой архитектуры. В дополнение к этому, файл должен называться boot.efi. Например, \EFI\BOOT\bootx64.efi. Доступные архитектуры: ia32, x64, ia64, arm, aa64.

Доступ к NVRAM

По умолчанию, если ничего не записано в энергонезависимой памяти UEFI, будет загружаться \EFI\BOOT\bootx64.efi. Чтобы записать в NVRAM путь к необходимому приложению, можно воспользоваться утилитой efibootmgr. Попробуем вывести текущие записи:

В некоторых дистрибутивах для работы этой утилиты требуется включенная опция ядра CONFIG_EFI_VARS.

Приступаем

Чтобы проверить, была ли включена опция при сборке ядра, выполним:
либо

CONFIG_EFI_STUB=y означает, что опция активна.

Как вариант, при обновлении ядра, каждый раз копировать его и ram-disk в ESP\\

Можно поставить EFI driver на чтение файловой системы /boot и грузиться напрямую, лишь добавив к ядру расширение ‘.efi’ (а лучше хардлинк).

Теперь нужно как-то добавить загрузочный пункт в NVRAM UEFI. Здесь снова множество вариантов:

Если мы уже загружены в режиме EFI (efibootmgr -v не ругается) с помощью загрузчиков GRUB2, rEFInd и т. п., то всё нормально:

Используем efibootmgr, который умеет передавать параметры ядра.
Если efibootmgr брыкается, можно воспользоваться UEFI Shell, который, как и наше ядро, является EFI-приложением. Через его команду bcfg возможно редактировать пункты загрузки.
Может быть такой вариант: efibootmgr ругается на добавление параметров, значит прошивка не поддерживает их запись (либо просто кривая, что вероятнее). В прошлой статье в комментах упомянули параметр ядра efi_no_storage_paranoia , который может помочь. Но пользоваться им можно только если вы уверены в том, что ваша прошивка реализована полностью в соответствии со спецификацией! Разработчики предупреждают, что если вендор добавил костылей и отсебятины при реализации, есть неиллюзорная вероятность материализовать кирпич на месте материнской платы.
Можно также грузиться через UEFI Shell. Для него создаётся скриптstartup.nsh, в котором указывается команда загрузки ядра с нужным command line. А Shell, в свою очередь, добавляется как пункт загрузки.
Существует ещё одна проблема: добавление пункта возможно только для одного пути ядра, при этом ram-disk не видится. В большинстве статей рекомендуют при этом пересобирать ядро со встроенным initrd. Точно не знаю — проблема ли ядра это, или загрузчика. Но на текущий момент в 90% случаев всё поддерживается и пересобирать ядро не нужно.

Dualboot без загрузчика

Если у вас установлены 2 системы одновременно, и всё равно не хочется ставить сторонний загрузчик, можно добавить обе в пункты загрузки UEFI и подкорректировать предпочитаемый boot order. Загрузчик Windows обычно располагается в \EFI\Microsoft\BOOT\bootmgfw.efi.

Итого

Если всё сделано правильно, перегружаемся, вызываем Boot Menu, выбираем добавленный нами пункт и смотрим на почти мгновенную загрузку. В случае с SSD, FastBoot, Readahead и Arch Linux — около 3-4 секунд. Домашний сервер уже год загружается без всяких сторонних загрузчиков, используя EFI Boot STUB.
Конечно, выигрыш в скорости тут минимальный, но, как пишут знающие люди типа Roderick Smith, иногда в режиме EFI Boot происходит «более адекватная» инициализация оборудования, чем в режимах совместимости.

Заключение

По причине относительной сырости прошивок UEFI и совершенно различных реализаций, я не стал приводить примеры кода. В каждом случае может существовать своя проблема. Надеюсь, описанное мной поможет понять общий принцип и применить к своему случаю.
Также рекомендую прошиться последней версией UEFI с сайта производителя материнской платы.

Источник

1. The Linux/x86 Boot ProtocolВ¶

On the x86 platform, the Linux kernel uses a rather complicated boot convention. This has evolved partially due to historical aspects, as well as the desire in the early days to have the kernel itself be a bootable image, the complicated PC memory model and due to changed expectations in the PC industry caused by the effective demise of real-mode DOS as a mainstream operating system.

Currently, the following versions of the Linux/x86 boot protocol exist.

Old kernels	zImage/Image support only. Some very early kernels may not even support a command line.
Protocol 2.00	(Kernel 1.3.73) Added bzImage and initrd support, as well as a formalized way to communicate between the boot loader and the kernel. setup.S made relocatable, although the traditional setup area still assumed writable.
Protocol 2.01	(Kernel 1.3.76) Added a heap overrun warning.
Protocol 2.02	(Kernel 2.4.0-test3-pre3) New command line protocol. Lower the conventional memory ceiling. No overwrite of the traditional setup area, thus making booting safe for systems which use the EBDA from SMM or 32-bit BIOS entry points. zImage deprecated but still supported.
Protocol 2.03	(Kernel 2.4.18-pre1) Explicitly makes the highest possible initrd address available to the bootloader.
Protocol 2.04	(Kernel 2.6.14) Extend the syssize field to four bytes.
Protocol 2.05	(Kernel 2.6.20) Make protected mode kernel relocatable. Introduce relocatable_kernel and kernel_alignment fields.
Protocol 2.06	(Kernel 2.6.22) Added a field that contains the size of the boot command line.
Protocol 2.07	(Kernel 2.6.24) Added paravirtualised boot protocol. Introduced hardware_subarch and hardware_subarch_data and KEEP_SEGMENTS flag in load_flags.
Protocol 2.08	(Kernel 2.6.26) Added crc32 checksum and ELF format payload. Introduced payload_offset and payload_length fields to aid in locating the payload.
Protocol 2.09	(Kernel 2.6.26) Added a field of 64-bit physical pointer to single linked list of struct setup_data.
Protocol 2.10	(Kernel 2.6.31) Added a protocol for relaxed alignment beyond the kernel_alignment added, new init_size and pref_address fields. Added extended boot loader IDs.
Protocol 2.11	(Kernel 3.6) Added a field for offset of EFI handover protocol entry point.
Protocol 2.12	(Kernel 3.8) Added the xloadflags field and extension fields to struct boot_params for loading bzImage and ramdisk above 4G in 64bit.
Protocol 2.13	(Kernel 3.14) Support 32- and 64-bit flags being set in xloadflags to support booting a 64-bit kernel from 32-bit EFI

1.1. Memory LayoutВ¶

The traditional memory map for the kernel loader, used for Image or zImage kernels, typically looks like:

When using bzImage, the protected-mode kernel was relocated to 0x100000 (вЂњhigh memoryвЂќ), and the kernel real-mode block (boot sector, setup, and stack/heap) was made relocatable to any address between 0x10000 and end of low memory. Unfortunately, in protocols 2.00 and 2.01 the 0x90000+ memory range is still used internally by the kernel; the 2.02 protocol resolves that problem.

It is desirable to keep the вЂњmemory ceilingвЂќ вЂ“ the highest point in low memory touched by the boot loader вЂ“ as low as possible, since some newer BIOSes have begun to allocate some rather large amounts of memory, called the Extended BIOS Data Area, near the top of low memory. The boot loader should use the вЂњINT 12hвЂќ BIOS call to verify how much low memory is available.

Unfortunately, if INT 12h reports that the amount of memory is too low, there is usually nothing the boot loader can do but to report an error to the user. The boot loader should therefore be designed to take up as little space in low memory as it reasonably can. For zImage or old bzImage kernels, which need data written into the 0x90000 segment, the boot loader should make sure not to use memory above the 0x9A000 point; too many BIOSes will break above that point.

For a modern bzImage kernel with boot protocol version >= 2.02, a memory layout like the following is suggested:

1.2. The Real-Mode Kernel HeaderВ¶

In the following text, and anywhere in the kernel boot sequence, вЂњa sectorвЂќ refers to 512 bytes. It is independent of the actual sector size of the underlying medium.

The first step in loading a Linux kernel should be to load the real-mode code (boot sector and setup code) and then examine the following header at offset 0x01f1. The real-mode code can total up to 32K, although the boot loader may choose to load only the first two sectors (1K) and then examine the bootup sector size.

The header looks like:

Offset/Size	Proto	Name	Meaning
01F1/1	ALL(1)	setup_sects	The size of the setup in sectors
01F2/2	ALL	root_flags	If set, the root is mounted readonly
01F4/4	2.04+(2)	syssize	The size of the 32-bit code in 16-byte paras
01F8/2	ALL	ram_size	DO NOT USE — for bootsect.S use only
01FA/2	ALL	vid_mode	Video mode control
01FC/2	ALL	root_dev	Default root device number
01FE/2	ALL	boot_flag	0xAA55 magic number
0200/2	2.00+	jump	Jump instruction
0202/4	2.00+	header	Magic signature вЂњHdrSвЂќ
0206/2	2.00+	version	Boot protocol version supported
0208/4	2.00+	realmode_swtch	Boot loader hook (see below)
020C/2	2.00+	start_sys_seg	The load-low segment (0x1000) (obsolete)
020E/2	2.00+	kernel_version	Pointer to kernel version string
0210/1	2.00+	type_of_loader	Boot loader identifier
0211/1	2.00+	loadflags	Boot protocol option flags
0212/2	2.00+	setup_move_size	Move to high memory size (used with hooks)
0214/4	2.00+	code32_start	Boot loader hook (see below)
0218/4	2.00+	ramdisk_image	initrd load address (set by boot loader)
021C/4	2.00+	ramdisk_size	initrd size (set by boot loader)
0220/4	2.00+	bootsect_kludge	DO NOT USE — for bootsect.S use only
0224/2	2.01+	heap_end_ptr	Free memory after setup end
0226/1	2.02+(3)	ext_loader_ver	Extended boot loader version
0227/1	2.02+(3)	ext_loader_type	Extended boot loader ID
0228/4	2.02+	cmd_line_ptr	32-bit pointer to the kernel command line
022C/4	2.03+	initrd_addr_max	Highest legal initrd address
0230/4	2.05+	kernel_alignment	Physical addr alignment required for kernel
0234/1	2.05+	relocatable_kernel	Whether kernel is relocatable or not
0235/1	2.10+	min_alignment	Minimum alignment, as a power of two
0236/2	2.12+	xloadflags	Boot protocol option flags
0238/4	2.06+	cmdline_size	Maximum size of the kernel command line
023C/4	2.07+	hardware_subarch	Hardware subarchitecture
0240/8	2.07+	hardware_subarch_data	Subarchitecture-specific data
0248/4	2.08+	payload_offset	Offset of kernel payload
024C/4	2.08+	payload_length	Length of kernel payload
0250/8	2.09+	setup_data	64-bit physical pointer to linked list of struct setup_data
0258/8	2.10+	pref_address	Preferred loading address
0260/4	2.10+	init_size	Linear memory required during initialization
0264/4	2.11+	handover_offset	Offset of handover entry point

For backwards compatibility, if the setup_sects field contains 0, the real value is 4.
For boot protocol prior to 2.04, the upper two bytes of the syssize field are unusable, which means the size of a bzImage kernel cannot be determined.
Ignored, but safe to set, for boot protocols 2.02-2.09.

If the вЂњHdrSвЂќ (0x53726448) magic number is not found at offset 0x202, the boot protocol version is вЂњoldвЂќ. Loading an old kernel, the following parameters should be assumed:

Otherwise, the вЂњversionвЂќ field contains the protocol version, e.g. protocol version 2.01 will contain 0x0201 in this field. When setting fields in the header, you must make sure only to set fields supported by the protocol version in use.

1.3. Details of Harder FiledsВ¶

For each field, some are information from the kernel to the bootloader (вЂњreadвЂќ), some are expected to be filled out by the bootloader (вЂњwriteвЂќ), and some are expected to be read and modified by the bootloader (вЂњmodifyвЂќ).

All general purpose boot loaders should write the fields marked (obligatory). Boot loaders who want to load the kernel at a nonstandard address should fill in the fields marked (reloc); other boot loaders can ignore those fields.

The byte order of all fields is littleendian (this is x86, after all.)

Field name:	setup_sects
Type:	read
Offset/size:	0x1f1/1
Protocol:	ALL

Field name:	root_flags
Type:	modify (optional)
Offset/size:	0x1f2/2
Protocol:	ALL

Field name:	syssize
Type:	read
Offset/size:	0x1f4/4 (protocol 2.04+) 0x1f4/2 (protocol ALL)
Protocol:	2.04+

Field name:	ram_size
Type:	kernel internal
Offset/size:	0x1f8/2
Protocol:	ALL

Field name:	vid_mode
Type:	modify (obligatory)
Offset/size:	0x1fa/2

Field name:	root_dev
Type:	modify (optional)
Offset/size:	0x1fc/2
Protocol:	ALL

Field name:	boot_flag
Type:	read
Offset/size:	0x1fe/2
Protocol:	ALL

Field name:	jump
Type:	read
Offset/size:	0x200/2
Protocol:	2.00+

Field name:	header
Type:	read
Offset/size:	0x202/4
Protocol:	2.00+

Field name:	version
Type:	read
Offset/size:	0x206/2
Protocol:	2.00+

Contains the boot protocol version, in (major

Field name: realmode_swtch Type: modify (optional) Offset/size: 0x208/4 Protocol: 2.00+

Field name:	start_sys_seg
Type:	read
Offset/size:	0x20c/2
Protocol:	2.00+

Field name:	kernel_version
Type:	read
Offset/size:	0x20e/2
Protocol:	2.00+

If set to a nonzero value, contains a pointer to a NUL-terminated human-readable kernel version number string, less 0x200. This can be used to display the kernel version to the user. This value should be less than (0x200*setup_sects).

For example, if this value is set to 0x1c00, the kernel version number string can be found at offset 0x1e00 in the kernel file. This is a valid value if and only if the вЂњsetup_sectsвЂќ field contains the value 15 or higher, as:

Field name:	type_of_loader
Type:	write (obligatory)
Offset/size:	0x210/1
Protocol:	2.00+

If your boot loader has an assigned id (see table below), enter 0xTV here, where T is an identifier for the boot loader and V is a version number. Otherwise, enter 0xFF here.

For boot loader IDs above T = 0xD, write T = 0xE to this field and write the extended ID minus 0x10 to the ext_loader_type field. Similarly, the ext_loader_ver field can be used to provide more than four bits for the bootloader version.

For example, for T = 0x15, V = 0x234, write:

Assigned boot loader ids (hexadecimal):

0	LILO (0x00 reserved for pre-2.00 bootloader)
1	Loadlin
2	bootsect-loader (0x20, all other values reserved)
3	Syslinux
4	Etherboot/gPXE/iPXE
5	ELILO
7	GRUB
8	U-Boot
9	Xen
A	Gujin
B	Qemu
C	Arcturus Networks uCbootloader
D	kexec-tools
E	Extended (see ext_loader_type)
F	Special (0xFF = undefined)
10	Reserved
11	Minimal Linux Bootloader
12	OVMF UEFI virtualization stack

Please contact @ zytor . com> if you need a bootloader ID value assigned.

Field name:	loadflags
Type:	modify (obligatory)
Offset/size:	0x211/1
Protocol:	2.00+

This field is a bitmask.

Bit 0 (read): LOADED_HIGH

If 0, the protected-mode code is loaded at 0x10000.
If 1, the protected-mode code is loaded at 0x100000.

Bit 1 (kernel internal): KASLR_FLAG

Used internally by the compressed kernel to communicate KASLR status to kernel proper.

If 1, KASLR enabled.
If 0, KASLR disabled.

Bit 5 (write): QUIET_FLAG

If 0, print early messages.

If 1, suppress early messages.

This requests to the kernel (decompressor and early kernel) to not write early messages that require accessing the display hardware directly.

Bit 6 (write): KEEP_SEGMENTS

If 0, reload the segment registers in the 32bit entry point.

If 1, do not reload the segment registers in the 32bit entry point.

Assume that %cs %ds %ss %es are all set to flat segments with a base of 0 (or the equivalent for their environment).

Bit 7 (write): CAN_USE_HEAP

Field name:	setup_move_size
Type:	modify (obligatory)
Offset/size:	0x212/2
Protocol:	2.00-2.01

When using protocol 2.00 or 2.01, if the real mode kernel is not loaded at 0x90000, it gets moved there later in the loading sequence. Fill in this field if you want additional data (such as the kernel command line) moved in addition to the real-mode kernel itself.

The unit is bytes starting with the beginning of the boot sector.

This field is can be ignored when the protocol is 2.02 or higher, or if the real-mode code is loaded at 0x90000.

Field name:	code32_start
Type:	modify (optional, reloc)
Offset/size:	0x214/4
Protocol:	2.00+

The address to jump to in protected mode. This defaults to the load address of the kernel, and can be used by the boot loader to determine the proper load address.

This field can be modified for two purposes:

as a boot loader hook (see Advanced Boot Loader Hooks below.)
if a bootloader which does not install a hook loads a relocatable kernel at a nonstandard address it will have to modify this field to point to the load address.

Field name:	ramdisk_image
Type:	write (obligatory)
Offset/size:	0x218/4
Protocol:	2.00+

Field name:	ramdisk_size
Type:	write (obligatory)
Offset/size:	0x21c/4
Protocol:	2.00+

Field name:	bootsect_kludge
Type:	kernel internal
Offset/size:	0x220/4
Protocol:	2.00+

Field name:	heap_end_ptr
Type:	write (obligatory)
Offset/size:	0x224/2
Protocol:	2.01+

Field name:	ext_loader_ver
Type:	write (optional)
Offset/size:	0x226/1
Protocol:	2.02+

This field is used as an extension of the version number in the type_of_loader field. The total version number is considered to be (type_of_loader & 0x0f) + (ext_loader_ver

Field name: ext_loader_type Type: write (obligatory if (type_of_loader & 0xf0) == 0xe0) Offset/size: 0x227/1 Protocol: 2.02+

This field is used as an extension of the type number in type_of_loader field. If the type in type_of_loader is 0xE, then the actual type is (ext_loader_type + 0x10).

This field is ignored if the type in type_of_loader is not 0xE.

Kernels prior to 2.6.31 did not recognize this field, but it is safe to write for protocol version 2.02 or higher.

Field name:	cmd_line_ptr
Type:	write (obligatory)
Offset/size:	0x228/4
Protocol:	2.02+

Set this field to the linear address of the kernel command line. The kernel command line can be located anywhere between the end of the setup heap and 0xA0000; it does not have to be located in the same 64K segment as the real-mode code itself.

Fill in this field even if your boot loader does not support a command line, in which case you can point this to an empty string (or better yet, to the string вЂњautoвЂќ.) If this field is left at zero, the kernel will assume that your boot loader does not support the 2.02+ protocol.

Field name:	initrd_addr_max
Type:	read
Offset/size:	0x22c/4
Protocol:	2.03+

Field name:	kernel_alignment
Type:	read/modify (reloc)
Offset/size:	0x230/4
Protocol:	2.05+ (read), 2.10+ (modify)

Alignment unit required by the kernel (if relocatable_kernel is true.) A relocatable kernel that is loaded at an alignment incompatible with the value in this field will be realigned during kernel initialization.

Starting with protocol version 2.10, this reflects the kernel alignment preferred for optimal performance; it is possible for the loader to modify this field to permit a lesser alignment. See the min_alignment and pref_address field below.

Field name:	relocatable_kernel
Type:	read (reloc)
Offset/size:	0x234/1
Protocol:	2.05+

Field name:	min_alignment
Type:	read (reloc)
Offset/size:	0x235/1
Protocol:	2.10+

This field, if nonzero, indicates as a power of two the minimum alignment required, as opposed to preferred, by the kernel to boot. If a boot loader makes use of this field, it should update the kernel_alignment field with the alignment unit desired; typically:

There may be a considerable performance cost with an excessively misaligned kernel. Therefore, a loader should typically try each power-of-two alignment from kernel_alignment down to this alignment.

Field name:	xloadflags
Type:	read
Offset/size:	0x236/2
Protocol:	2.12+

This field is a bitmask.

Bit 0 (read): XLF_KERNEL_64

If 1, this kernel has the legacy 64-bit entry point at 0x200.

Bit 1 (read): XLF_CAN_BE_LOADED_ABOVE_4G

If 1, kernel/boot_params/cmdline/ramdisk can be above 4G.

Bit 2 (read): XLF_EFI_HANDOVER_32

If 1, the kernel supports the 32-bit EFI handoff entry point given at handover_offset.

Bit 3 (read): XLF_EFI_HANDOVER_64

If 1, the kernel supports the 64-bit EFI handoff entry point given at handover_offset + 0x200.

Bit 4 (read): XLF_EFI_KEXEC

If 1, the kernel supports kexec EFI boot with EFI runtime support.

Field name:	cmdline_size
Type:	read
Offset/size:	0x238/4
Protocol:	2.06+

Field name:	hardware_subarch
Type:	write (optional, defaults to x86/PC)
Offset/size:	0x23c/4
Protocol:	2.07+

In a paravirtualized environment the hardware low level architectural pieces such as interrupt handling, page table handling, and accessing process control registers needs to be done differently.

This field allows the bootloader to inform the kernel we are in one one of those environments.

0x00000000	The default x86/PC environment
0x00000001	lguest
0x00000002	Xen
0x00000003	Moorestown MID
0x00000004	CE4100 TV Platform

Field name:	hardware_subarch_data
Type:	write (subarch-dependent)
Offset/size:	0x240/8
Protocol:	2.07+

Field name:	payload_offset
Type:	read
Offset/size:	0x248/4
Protocol:	2.08+

If non-zero then this field contains the offset from the beginning of the protected-mode code to the payload.

The payload may be compressed. The format of both the compressed and uncompressed data should be determined using the standard magic numbers. The currently supported compression formats are gzip (magic numbers 1F 8B or 1F 9E), bzip2 (magic number 42 5A), LZMA (magic number 5D 00), XZ (magic number FD 37), and LZ4 (magic number 02 21). The uncompressed payload is currently always ELF (magic number 7F 45 4C 46).

Field name:	payload_length
Type:	read
Offset/size:	0x24c/4
Protocol:	2.08+

Field name:	setup_data
Type:	write (special)
Offset/size:	0x250/8
Protocol:	2.09+

The 64-bit physical pointer to NULL terminated single linked list of struct setup_data. This is used to define a more extensible boot parameters passing mechanism. The definition of struct setup_data is as follow:

Where, the next is a 64-bit physical pointer to the next node of linked list, the next field of the last node is 0; the type is used to identify the contents of data; the len is the length of data field; the data holds the real payload.

This list may be modified at a number of points during the bootup process. Therefore, when modifying this list one should always make sure to consider the case where the linked list already contains entries.

Field name:	pref_address
Type:	read (reloc)
Offset/size:	0x258/8
Protocol:	2.10+

This field, if nonzero, represents a preferred load address for the kernel. A relocating bootloader should attempt to load at this address if possible.

A non-relocatable kernel will unconditionally move itself and to run at this address.

Field name:	init_size
Type:	read
Offset/size:	0x260/4

This field indicates the amount of linear contiguous memory starting at the kernel runtime start address that the kernel needs before it is capable of examining its memory map. This is not the same thing as the total amount of memory the kernel needs to boot, but it can be used by a relocating boot loader to help select a safe load address for the kernel.

The kernel runtime start address is determined by the following algorithm:

Field name:	handover_offset
Type:	read
Offset/size:	0x264/4

This field is the offset from the beginning of the kernel image to the EFI handover protocol entry point. Boot loaders using the EFI handover protocol to boot the kernel should jump to this offset.

See EFI HANDOVER PROTOCOL below for more details.

1.4. The Image ChecksumВ¶

From boot protocol version 2.08 onwards the CRC-32 is calculated over the entire file using the characteristic polynomial 0x04C11DB7 and an initial remainder of 0xffffffff. The checksum is appended to the file; therefore the CRC of the file up to the limit specified in the syssize field of the header is always 0.

1.5. The Kernel Command LineВ¶

The kernel command line has become an important way for the boot loader to communicate with the kernel. Some of its options are also relevant to the boot loader itself, see вЂњspecial command line optionsвЂќ below.

The kernel command line is a null-terminated string. The maximum length can be retrieved from the field cmdline_size. Before protocol version 2.06, the maximum was 255 characters. A string that is too long will be automatically truncated by the kernel.

If the boot protocol version is 2.02 or later, the address of the kernel command line is given by the header field cmd_line_ptr (see above.) This address can be anywhere between the end of the setup heap and 0xA0000.

If the protocol version is not 2.02 or higher, the kernel command line is entered using the following protocol:

At offset 0x0020 (word), вЂњcmd_line_magicвЂќ, enter the magic number 0xA33F.
At offset 0x0022 (word), вЂњcmd_line_offsetвЂќ, enter the offset of the kernel command line (relative to the start of the real-mode kernel).
The kernel command line must be within the memory region covered by setup_move_size, so you may need to adjust this field.

1.6. Memory Layout of The Real-Mode CodeВ¶

The real-mode code requires a stack/heap to be set up, as well as memory allocated for the kernel command line. This needs to be done in the real-mode accessible memory in bottom megabyte.

It should be noted that modern machines often have a sizable Extended BIOS Data Area (EBDA). As a result, it is advisable to use as little of the low megabyte as possible.

Unfortunately, under the following circumstances the 0x90000 memory segment has to be used:

When loading a zImage kernel ((loadflags & 0x01) == 0).
When loading a 2.01 or earlier boot protocol kernel.

For the 2.00 and 2.01 boot protocols, the real-mode code can be loaded at another address, but it is internally relocated to 0x90000. For the вЂњoldвЂќ protocol, the real-mode code must be loaded at 0x90000.

When loading at 0x90000, avoid using memory above 0x9a000.

For boot protocol 2.02 or higher, the command line does not have to be located in the same 64K segment as the real-mode setup code; it is thus permitted to give the stack/heap the full 64K segment and locate the command line above it.

The kernel command line should not be located below the real-mode code, nor should it be located in high memory.

1.7. Sample Boot ConfiguartionВ¶

As a sample configuration, assume the following layout of the real mode segment.

When loading below 0x90000, use the entire segment:

0x0000-0x7fff	Real mode kernel
0x8000-0xdfff	Stack and heap
0xe000-0xffff	Kernel command line

When loading at 0x90000 OR the protocol version is 2.01 or earlier:

0x0000-0x7fff	Real mode kernel
0x8000-0x97ff	Stack and heap
0x9800-0x9fff	Kernel command line

Such a boot loader should enter the following fields in the header:

1.8. Loading The Rest of The KernelВ¶

The 32-bit (non-real-mode) kernel starts at offset (setup_sects+1)*512 in the kernel file (again, if setup_sects == 0 the real value is 4.) It should be loaded at address 0x10000 for Image/zImage kernels and 0x100000 for bzImage kernels.

The kernel is a bzImage kernel if the protocol >= 2.00 and the 0x01 bit (LOAD_HIGH) in the loadflags field is set:

Note that Image/zImage kernels can be up to 512K in size, and thus use the entire 0x10000-0x90000 range of memory. This means it is pretty much a requirement for these kernels to load the real-mode part at 0x90000. bzImage kernels allow much more flexibility.

1.9. Special Command Line OptionsВ¶

If the command line provided by the boot loader is entered by the user, the user may expect the following command line options to work. They should normally not be deleted from the kernel command line even though not all of them are actually meaningful to the kernel. Boot loader authors who need additional command line options for the boot loader itself should get them registered in Documentation/admin-guide/kernel-parameters.rst to make sure they will not conflict with actual kernel options now or in the future.

In addition, some boot loaders add the following options to the user-specified command line:

If these options are added by the boot loader, it is highly recommended that they are located first, before the user-specified or configuration-specified command line. Otherwise, вЂњinit=/bin/shвЂќ gets confused by the вЂњautoвЂќ option.

1.10. Running the KernelВ¶

The kernel is started by jumping to the kernel entry point, which is located at segment offset 0x20 from the start of the real mode kernel. This means that if you loaded your real-mode kernel code at 0x90000, the kernel entry point is 9020:0000.

At entry, ds = es = ss should point to the start of the real-mode kernel code (0x9000 if the code is loaded at 0x90000), sp should be set up properly, normally pointing to the top of the heap, and interrupts should be disabled. Furthermore, to guard against bugs in the kernel, it is recommended that the boot loader sets fs = gs = ds = es = ss.

In our example from above, we would do:

If your boot sector accesses a floppy drive, it is recommended to switch off the floppy motor before running the kernel, since the kernel boot leaves interrupts off and thus the motor will not be switched off, especially if the loaded kernel has the floppy driver as a demand-loaded module!

1.11. Advanced Boot Loader HooksВ¶

If the boot loader runs in a particularly hostile environment (such as LOADLIN, which runs under DOS) it may be impossible to follow the standard memory location requirements. Such a boot loader may use the following hooks that, if set, are invoked by the kernel at the appropriate time. The use of these hooks should probably be considered an absolutely last resort!

IMPORTANT: All the hooks are required to preserve %esp, %ebp, %esi and %edi across invocation.

A 32-bit flat-mode routine jumped to immediately after the transition to protected mode, but before the kernel is uncompressed. No segments, except CS, are guaranteed to be set up (current kernels do, but older ones do not); you should set them up to BOOT_DS (0x18) yourself.

After completing your hook, you should jump to the address that was in this field before your boot loader overwrote it (relocated, if appropriate.)

1.12. 32-bit Boot ProtocolВ¶

For machine with some new BIOS other than legacy BIOS, such as EFI, LinuxBIOS, etc, and kexec, the 16-bit real mode setup code in kernel based on legacy BIOS can not be used, so a 32-bit boot protocol needs to be defined.

In 32-bit boot protocol, the first step in loading a Linux kernel should be to setup the boot parameters (struct boot_params, traditionally known as вЂњzero pageвЂќ). The memory for struct boot_params should be allocated and initialized to all zero. Then the setup header from offset 0x01f1 of kernel image on should be loaded into struct boot_params and examined. The end of setup header can be calculated as follow:

In addition to read/modify/write the setup header of the struct boot_params as that of 16-bit boot protocol, the boot loader should also fill the additional fields of the struct boot_params as that described in zero-page.txt.

After setting up the struct boot_params, the boot loader can load the 32/64-bit kernel in the same way as that of 16-bit boot protocol.

In 32-bit boot protocol, the kernel is started by jumping to the 32-bit kernel entry point, which is the start address of loaded 32/64-bit kernel.

At entry, the CPU must be in 32-bit protected mode with paging disabled; a GDT must be loaded with the descriptors for selectors __BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat segment; __BOOT_CS must have execute/read permission, and __BOOT_DS must have read/write permission; CS must be __BOOT_CS and DS, ES, SS must be __BOOT_DS; interrupt must be disabled; %esi must hold the base address of the struct boot_params; %ebp, %edi and %ebx must be zero.

1.13. 64-bit Boot ProtocolВ¶

For machine with 64bit cpus and 64bit kernel, we could use 64bit bootloader and we need a 64-bit boot protocol.

In 64-bit boot protocol, the first step in loading a Linux kernel should be to setup the boot parameters (struct boot_params, traditionally known as вЂњzero pageвЂќ). The memory for struct boot_params could be allocated anywhere (even above 4G) and initialized to all zero. Then, the setup header at offset 0x01f1 of kernel image on should be loaded into struct boot_params and examined. The end of setup header can be calculated as follows:

After setting up the struct boot_params, the boot loader can load 64-bit kernel in the same way as that of 16-bit boot protocol, but kernel could be loaded above 4G.

In 64-bit boot protocol, the kernel is started by jumping to the 64-bit kernel entry point, which is the start address of loaded 64-bit kernel plus 0x200.

At entry, the CPU must be in 64-bit mode with paging enabled. The range with setup_header.init_size from start address of loaded kernel and zero page and command line buffer get ident mapping; a GDT must be loaded with the descriptors for selectors __BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat segment; __BOOT_CS must have execute/read permission, and __BOOT_DS must have read/write permission; CS must be __BOOT_CS and DS, ES, SS must be __BOOT_DS; interrupt must be disabled; %rsi must hold the base address of the struct boot_params.

1.14. EFI Handover ProtocolВ¶

This protocol allows boot loaders to defer initialisation to the EFI boot stub. The boot loader is required to load the kernel/initrd(s) from the boot media and jump to the EFI handover protocol entry point which is hdr->handover_offset bytes from the beginning of startup_<32,64>.

The function prototype for the handover entry point looks like this:

вЂhandleвЂ™ is the EFI image handle passed to the boot loader by the EFI firmware, вЂtableвЂ™ is the EFI system table — these are the first two arguments of the вЂњhandoff stateвЂќ as described in section 2.3 of the UEFI specification. вЂbpвЂ™ is the boot loader-allocated boot params.

The boot loader must fill out the following fields in bp:

Источник