- Landoflinux
- Basic Commands
- Display Current Directory
- Change Directory
- Listing Files
- Creating Directories — «mkdir»
- Removing directories
- Touch Command
- cp copy command
- Copying source to destination
- Copying multiple files to destination
- move (rename) file command
- move file to another location
- Deleting Files with rm command
- Delete a single file
- Confirm before deleting
- Deleting files that match a pattern
- File handling in the Linux kernel
- Архитектура слоев
- Приложение
- Библиотека
- Поддержка mount() в VFS
- Открытие файла
- Into the filesystem layer
- A digression: memory mapped files
- The ext2 filesystem handler
- Into the block generic device layer
- Device drivers, special files, and modules
- Finding the device numbers for a filesystem
- Registering the driver
- Request queue management
- Device driver interface
- Into the device driver
- Interrupts in Linux
- Port IO in Linux
- DMA in Linux
- The IDE disk driver
Landoflinux
Basic Commands
Within this section of the tutorial, we will look at some essential basic commands that allow you to work with files and directories.
We will be covering the commands «pwd«, «cd«, «ls«, «mkdir«, «rmdir«, «touch«, «cp«, «mv» and «rm«.
Display Current Directory
On a Linux system, the command that is used to display the current directory you are in is the command «pwd«. This command stands for print current working directory.
Change Directory
On a Linux system, the command that is used to navigate to a different directory is the «cd» command. This command stand for change directory.
If the «cd» command is invoked without any parameters, you will be taken to your default home directory.
In the above example, we can see that we were originally in the directory «/home/john/Music«. When we issued the command «cd» without any parameters, we were placed back into our home directory. This command is very useful for returning back to your home area quickly.
To return back to the directory you just navigated away from, you can issue the command «cd —«.
From the above example, we can see that we were able to move from the original directory back to the home directory and then go back to the original directory.
For more examples of the «cd» command, click on the following link: cd command examples.
Listing Files
To list files on a Linux system, we can use the command «ls«. One of the most useful and common commands that you will encounter is the «ls» command. (Literally meaning list). When «ls» is run alone, it will list any normal files within the current directory. If you add the «-a» parameter to the «ls» command thus making «ls -a», any files including hidden files will be displayed in the current directory. In Linux, hidden files are file names that begin with a dot «.». If you supply the «-l» option to the «ls» command, then we get an extended or long view of the files in the current directory. It is also possible to pass a directory to be displayed to the «ls» command. «ls -l /home/john» this would display all files in the /home/john area. Another useful parameter that can be passwd to the «ls» command is the «-t» option. The «-t» option sorts by modification time (newest first). Passing the «-r» option will reverse the order. Multiple options can be added together «ls -rtla». This command lists all files including hidden in reverse time order with the extended listing option.
For more examples of the «ls» command, click on the following link: ls command examples.
Creating Directories — «mkdir»
The «mkdir» command is used to create new directories. Directories allow you to organise your files and keep your system tidy. You will notice while navigating around your linux system that there are literally hundreds if not thousands of directories. Important system files and users files are stored within. In most cases, we will be working within our home area /home/user. In the example that follows we will be using /home/tux. Simply issuing the «mkdir» command followed by a name will create you a single directory: «mkdir mydir1». If we wanted to create a directory with several sub-directories, then we can not just issue «mkdir one/two/three» because we will receive an error «mkdir: cannot create directory `one/two/three’: No such file or directory». To create nested directories, we would have to pass a «-p» flag to our command. This option will then create any missing directories. So now our command would be «mkdir -p one/two/three»
Removing directories
The command used for removing a directory on a Linux system is «rmdir«. To remove a directory using this command, all files within the directory need to be removed first. The command will only work on an empty directory. Below is an example of trying to remove a directory that contains a file. Before the directory command will work, we needed to remove the file from within that directory. Once the file «john.txt» is removed, the command will work.
For more examples of the «mkdir» command and «rmdir command, click on the following link: mkdir and rmdir command examples
Touch Command
The «touch» command can be used to create empty files. The command may also be used for modifying timestamps of a file. Below is an example of creating some files using the touch command.
For more examples of how to use the «touch» command for creating and modifying timestamps, click on the following link: touch command examples
cp copy command
The «cp» command is used to copy a file to another name or to another location. The command creates an exact copy of a file on a disk with different file name.
The basic syntax of the «cp» command is:
Copying source to destination
To copy a single file to another name we pass the source and destination names after the «cp» command:
Copying multiple files to destination
To copy multiple files to another destination we pass the source files and destination names after the «cp» command:
For more examples of how to use the «cp» command for copying files, click on the following link: cp command examples
move (rename) file command
The «mv» command is used to move a file to another destination or rename the file to a new name.
The basic syntax of the «mv» command is: «mv filename new_filename«
In the above example we can see that the file «old_file» was renamed to «new_file».
move file to another location
The next example shows a file being moved from its current location to a new location.
In the above example, we moved the file «new_file» from its current location to the /tmp directory.
Deleting Files with rm command
The «rm» command is used to delete a file or group of files or directories.
The basic syntax of the «rm» command is: «rm filename«
Delete a single file
Confirm before deleting
A safer way to delete files is to ask for confirmation before deleting. To do this, you must pass the parameter «-i«.
Deleting files that match a pattern
Often you will need to delete a bunch of files from a directory. Where you have multiple files, you may be able to use a pattern matching wildcard.
For example, if you wanted to delete all files in the current directory that end with .txt, you could issue the command below:
From the above we can see that only the files that ended with .txt have been deleted.
For a full list of parameters that can be used with the «rm»» remove command, you can issue: «rm —help«
Источник
File handling in the Linux kernel
Будет рассмотрен механизм управления файлами в ядре в следующем порядке слоев :
File handling in the Linux kernel: application layer
Архитектура слоев
Linux kernel в этом смысле не идеален. На то есть причины . Первая : ядро строится многими людьми на протяжении многих лет. Целые куски кода в ядре исчезают бесследно , на их место приходят другие. Вторая : помимо вертикальных слоев , существуют также горизонтальные. Под этим я понимаю подсистемы , которые имеют одинаковый уровень абстракции. Слои , которые рассмотрены в этой статье , не являются общепринятыми по умолчанию в самом ядре.
Будет описан каждый слой и разговор начнется со слоев верхнего уровня. Начнем мы со слоя приложения.
- Слой приложения. Это может быть код приложения: C, C++, Java, и т.д.
- Библиотечный слой. Приложения не работают напрямую с ядром . Для этого есть GNU C library (`glibc’). Это касается не только непосредственно приложений , написанных на си , но также и tcl, и java, и т.д.
- VFS . VFS — верхняя , наиболее абстрактная часть ядра управления файлами. VFS включает набор API для поддержки стандартного функционала — open, read, write, и т.д. VFS работает не только с файлами , но также и с pipes, sockets, character devices, и т.д. В основном код VFS лежит в каталоге fs/ ядра.
- Файловая система . Файловая система конвертирует высокоуровневые операции VFS — reading, writing — в низко-уровневые операции с дисковыми блоками. Поскольку в большинстве своем файловые системы похожи друг на друга , код VFS по их управлению универсален. Часть кода VFS можно найти в mm/ .
- The generic block device layer. Файловая система не обязательно имеет блочную систему. Например /proc не хранится на диске. VFS не волнует , как реализована файловая система. Модель блочного устройства построена на последовательности блоков данных фиксированного размера. Код лежит в drivers/block . В основном этот код обслуживает базовые функции любых типов блочных устройств , в частности обслуживает буфер и очереди.
- The device driver. Это нижний уровень, наименее абстрактный , напрямую взаимодействующий с драйверами. Он работает с портами IO, memory-mapped IO, DMA, прерываниями. Большинство линуксовых драйверов написано на си . Не все драйвера обслуживают прерывания , их можно разделить на 2 группы , в нижней части находятся например дисковые и SCSI контроллеры.
Такая архитектура имеет свои преимущества. Различные файловые системы — ext2, ISO9660,UDF — ничего не знают о драйверах , но могут управлять любым блочным устройством. В слое драйверов имеется поддержка для SCSI, IDE, MFM, и других контроллеров. Эти драйвера могут быть использованы для любых файловых систем.
В данной статье рассматривается пример с приложением , написанном на C, слинкованном с GNU C library. Файловая система — ext2 , диск — IDE.
Приложение
Библиотека
Приложений может быть много , а дисковый контролер один , и ядро должно предусмотреть его блокировку на тот период , пока он будет обслуживать данный вызов. Такая блокировка происходит на нижних уровнях ядра , и прикладному программисту нет нужды заботиться об этом.
File handling in the Linux kernel: VFS layer
Функция sys_open выглядит примерно так :
Функция get_unused_fd() пытается найти открытый слот в таблице файловых дескрипторов для текущего процесса. Если их слишком много , может случиться fail. В случае успеха вызывается функция filp_open() ; в случае успеха вернется структура file с файловым дескриптором , который будет назначен fd_install() .
sys_open вызывает filp_open() , вызов которой назначается файловому дескриптору. filp_open() определен в open.c , и выглядит примерно так :
filp_open срабатывает в 2 приема. Сначала вызывается open_namei ( fs/namei.c ) для генерации структуры nameidata . Эта структура делает линк на файловую ноду. Далее делается вызов dentry_open() , передавая информацию из структуры nameidata .
Рассмотрим подробнее структуры inode , dentry , file .
Концепция ноды — одна из самых древних в юниксе. Нода — блок данных , который хранит информацию о файле , такую как права доступа , размер , путь. В юниксе различные файловые имена могут указывать на один файл — линки, но каждый файл имеет только одну ноду. Директория — разновидность файла и имеет аналогичную ноду. Нода в завуалированном виде представлена на уровне приложения структурой stat , в которой есть информация о группе , владельце , размере и т.д.
Поскольку над одним файлом могут трудиться несколько процессов , нужно разграничить права доступа для них на этот файл. В линуксе это реализовано с помощью структуры file . Она включает информацию о файле применительно к ктекущему процессу. Структура file содержит косвенную ссылку на ноду.
VFS кеширует ноды в памяти , этот кеш управляется с помощью связанного списка , состоящего из структур dentry . Каждая такая dentry включает ссылку на ноду и имя файла.
Нода — это представление файла , который может быть расшарен; структура file есть представление файла в конкретном процессе , dentry — структура для кеширования нод в памяти .
Вернемся к filp_open() . Она вызывает open_namei() , которая ищет dentry для данного файла. Вычисляется путь относительно рута с помощью слеша . Путь может состоять из нескольких компонентов , и если эти компоненты закешированы в качестве dentry , вычисление пути проходит быстрее. Если файл не был в доступе , его нода наверняка не закеширована. В этом случае она читается слоем файловой системы и кешируется. open_namei() инициализирует структуру nameidata :
dentry включает ссылку на закешированную ноду , vfsmount — ссылка на файловую систему , в которой находится файл. Утилита mount регистрирует эту структуру vfsmount на уровне ядра .
Функция filp_open вычислили нужную dentry , в том числе и нужную ноду , а также структуру vfsmount . open() сам по себе не ссылается на vfsmount . Структура vfsmount хранится в структуре file для будущего использования. filp_open вызывает dentry_open() , которая выглядит так :
Здесь определяется адрес функции в файловом слое. Он может быть найден в самой ноде.
Поддержка mount() в VFS
Для того , чтобы получить список поддерживаемых файловых систем , можно набрать :
Как минимум линукс обязан поддерживать ext2 и iso9660 (CDROM) типы. Некоторые файловые системы вкомпилированы в ядро — например proc , которая поддерживает одноименный каталог, который позволяет пользовательским приложениям взаимодействовать с ядром и драйверами. Т.е. в линуксе файловая система не обязательно соответствует физическому носителю. В данном случае /proc существует в памяти и динамически генерируется.
Некоторые файловые системы поддерживаются через загружаемые модули , поскольку не используются постоянно. Например , поддержка FAT или VFAT может быть модульной .
Хэндлер файловой системы должен себя зарегистрировать в ядре. Обычно это делается с помощью register_filesystem() , которая определена в fs/super.c : Эта функция проверяет — а зарегистрирован ли данный тип файловой системы , и если нет , то сохраняетт структуру file_system_type в таблице фйловых систем ядра. Хэндлер файловой системы , который инициализирует структуру struct file_system_type , выглядит так : Структура включает имя файловой системы (например ext3 ), и адрес функции read_super . Эта функция будет вызвана . когда файловая система будет монтироваться. Задача read_super — инициализация структуры struct super_block , содержание которой похоже на содержимое суперблока физического диска. Суперблок включает базовую информацию о файловой системе, такую например , как максимальный размер файла. Структура super_block также включает указатели на блоковые операции.
С точки зрения пользователя , для доступа к файловой системе ее нужно примонтировать командой mount . Команда mount вызывает функцию mount() , которая определена в libc-XXX.so . Происходит вызов функции VFS sys_mount() , определенной в fs/namespace.h , sys_mount() проверяет таблицу монтирования :
Приведенный код упрощен по сравнению с реальным , в частности пропущена обработка ошибок . некоторые файловые системы могут обслуживать несколько точек монтирования и т.д.
Как ядро выполняет начальное чтение диска при загрузке ? В ядре есть специальный тип файловой системы — rootfs . Она инициализируется при загрузке , и берет информацию не от ноды с диска , а от параметра загрузчика. Во время загрузки будет смонтирована как обычная система. При наборе команды сначала появляется строка раньше чем `type rootfs’.
Функция sys_mount() создает пустую структуру superblock , и сохраняет в ней структуру блочного устройства. Хэндлер файловой системы регистрируется с помощью register_filesystem() , прописывая имя файловой системы и адрес функции read_super() . Когда sys_mount() проинициализирует структуру super_block , вызывается функция read_super() . Хэндлер прочтет физический суперблок с диска и проверит его. Будет также проинициализирована структура super_operations , и будет добавлен указатель на поле s_op в структуре super_block . super_operations включает набор указателей на функции базовых файловых операций , таких как создание или удаление файла.
Когда структура super_block будет проинициализирована , будет заполнена другая структура vfsmount данными из суперблока, после чего vfsmount будет приаттачена к dentry . Это будет сигнал к тому , что данная директория — не просто каталог , а точка монтирования.
Открытие файла
File handling in the Linux kernel: filesystem layer
Into the filesystem layer
We have seen how the VFS layer calls open through the inode which was created by the filesystem handler for the requested file. As well as open() , a large number of other operations is exposed in the inode structure. Looking at the definition of struct inode (in include/linux/fs.h ), we have:
The interface for manipulating the file is provided by the i_op and i_fop structures. These structures contain the pointers to the functions that do the real work; these functions are provided by the filesytem handler. For example, file_operations , contains the following pointers: You can see that the interface is clean — there are no references to lower-level structures, such as disk block lists, and there is nothing in the interface that presupposes a particular type of low-level hardware. This clean interface should, in principle, make it easy to understand the interaction between the VFS layer and the filesystem handlers. Conceptually, for example, a file read operation ought to look like this:
- The application asks the VFS layer to read a file.
- The VFS layer finds the inode and calls the read() function.
- The filesystem handler finds which disk blocks correspond to the part of the file requested by the application via VFS.
- The filesystem handler asks the block device to read those blocks from disk.
No doubt in some very simple filesystems, the sequence of operations that comprise a disk read is just like this. But in most cases the interaction between VFS and the filesystem is far from straightforward. To understand why, we need to consider in more detail what goes on at the filesystem level.
The pointers in struct file_operations and struct inode_operations are hooks into the filesytem handler, and we can expect each handler to implement them slightly differently. Or can we? It’s worth thinking, for example, about exactly what happens in the filesystem layer when an application opens or reads a file. Consider the `open’ operation first. What exactly is meant by `opening’ a file at the filesystem level? To the application programmer, `opening’ has connotations of checking that file exists, checking that the requested access mode is allowed, creating the file if necessary, and marking it as open. At the filesystem layer, by the time we call open() in struct file_operations() all of this has been done. It was done on the cached dentry for the file, and if the file did not exist, or had the wrong permissions, we would have found out before now. So the open() operation is more-or-less a no-brainer, on most filesystem types. What about the read() operation? This operation will involve doing some real work, won’t it? Well, possibly not. If we’re lucky, the requested file region will have been read already in the not-too-distant past, and will be in a cache somewhere. If we’re very lucky, then it will be available in a cache even if it hasn’t been read recently. This is because disk drives work most effectively when they can read in a continuous stream. If we have just read physical block 123, for example, from the disk, there is an excellent chance that the application will need block 124 shortly. So the kernel will try to `read ahead’ and load disk blocks into cache before they are actually requested. Similar considerations apply to disk write operations: writes will normally be performed on memory buffers, which will be flushed periodically to the hardware.
Now, this disk caching, buffering, and read-ahead support is, for the most part filesystem-independent. Of the huge amount of work that goes on when the application reads data from a file, the only part that is file-system specific is the mapping of logical file offsets to physical disk blocks. Everything else is generic. Now, if it is generic, it can be considered part of VFS, along with all the other generic file operations, right? Well, no actually. I would suggest that conceptually the generic filesystem stuff forms a separate architectural layer, sitting between the individual filesystem handlers and the block devices. Whatever the merits of this argument, the Linux kernel is not structured like this. You’ll see, in fact, that the code is split between two subsystems: the VFS subsystem (in the fs directory of the kernel source), and the memory management subsystem (in the mm directory). There is a reason for this, but it is rather convoluted, and you may not need to understand it to make sense of the rest of the disk access procedure which I will describe later. But, if you are interested, it’s like this.
A digression: memory mapped files
The ext2 filesystem handler
Let’s dispose of the open() operation first, as this is trivial (remember that VFS has done all the hard work by the time the filesystem handler is invoked). VFS calls open() in the struct file_operations provided by the filesystem handler. In the ext2 handler, this structure is initialized like this ( fs/ext2/file.c ):
Notice that most of the file operations are simply delegated to the generic filesystem infrastructure. open() maps onto generic_file_open() , which is defined in fs/open.c : Not very interesting, it it? All this function does is to check whether we have requested an operation with large file support on a filesystem that can’t accomodate it. All the hard work has already been done by this point.
The read() operation is more interesting. This function results in a call on generic_file_read() , which is defined in mm/filemap.c (remember, file reads are part of the memory management infrastructure!). The logic is fairly complex, but for our purposes — talking about file management, not memory management — can be distilled down to something like this:
In this code we can see (in outline) the caching and read-ahead logic. It’s important to remember that because the generic_file_read code is part of the memory management subsystem, its operations are expressed in terms of (virtual memory) pages, not disk blocks. Ultimately we will be reading disk blocks, but not here. In practice, disk blocks will often be 1kB, and pages 4kB. So we will have four block reads for each page read. generic_read_file can’t get real data from a real filesystem, either in blocks or in pages, because only the filesystem knows where the logical blocks in the file are located on the disk. So, for this discussion, the most important feature of the above code is the call: This is a call through the inode of the file, back into the filesystem handler. It is expected to schedule the read of a page of data, which may encompass multiple disk blocks. In reality, reading a page is also a generic operation — it is only reading blocks that is filesystem-specific. A page read just works out the number of blocks that constitute a page, and then calls another function to read each block. So, in the ext2 filesystem example, the readpage function pointer points to ext2_readpage() (in fs/ext2/inode.c ), which simply calls back into the generic VFS layer like this: block_read_full_page() (in fs/buffer.c ) calls the ext2_get_block() function once for each block in the page. This function does not do any IO itself, or even delegate it to the block device. Instead, it determines the location on disk of the requested logical block, and returns this information in a buffer_head structure (of which, more later). The ext2 handler does know the block device (because this information is stored in the inode object that has been passed all the way down from the VFS layer). So it could quite happily ask the device to do a read. It doesn’t, and the reason for this is quite subtle. Disk devices generally work most efficiently when they are reading or writing continuously. They don’t work so well if the disk head is constantly switching tracks. So for best performance, we want to try to arrange the disk blocks to be read sequentially, even if that means that they are not read or written in the order they are requested. The code to do this is likely to be the same for most, if not all, disk devices. So disk drivers typically make use of the generic request management code in the block device layer, rather than scheduling IO operations themselves.
So, in short, the filesystem handler does not do any IO, it merely fills in a buffer_head structure for each block required. The salient parts of the structure are: You can see that the structure contains a block number, an identifier for the device ( b_dev ), and a reference to the memory into which the disk contents should be read. kdev_t is an integer containing the major and minor device numbers packed together. buffer_head therefore contains everything the IO subsystem needs to do the real read. It also defines a function called b_end_io that the block device layer will call when it has loaded the requested block (remember this operation is asynchronous). However, the VFS generic filesystem infrastructure does not hand off this structure to the IO subsystem immediately it is returned from the filesystem handler. Instead, as the filesystem handler populates buffer_head objects, VFS builds them into a queue (a linked list), and then submits the whole queue to the block device layer. A filesystem handler can implement its own b_end_io function or, more commonly, make using of the generic end-of-block processing found in the generic block device layer, which we will consider next.
File handling in the Linux kernel: generic device layer
Into the block generic device layer
When the application manipulates a file, the kernel’s VFS layer finds the file’s dentry structure in memory, and calls the file operations open() , read() , etc., through the file’s inode structure. The inode contains pointers into the filesystem handler for the particular filesystem on which the file is located. The handler delegates most of its work to the generic filesystem support code, which is split between the VFS subsystem and the memory management subsystem. The generic VFS code attempts to find the requested blocks in the buffer cache but, if it can’t, it calls back into the filesystem handler to determine the physical blocks that correspond to the logical blocks that VFS has asked it to read or write. The filesystem handler populates buffer_head structures containing the block numbers and device numbers, which are then built into a queue of IO requests. The queue of requests is then passed to the generic block device layer.
Device drivers, special files, and modules
Finding the device numbers for a filesystem
Registering the driver
Request queue management
If the driver is taking advantage of the kernel’s request ordering and coalescing functions, then it register’s itself using the function
(also defined in drivers/block/ll_rw_blk.c ). The second argument to this function is a pointer to a function that will be invoked when a sorted queue of requests is available to be processed. The driver might use this function like this: So we have seen how the device registers itself with the generic block device layer, so that it can accept requests to read or write blocks. We must now consider what happens when these requests have been completed. You may remember that the interface between the filesystem layer and the block device layer is asynchronous. When the filesystem handler added the specifications of blocks to load into the buffer_head structure, it could also write a pointer to the function to call to indicate that the block had been read. This function was stored in the field b_end_io . In practice, when the filesystem layer submits a queue of blocks to read to the submit_bh() function in the block device layer, submit_bh() ultimately sets b_end_io to a generic end-of-block handler. This is the function end_buffer_io_sync (in fs/buffer.c ). This generic handler simply marks the buffer complete and unlocks its memory. As the interface between the filesystem layer and the generic block device layer is asynchronous, the interface between the generic block device layer and the driver itself is also asynchronous. The request handling functions described above (named my_request_fn in the code snippets) are expected not to block. Instead, these methods should schedule an IO request on the hardware, then notify the block device layer by calling b_end_io on each block when it is completed. In practice, device drivers typically make use of utility functions in the generic block device layer, which combine this notification of completion with the manipulation of the queue. If the driver registers itself using blk_init_q() , its request handler can expected to be passed a pointed to a queue whenever there are requests available to be serviced. It uses utility functions to iterate through the queue, and to notify the block device layer everytime a block is completed. We will look at these functions in more detail in the next section.
Device driver interface
File handling in the Linux kernel: device driver layer
Into the device driver
Interrupts in Linux
When the interrupt handler is finished, execution resumes at the point at which it was broken off. So, what the driver can do is to tell the hardware to do a particular operation, and then put itself to sleep until the hardware generates an interrupt to say that it’s finished. The driver can then finish up the operation and return the data to the caller.
Most hardware devices that are attached to a computer are capable of generating interrupts. Different architectures support different numbers of interrupts, and have different sharing capabilities. Until recently there was a significant chance that you would have more hardware than you had interrupts for, at least in the PC world. However, Linux now supports `interrupt sharing’, at least on compatible hardware. On the laptop PC I am using to write this, the interrupt 9 is shared by the ACPI (power management) system, the USB interfaces (two of them) and the FireWire interface. Interrupt allocations can be found by doing
So, what a typical harddisk driver will usually do, when asked to read or write one or more blocks of data, will be to write to the hardware registers of the controller whatever is needed to start the operation, then wait for an interrupt.
In Linux, interrupt handlers can usually be written in C. This works because the real interrupt handler is in the kernel’s IO subsystem — all interrupts actually come to the same place. The kernel then calls the registered handler for the interrupt. The kernel takes care of matters like saving the CPU register contents before calling the handler, so we don’t need to do that stuff in C.
An interrupt handler is defined and registered like this: request_irq takes the IRQ number (1-15 on x86), some flags, a pointer to the handler, and a name. The name is nothing more than the text that appears in /proc/interrupts . The flags dictate two important things — whether the interrupt is `fast’ (SA_INTERRUPT, see below) and whether it is shareable (SA_SHIRQ). If the interrupt is not available, or is available but only to a driver that supports sharing, then request_irq returns a non-zero status. The last argument to the function is a pointer to an arbitrary block of data. This will be made available to the handler when the interrupt arrives, and is a nice way to supply data to the handler. However, this is relatively new thing in Linux, and not all the existing drivers use it.
The interrupt handler is invoked with the number of the IRQ, the registers that were saved by the interrupt service routine in the kernel, and the block of data passed when the handler was registered.
An important historical distinction in Linux was between `fast’ and `slow’ interrupt handlers, and because this continues to confuse developers and arouse heated debate, it might merit a brief mention here.
. In the early days (1.x kernels), the Linux architecture supported two types of interrupt handler — a `fast’ and a `slow’ handler. A fast handler was invoked without the stack being fully formatted, and without all the registers preserved. It was intended for handlers that were genuinely fast, and didn’t do much. They couldn’t do much, because they weren’t set up to. In order to avoid the complexity of interacting with the interrupt controller, which would have been been necessary to prevent other instances of the same interrupt entering the handler re-entrantly and breaking it, a fast handler was entered with interrupts completely disabled. This was a satisfactory approach when the interrupts really had to be fast. As hardware got faster, the benefit of servicing an interrupt with an incompletely formatted stack became less obvious. In addition, the technique was developed of allowing the handler to be constructed of two parts: a `top half’ and a `bottom half’. The `top half’ was the part of the handler that had to complete immediately. The top half would do the minimum amount of work, then schedule the `bottom half’ to execute when the interrupt was finished. Because the bottom half could be pre-empted, it did not hold up other processes. The meaning of a `fast’ interrupt handler therefore changed: a fast handler completed all its work within the main handler, and did not need to schedule a bottom half.
In modern (2.4.x and later) kernels, all these historical features are gone. Any interrupt handler can register a bottom half and, if it does, the bottom half will be scheduled to run in normal time when the interrupt handler returns. You can use the macro mark_bh to schedule a bottom half; doing so requires a knowledge of kernel tasklets, which are beyond the scope of this article (read the comments in include/linux/interrupt.h in the first instance). The only thing that the `fast handler’ flag SA_INTERRUPT now does is to cause the handler to be invoked with interrupts disabled. If the flag is ommitted, interrupts are enabled, but only for IRQs different to the one currently being serviced. One type of interrupt can still interrupt a different type.
Port IO in Linux
DMA in Linux
The IDE disk driver
When requests are delivered to the driver, the method do_ide_request is invoked. This determines the type of the request, and whether the driver is in a position to service the request. If it is not, it puts itself to sleep for a while. If the request can be serviced, then do_ide_request() calls the appropriate function for that type of request. For a read or write request, the function is __do_rw_disk() (in drivers/ide/ide-disk.c ). __do_rw_disk() tells the IDE controller which blocks to read, by calculating the drive parameters and outputing them to the control registers. It is a fairly long and complex function, but the part that is important for this discussion looks (rather simplified) looks like this:
The outb function outputs bytes of data to the control registers IDE_SECTOR_REG, etc. These are defined in include/linux/ide.h , and expand to the IO port addresses of the control registers for specific IDE disks. If the IDE controller supports bus-mastering DMA, then the driver will intialize a DMA channel for it to use. read_intr is the function that will be invoked on the next interrupt; its address is stored in the pointer handler , so it gets invoked by ide_intr , the registered interrupt handler. The convenience functions end_that_request_first() and end_that_request_last() are defined in devices/block/ll_rw_blk.c end_that_request_first() shuffles the next request to the head or the queue, so it is available to be processed, and then calls b_end_io on the request that was just finished. bh->b_end_io points to end_buffer_io_sync (in fs/buffer.c ) which just marks the buffer complete, and wakes up any threads that are sleeping in wait for it to complete.
Источник