Содержание

File descriptor
Overview
Stdin, stdout, and stderr
Redirecting file descriptors
Security descriptors in file systems
Что такое файловый дескриптор простыми словами
Как файлы получают дескрипторы
Для чего нужны файловые дескрипторы
Что такое плохой файловый дескриптор
Что можно делать с файловыми дескрипторами
What is file-descriptor?
4 Answers 4

File descriptor

A file descriptor is a number that uniquely identifies an open file in a computer’s operating system. It describes a data resource, and how that resource may be accessed.

When a program asks to open a file — or another data resource, like a network socket — the kernel:

Grants access.
Creates an entry in the global file table.
Provides the software with the location of that entry.

The descriptor is identified by a unique non-negative integer, such as 0, 12, or 567. At least one file descriptor exists for every open file on the system.

File descriptors were first used in Unix, and are used by modern operating systems including Linux, macOS, and BSD. In Microsoft Windows, file descriptors are known as file handles.

Overview

When a process makes a successful request to open a file, the kernel returns a file descriptor which points to an entry in the kernel’s global file table. The file table entry contains information such as the inode of the file, byte offset, and the access restrictions for that data stream (read-only, write-only, etc.).

Stdin, stdout, and stderr

On a Unix-like operating system, the first three file descriptors, by default, are STDIN (standard input), STDOUT (standard output), and STDERR (standard error).

Name	File descriptor	Description	Abbreviation
Standard input	0	The default data stream for input, for example in a command pipeline. In the terminal, this defaults to keyboard input from the user.	stdin
Standard output	1	The default data stream for output, for example when a command prints text. In the terminal, this defaults to the user’s screen.	stdout
Standard error	2	The default data stream for output that relates to an error occurring. In the terminal, this defaults to the user’s screen.	stderr

Redirecting file descriptors

File descriptors may be directly accessed using bash, the default shell of Linux, macOS X, and Windows Subsystem for Linux.

For example, when you use the find command, successful output goes to stdout (file descriptor 1), and error messages go to stderr (file descriptor 2). Both streams display as terminal output:

We’re getting errors because find is trying to search a few system directories that we don’t have permission to read. All the lines that say «Permission denied» were written to stderr, and the other lines were written to stdout.

You can hide stderr by redirecting file descriptor 2 to /dev/null, the special device in Linux that «goes nowhere»:

The errors sent to /dev/null, and are not displayed.

Understanding the difference between stdout and stderr is important when you want to work with a program’s output. For example, if you try to grep the output of the find command, you’ll notice the error messages are not filtered, because only the standard output is piped to grep.

However, you can redirect standard error to standard output, and then grep will process the text of both:

Notice that in the command above, the target file descriptor (1) is prefixed with an ampersand («&»). For more information about data stream redirection, see pipelines in the bash shell.

For examples of creating and using file descriptors in bash, see our exec builtin command examples.

Security descriptors in file systems

Windows file systems may support the storage and management of security descriptors associated with individual storage units within the file system. The granularity of security control is entirely up to the file system. For example, one file system might maintain a single security descriptor that covers everything on a given storage volume, while another might provide security descriptors that cover different parts of a single given file. The models that most developers are comfortable with are those provided by the existing Windows file systems:

NTFS supports a per-file (or directory) security descriptor model. NTFS is efficient in its storage of security descriptors, storing only a single copy of each security descriptor, even if it is used by many different files.

FAT, CDFS, UDFS do not support security descriptors.

RDBSS and the SMB Network Redirector provide support comparable to that provided by the remote volume.

These file systems, however, do not represent all possible implementations of Windows security for file systems.

A Windows security descriptor consists of four distinct pieces:

The security identifier (SID) of the owner of the object. An object’s owner always has the ability to reset the security on the object. This is a good way to ensure that, for example, all access to an object can be removed. Because even if owners remove their ability to perform all operations, this inherent right allows them to restore their security rights on the object.

An optional security identifier (SID) of the default group of the object. The concept of group ownership is one that is not required in Windows, but is useful for some applications.

The system access control list (SACL) that describes the auditing policy of the security descriptor.

The discretionary access control list (DACL) that describes the access policy of the security descriptor.

The following figure illustrates a windows security descriptor.

Security descriptors are variable-sized objects, with each of the individual sub-components being variable in size as well. To facilitate offline storage of security descriptors, a security descriptor may be in self-relative format, in which case the header is the offset within the buffer to the specific component of the security descriptor. An in-memory format consists of pointer values to the various parts of the security descriptor. For a file system, the self-relative format is normally the most useful because it allows for simple storage and retrieval of the security descriptor from persistent storage. Applications that build security descriptors are more likely to use the in-memory format. The security reference monitor provides conversion routines to convert from one format to the other.

This section includes the following topics:

Что такое файловый дескриптор простыми словами

Файловый дескриптор — это неотрицательное число, которое является идентификатором потока ввода-вывода. Дескриптор может быть связан с файлом, каталогом, сокетом.

Например, когда вы открываете или создаете новый файл, операционная система формирует для себя запись для представления этого файла и хранения информации о нем. У каждого файла индивидуальный файловый дескриптор Linux. Открыли 100 файлов — где-то в ядре появились 100 записей, представленных целыми числами.

Как файлы получают дескрипторы

Обычно файловые дескрипторы выделяются последовательно. Есть пул свободных номеров. Когда вы создаете новый файл или открываете существующий, ему присваивается номер. Следующий файл получает очередной номер — например, 101, 102, 103 и так далее.

Дескриптор для каждого процесса является уникальным. Но есть три жестко закрепленных индекса — это первые три номера (0, 1, 2).

0 — стандартный ввод (stdin), место, из которого программа получает интерактивный ввод.
1 — стандартный вывод (stdout), на который направлена большая часть вывода программы.
2 — стандартный поток ошибок (stderror), в который направляются сообщения об ошибках.

Читайте также: Coreutils linux что это

Когда вы завершаете работу с файлом, присвоенный ему дескриптор освобождается и возвращается в пул свободных номеров. Он снова доступен для выделения под новый файл.

В Unix-подобных системах файловые дескрипторы могут относиться к любому типу файлов Unix: обычным файлам, каталогам, блочным и символьным устройствам, сокетам домена, именованным каналам. Дескрипторы также могут относиться к объектам, которые не существуют в файловой системе: анонимным каналам и сетевым сокетам.

Понятием «файловый дескриптор» оперируют и в языках программирования. Например, в Python функция os.open(path, flags, mode=0o777, *, dir_fd=None) открывает путь к файлу path, добавляет флаги и режим, а также возвращает дескриптор для вновь открытого файла. Начиная с версии 3.4 файловые дескрипторы в дочернем процессе Python не наследуются. В Unix они закрываются в дочерних процессах при выполнении новой программы.

Для чего нужны файловые дескрипторы

Чтобы оценить важность файловых дескрипторов, нужно разобраться, как работает файловая система.

В традиционной реализации Unix дескрипторы индексируются в таблицу дескрипторов для каждого процесса, поддерживаемого ядром.
Таблица файловых дескрипторов индексирует общесистемную таблицу файлов, открытых всеми процессами.
В таблице файлов записывается режим, в котором открыт файл или другой ресурс — например, для чтения, записи, чтения и записи.
Режим индексируется в таблицу индексных дескрипторов, описывающих фактические базовые файлы. В каждом индексном дескрипторе хранятся атрибуты и расположение дисковых блоков переданного объекта.

Когда нужно выполнить ввод или вывод, процесс через системный вызов передает ядру дескриптор нужного файла. Ядро обращается к файлу от имени процесса. При этом у самого процесса нет доступа к файлу или таблице индексных дескрипторов.

Что такое плохой файловый дескриптор

Это ошибка, которая может возникнуть в многопоточных приложениях, — Bad file descriptor. Чтобы исправить ее, нужно найти код, который закрывает один и тот же дескриптор файла. Может произойти и другая ситуация — например, один поток уже закрыл файл, а другой поток пытается получить к нему доступ.

В однопоточных приложениях такая проблема обычно не возникает.

Что можно делать с файловыми дескрипторами

Файловые дескрипторы можно использовать для исправления ошибок. Например, если на диске нет свободного места, но вы не видите файлы, которые занимают пространство, то можно посмотреть открытые дескрипторы. Это поможет понять, какое приложение заняло весь доступный объем.

Важно понимать, что если мы один раз открыли файл, и он получил файловый дескриптор, то мы можем взаимодействовать с ним дальше. Не имеет значения, что с этим файлом происходит. Его могут переименовать, удалить, могут изменить его владельца, отобрать права на запись и чтение. Если вы уже начали работать с файлом и знаете его дескриптор, то можете продолжать с ним работать.

What is file-descriptor?

While trying to learn socket programming, I saw the following code:

I browsed through the man page and found that socket returns a file descriptor. I have tried searching the internet and other similar questions here but I couldn’t understand what file descriptor really is. I have to complete my socket programming coursework in two days. So if someone could explain file descriptor in easy language, that would be great.

4 Answers 4

There are two related objects: file descriptor and file description. People often confuse these two and think they are the same.

File descriptor is an integer in your application that refers to the file description in the kernel.

File description is the structure in the kernel that maintains the state of an open file (its current position, blocking/non-blocking, etc.). In Linux file descripion is struct file .

The open() function shall establish the connection between a file and a file descriptor. It shall create an open file description that refers to a file and a file descriptor that refers to that open file description. The file descriptor is used by other I/O functions to refer to that file. The path argument points to a pathname naming the file.

The open() function shall return a file descriptor for the named file that is the lowest file descriptor not currently open for that process. The open file description is new, and therefore the file descriptor shall not share it with any other process in the system.

In Unix/ Linux operating systems, a file descriptor is an abstract indicator (handle) used to access a file or other IO(input/output) resource, such as a pipe or network socket. Normally a file descriptors index into a per-process file descriptor table maintained by the kernel in Linux/Unix OS, that in turn indexes into a system-wide table of files opened by all processes, called the file table. This table records the «mode» with which the file or the other resource has been opened for the following operations(There are more operations)

and possibly other modes. It also indexes into a third table called the inode table that describes the actual underlying files.

I think of file descriptors as (indirect, higher-level) pointers to opaque file objects maintained by the kernel.

Normally, when you deal with objects maintained by a library, you pass to the library pointers to objects that you’re not supposed to dereference and manipulate yourself.

For kernel objects, this it’s not just that you’re not supposed to manipulate them yourself — you literally can’t because they live in a different address space that’s not at all accessible to you. And because they live in a different address space, pointers wouldn’t be a meaningful way of referring to them.

You need a token or handle which the kernel would internally resolve to a pointer that’s meaningful in the kernel address space. File descriptors are such tokens in integer form.

(or an EBADF error if a given filedescriptor may not be resolved to a file object pointer for the given process)

File Descriptors are nothing but mappings to a file. You can also say these are pointers to a file that the process is using.
FDs are just integer values which act as pointers to process resources.

Whenever a process starts, an entry of the running process is added to the /proc/

directory. This is the place where all of the data related to the process is kept. Also, on process start the kernel allocates 3 file-descriptors to the process for communication with the 3 data streams referred to as stdin , stdout and stderr .
the linux kernel uses an algorithm to always create a FD with the lowest possible integer value so these data-streams are mapped to the numbers 0 , 1 and 2 .

Let’s say in you code you opened a file to read from or to write to. This means the process needs access to a resource and it has to create a mapping/pointer for this new resource.
To do this, the kernel automatically creates a FD as soon as the file is opened by your code.

If you run ls -l /proc/

/fd/ you will se an additional FD created there with id 4 (can be some other number also if the program has used other resources)

What is file descriptor in windows