Linux signal all threads

Содержание

Практика работы с сигналами
Функция обработчик сигналов
Блокирование сигналов
sigwait
Посыл сигнала
Пример использования сигналов
All about Linux signals
What is covered here
What is signaled in Linux
Signal handlers
Traditional signal() is deprecated
The recommended way of setting signal actions: sigaction
Example use of sigaction()
SA_SIGINFO handler
Compiler optimization and data in signal handler
Atomic Type
Signal-safe functions
Alternative method of handling signals: signalfd()
Handling SIGCHLD
Handling SIGBUS
Handling SIGSEGV
Handling SIGABRT
What happens when a process receives a signal?
Default actions
Interrupting system calls
What is interrupted?
Simple example of signal aware code
Data transferring and signals
Blocking signals
How to block signals
Preventing race conditions.
Waiting for a signal
Other functions to wait for a signal
Sending signals
Sending signal from keyboard
Sending signals to yourself
Sending data along with signal — sigqueue()
Real-time signals
Signals and fork()
Signals and threads
Which thread receives the signal?
Signal handlers
sigwaitinfo()/sigtimedwait() and process-directed signals
Real-time signals
Other uses of signals

Практика работы с сигналами

Хочу запечатлеть небольшой опыт работы с сигналами в Linux. Ниже будут представлены примеры использования наиболее значимых конструкций в этой области. Постараюсь разложить все по отдельным полочкам, чтобы всегда было легко глянуть и вспомнить, что и как использовать.

Важные факты о сигналах:

Сигналы в Linux играют роль некоего средства межпроцессного взаимодействия (а так же и межпоточного)
Каждый процесс имеет маску сигналов (сигналов, получение которых он игнорирует)
Каждая нить (thread), так же как и процесс, имеет свою маску сигналов
При получении сигнала(если он не блокируется) процесс/нить прерывается, управление передается в функцию обработчик сигнала, и если эта функция не приводит к завершению процесса/нити, то управление передается в точку на которой процесс/нить была прервана
Можно установить свою функцию обработчик сигнала, но только для процесса. Данный обработчик будет вызываться и для каждой нити порожденной из этого процесса

Я не буду углубляться в теорию сигналов, что откуда зачем и куда. Меня в первую очередь интересует сам механизм работы с ними. Поэтому в качестве используемых сигналов будут выступать SIGUSR1 и SIGUSR2, это два единственных сигнала отданных в полное распоряжение пользователя. А так же я постараюсь уделить больше внимание именно межпоточному взаимодействию сигналов.
Итак, поехали.

Функция обработчик сигналов

Данная функция вызывается, когда процесс (или нить) получает неблокируемый сигнал. Дефолтный обработчик завершает наш процесс (нить). Но мы можем сами определить обработчики для интересующих нас сигналов. Следует очень осторожно относится к написанию обработчика сигналов, это не просто функция, выполняющаяся по коллбеку, происходит прерывание текущего потока выполнения без какой либо подготовительной работы, таким образом глобальные объекты могут находится в неконсистентном состоянии. Автор не берется приводить свод правил, так как сам их не знает, и призывает последовать совету Kobolog (надеюсь он не против, что я ссылаюсь на него) и изучить хотя бы вот этот материал FAQ.

Установить новый обработчик сигнала можно двумя функциями

Которая принимает номер сигнала, указатель на функцию обработчик (или же SIG_IGN (игнорировать сигнал) или SIG_DFL (дефолтный обработчик)), и возвращает старый обработчик. Сигналы SIGKILL и SIGSTOP не могут быть «перехвачены» или проигнорированы. Использование этой функции крайне не приветствуется, потому что:

функция не блокирует получение других сигналов пока выполняется текущий обработчик, он будет прерван и начнет выполняться новый обработчик
после первого получения сигнала (для которого мы установили свой обработчик), его обработчик будет сброшен на SIG_DFL

Этих недостатков лишена функция

Которая также принимает номер сигнала (кроме SIGKILL и SIGSTOP). Второй аргумент это новое описание для сигнала, через третий возвращается старое значение. Структура struct sigaction имеет следующие интересующие нас поля

sa_handler — аналогичен sighandler_t в функции signal
sa_mask — маска сигналов который будут блокированы пока выполняется наш обработчик. + по дефолту блокируется и сам полученный сигнал
sa_flags — позволяет задать дополнительные действия при обработке сигнала о которых лучше почитать тут

Использование данной функции выглядит совсем просто

Здесь мы установили наш обработчик для сигналов SIGUSR1 и SUGUSR2, а также указали, что необходимо блокировать эти же сигналы пока выполняется обработчик.
С обработчиком сигналов есть один не очень удобный момент, он устанавливается на весь процесс и все порожденные нити сразу. Мы не имеет возможность для каждой нити установить свой обработчик сигналов.
Но при этом следует понимать что когда сигнал адресуется процессу, обработчик вызывается именно для главной нити (представляющей процесс). Если же сигнал адресуется для нити, то обработчик вызывается из контекста этой нити. См пример 1.

Блокирование сигналов

Для того, чтобы заблокировать некоторый сигналы для процесса, необходимо добавить их в маску сигналов данного процесса. Для этого используется функция

Мы можем к уже существующей маске сигналов добавить новые сигналы (SIG_BLOCK), можем из этой маски убрать часть сигналов (SIG_UNBLOCK), а так же установить полностью нашу маску сигналов (SIG_SETMASK).
Для работы с маской сигналов внутри нити используется функция

которая позволяет сделать все тоже, но уже для каждой нити в отдельности.
Невозможно заблокировать сигналы SIGKILL или SIGSTOP при помощи этих функций. Попытки это сделать будут игнорироваться.

sigwait

Данная функция позволяет приостановить выполнении процесса (или нити) до получения нужного сигнала (или одного из маски сигналов). Особенностью этой функции является то, что при получении сигнала не будет вызвана функции обработчик сигнала. См. пример 2.

Посыл сигнала

Для того, чтобы послать сигнал процессу можно использовать две функции

С первой все понятно. Вторая нужна для того, чтобы послать сигнал самому себе, и по сути равносильна kill(getpid(), signal). Функция getpid() возвращает PID текущего процесса.
Для того, чтобы послать сигнал отдельной нити, используется функция

Пример использования сигналов

Все, что я описал выше, не дает ответа на вопрос «Зачем мне использовать сигналы». Теперь я хотел бы привести реальный пример использования сигналов и где без них попросту не обойтись.
Представьте, что вы хотите читать или писать какие-то данные в какое то устройство, но это может привести к блокированию. Ну например, чтение в случае работы с сокетами. Или может быть запись в пайп. Вы можете вынести это в отдельный поток, чтобы не блокировать основную работу. Но что делать когда вам нужно завершить приложение? Как корректно прервать блокирующую операцию IO? Можно было бы задавать таймаут, но это не очень хорошее решение. Для этого есть более удобные средства: функции pselect и ppoll. Разница между ними исключительно в юзабельности, поведение у них одинаковое. В первую очередь эти функции нужны для мультиплексирования работы с IO (select/poll). Префикс ‘p’ в начале функции указывает на то, что данная функция может быть корректно прервана сигналом.

Итак, сформулируем требование:
Необходимо разработать приложение, открывающее сокет (для простоты UDP) и выполняющее в потоке операцию чтения. Данное приложение должно корректно без задержек завершаться по требованию пользователя.
Функция треда выглядит вот так

stop это глобальный булев флаг который устанавливается в true нашим обработчиком, что сообщает потоку о необходимости завершиться.
Логика работы такая:

проверяем, что пока стартовал тред его еще не пожелали завершить
блокируем завершающий сигнал
проверяем, что пока блокировали, нас не пожелали завершить
вызываем ppoll передавая в качестве последнего параметра маску сигналов по которой ждется сигнал
после выхода из ppoll проверяем что вышли не из за сигнала о завершении

Вот так выглядит главная функция

Устанавливаем наш обработчик для SIGINT, и когда нужно завершить дочерний поток шлем ему этот сигнал.
Полный листинг см. пример 3.

На мой взгляд, недостатком данного способа является то, что в случае нескольких потоков мы можем завершить их только все сразу. Нет возможности устанавливать свой обработчик сигналов для каждого треда. Таким образом, нет возможности реализовать полноценное межпоточное взаимодействие через сигналы. Linux way это не предусматривает.

PS. Исходные коды разместил на сервисе PasteBin (ссылку не даю, а то еще за рекламу посчитают).
PPS. Прошу простить за обилие ошибок. Язык, слабая моя сторона. Спасибо, всем кто помог их исправить.

Данная статья не претендует на полное (и глубокое) описание работы с сигналами и нацелена в первую очередь на тех, кто до этого момента не сталкивались с понятием «сигнал». Для более глубоко понимания работы сигналов автор призывает обратиться в более компетентные источники и ознакомиться с конструктивной критикой в комментариях.

Источник

All about Linux signals

In most cases if you want to handle a signal in your application you write a simple signal handler like:

void handler ( int sig )

and use the signal(2) system function to run it when a signal is delivered to the process. This is the simplest case, but signals are more interesting than that! Information contained in this article is useful for example when you are writing a daemon and must handle interrupting your program properly without interrupting the current operation or the whole program.

What is covered here

What is signaled in Linux

For a complete list of signals see the signal(7) manual page.

Signal handlers

Traditional signal() is deprecated

The recommended way of setting signal actions: sigaction

As you can see you don’t pass the pointer to the signal handler directly, but instead a struct sigaction object. It’s defined as:

For a detailed description of this structure’s fields see the sigaction(2) manual page. Most important fields are:

sa_handler — This is the pointer to your handler function that has the same prototype as a handler for signal(2).
sa_sigaction — This is an alternative way to run the signal handler. It has two additional arguments beside the signal number where the siginfo_t * is the more interesting. It provides more information about the received signal, I will describe it later.
sa_mask allows you to explicitly set signals that are blocked during the execution of the handler. In addition if you don’t use the SA_NODEFER flag the signal which triggered will be also blocked.
sa_flags allow to modify the behavior of the signal handling process. For the detailed description of this field, see the manual page. To use the sa_sigaction handler you must use SA_SIGINFO flag here.

What is the difference between signal(2) and sigaction(2) if you don’t use any additional feature the later one provides? The answer is: portability and no race conditions. The issue with resetting the signal handler after it’s called doesn’t affect sigaction(2), because the default behavior is not to reset the handler and blocking the signal during it’s execution. So there is no race and this behavior is documented in the POSIX specification. Another difference is that with signal(2) some system calls are automatically restarted and with sigaction(2) they’re not by default.

Example use of sigaction()

In this example we use the three arguments version of signal handler for SIGTERM . Without setting the SA_SIGINFO flag we would use a traditional one argument version of the handler and pass the pointer to it by the sa_handler field. It would be a replacement for signal(2). You can try to run it and do kill PID to see what happens.

In the signal handler we read two fields from the siginfo_t * siginfo parameter to read the sender’s PID and UID. This structure has more fields, I’ll describe them later.

The sleep(3) function is used in a loop because it’s interrupted when the signal arrives and must be called again.

SA_SIGINFO handler

We’ll see more examples of use of siginfo_t later.

Compiler optimization and data in signal handler

What it does? It depends on compiler optimization settings. Without optimization it executes a loop that ends when the process receives SIGTERM or other sgnal that terminates the process and was not handler. When you compile it with the -O3 gcc flag it will not exit after receiving SIGTERM . Why? because whe while loop is optimized in such way that the exit_flag variable is loaded into a processor register once and not read from the memory in the loop. The compiler isn’t aware that the loop is not the only place where the program accesses this variable while running the loop. In such cases — modifying a variable in a signal handler that is also accessed in some other parts of the program you must remember to instruct the compiler to always access this variable in memory when reading or writing them. You should use the volatile keyword in the variable declaration:

Atomic Type

It doesn’t work like a mutex: it’s guaranteed that read or write of this type translates into an uninterruptible operation but code such as:

Isn’t safe: there is read and update in the if operation but only single reads and single writes are atomic.

Don’t try to use this type in a multi-threaded program as a type that can be used without a mutex. It’s only intended for signal handlers and has nothing to do with mutexes!
You don’t need to worry if data are modified or read in a signal handler are also modified or read in the program if it happens only in parts where the signal is blocked. Later I’ll show how to block signals. But you will still need the volatile keyword.

Signal-safe functions

Alternative method of handling signals: signalfd()

First we must block the signals we want to handle with signalfd(2) using sigprocmask(2). This function will be described later. Then we call signalfd(2) to create a file descriptor that will be used to read incoming signals. At this point in case of SIGTERM or SIGINT delivered to your program it will not be interrupted, no handler will be called. It will be queued and you can read information about it from the sfd descriptor. You must supply a buffer large enough to read the struct signalfd_siginfo object that will be filled with information similar to the previously described siginfo_t . The difference is that the fields are named a bit different (like ssi_signo instead of si_signo ). What is interesting is that the sfd descriptor behaves and can be used just like any other file descriptor, in particular you can:

Use it in select(2), poll(2) and similar functions.
Make it non-blocking.
Create many of them, each handling different signals to return different descriptors as ready by select(2) for every different signal.
After fork() the file descriptor is not closed, so the child process can read signals that were send to the parent.

This is perfect to be used in a single-process server with the main loop executes a function like poll(2) to handle many connections. It simplifies signal handling because the signal descriptor can be added to the poll’s array of descriptors and handled like any other of them, without asynchronous actions. You handle the signal when you are ready for that because your program is not interrupted.

Handling SIGCHLD

This way whenever a child exits it will be cleaned-up but information which process was that, why it exited and its exit status is forgotten. You could make the handler more intelligent but remember to not use any function that is not listed as signal-safe.

You must remember that if you make child processes SIGCHLD must have a handler. The behavior of ignoring this signal is undefined, so at least a handler that doesn’t do anything is required.

Handling SIGBUS

You must keep in mind the list of signal-safe functions: In this example we never actually return from the signal handler. The stack is cleaned up, but program is restarted in completely different place, so if you’ve had, for example, a mutex locked during the operation like:

After longjmp(3) the mutex is still held although in every other situation the mutex is released.

So handling SIGBUS is possible but very tricky and can introduce bugs that are very hard to debug. The program’s code also becomes ugly.

Handling SIGSEGV

Exhausting stack space is one of the causes of segmentation fault. In this case running a signal handler is not possible because it requires space on the stack. To allow handling SIGSEGV in such condition the sigaltstack(2) function exists that sets alternative stack to be used by signal handlers.

Handling SIGABRT

What happens when a process receives a signal?

Default actions

For a complete list of default actions see the signal(7) manual page.

Interrupting system calls

What is interrupted?

Simple example of signal aware code

This example works, but if you try it and send few signals during sleep you can see that it may sleep different amount of time. This is because sleep(3) takes the argument and returns the value with 1s resolution so it can’t be precise telling you how long it need to sleep after interruption.

Data transferring and signals

This program reads from it’s standard input and copies the data to the standard output. Additionally, when SIGUSR1 is received it prints to stderr how many bytes has been already read and written. It installs a signal handler which sets a global flag to 1 if called. Whatever the program does at the moment it receives the signal, the numbers are immediately printed. It works because read(2) and write(2) functions are interrupted by signals even during operation. In case of those functions two things might happen:

When read(2) waits for data or write(2) waits for stdout to put some data and no data were yet transfered in the call and SIGUSR1 arrives those functions exit with return value of -1. You can distinguish this situation from other errors by reading the value of the errno variable. If it’s EINTR it means that the function was interrupted without any data transfered and we can call the function again with the same parameters.
Another case is that some data were transfered but the function was interrupted before it finished. In this case the functions don’t return an error but a value less that the supplied data size (or buffer size). Neither the return value nor the errno variable tells us that the function was interrupted by a signal, if we want to distinguish this case we need to set some flag in the signal handler (as we do in this example). To continue after interruption we need to call the function again keeping in mind that some data were consumed or read adn we must restart from the right point. In our example only the write(2) must be properly restarted, we use the written variable to track how many bytes were actually written and properly call write(2) again if there are data left in the buffer.

Remember that not all system calls behave exactly the same way, consult their manual page to make sure.

Reading the sigaction(2) manual page you can think that setting the SA_RESTART flag is simpler that handling system call interruption. The documentation says that setting it will make certain system calls automatically restartable across signals. It’s not specified which calls are restarted. This flag is mainly used for compatibility with older systems, don’t use it.

Blocking signals

How to block signals

This program will sleep for 10 seconds and will ignore the SIGTERM signal during the sleep. It works this way because we’ve block the signal with sigprocmask(2). The signal is not ignored, it’s blocked, it means that are queued by the kernel and delivered when we unblock the signal. This is different than ignoring the signal with signal(2). First sigprocmask(2) is more complicated, it operates in a set of signals represented by sigset_t , not on one signal. The SIG_BLOCK parameter tells that the the signals in set are to be blocked (in addition to the already blocked signals). The SIG_SETMASK tells that the signals in set are to be blocked, and signals that are not present in the set are to be unblocked. The third parameter, if not NULL, is written with the current signal mask. This allows to restore the mask after modifying the process’ signal mask. We do it in this example. The first sleep(3) function is executed with SIGTERM blocked, if the signal arrives at this moment, it’s queued. When we restore the original signal mask, we unblock SIGTERM and it’s delivered, the signal handler is called.

See the sigprocmask(2) manual on how to use this function and sigsetops(3) on how to manipulate signal sets.

Preventing race conditions.

Let’s say it’s an example of a network daemon that accepts connections using select(2) and accept(2). It can use select(2) because it listens on multiple interfaces or waits also for some events other than incoming connections. We want to be able to cleanly shut it down with a signal like SIGTERM (remove the PID file, wait for pending connections to finish etc.). To do this we have a handler for the signal defined which sets global flag and relay on the fact that select(2) will be interrupted when the signal arrives at the moment we are just waiting for some events. If the main loop in the program looks similarly as the above code everything works. almost. There is a specific case in which the signal will not interrupt the program even if it does nothing at all at the moment. When it arrives between checking the while condition and executing select(2). The select(2) function will not be interrupted (because signal was handled) and will sleep until some file descriptor it monitors will be ready.

This is where the sigprocmask(2) and other «new» functions are useful. Let’s see an improved version:

What’s the difference between select(2) and pselect(2)? The most important one is that the later takes an additional argument of type sigset_t with set of signals that are unblocked during the execution of the system call. The idea is that the signals are blocked, then global variables/flags that are changed in signal handlers are read and then pselect(2) runs. There is no race because pselect(2) unblocks the signals atomically. See the example: the exit_request flag is checked while the signal is blocked, so there is no race here that would lead to executing pselect(2) just after the signal arrives. In fact, in this example we block the signal all the time and the only place where it can be delivered to the program is the pselect(2) execution. In real world you may block the signals only for the part of the program that contains the flag check and the pselect(2) call to allow interruption in other places in the program.
Another difference not related to the signals is that select(2)’s timeout parameter is of type struct timeval * and pselect(2)’s is const struct timespec * . See the pselect(2) manual page for more information.

If you like poll(2) there is analogous ppoll(2) functions, but in contrast of pselect(2) ppoll(2) is not a standard POSIX function.

Waiting for a signal

The proper solution is to use a dedicated function to wait for a signal: see an example of using sigtimedwait().

This program creates a child process that sleeps few seconds (in a real world application this process would do something like execve(2)) and waits for it to finish. We want to implement a timeout after which the process is killed. The waitpid(2) function does not have a timeout parameter, but we use the SIGCHLD signal that is sent when the child process exits. One solution would be to have a handler for this signal and a loop with sleep(3) in it. The sleep(3) will be interrupted by the SIGCHLD signal or will sleep for the whole time which means the timeout occurred. Such a loop would have a race because the signal could arrive not in the sleep(3), but somewhere else like just before the sleep(3). To solve this we use the sigtimedwait(2) function that allows us to wait for a signal without any race. We can do this because we block the SIGCHLD signal before fork(2) and then call sigtimedwait(2) which atomically unblock the signal and wait for it. If the signal arrives it block it again and returns. It can also take a timeout parameter so it will not sleep forever. So without any trick we can wait for the signal safely.

One drawback is that if sigtimedwait(2) is interrupted by another signal it returns with an error and doesn’t tell us how much time elapsed, so we don’t know how to properly restart it. The proper solution is to wait for all signals we expect at this point in hte program or block other signals. There is another small bug i the program: when we kill the process, SIGCHLD is sent and we don’t handle it anywhere. We should unblock the signal before waitpid(2) and have a handler for it.

Other functions to wait for a signal

sigsuspend(2) — waits for any signal. It takes a signal mask of signals that are atomically unblocked, co it doesn’t introduce race conditions.
sigwaitinfo(2) — like sigtimedwait(2), but without the timeout parameter.
pause(2) — simple function taking no argument. Just waits for any signal. Don’t use it, you will introduce a race condition similar to the described previously, use sigsuspend(2).

Sending signals

Sending signal from keyboard

Sending signals to yourself

Sending data along with signal — sigqueue()

Real-time signals

Whats the difference between RT signals and standard signals? There are couple:

More than one RT signal can be queued for the process if it has the signal blocked while someone sends it. In standard signals only one of a given type is queued, the rest is ignored.
Order of delivery of RT signal is guaranteed to be the same as the sending order.
PID and UID of sending process is written to si_pid and si_uid fields of siginfo_t . For more information see section about Real time signals in signal(7).

Signals and fork()

Signals and threads

With Native POSIX Threads Library things get more interesting. Since this is the POSIX compliant implementation the behavior described here also applies to other POSIX systems.

Which thread receives the signal?

As you can see there is a process-wide signal queue and a per-thread queues.

Signal handlers

sigwaitinfo()/sigtimedwait() and process-directed signals

Real-time signals

Other uses of signals

It’s possible to be notified of I/O availability by a signal. It’s an alternative to functions like select(2). It’s done by setting the O_ASYNC flag on the file descriptor. If you do so and if I/O is available (as select(2) would consider it) a signal is sent to the process. By default it’s SIGIO , but using Real-time signals is more practical and you can set up the file descriptor using fcntl(2) so that you get more information in siginfo_t structure. See the links at the bottom of this article for more information. There is now a better way to do it on Linux: epoll(7) and similar mechanisms are available on other systems.

The dnotify mechanism uses similar technique: you are notified about file system actions using signals related to file descriptors of monitored directories or files. The recommended way of monitoring files is now inotify.

Источник