Linux kernel copy to user

Linux kernel copy to user

These two functions are used very frequently in the kernel and are responsible for copying data from user space to kernel space and copying data from kernel space to user space.

Under the arm architecture, copy_from_user The relevant documents are mainly

Let’s look first copy_from_user , Its implementation is in arch/arm/include/asm/uaccess.h:

The function first passes access_ok Do the first layer address range validity check, and then pass __copy_from_user Make an official copy. The reason why only the first layer of inspection is done, because the second layer of inspection (the address does not have a corresponding physical page) can only be solved by exception handling!

Look below access_ok The realization of it! (The code implementation is still in the same file) Similarly, different architectures have different implementation methods. Even mmu and no mmu are different.

For those without mmu, the check is not to check, because no mmu means that there is no virtual address mapping, and the physical address is used (if something goes wrong, it cannot be solved).

For those with mmu, it will be first __chk_user_ptr Check addr, the function is generally empty! (Its implementation involves __CHECKER__ Macro judgment, __CHECKER__ Macros are defined when the kernel code is checked through the Sparse (Semantic Parser for C) tool. This tool will be called when using make C=1 or C=2. This tool can check the kernel functions and variables that declare the relevant attributes that sparse can check in the code. If defined __CHECKER__ , __chk_user_ptr with __chk_io_ptr Here only the function is declared, there is no function body, the purpose is that Sparse can catch the compilation error during the compilation process and check the type of the parameter. If not defined __CHECKER__ , This is an empty statement). The core content is

This is a section of C inline assembly (Linux uses AT&T encoding, the left side is the original operand, and the right side is the destination operand, which is different from the intel encoding, refer toGNU C embedded assembly language)! The core idea is to determine whether the source address + the size to be copied exceeds the address limit range restricted by the process. The following line by line analysis, first look at the input and output settings section:

& Indicates that the output data will not be overwritten, «=&r» (flag), «=&r» (roksum) Indicates that the output is stored in a general-purpose register, and points to flag and roksum at the same time, the input general-purpose register is used to store addr, and 32 is an integer size, and the initial value of flag is set to current_thread_info()->addr_limit , «Cc» means the embedded __asm__ Assembly instructions will change the CPU’s condition status register cc.

Continue to look at the command section below:

Add addr and size first and store it in roksum (the calculation result will set cpsr), if the previous calculation does not carry, then the addition of add and size does not exceed the unsigned int range, so use sbc to implement addr+size -flag-!C, which is addr+size-current_thread_info()->addr_limit-1 Finally, if the previous command execution does not result in the C bit being 1, then execute mov %0, #0, which means that the flag is set to 0. If the C bit is 1, then explain ( addr + size)>=(current_thread_info()->addr_limit ). It should be noted here that when the subtraction instruction is not borrowed, C is 0; when there is a borrow, C is 1.

Finally, let me explain __range_ok There is a flag at the end of the definition; this is an extension supported by gnu. In the code surrounded by (<>), the last expression or value will be the return value of the entire (<>). In other words, the flag is __range_ok The return value. __range_ok If all goes well, then the return is 0, if any of the instructions have a problem, then it will not be 0 (the initial value of the initial flag is current_thread_info()->addr_limit , Not 0)

Читайте также:  Драйвера для realtek pcie gbe family controller windows 10 pro

Okay, after analysis __range_ok Realization, now continue to watch __copy_from_user , Still in the same file (there are also mmu and non-mmu points):

When there is mmu, its corresponding implementation is arch/arm/lib/copy_from_user.S inside:

The core implementation is arch/arm/lib/copy_template.S in. This code of arm is quite complicated. There are many similar overviews on the Internet, all of which describe the x86 architecture below.

The reason why the code is so complicated is found in the «Linux kernel source code scenario analysis» book:

When the kernel obtains the pointer passed from user space from a process, it is difficult to ensure the validity of this pointer, and it is even more difficult to ensure that the entire range of length len is legal. Therefore, for security reasons, you should first check the validity of this interval to see if the virtual memory space determined by the pointer and length parameters has been mapped. Each process has a mm_struct structure representing its Xu village space, which records all the mapped regions of the process in user space. As long as you search the linked list in this data structure, you can find out whether the virtual memory space has been mapped. The old Linux version does this, but this detection will bring a lot of burden. The possibility that the pointer is really problematic is actually relatively small, so the new version of linux removes these tests. When you encounter a bad pointer, the page exception occurs.

How to do it. When a bad real happens and a page anomaly really occurs, in do_page_fault, first search the current virtual memory list of the current process through find_vma, and if the search fails, transfer to bad_area. Although the target address of the access failure is in the user space, the execution address of the cpu is in the system space. In other words, if the kernel can find the address of the abnormal instruction in an exception table and get the corresponding repair address fixup, it will replace the address to be re-executed with the repair address after the exception returns. Why do you do that? Because in this case the kernel cannot fill up a page for the current process, then the content of the user space to be copied will be completely empty (this is that the kernel state cannot encounter a page fault exception like the user mode. Allocate space to the abnormal address during processing to solve the cause of the page fault problem) . If you let it happen, after the exception returns, the current process will inevitably generate new exceptions due to the execution of the same instruction, so it must be pulled out of the mud pit. The corresponding carefully designed code to fix the address fixup plays such a role.

The original text has a description of the address jump for the exception repair under x86. For more details, please refer to the original text.

Источник

1. copy_from_user

In learning Linux kernel driver, often encounter copy_from_user and copy_to_user these two functions, the device driver ioctl function will often use. These two functions are responsible for passing data in user space and kernel space. First look at their definitions ( linux / include / asm-arm / uaccess.h), look copy_from_user:

Function look three parameters: * to a pointer to the kernel space, * from user space pointer, n represents the number of bytes of data copied kernel space from user space like. If the copy operation executed successfully, 0 is returned, otherwise the number of bytes not yet completed copy.

This function came from the structural analysis, in fact, can be divided into two parts:

  1. First checks the address of the user space pointer is valid;
  2. Call __arch_copy_from_user function.

1.1. access_ok

access_ok Used to address pointer from the user space for some kind of validation, the macro and architecture related, on the arm platform (linux / include / asm-arm / uaccess.h):

Читайте также:  Аналоги time machine для windows

can be seen access_ok The first parameter type and did not use, __ role range_ok lies within the range of user space process is still there after the judgment addr + size. Here we look at the specific. This code involves the GCC inline assembly, but not the friends can read this blog (http://blog.csdn.net/ce123/article/details/8209702).
(1) unsigned long flag, sum; \\ defines two variables

  • flag: variable holds the result: zero represents the address is not valid, zero represents the address can be accessed. The initial deposit non-zero value (current_thread_info () -> addr_limit), which is the address of the current process value.
  • sum: To access the saved address range end, and for the current process address space limit data for comparison.

See specific definition (linux / compiler.h):

(3) Next is a compilation:
adds %1, %2, %3
sum = addr + size affect the operating status bit (the aim is to affect the carry flag C), with the following two instructions are conditions CC, that is, when C = 0 when it is executed.

If the above-bit add instruction into the (C = 1), then the following instruction is not executed, flag to an initial value current_thread_info () -> addr_limit (non-zero), and returns.
If there is no carry (C = 0), executes the following instructions:
sbcccs %1, %1, %0
sum = sum — flag — 1, i.e. (addr + size) — (current_thread_info () -> addr_limit) — 1, operation of the sign bit.
If (addr + size)> = (current_thread_info () -> addr_limit) — 1, then C = 1
If (addr + size) addr_limit) — 1, then C = 0
When C = 0, when following instructions are executed, otherwise skip (flag zero).
movcc %0, #0
flag = 0, 0 is assigned to the flag.

To sum up: __ range_ok macro actually equivalent to:

  • If (addr + size)> = (current_thread_info () -> addr_limit) — 1, returns a nonzero value
  • If (addr + size) addr_limit), returns zero

And whether the user space address range is to test access_ok to be operated in the current process user address space limits. The macro function is very simple, can be implemented in C, not necessary to use assembly. But these two frequently used functions, use the compilation to achieve some functions to increase efficiency.
From here again to recognize, copy_from_user use a combination of process context, because they want to access the «user» memory space, this «user» must be a specific process. Through the above source, we know where the current_thread_info () to check whether the space can be accessed. If you use these functions in the drive must be used to achieve system function call, the function can not be used in the realization of interrupt processing. If you use in interrupt context, it is likely that the code does not related to the operation of the process address space. Second, because the page operation can be swapped out, these two functions may be dormant, so the same can not be used in the interrupt context.

1.2. __arch_copy_from_user

We will analyze in detail the function in another blog post.

Источник

copy_to_user failing with linux kernel page tables

I’m writing a system call in the Linux Kernel that given a virtual address and an unsigned long pointer, finds the corresponding page table entry and then copies its contents into the unsigned long pointer. Here is the system call:

Here is the test program that’s calling the system call:

Currently the call to copy_to_user is failing with a return value of 8 meaning it that it copied none of kernel_pte into pte. I checked pte with access_ok for VERIFY_WRITE and it returns with a 1. However, acces_ok called on kernel_pte with VERIFY_READ returns with a 0. I’m not sure if that is what is causing copy_to_user to fail, but looking at the source code for copy_to_user it looks like it only checks the user pointer again. So I’m at a bit of a loss why the call is failing.

2 Answers 2

You doesn’t initialize pte in the test program. Or you should declare it unsigned long and pass address of it to the syscall, – Tsyvarev

Читайте также:  Установка gimp для linux

This is completely wrong. What are you trying to accomplish?

The curly bracket is supposed to go on separate line. readMMU is a bad name inconsistent with other syscalls. The star near pte is misplaced.

Misplaced stars. Are you sure these types are even correct here?

What is this for?

What’s the purpose of this kmalloc? This code just frees it when its done, so given the super small size this should just be a local var instead. Further, because of other bugs, you actually leak this memory in case of errors, like below.

Where did you get this error checking from? You are supposed to check != 0.

Incorrect. All printks need to be prefixed. For meh purposes you can just use KERN_CRIT.

Incorrect on several levels. First of all, it should be obvious the kernel already has some access to ‘task_struct’ of the executing thread. It so happens the macro used to obtain it is named ‘current’ and use it here to get the pid. You then use find_get_pid and get_pid_task without rcu held which is an error the kernel would have told you about if you had debugging enabled. Finally, get_pid_task increments task’s reference counter, which you leak since you don’t decreent in anywhere. Note that the task you are looking for is in ‘current’, which you use anyway.

If this is supposed to be used later for querying the state of other threads it is still incorrect.

current will definitely have valid mm at this point, but should you want to inspect other threads, you must check for validity of mm first.

This is reasonably valid, but serves no purpose (see below).

What’s the point of this check? Not only this cannot succed, its straight up wrong.

access_ok is being done by copy_to_user. What you seem to not realize is that its only a range check. Checking anything else would require holding the mm semaphore in order to prevent invalidation of the result.

Источник

copy_to_user not working in kernel module

I was trying to use copy_to_user in kernel module read function, but am not able to copy the data from kernel to user buffer. Please can anyone tell me if I am doing some mistake. My kernel version is 2.6.35. I am giving the portion of kernel module as well as the application being used to test it. Right now my focus is why this copy_to_user is not working. Any help will great.

2 Answers 2

There are a few problems in the code snippet you wrote. First of all, it is not a good thing to make the call try_module_get(THIS_MODULE);
This statement tries to increase the refcount of the module . in the module itself ! Instead, you should set the owner field of the file_ops structure to THIS_MODULE in your init method. This way, the reference handling will happen outside the module code, in the VFS layer. You might take a look at Linux Kernel Modules: When to use try_module_get / module_put.
Then, as it was stated by Vineet you should retrieve the pointer from the file_ops private_data field.

And last but not least, here is the reason why it seems an error happened while . Actually . It did not : The copy_to_user call returns 0 if it has successfully copied all the desired bytes into the destination memory area and a strictly positive value stating the number of bytes that were NOT copied in case of error. That said, when you run :

You should only get one character upon a successful copy, that is only a single «I» in your case. Moreover, you use your msg_Ptr pointer as a safeguard but you never update it. This might result in a wrong call to copy_to_user .
copy_to_user checks the user-space pointer with a call to access_ok , but if the kernel-space pointer and the given length are not allright, this might end in a Kernel Oops/Panic.

Источник

Оцените статью