how does the OS determine null pointer access without checking all pointer addresses?

how does the OS determine null pointer access without checking all pointer addresses? - pointers

It is known that the 0 address (which is marked as the macro 'NULL'), is not legal to access.
I was wondering how is it that the operating system (say linux) can determine when there is an access to null address, somewhere in the code, without having to access each and every pointer address in the code?
I assume it has something to do with signal and specifically, the "sigsegv" signal.
But I'm not sure how it's done.

First of all a null pointer access is not necessarily invalid. Typically, either the operating system's program loader or the linker (depending upon the system) set up processes so that the the lowest page in the virtual address space is not mapped.
Many systems that do this also allow the application to map the first page, making a null reference valid.
The NULL pointer is checked the same way all other memory addresses are checked: through the logical address translation of the CPU.
Each time the processor accesses memory (ignoring caching) it looks up the address in the process's page table. If there is no corresponding entry, the processor triggers an access fault (that in Unix variants gets translated into a signal).
If there is an entry in the page table for the address, the processor checks the access allowed for the page. If you are in user mode and try to access a kernel protected page, that triggers a fault. If you are trying to write to a read only page, that triggers a fault. If you try to execute a non-executable page, that triggers a fault.
This is a rather lengthy topic. You need to understand logical memory translation (sometimes misnamed virtual memory) if you want to learn more on the topic.

Pointers refer to virtual address space. In the virtual address space, each page of memory can be mapped to real physical memory. The operating system takes care of this mapping separately for each process.
When you access memory through a pointer, the CPU looks at the mapping for the virtual address your pointer specifies and checks if there is real, physical memory behind. Additional checks are done to verify that you have read or write access to that piece of memory, depending on the operation you are attempting.
If there is no memory mapped for that address, the CPU generates a hardware interrupt. The OS catches that interrupt and - usually - signals sigsegv for the calling process.
The zero page containing the NULL address is usually intentionally left unmapped, so that NULL pointer accesses, which usually result from programming errors, are easily trapped.

Linux obtains this support from hardware. Processor is informed about the purpose of individual memory regions and their availability. If "unavailable" memory region is accessed the processor informs the operating system about the problem and the operating system informs the application.
It means two things:
There is no software overhead related to checking all pointers against the NULL value.
There is no precise check for allowed pointer values.
In other words, if your pointer points anywhere to the "available" memory then the hardware unit is unable to recognize the problem.

The Memory Management Unit plays a key role in the exception triggering when a NULL pointer is dereferenced or an invalid address is accessed.
During the normal virtual-to-physical memory mapping process done by MMU on each RAM access, the undefined address is simply not found in the range of virtual addresses defined in the MMU descriptors. This can have catastrophic consequences if occurred in OS kernel-space, or just process kill and cleanup in the user-space domain.

...how is it that the operating system (say linux) can determine when there is an access to null address, somewhere in the code, without having to access each and every pointer address in the code?
Well, OS cannot determine a NULL dereference without accessing the pointer. From the wiki for segmentation fault:
In computing, a segmentation fault (often shortened to segfault) or access violation is a fault raised by hardware with memory protection, notifying an operating system (OS) about a memory access violation; on x86 computers this is a form of general protection fault. The OS kernel will in response usually perform some corrective action, generally passing the fault on to the offending process by sending the process a signal....
The memory access violation is a run-time incident, and unless there is an invalid access, there is no way OS will raise the signal to the process.
FWIW, a process is allowed to access the memory allocated for it (in virtual address space). Any address, outside the allocated virtual address space, if accessed, will generate a fault (through MMU) which in turn, generates the segmentation fault.
TL;DR - SIGSEV is generated on encountering the NULL-pointer dereference, not before that. Also, OS does not detect the erroneous access itself, rather it is informed to the OS by the Memory Management Unit via raising a fault.

Related

Can a memory address tell you anything about how/where the object is stored

Is there a way you can identify whether an object is stored on the stack or heap solely from its memory address? I ask because it would be useful to know this when debugging and a memory address comes up in an error.
For instance:
If have a memory address: 0x7fd8507c6
Can I determine anything about the object based on this address?

You don't mention which OS you are using. I'll answer for Microsoft Windows as that's the one I've been using for the last 25 years. Most of what I knew about Unix/Linux I've forgotten.
If you just have the address and no other information - for 32 bit Windows you can tell if it's user space (lower 2GB) or kernel space (upper 2GB), but that's about it (assuming you don't have the /3GB boot option).
If you have the address and you can run some code you can use VirtualQuery() to get information about the address. If you get a non-zero return value you can use the data in the returned MEMORY_BUFFER_INFORMATION data.
The State, Type, and Protect values will tell you about the possible uses for the memory - whether it's memory mapped, a DLL (Type & MEM_IMAGE != 0), etc. You can't infer from this information if the memory is a thread's stack or if it's in a heap. You can however determine if the address is in memory that isn't heap or stack (memory in a DLL is not in a stack or heap, non-accessible memory isn't in a stack or a heap).
To determine where a thread stack is you could examine all pages in the application looking for a guard page at the end of a thread's stack. You can then infer the location of the stack space using the default stack size stored in the PE header (or if you can't do that, just use the default size of 1MB - few people change it) and the address you have (is it in the space you have inferred?).
To determine if the address is in a memory heap you'd need to enumerate the application heaps (GetProcessHeaps()) and then enumerate each heap (HeapWalk()) found checking the areas being managed by the heap. Not all Windows heaps can be enumerated.
To get any further than this you need to have tracked allocations/deallocations etc to each heap and have all that information stored for use.
You could also track when threads are created/destroyed/exit and calculate the thread addresses that way.
That's a broad brush answer informed by my experience creating a memory leak detection tool (which needs this information) and numerous other tools.

Use of pci_iomap() and ioremap_nocache() functions in UART(8250) driver

I'm understanding driver code for UART- 8250.c and 8250_pci.c from Linux.
I've problem in understanding use of pci_iomap and ioremap_nocache function call.
1) Means why they are used in code?
2) And what is significance of Address return by both functions?
Need help.Thanks.

The way to access I/O memory depends on the computer architecture, bus, and device being used, although the principles are the same everywhere. Depending on the computer platform and bus being used, I/O memory may or may not be accessed through page tables. When access passes though page tables, the kernel must first arrange for the physical address to be visible from your driver, and this usually means that you must call ioremap before doing any I/O. If no page tables are needed, I/O memory locations look pretty much like I/O ports, and you can just read and write to them using proper wrapper functions.
Allocation of I/O memory is not the only required step before that memory may be accessed. You must also ensure that this I/O memory has been made accessible to the kernel. Getting at I/O memory is not just a matter of dereferencing a pointer; on many systems, I/O memory is not directly accessible in this way at all. So a mapping must be set up first. This is the role of the ioremap function. Once equipped with ioremap (and iounmap), a device driver can access any I/O memory address, whether or not it is directly mapped to virtual address space. Remember, though, that the addresses returned from ioremap should not be dereferenced directly; instead, accessor functions provided by the kernel should be used.
ioremap_nocache: Quoting from one of the kernel headers: "It's useful if some control registers are in such an area, and write combining or read caching is not desirable"
Few of the accessor functions are, to read from I/O memory
unsigned int ioread8(void *addr);
unsigned int ioread16(void *addr);
unsigned int ioread32(void *addr);
Here, addr should be an address obtained from ioremap (perhaps with an integer offset); the return value is what was read from the given I/O memory.
There is a similar set of functions for writing to I/O memory:
void iowrite8(u8 value, void *addr);
void iowrite16(u16 value, void *addr);
void iowrite32(u32 value, void *addr);

pci_iomap() maps PCI resources as such as IORESOURCE_MEM and IORESOURCE_IO (legacy). ioremap() makes the physical IO address space usable for IO accessors. ioremap() works for any devices regardless to which bus it's attached and enumerated.
To answer for your question. The ioremap_nochache() is used only in two places in the driver. And each of them is custom to some hardware. I suspect this happened due to chronological changes since 8250 is one of the oldest drivers in the kernel.

Why is PCD bit set when I don't use ioremap_cache?

I am using ubuntu 12.10 32 bit on an x86 system. I have physical memory(about 32MB,sometimes more) which is enumerated and reserved through the ACPI tables as a device so that linux/OS cannot use it. I have a Linux driver for this memory device. THe driver implements mmap() so that when a process calls mmap(), the driver can map this reserved physical memory to user space. I also sometimes do nothing in the mmap except setup the VMA's and point vma->vmops to the vm_operations_struct with the open close and fault functions implemented. When the application accesses the mmapped memory, I get a page fault and my .fault function is called. Here is use vm_insert_pfn to map the virtual address to any physical address in the 32MB that I want.
Here is the problem I have: In the driver, if I call ioremap_cache() during init, I get good cache performance from the application when I access data in this memory. However, if I don't call ioremap_cache(), I see that any access to these physical pages results in a cache miss and gives horrible performance. I looked into the PTE's and see that the PCD bit for these virtual address->physical translation are set, which means caching on these physical pages is disabled. We tried setting _PAGE_CACHE_WB in the vma_page_prot field and also used remap_pfn_range with the new vma_page_prot but PCD bit was still set in the PTE's.
Does anybody have any idea on how we can ensure caching is enabled for this memory? The reason I don't want to use ioremap_cache() for 32 MB is because there are limited Kernel Virtual Address on 32bit systems and I don't want to hold them.

Suggestions:
Read linux/Documentation/x86/pat.txt
Boot Linux with debugpat
After trying the set_memory_wb() APIs, check /sys/kernel/debug/x86/pat_memtype_list

Does printing to the screen cause a switch to kernel mode and running OS code in Unix?

i'm studying for a test is OS (unix is our model).
i have the following question:
which of the following 2 does NOT cause the user's program to stop and to switch to OS code?
A. the program found an error and is
printing it to the screen.
B. the program allocated memory that
will be read later on from the disk.
well, i have answers, however, i'm not sure how good they are.
they say the answer is B.
but, B is when the user uses malloc which is a system call no? allocating memory doesn't go through the OS?
and why printing to the screen should need the OS for it?
thanks for your help

malloc is not a system call. It's just a function.
When you call malloc it checks to see if it (internally) has enough memory to give you. If it does, it just returns the address - no need to trap into kernel mode. IF it doesn't have it, it asks the operating system (indeed a system call).
Depending on the way printing is done, that too might or might not elicit a system call. For instance if you use stdio, then printing is user-buffered. What that means is that a printf means copying to some stdio buffer without any actual I/O. However, if printf decides to flush, then indeed a system call must be performed.

printf() and malloc() calls invoke the C runtime library (libc). The C runtime library is a layer on top of the kernel, and may end up calling the kernel depending on circumstances.
The kernel provides somewhat primitive memory allocation via brk() (extend/shrink the data segment), and mmap() (map pages of memory into the process virtual address space). Libc's malloc() internally manages memory it has obtained from the kernel, and tries to minimize system calls (among other things, it also tries to avoid excessive fragmentation, and tries to have good performance on multi-threaded programs, so has to make some compromises).
stdio input/ouput (via *printf()/*scanf()) is buffered, and ends up calling the kernel's write()/read() system calls. By default, stderr (the error stream) is unbuffered or line-buffered (ISO C §7.19.3 ¶7), so that errors can be seen immediately. stdin and stdout are line-buffered or unbuffered unless it can be determined they aren't attached to an interactive device, so that interactive prompts for input work correctly. stdin and stdout can be fully-buffered (block-buffered) if they refer to a disk file, or other non-interactive stream.
That means that error output is by default guaranteed to be seen as soon as you output a '\n'(newline) character (unless you use setbuf()/setvbuf()). Normal output additionally requires to be connected to a terminal or other interactive device to provide that guarantee.

In A the user program is responsible for detecting the error and deciding how to provide that information. However in most cases actually rendering characters to a display device or terminal will involve an OS call at some point.
In B the OS is certainly responsible for memory management, and allocation may at some point request memory from the OS or the OS may have to provide disk swapping.
So the answer is probably strictly neither. But A will require a system call, whereas B may require a system call.

The answer is A. Handling an error after it is detected is handled by the programming language run time and user space application. On the other hand, mmap'ing a file requires entering kernel mode to allocate the necesary pages and queue up any disk IO. So B is definitely not the correct option.

where does the shared memory gets allocated?

In Linux, When we are sharing data between 2 or more processes using shared memory, where does the shared memory gets allocated?
Will it become part of process address space at run time? as the process cannot access the memory outside its address space.
Could some one please clarify?

When you have shared memory, then that memory gets mapped into the virtual address space of each process that shares the memory (not necessarily at the same virtual addresses in each process). The virtual memory manager ensures that the virtual addresses both map to the same physical addresses so that the sharing actually happens.

Assuming System V: One process takes memory which is allocated inside its process space and makes it available to others via IPC. The most common way to share it is to map the memory into the other process' virtual address space. In which case they can access the memory as though it was part of their won address space.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex