The objectives of this article are to make you :
This article drastically simplifies the presentation of virtual memory so that it suits CEG 233.
A typical computer system might have 2 GB of RAM installed (physical memory) in it. Yet, a modern OS gives the processes the "illusion" that they have and can use way more than 2 GB of memory. This is known as virtual memory. This is not only (i) a method of extending the available physical memory on a computer, but also (ii) that of protecting a processes memory from being altered by another (possibly malicious) process.
Recall the distinction between a program (a static file of machine code and initialized data stored in an OS specific internal structure, and a process the corresponding dynamic entity created and managed by the OS). In a virtual memory system, the operating system divides physical memory into units called frames. The virtual memory is similarly divided into pages
Imagine that all the machine code of a process is spread out as long tape but byte-wide. (To simplify, we are ignoring initialized data, stack and heap.) Each byte has an address. We cut it up into fixed size rectangles from the beginning (left edge, address 0) to the end (right edge, highest address of the process). These cut up rectangles are known as pages; the fixed size is the page size. The last rectangle may or may not be a full page.
Imagine that all the physical memory (so-called RAM) also as a long byte-wide tape. We cut it up into page size rectangles from the beginning (left edge, byte address 0) to the end (right edge, max memory size - 1). These cut up rectangles are known as frames
Initially, the pages are slices of program or data files stored on the HDD. As the OS loads and unloads, several pages of different processes will be in the frames. It is possible that no process has all its pages in frames. Recently referenced pages are located in the frames. The OS loads pages into frames as needed, and unloads a page either because it is no longer need to be in a frame (i.e., RAM), or because a fresh page needs to be loaded into a frame and all frames are already occupied.
The operating system creates a swap volume, or a pagefile on the hard disk. If a page is not used ("referenced") for a while, it is written to the pagefile, marking the frame it occupied until then as now vacant. This is called "swapping" or "paging out" memory. If that piece of memory is then later referenced by a program, the operating system reads the memory page back from the pagefile into physical memory, also called "swapping" or "paging in" memory.
In Windows, there is a large file named pagefile.sys usually located on the C: drive. In Linux, it is possible to set aside an entire partition as a swap partition, or specify that a large file be used as a swap file. This swap area is used by the OS to increase the amount of physical storage for virtual memory.
A page fault is a trap (or an interrupt/exception) raised by the hardware, when a process accesses a page that is not in any frame. It is not a "fault" as in English. The hardware that detects this situation is the memory management unit in a processor. The software that handles the page fault is part of the OS. The OS either loads the required page into a frame or kills the program in case it is an illegal access to a page that it does not have.
Soon after booting we may have available (i.e., frames unoccupied by a page of some process). There after, it is unlikely that we have free frames. In order to bring a new page in, an old page must be evicted.
An OS deploys carefully designed "Page Replacement Algorithms" that choose a page to be evicted.
LRU (Least Recently Used) is the name of a page replacement algorithm: when necessary, replace the least recently use page. It is expensive to implement. So, both Windows and Linux use an approximated LRU algorithm.
When a page is needed, but is not present in physical memory, a page fault occurs. OS with the help of hardware removes an existing page from a frame, brings in just-demanded page into the newly vacant frame. It may so happen that the just evicted page is demanded soon. This vicious cycle is known as "thrashing." When this happens, system performance deteriorates drastically. What would have taken a second to compute may take hours and days.
A common misconception is that 64-bit architectures are no better than 32-bit architectures unless the computer has more than 4 GB of physical memory. This is not entirely true:
The main disadvantage of 64-bit architectures is that relative to 32-bit architectures the same data occupies more space in memory (due to swollen pointers and possibly other types and alignment padding). This increases the memory requirements of a given process and can have implications for efficient processor cache utilization. Maintaining a partial 32-bit model is one way to handle this and is in general reasonably effective.
Currently (2011), most commercial x86 software is still 32-bit, not 64-bit code, so it does not take advantage of the larger 64-bit address space or wider 64-bit registers and data paths on x86 processors, or the additional registers in 64-bit mode. [From http://en.wikipedia.org/wiki/64-bit]
In 32-bit OS (including Linux 32-bit versions, XP 32-bit, Vista 32-bit), a process has at most 4 GB of address space; 2 GB of this space is separated for OS kernel usage.
In the following table, the increased maximum resources of computers that are based on 64-bit versions of Windows and the 64-bit Intel processor are compared with existing 32-bit resource maximums. (1 TB = 1024 GB)
| Architectural component | 64-bit Windows | 32-bit Windows |
|---|---|---|
| Virtual memory | 16 TB | 4 GB |
| Paging file size | 256 TB | 16 TB |
| Hyperspace | 8 GB | 4 MB |
| Paged pool | 128 GB | 470 MB |
| Non-paged pool | 128 GB | 256 MB |
| System cache | 1 TB | 1 GB |
| System PTEs | 128 GB | 660 MB |
This is a special region that is used to map the process working set list and to temporarily map other physical pages for such operations as zeroing a page on the free list (when the zero list is empty and the zero page is needed), invalidating page table entries in other page tables (such as when a page is removed from the standby list), and in regards to process creation, setting up the address space of a new process.
This is a region of virtual memory in system space that can be paged in and out of the working set of the system process. Paged pool is created during system initialization and is used by Kernel-mode components to allocate system memory. Uniproccessor systems have two paged pools, and multiprocessor systems have four. Having more than one paged pool reduces the frequency of system code blocking on simultaneous calls to pool routines.
This is a memory pool that consists of ranges of system virtual addresses that are guaranteed to be resident in physical memory at all times and thus can be accessed from any address space without incurring paging input/output (I/O). Non-paged pool is created during system initialization and is used by Kernel-mode components to allocate system memory.
A pool of system Page Table Entries (PTEs) that is used to map system pages such as I/O space, Kernel stacks, and memory descriptor lists. 64-bit programs use a 16-terabyte tuning model (8 terabytes User and 8 terabytes Kernel). 32-bit programs still use the 4-GB tuning model (2 GB User and 2 GB Kernel). This means that 32-bit processes that run on 64-bit versions of Windows run in a 4-GB tuning model (2 GB User and 2GB Kernel). 64-bit versions of Windows do not support the use of the /3GB switch in the boot options. Theoretically, a 64-bit pointer could address up to 16 exabytes. 64-bit versions of Windows have currently implemented up to 16 terabytes of address space.
Imagine that you invoked a program P and another user on the same system also invoked it the same program P. The OS will have two separate processes. But the two will share the code segments/pages. The data manipulated by your process will be disjoint from the data manipulated by the other process. The sharing of code also happens with standard libraries.