CEG 233: Linux and Windows

Notes on Virtual Memory

Table of Contents

  1. Educational Objectives
  2. Lab Experiment
  3. Acknowledgements
  4. References

Educational Objectives

The objectives of this article are to make you :

  1. Familiar with virtual memory

Virtual Memory

This article drastically simplifies the presentation of virtual memory so that it suits CEG 233.

A typical computer system might have 2 GB of RAM installed (physical memory) in it.  Yet, a modern OS gives the processes the "illusion" that they have and can use way more than 2 GB of memory.  This is known as virtual memory. This is not only (i) a method of extending the available physical memory on a computer, but also (ii) that of protecting a processes memory from being altered by another (possibly malicious) process.

Pages and Frames

Recall the distinction between a program (a static file of machine code and initialized data stored in an OS specific internal structure, and a process the corresponding dynamic entity created and managed by the OS). In a virtual memory system, the operating system divides physical memory into units called frames. The virtual memory is similarly divided into pages

Imagine that all the machine code of a process is spread out as long tape but byte-wide. (To simplify, we are ignoring initialized data, stack and heap.) Each byte has an address. We cut it up into fixed size rectangles from the beginning (left edge, address 0) to the end (right edge, highest address of the process). These cut up rectangles are known as pages; the fixed size is the page size. The last rectangle may or may not be a full page.

Imagine that all the physical memory (so-called RAM) also as a long byte-wide tape. We cut it up into page size rectangles from the beginning (left edge, byte address 0) to the end (right edge, max memory size - 1). These cut up rectangles are known as frames

Initially, the pages are slices of program or data files stored on the HDD.  As the OS loads and unloads, several pages of different processes will be in the frames.  It is possible that no process has all its pages in frames.  Recently referenced pages are located in the frames. The OS loads pages into frames as needed, and unloads a page either because it is no longer need to be in a frame (i.e., RAM), or because a fresh page needs to be loaded into a frame and all frames are already occupied.

Swapping

The operating system creates a swap volume, or a pagefile on the hard disk.  If a page is not used ("referenced") for a while, it is written to the pagefile, marking the frame it occupied until then as now vacant. This is called "swapping" or "paging out" memory. If that piece of memory is then later referenced by a program, the operating system reads the memory page back from the pagefile into physical memory, also called "swapping" or "paging in" memory.

Paging File or Volume

In Windows, there is a large file named pagefile.sys usually located on the C: drive.   In Linux, it is possible to set aside an entire partition as a swap partition, or specify that a large file be used as a swap file.  This swap area is used by the OS to increase the amount of physical storage for virtual memory.

Page Faults

Typically the number of frames we have is orders of magnitude less than the sum total of pages of all the processes. So we cannot have all the pages available in physical memory. We load the pages as needed into available frames, evicting them when necessary. As the process executes it will ask for a machine instruction or a piece of data belonging to one of its pages but not available in any of the frames it has. Modern computer architectures and OS are designed to detect this situation, known as a page fault extremely fast. An OS services this (in a component called "page fault handler") by locating the page (on the HDD) and loading it into an available frame.

A page fault is a trap (or an interrupt/exception) raised by the hardware, when a process accesses a page that is not in any frame. It is not a "fault" as in English. The hardware that detects this situation is the memory management unit in a processor. The software that handles the page fault is part of the OS. The OS either loads the required page into a frame or kills the program in case it is an illegal access to a page that it does not have.

Page Replacement Algorithms

Soon after booting we may have available (i.e., frames unoccupied by a page of some process). There after, it is unlikely that we have free frames. In order to bring a new page in, an old page must be evicted.

An OS deploys carefully designed "Page Replacement Algorithms" that choose a page to be evicted.

LRU (Least Recently Used) is the name of a page replacement algorithm: when necessary, replace the least recently use page.  It is expensive to implement.  So, both Windows and Linux use an approximated LRU algorithm.

Page Tables

There is a Page Table per process. In the context of OS, the word table almost always means an array.  The page table is an array, let us call it P, of numbers.  The indices are the page numbers, the elements in the array are frame numbers.  So, the i-the page is located at frame number P[i]. Note that frame 0 is legit.  So, the hardware indicates the absence of a page by a single bit-wide column.

Thrashing

When a page is needed, but is not present in physical memory, a page fault occurs.  OS with the help of hardware removes an existing page from a frame, brings in just-demanded page into the newly vacant frame.  It may so happen that the just evicted page is demanded soon.  This vicious cycle is known as "thrashing."  When this happens, system performance deteriorates drastically.  What would have taken a second to compute may take hours and days.

64-bit v 32-bit OS

A common misconception is that 64-bit architectures are no better than 32-bit architectures unless the computer has more than 4 GB of physical memory. This is not entirely true:

The main disadvantage of 64-bit architectures is that relative to 32-bit architectures the same data occupies more space in memory (due to swollen pointers and possibly other types and alignment padding). This increases the memory requirements of a given process and can have implications for efficient processor cache utilization. Maintaining a partial 32-bit model is one way to handle this and is in general reasonably effective.

Currently (2011), most commercial x86 software is still 32-bit, not 64-bit code, so it does not take advantage of the larger 64-bit address space or wider 64-bit registers and data paths on x86 processors, or the additional registers in 64-bit mode.   [From  http://en.wikipedia.org/wiki/64-bit]

In 32-bit OS (including Linux 32-bit versions, XP 32-bit, Vista 32-bit), a process has at most 4 GB of address space; 2 GB of this space is separated for OS kernel usage.

Windows 32-bit or 64 bit

In the following table, the increased maximum resources of computers that are based on 64-bit versions of Windows and the 64-bit Intel processor are compared with existing 32-bit resource maximums. (1 TB = 1024 GB)

Architectural component 64-bit Windows 32-bit Windows
Virtual memory 16 TB 4 GB
Paging file size 256 TB 16 TB
Hyperspace 8 GB 4 MB
Paged pool 128 GB 470 MB
Non-paged pool 128 GB 256 MB
System cache 1 TB 1 GB
System PTEs 128 GB 660 MB

Hyperspace

This is a special region that is used to map the process working set list and to temporarily map other physical pages for such operations as zeroing a page on the free list (when the zero list is empty and the zero page is needed), invalidating page table entries in other page tables (such as when a page is removed from the standby list), and in regards to process creation, setting up the address space of a new process.

Paged pool

This is a region of virtual memory in system space that can be paged in and out of the working set of the system process. Paged pool is created during system initialization and is used by Kernel-mode components to allocate system memory. Uniproccessor systems have two paged pools, and multiprocessor systems have four. Having more than one paged pool reduces the frequency of system code blocking on simultaneous calls to pool routines.

Non-paged pool

This is a memory pool that consists of ranges of system virtual addresses that are guaranteed to be resident in physical memory at all times and thus can be accessed from any address space without incurring paging input/output (I/O). Non-paged pool is created during system initialization and is used by Kernel-mode components to allocate system memory.

System cache

These are pages that are used to map open files in the system cache.

System PTEs

A pool of system Page Table Entries (PTEs) that is used to map system pages such as I/O space, Kernel stacks, and memory descriptor lists. 64-bit programs use a 16-terabyte tuning model (8 terabytes User and 8 terabytes Kernel). 32-bit programs still use the 4-GB tuning model (2 GB User and 2 GB Kernel). This means that 32-bit processes that run on 64-bit versions of Windows run in a 4-GB tuning model (2 GB User and 2GB Kernel). 64-bit versions of Windows do not support the use of the /3GB switch in the boot options. Theoretically, a 64-bit pointer could address up to 16 exabytes. 64-bit versions of Windows have currently implemented up to 16 terabytes of address space.

Sharing of Pages and Segments

Imagine that you invoked a program P and another user on the same system also invoked it the same program P.  The OS will have two separate processes.  But the two will share the code segments/pages.  The data manipulated by your process will be disjoint from the data manipulated by the other process.  The sharing of code also happens with standard libraries.

Lab Experiment

Acknowledgements

References

  1. Microsoft, Comparison of 32-bit and 64-bit memory architecture for 64-bit editions of Windows XP and Windows Server 2003, http://support.microsoft.com/kb/294418,

  2. http://en.wikipedia.org/wiki/Virtual_memory for further info and pointers.

Copyright © 2012 pmateti@wright.edu