BookPDF Available

Operating System Fundamentals

Authors:
  • University of Doha for Science and Technology

Abstract and Figures

This textbook is designed to give you an overview of what an operating system is, and how a modern operating system works. There are many different examples of operating systems available for computers. The most popular operating systems belong to the Microsoft Windows family (such as Windows 98, XP and Vista). Other examples are Unix and Linux, Mac OS X, and specialized operating systems for handheld devices like mobile phones. An operating system is the software that controls (or operates) all of the parts of your computer. It manages all of your resources, and lets you interface with the computer. This textbook takes a look at the main problems that an operating system must be able to overcome, and the main functions that it must be able to perform. We begin by reviewing the architecture (or structure) of the physical parts of a computer, and how they communicate with each other. Then we take a look at fundamental operating system concepts, processes and process management, memory management, controlling input and output devices, and file system management.
Content may be subject to copyright.
Operating System Fundamentals
Robert Power & Robert Ford
School of Information Technology
College of the North Atlantic-Qatar
© 2009
Table of Contents
Introduction
5
Unit 1: Computer Architecture Review
Why Review Computer Architecture?
6
A Map of Your Computer‘s Architecture
7
The CPU
10
Memory
11
Talking to Devices
12
Unit Summary
13
Key Terms
13
Review Questions
13
Unit 2: Operating System Principles
What is an Operating System?
14
Examples of Operating Systems
14
The Structure of Operating Systems
15
Interfacing with an Operating System
19
Managing System Resources
20
Unit Summary
23
Key Terms
23
Review Questions
24
Unit 3: Processes
Processes and Multitasking
25
Process
25
State Changing
28
Process Creation
32
Stopping a Process
34
Threads
35
Inter Process Communication
37
Process Scheduling Algorithms
47
Unit Summary
53
Key Terms
53
Review Questions
54
Unit 4: Memory Management
What is Memory Management?
55
The Memory Manager
55
Virtual Memory
58
Unit Summary
62
Key Terms
63
Review Questions
63
Unit 5: Input/Output
What are Input/Output Resources?
64
I/O Resources
65
I/O Management Software
68
Managing Magnetic Disks
70
System Clocks
74
Unit Summary
74
Key Terms
75
Review Questions
76
Unit 6: File Systems
Managing Files
77
Long Term Storage of Information
77
The File Manager
78
Data Storage Strategies
78
File Systems
81
The Master Boot Record and Partition Table
82
Interacting with File Systems
83
Unit Summary
85
Key Terms
86
Review Questions
87
Glossary
88
References
102
Operating System Fundamentals
5
Introduction
This textbook is designed to give you an overview of what an operating system is, and how a
modern operating system works. There are many different examples of operating systems
available for computers. The most popular operating systems belong to the Microsoft Windows
family (such as Windows 98, XP and Vista). Other examples are Unix and Linux, Mac OS X, and
specialized operating systems for handheld devices like mobile phones.
An operating system is the software that controls (or
operates) all of the parts of your computer. It manages all
of your resources, and lets you interface with the
computer. This textbook takes a look at the main problems
that an operating system must be able to overcome, and
the main functions that it must be able to perform. We
begin by reviewing the architecture (or structure) of the
physical parts of a computer, and how they communicate
with each other. Then we take a look at fundamental
operating system concepts, processes and process
management, memory management, controlling input and
output devices, and file system management.
Operating System Fundamentals
6
Unit 1: Computer Architecture Review
Why Review Computer Architecture?
This textbook focuses on how operating systems work. Operating systems (or system software)
are one of the two main types of software (the other is application software). However, we need
to know some important things about the hardware inside of a computer in order to understand
some of the critical functions of any operating system.
Computer architecture refers to the overall design of the
physical parts of the computer. That is, it refers to:
what the main parts are;
how they are physically connected to each other; and
how they work together;
Although all of the parts of a computer are connected to
each other by the motherboard, the operating system is
essential in order to control how those parts talk to each
other. Without the operating system, the parts of the
computer would not be able to do anything that the user
needs them to do! We need to know some basics about how
the main parts of a computer are physically connected to
each other before we can truly understand what an
operating system does.
In this chapter we will look at a general map of a computer‘s architecture, showing the physical
structure that allows all of the parts to exchange information and instructions. We will also take
a brief look at some of the major components of a computer that an operating system must be
concerned with, including the CPU, memory and how devices actually talk to each other.
A motherboard
Operating System Fundamentals
7
A Map of Your Computer‘s Architecture
Lines (traces) on a motherboard
are like roads in a city
CPU Central Processing Unit
This is the brain of your computer. It performs all of the calculations.
RAM Random Access Memory
This is your system memory.
This is like a desk, or a workspace, where your computer temporarily stores all of the
information (data) and instructions (software or program code) that it is currently using.
Most computers today have between 1 to 4 Gigabytes (GB) of RAM.
Graphics
Many computers have a dedicated system bus and expansion card slot just for a video
card.
Many video cards include their own memory so that you do not need to use up all of the
RAM to run your monitor.
I/O Busses
Special busses (roads) connecting all of your input/output devices to your motherboard.
The three main types of I/O busses are ISA, PCI and USB.
ISA Industry Standard Architecture
o This was the industry standard in the 1980s and early 1990s.
o It is now used to provide support for older and slower devices.
o Common devices connected to the ISA bus might include an older modem, a
joystick, a mouse, or a printer (using the older, wide-style printer port).
PCI Peripheral Component Interconnect
o This is for newer and faster devices than ISA.
o You can think of this like a wider road, with a faster speed limit!
o Some common devices connected to the PCI bus include your network card, EIDE
devices (hard disk, CD/DVD drive, etc).
USB Universal Serial Bus
o Many new devices can connect to your computer using a USB port.
o Examples include webcams, MP3 players, printers, PDAs, etc.
Operating System Fundamentals
8
Figure 1.1 (below) is a diagram of the architecture of these main components (how they are all
connected)
Figure 1.1
Diagram of system bus architecture
Busses and Bridges
In Figure 1.1 we see that the major components are all connected by different busses. The
Front Side Bus provides the main connection between the CPU, RAM, the graphics card (AGP
Accelerated Graphics Port), and the Northern and Southern Bridges. The Front Side Bus looks
wider than the I/O busses because it is wider and faster. It contains more wires (traces) for the
transmission of data between the devices. In the comparison to a city, the Front Side Bus is like
a major freeway with a fast speed limit. The smaller I/O busses are like smaller side streets.
Some of the I/O busses are narrower and slower than others.
The two bridges in Figure 1.1 perform the same function inside your computer that would be
performed by bridges or roundabouts in a city. They are major intersections where data from
Operating System Fundamentals
9
different devices cross paths. Of course, like any bridge or roundabout, there needs to be traffic
laws to govern who goes first. If there were no rules (and no police to enforce the rules) then
everyone would crash together. In computer terms, your data would become corrupted, and no
information would ever reach its destination.
Figure 1.2 (below) compares the system bus architecture to a series of city streets with
roundabouts:
Figure 1.2
A different view of system bus architecture
Front
Side Bus
Front
Side Bus
Northern
Bridge
Southern
Bridge
ISA Bus
PS/2 Mouse
and Keyboard
PCI Bus
USB
Operating System Fundamentals
10
The Central Processing Unit (CPU)
The CPU is the brain of your computer. It performs all of the calculations.
Of course, in order to do its job, the CPU needs commands to perform, and
data to work with. These instructions and data travel to and from the CPU
on the system bus. The operating system provides rules for how that
information gets back and forth, and how it will be used by the CPU.
Inside the CPU
Inside the CPU there are many important parts:
Figure 1.3
The parts inside a typical CPU
The Arithmetic Logic Unit (ALU), which
performs the calculations
The Control Unit, which controls the flow of
data inside the CPU
The Interface Unit, or the I/O Unit, which
acts like a gate for information entering and
leaving the CPU
Registers, which temporarily hold data and
instructions waiting to be used
The Program Counter (PC Register), which is
a special register holding the address of the
next instruction the CPU needs from the RAM
The FetchDecodeExecute Cycle
The CPU finds, interprets, and executes program code using a
specific cycle, as follows:
1. The CPU looks in the PC register for the location of the next
program instruction.
2. The CPU retrieves the next instruction from RAM, and places
it in a register.
3. The CPU changes the PC register with the address in RAM for
the next instruction.
4. The CPU performs the first instruction, and repeats the cycle
until the power is lost.
Check PC register
for address of next
instruction
Retrieve next
instruction from
RAM, and place it
in a register
Place address of
next instruction in
PC register
Execute first
instruction
Repeat
Cycle
Figure 1.4
The FetchDecodeExecute Cycle
Operating System Fundamentals
11
Memory
Memory is stored on Radom Access Memory (RAM)
chips. A typical computer today has between 1
Gigabyte (GB) and 4 GB of RAM.
Memory is used to:
Store data
Store commands (instructions)
Store system settings
Figure 1.5
Address structure of RAM
Each RAM chip contains millions of address
spaces. Each address space is the same size,
and has its own unique identifying number
(address). The operating system provides the
rules for using these memory spaces, and
controls storage and retrieval of information
from RAM.
32-bit value
RAM
0
32-bit value
4
32-bit value
8
32-bit value
12
32-bit value
16
32-bit value
20
32-bit value
24
32-bit value
32
32-bit value
36
32-bit value
40
Addresses
Fast fact RAM is a device!
Without an operating system, a computer would
not be able to use RAM chips. This is because
your computer treats the RAM chips like a
device that has been installed (just like a
webcam, or a printer). When your computer first
starts up, it can only use a small amount of RAM
memory (1 Megabyte (MB)) that is built into the
motherboard. Device drivers for RAM chips are
included with the operating system, and must be
loaded as part of the boot process in order for
the RAM to work!
Problem: If RAM needs an operating system to
work, and an operating system needs RAM in
order to work, how does your computer activate
its RAM to load the operating system?
Operating System Fundamentals
12
Talking to Devices
Devices talk to each other and to the CPU. They need to communicate in order to share
information, and in order to be told what to do! There are two types of devices that are
controlled by information from the CPU:
Programmed devices, and
Interrupt-driven devices
Programmed Input/Output Devices
Programmed I/O devices need to be completely controlled by the CPU. That means the CPU
must stop whatever task it is doing, and focus on the device until it has finished whatever it has
been told to do. This wastes a lot of processing time!
Interrupt-Driven Devices
A more efficient way to control devices is by using an interrupt controller. The
interrupt controlled keeps track of whichever devices need to talk to the CPU,
and gives different priority to different devices. For example, the keyboard
gets higher priority than a modem. When a device needs new instructions, or
when it has finished a task, the interrupt controller issues an interrupt to the
CPU (like raising your hand in class). The CPU stops whatever it is doing long
enough to talk to the device. Although this is more difficult to program, it
results in better computer performance.
Of course, the operating system provides all of the rules for communicating with both
programmed and interrupt-driven devices.
Direct Memory Access
Sometimes devices may want to talk to
each other without ‗going through‘ the
CPU. The DMA Controller controls access
to the system bus, and RAM, and
bypasses the CPU. The CPU does not
need to get involved in the process,
other than to set up the transfer. The
CPU will get an interrupt when the
transfer is complete.
Direct Memory Access is like adding
police officers to a roundabout who will
let traffic go through to other streets
when the road is clear.
The CPU
(Not
everyone
needs his
attention!)
The Interrupt
Controller
Some devices
don’t need to
talk to the CPU
I’m only going to
McDonald’s! Do I
really need to go
see the CPU first?
Don’t worry,
Sarge! I’ll direct
this car through
the roundabout!
The DMA Controller is
like a second traffic
officer who handles traffic
not going to the CPU
Can I
go
now?
Figure 1.6
DMA is like an extra police office who guides cars
through a busy intersection without bothering
anyone back at the police station first
Operating System Fundamentals
13
Unit Summary
A computer is a system of devices that are all connected together, just like buildings throughout
a city. All of these devices are connected through the motherboard. A system of wires, called
traces, provides a means for information to be exchanged between all of the devices. These
wires are called busses, and they are like the roads throughout a city. Just like in a city, there
must be traffic laws and police officers to enforce the flow of traffic, or it will all crash together
and become useless. The rules for data transfer, and the control of devices installed in a
computer, are provided and enforced by the operating system.
Key Terms
Address
AGP
ALU
Application software
Busses
Computer architecture
Control Unit (CU)
CPU
DMA
DMA Controller
FetchDecodeExecute cycle
Front Side Bus
Graphics
I/O Busses
Instructions
Interface (I/O) Unit
Interrupt
Interrupt Controller
Interrupt-driven I/O devices
ISA
Memory
Motherboard
Northern Bridge
Operating System
PCI
Program Counter
Programmed I/O devices
RAM
Registers
Southern Bridge
System bus
Traces
USB
Review Questions
1. Describe the difference between pre-emptive multitasking and cooperative multitasking.
2. What are registers?
3. List and briefly describe any three busses that are used to transport data.
4. Briefly describe the FetchDecodeExecute cycle.
5. Describe the difference between programmed I/O devices and interrupt-driven I/O
devices.
6. What is the benefit of DMA?
Operating System Fundamentals
14
Unit 2: Operating System Fundamentals
What is an Operating System?
You need two types of software in order to use your computer (or any other computerized
device). These are applications and system software. Applications are the programs you use to
do tasks, such as write a document, surf the web, or play games. System software runs the
computer system for you. Another name for system software is an operating system. There are
many different operating systems, but they all have a similar architecture (or structure). That is
because they must all overcome the same problems and perform the same basic functions. An
operating system must be able to:
Manage system resources
o CPU scheduling
o Process management
o Memory management
o Input/Output device management
o Storage device management (hard disks, CD/DVD drives, etc)
o File System Management
Simplify the development and use of applications
Examples of Operating Systems
A number of operating systems are available for personal computers. The most popular is
Microsoft Windows, which is the operating system used on over ninety percent of the world‘s
personal computer systems. Another popular operating system is Mac OS X, which is the
operating system used for Apple Macintosh computers (like the Mac Book Pro laptop series).
While IMB PCs (mostly Windows) and Mac computers are not directly compatible, it is possible to
use virtualization to run one operating system on an incompatible computer.
Another group of widely used operating systems is based on UNIX. UNIX was a command line
interface operating system developed for large scale computers and networks in the 1960s. The
latest generation of operating systems derived from UNIX is called Linux. It is a free, open-
source operating system that is supported by most computer platforms.
Special Purpose Operating Systems
Operating systems are not limited to just personal computers.
Most electronic devices today use an operating system to
manage their physical components and to make it easier to
develop applications for use on the devices. Examples include
the Symbian, Blackberry, Palm and Windows Mobile operating
systems used for personal digital assistants (PDAs) and mobile
phones. Specialized operating systems have even been
developed to control computerized aircraft systems (VxWorks,
pSOS and QNX are examples).
Operating System Fundamentals
15
The Structure of Operating Systems
Layers
Accessing computer resources is divided into layers. The user represents
one layer at one end of the system. Your computer‘s hardware represents
the layer at the opposite end of the system. In order to use your hardware
to do anything with the computer, you need software. Software forms the
layers in between the user and the hardware and is divided up into
application software and the operating system. The operating system must
be able to manage resources from both the applications and hardware
layers.
In the computer layer system the user interacts directly with software
applications. The applications interact with both the user and the operating
system. The operating system interacts with the applications and controls
the hardware.
Each layer is isolated and only interacts directly with the layer below or
above it. If you make changes to any one layer, they only directly affect
the layer next to it. For example, if you install a new hardware device you
do not need to change anything about the user or applications. However,
you do need to make changes to the operating system. You need to install
the device drivers that the operating system will use to control the new
device. If you install a new software application you do not need to make
any changes to your hardware. But you do need to make sure the
application is supported by the operating system and the user will need to
learn how to use the new application. If you change the operating system
you need to make sure that both your applications and your hardware will
work with the new operating system.
Running Multiple Operating Systems
It is possible to install more than one operating system
on a computer. You can do this by partitioning your
hard disk(s) and installing different operating systems
on different partitions. This can be very useful,
because you may want to use different operating
systems to perform different tasks. For example, you
may have specialized applications that will only work
with one operating system, making them incompatible
with the rest of your software. When you turn your
computer on, you are given a choice of which
operating system to use. You can only run one
operating system at a time. Figure 2.2 (right) shows
the system of layers when multiple operating systems
are installed on the same computer.
USER
APPLICATIONS
OPERATING
SYSTEM
HARDWARE
Figure 2.1
Layers in a
computer system
Figure 2.2
Layers in a computer with multiple
partitions and operating systems
Operating System Fundamentals
16
Running a Virtual Operating System
What happens if you want to work on
applications in two operating systems at the
same time? What about if you want to run an
operating system that is not compatible with
your computer‘s hardware? (For example, you
cannot install the Mac OS X operating system
on an IBM compatible PC). You can get around
these problems by running a virtual computer.
A virtual computer is really an application
within one operating system that lets you
pretend you have a different operating system
installed. Virtual computer applications like
VMWare and Virtual PC act as translators.
They convert instructions from the virtual
operating system into instructions from the real
operating system, which then controls your
computer‘s hardware.
Figure 2.3 (left) shows the structure of layers
when you run a virtual operating system within
a Windows operating system. As far as
Windows is concerned, it is simply running
another application. Notice that the layers
between the virtual computer application and
the user are just like the layers for a single
operating system (Figure 2.1).
Operating System Modes
A typical operating system has two modes of operation. These are like layers of operation within
the operating system layer (Figure 2.1). The User Mode is concerned with the actual interface
between the user and the system. It controls things like running applications and accessing
files. The Kernel Mode is concerned with everything running in the background. It controls
things like accessing system resources, controlling hardware functions and processing program
instructions. The Kernel forms the core of the operating system, and it acts like a supervisor for
everything that is happening in the computer. In the client-server model of an operating
system, the User Mode is considered a client. That is, the User Mode accesses resources
provided by the Kernel (the server). Figure 2.4 (below) shows what operating system functions
are controlled by the User Mode and Kernel Mode.
USER
Windows
Applications
Windows
HARDWARE
Figure 2.3
Layers with a virtual operating system
Linux
Applications
Linux
Virtual Computer
Application
Operating System Fundamentals
17
Graphics
System
ApplicationService
Service
Service
Service
Application
Application
Application
Application Interface
Scheduler
Memory
Manager
I/O Device
Manager
File System
Security
System
Kernel
Hardware
User mode
(client)
Kernel mode
(server)
Graphics
System
Dispatcher
Figure 2.4
Typical structure in the Client (User Mode) Server (Kernel Mode) model of an operating system
Starting an Operating System
Most personal computers have similar architecture and can use a variety of different operating
systems. When a computer is first made, there is no operating system installed. Even after you
have an operating system installed, you can remove it and install a different one. As we
discussed earlier, you can even have multiple operating systems installed on the same personal
computer. This raises the questionhow does your computer start the operating system? If
you have more than one operating system installed, how does your computer choose which
operating system to use?
Your computer is designed to start in stages. In the first stage, you turn on the power supply to
your computer. This sends electricity to the motherboard on a wire called the Voltage Good
line. If the power supply is good, then the BIOS (Basic Input/Output System) chip takes over.
At this stage the computer‘s CPU is operating in Real Mode (or real address mode), which means
that it is only capable of using approximately 1 MB of memory built into the motherboard. RAM
will be initialized later using device drivers from the operating system.
The BIOS chip contains basic instructions for starting up the rest of the computer system. The
first thing that it will do is a Power-On Self Test (POST), which will check to make sure all your
Operating System Fundamentals
18
hardware is working properly. If the hardware is all working, BIOS will then look for a small
sector at the very beginning of your primary hard disk called the Master Boot Record (MBR).
The MBR contains a list, or map, of all of the partitions on your computer‘s hard disk (or disks).
After the MBR is found the Bootstrap Loader follows basic instructions for starting up the rest of
the computer, including the operating system. If multiple operating systems are installed, the
user will be given a choice of which operating system to use.
The next stage is called Early Kernel Initialization.
Remember that the Kernel is the core of the
operating system, and it regulates all of the
background functions of your computer. In the
Early Kernel Initialization stage, a smaller core of
the Kernel is activated. This core includes the
device drivers needed to use your computer‘s RAM
chips. Without the extra memory provided by
RAM, it is not possible to run the more complicated
code for the remainder of the operating system.
Once the Early Kernel Initialization is complete,
the CPU switches to Protected Mode. The
computer can now take advantage of the extended
memory address system provided by RAM, and the
operating system‘s Kernel is fully initialized. Only
at this stage are the first User Mode processes
initialized, and the user can begin interacting with
the operating system, applications and hardware.
Figure 2.5 (below) shows the stages in starting an
operating system.
Remember RAM is a device!
In the first unit we said that without an
operating system a computer would not be
able to use RAM chips. Your computer
treats RAM chips like a device that has been
installed. When your computer first starts
up, it can only use a small amount of RAM
memory (1 MB) that is built into the
motherboard. Device drivers for RAM chips
are included with the operating system, and
must be loaded as part of the boot process
in order for the RAM to work!
Problem: If RAM needs an operating
system to work, and an operating system
needs RAM in order to work, how does your
computer activate its RAM to load the
operating system?
Solution: Device drivers for RAM are loaded
during the Early Kernel Initialization stage.
Figure 2.5
Stages in the startup of an operating system
Operating System Fundamentals
19
Interfacing with an Operating System
Types of User Interfaces
An operating system operates the functions of a computer. It also provides a way for users to
interface with, or access, a computer‘s applications, resources and hardware. There are two
main types of user interfaces for an operating system:
Command Line Interface
Graphical User Interface (GUI)
A command line interface uses typed commands to issue instructions to the computer. It can be
more difficult to use because the user must type the precise commands and locations of files.
DOS (Disk Operating System) and UNIX are examples of command line interface operating
systems.
A GUI uses graphics (or pictures) and menus to help the user access resources and issue
commands. Windows XP, Linux and Mac OS X are examples of GUI operating systems.
The Command Line Interpreter
Applications are accessed at the User Mode level. This means that they do not have the
authority to directly access system resources that are controlled at the Kernel Mode level. When
a user types a command (in a command line interface) or performs a task within an application
(using a GUI), processes are initiated. Since those processes usually require access to system
resources, the command line interpreter converts them into system actions (called system calls).
Most interpreters execute applications to perform the system calls.
Figure 2.6
Examples of a command line and GUI interface
Operating System Fundamentals
20
Managing System Resources
An operating system needs to manage a wide range of system resources. Some of the main
resources controlled by the operating system include CPU scheduling and process management,
memory (RAM), access to peripheral devices and file system management.
CPU Scheduling
Memory is like a workspace for the information and program instructions that are being used by
the computer. The Central Processing Unit (CPU) is the component that actually does the work.
The CPU performs all of the program instructions, sends commands to devices, and receives
information back from those devices. Just like memory, something needs to regulate which
devices and applications get to use the CPU, and for how long. This task is handled by the
operating system.
Most modern CPUs and operating systems can handle multitasking and multithreading. That is,
they can run more than one application at a time and they can process threads from more than
one device and application at a time. However, the CPU has limited resources. It needs a
schedule of processes to carry out, or nothing will run properly.
In older operating systems, it was up to each
application to determine how long it needed to use
the CPU and what priority it should be given over
other applications or interrupts from devices. This
was called cooperative multitasking. Unfortunately,
this system was rather like having roads with no
traffic laws or police officers. If someone wanted to
take complete control and cut off all other traffic, it
was possible. Newer operating systems use
preemptive multitasking. That is, the operating
system sets out the rules for the use of the CPU and
enforces those rules. Preemptive multitasking
means that the operating system shares the CPU
between everything that needs its attention. It also
gives priority to certain devices and applications
based on how critical they are to keeping the whole
system functioning.
Word
Applications
Microsoft
Outlook
Adobe
Acrobat
Internet
Explorer
Figure 2.7
An operating system shares
the CPU between applications
Operating System Fundamentals
21
The Process Table
As previously discussed, processes need to share the
CPU. Sometimes the CPU does not complete an
entire process before the operating system tells it to
start working on another one. This system of
sharing is what makes multitasking possible.
Keeping track of all of the processes is done with the
Process Table. The Process Table lists all of the
processes that are currently being run, those that
are waiting to be executed and those that have been
temporarily suspended. It also keeps track of the
current status, or state, of each process. This allows
the CPU to restart those processes again when they
are needed. Figure 2.8 (right) shows processes from
the Windows XP process table, as displayed in the
Task Manager. We will take a closer look at
processes and process management in Unit 3.
Memory Management
Memory is used by a computer to temporarily hold data and
instructions that are being used by applications, the operating
system and hardware devices. Since a typical computer has
between 1 and 4 GB of memory (RAM), and since modern
operating systems can run many devices and applications at the
same time, there is a lot of memory to keep track of. As we
noted in the previous chapter, RAM is divided up into small
spaces (usually 32 bits). Each space has its own address. An
operating system must be able to keep track of all of those
memory addresses and how they are currently being used. The
operating system typically performs three major functions with
respect to memory management:
1. Gives memory to each application and device as needed;
2. Protects applications (and their data) from each other;
3. Protects the system from ‗bad‘ applications (that might
try to use too much memory, or corrupt data from other
applications);
We will take a more detailed look at how operating systems
manage memory in Unit 4.
Figure 2.8
The Windows XP Task Manager showing
processes from the process table
Windows
Word
Internet Explorer
Internet Explorer
Unused
RAM
Figure 2.9
An operating system shares
memory between applications
Operating System Fundamentals
22
Peripherals
Peripheral devices are hardware devices that are connected to the computer by connection ports
on the motherboard. Examples include the monitor, keyboard, mouse, webcam and printer.
Peripheral devices are difficult to program and manage. Although many different applications
need to use peripheral devices, the task of accessing them is simplified by the operating system.
Applications do not directly access peripheral devices. These devices are programmed and
controlled using device drivers provided by the operating system. When an application needs to
use a device it talks to the device drivers. The device drivers then tell the device what to do.
When a new device is installed, the operating system looks for built in
device drivers or adds new drivers to control the device. Most newer
operating systems and devices are Plug and Play compatible, which
means that the operating system will handle everything related to
installing the new device and its drivers without any action from the
user (other than confirming installation options).
We‘ll take a closer look at Input/Output management in Unit 5
File System
Your computer contains more than
just your hardware resources. It
also contains all of the information
that you use and manipulate. This
information is stored on your hard
disk, CD/DVD discs, and removable
storage devices. Your operating
system controls the actual physical
operation of these storage devices.
It also helps you to manage the files
stored on these devices.
Different operating systems use
different file systems to encode and
organize your information. For
example, older versions of Windows
used either FAT16 or FAT32 (FAT
stands for File Allocation Table).
These older file systems limited the
amount of information you could
store on a hard disk, so newer
versions of Windows (XP, Vista, and
Windows 7) use NTFS (New Technology File System). NTFS lets you store up to 2 Terabytes
(TB) of information on a single volume and provides greater file security than the older FAT file
systems. Other operating systems use different file systems such as EXT3 for Linux, or HFS+
for Mac OS X.
Regardless of which file system an operating system uses, the operating system must perform
certain key file management tasks for the user:
Figure 2.8
The Windows XP Task Manager
Showing processes from the process table
Figure 2.10
Windows Explorer is a tool for viewing
and navigating your computer’s file system
Operating System Fundamentals
23
Manage the storage and retrieval of information; and
Provide a common, easy to navigate system for viewing and accessing storage devices
and the information stored on them.
We‘ll take a closer look at file system management in Unit 6
Unit Summary
Computerized devices need an operating system to control the actual functioning of the device.
Whether the device is a personal computer, mobile phone, or computerized aircraft controls, an
operating system must provide some way for the user to interface with the device. Modern
operating systems use a graphical user interface approach to simplify access to applications and
hardware resources. Operating systems act as one layer in the functioning of a device. Other
layers include the hardware, applications and the user. It is possible to install more than one
operating system on a computer, which creates multiple sets of layers (however, only one of
these sets of identical layers can operate at any given time). It is also possible to use
virtualization to simulate the use of two different operating systems at the same time.
Regardless of which operating system is being used, there are similar tasks that the operating
system must perform. The primary tasks performed by the operating system include the
management of CPU scheduling and tracking processes, memory management, management of
Input/Output systems and file system management.
Key Terms
Applications
BIOS
Blackberry OS
Bootstrap loader
Client-server model
Command Line Interface
Command Line Interpreter
Cooperative multitasking
Device drivers
DOS
Early Kernel Initialization
EXT3
FAT 16
FAT 32
File system
Full Kernel Initialization
Graphical User Interface
HFS+
Kernel mode
Layers
Linux
Master Boot Record (MBR)
Max OS X
Mode
Multitasking
Multithreading
NTFS
Operating system
Palm OS
Partition
PDA
Preemptive multitasking
Process
Process state
Process Table
Protected mode
pSOS
QNX
Real mode
Special purpose operating
system
Symbian
System call
System software
UNIX
User mode
Virtual computer
Virtualization
Voltage good line
VxWorks
Windows Mobile
Operating System Fundamentals
24
Review Questions
1. Describe two tasks that are performed by an operating system.
2. Describe the four layers of interaction in an operating system model.
3. Briefly describe how you can install multiple operating systems on the same computer.
4. Draw a diagram to demonstrate how virtualization can be used to run multiple operating
systems at the same time.
5. Briefly describe the client-server model of an operating system.
6. Draw a simple diagram to show the two modes of operations of an OS.
7. Briefly describe the function of the Command Line Interpreter.
8. What are the major differences between System Calls and Interrupts?
9. Describe the purpose of the Process Table.
Operating System Fundamentals
25
Unit 3: Processes
Processes and Multitasking
Many people like to try to speed up several tasks by performing them at the same time, such as
using a mobile phone while driving. While this seems like you are accomplishing two things at
the same time, the truth is that your brain specifically focuses on just a single task at any
specific time. The act of talking only occurs while you are not actively making decisions about
the task of driving. To put this into perspective, if you know you are about to have an accident,
you will stop talking.
To further illustrate the idea of performing simultaneous actions consider the problem of reading
a text message while watching TV. Your eyes can only look at one device at a time. You must
switch back and forth between the two devices, or look at the mobile phone only while
unimportant things are happening on the TV.
A CPU inside a computer is simply a high speed calculator that can perform relatively simple
operations on a set of data. If we ignore for the moment the idea of dual and quad core CPUs,
the CPU can only process a single instruction at any given time from a program. If we would
like to have more than one program executing on the processor at the same time, the programs
will need to take turns using the CPU. Since the computer switches back and forth between the
two programs often enough, then it will look like both applications are running at the same time.
This unit takes a detailed look at the definitions of processes and threads, and how the operating
system manages processes and threads in multitasking environments. The first section deals
with the definition of a process, and the concept of process states. We then take a look at state
changing, process creation and stopping processes. From there, we compare processes to
threads, and take a look at why threads are important. This is followed with a detailed look at
inter process (and inter thread) communication, including process synchronization, memory
sharing, the use of signals (or semaphores), critical sections, and the use of message queues.
We conclude by looking at how an operating system actually handles process scheduling. This
will include topics like completion scheduling, round robin scheduling, priority-based scheduling,
and scheduling in multi-core/multi-processor environments.
Process
Definition
In order to manage individual applications (or what we often refer to as programs) most
operating systems use the term process. An application or program is a set of instructions. A
process is the actual execution of those instructions, along with the memory and I/O devices
assigned to execute the given instructions.
Operating System Fundamentals
26
State Machines
Without the need to switch from one process to the next, the creation of an operating system is
significantly reduced. In fact, many embedded systems that have no multi-tasking often do not
include an operating system for cost reasons.
If we consider the act of attempting to read a text message while we are driving, we know that
the driver will only take their eyes off the road when the traffic appears to be under control. The
driver will then quickly look at some of the text on the mobile phone and then return to looking
at the road. This process will continue as long as there are text messages to be read or places
to drive to.
If we draw a small diagram showing these two actions we have something like this:
Mobile
Road looks okay
We heard a horn/screeching
tires or we looked at the
mobile long enough
Road
Figure 3.1
Example of a simple state machine
This diagram shows that we start off by watching the road and when we feel that the road
conditions look okay (no cars or pedestrians) we then look at the mobile phone. We then read
the message on the mobile phone until one of two things occurs:
1. We hear something that needs our attention such as some screeching tire or we notice
something in our peripheral vision.
2. We have been looking at the mobile phone for some time and realize that we should
probably see if we are about to hit something.
Figure 3.1 (above) is used in many computer design documents and is called a state machine.
The circles represent the state of the machine and the arrows represent actions that cause us to
change from one state to another. In the case of our texting driver, there are two states:
looking at the road, or looking at the mobile. Although it may be possible to hold the phone in
such a way that the peripheral vision encounters most of the road, it remains a fact that you
cannot actually look at both the phone and the road at the same time. Regardless of the opinion
of texting while driving, a single core CPU can only perform one instruction at a time which is the
whole reason for describing this analogy. The important thing to take away from this analogy is
that some things in life are modeled really well by state machines and that events can cause
some resources to change from one state to the next.
Operating System Fundamentals
27
Computer Process
We now return to the world of computers and processes again. The fact that a single-core CPU
can only process one instruction stream at a time means that two applications must take turns
running on the CPU. Although a rather obvious statement, if we consider just one single
process we realize that it must be either running or is must not be running. This sounds
suspiciously like a set of states. In addition to two states, the next question that should be
asked is how does a process go from ‗not-running‘ to the ‗running‘ state?
We will call the ‗not-running‘ state the Ready state because it indicates that the process would
like to run but currently cannot (suggests that some other process is actually running). We will
also introduce two new terms:
Dispatch means that the operating system has decided that the process should start
running now.
Interrupt means that the operating system has decided that the process should now
stop running so that another process can have a turn.
Putting all of this information into a single state diagram produces this version.
Running
dispatched
interrupted
Ready
Figure 3.2
A computer-based simple state machine
Many operating system students see diagrams such as this and easily understand the concept,
but many textbooks fail to remind the students that there are in fact probably several processes
(each with their own state machine) going at the same time.
The operating system would normally keep a list of all processes current loaded in a table known
simply as the process table. The process table would need to keep the current state value. So
if an operating system were running four programs (Word, Internet Explorer, Excel, and Visio)
the table might look like this:
Operating System Fundamentals
28
Process ID
Name
State
4
Word.exe
Running
6
IExplorer.exe
Ready
7
Excel.exe
Ready
8
Visio.exe
Ready
Figure 3.3
A sample process table
At no time would it be possible to have two processes both with the state Running as we are
assuming for now that the CPU has only a single core.
Processor Preservation
An important concept in operating system design is the ability to hide multi-tasking concepts
from the applications so that they do not know that other applications are running at the same
time and sharing the CPU.
The registers in the CPU become a resource that is shared by every process on the operating
system, including the operating system itself. Keeping these values correct is critical to making
an operating system that functions correctly. As an example if one application performs the
instruction ―mov $50, %eax‖ it means that the program expects the value 50 to be placed into
the EAX register. If this instruction were executed then another process ran that put the value 0
into the EAX register, the old value of 50 needs to be replaced before the first process runs
again.
Each process on the system requires a place where the registers can be stored while the process
is not running. The operating system can either put this information directly into the process
table or it can store the information somewhere else but keep a pointer to the information in the
process table.
State Changing
Changing the state of a process from one state to another is usually the responsibility of the
operating system. The actual switch could occur because the process has informed the
operating system that it is finished, or the operating system could determine that the process
has simply used too much time and it is another process‘ turn to run.
Of course, the operating system itself is software that must run on the CPU along with the
applications. So how does the application actually become active so that it can stop Word in the
example above? Many hardware platforms include a clock that periodically fires interrupts and
the operating system has likely attached a function called a scheduler to the interrupt so that
every so many milliseconds (a typical number is 10 ms) the operating system scheduler gets
executed.
When the scheduler is called (by the interrupt) the scheduler code can manipulate the process
table and set the state of the processes involved and then allow the new process to take over.
Operating System Fundamentals
29
As an example suppose that the process table looks like the example above with the four
processes and Word.exe executing. Here is the order in which things happen.
1. The word process is executing normally using the CPU registers as it requires.
2. The clock triggers an interrupt.
3. The interrupt service routine attached to the clock runs the OS scheduler.
4. The scheduler searches the process table for the process in the Running state.
5. The scheduler changes the state of the running process to ―ready‖.
6. The scheduler copies all of the current value of the registers into the process table (or
information block)
7. The scheduler looks in the table to find the next process to run. The topic of selecting a
process to run is covered in the next chapter but for here let‘s assume that iexplorer is
about to run.
8. The scheduler marks iexplorer as ―Running‖
9. The scheduler copies all of the registers for iexplorer into the CPU from the process table.
10.The scheduler then jumps to the next instruction that iexplorer should perform.
Preemptive and Non-preemptive Switching
The steps above describe a type of process switching known as preemptive switching. The
operating system always has control of the computer by way of the interrupt service routine and
will always be able to stop the currently executing process.
In a non-preemptive operating system, the operating system never interrupts the currently
executing process but waits for the process to release control voluntarily. In such a system, the
operating system is generally less complex. However, if a process does not wish to cooperate
with the other processes on the computer then this could lead to other processes, including
some operating system parts, never having a chance to execute.
The most recent popular non-preemptive operating system was Windows 3.1. In this operating
system, a programmer who accidentally created certain types of infinite loops would often have
to reboot their computer in order to stop their program. As a result, most operating systems
today utilize preemptive process switching so that no single process can monopolize the
computer.
Blocking
Generally the CPU executes instructions very quickly and transferring data to and from RAM also
takes place with very little delay. Unfortunately, if a process wishes to interact with some
external device the CPU cannot do very much useful work.
As an example, suppose we have written a very simple program that does nothing until the user
presses a key on the keyboard. The computer instructions may look something like this:
Operating System Fundamentals
30
Main() {
System.in.read(value);
}
When this process is started, there will be some instructions required to get the application
ready, but eventually the instruction ―wait for key‖ is reached. When the process reaches this
instruction what does the operating system do with the process state? In our two state process
model introduced earlier (see Figure 3.2), leaving the process as ‗Running‘ is probably not a
good idea because it really is not doing anything. Moving the process to the ‗Ready state does
not help either because the process is not actually ready to run. It needs to wait for the user to
do something.
We solve this problem by the introduction of a third state that we will call Blocked. This state
tells the operating system that the process is currently waiting for something, and therefore it
should never be considered by the scheduler when a new process is being selected for running.
The term blocked exists because the execution is being ―blocked‖ by some external event.
Some operating systems might use the term wait.
Running
dispatched
interrupted
Ready
Blocked
wait
Event complete
Figure 3.4
A state machine with a blocked state
In this model, a process can never become blocked unless it is actually running. Once a process
has been put into the blocked state, it must eventually be unblocked or released and put back in
to the ‗Ready‘ state before it can run again.
How does the operating system manage these extra state changes? As mentioned in the earlier
sections of the book, the operating system is responsible for managing all system resources
including I/O devices. In the case of waiting for the user to press a key, the program actually
issues a request to the operating system (through a function call) asking to wait for a key. In
response, the operating system changes the state of the process to blocked and runs the
scheduler algorithm to find a new process to run.
Operating System Fundamentals
31
How does the process get out of the blocked state? Each I/O device has its own mechanism that
it uses to interact with the operating system. We will consider just the keyboard example here.
During boot up, the operating system has probably attached a service routine to the keyboard
interrupt. Each time that the user presses a key, it generates an interrupt which causes the
operating system routine to generate a key event and the scheduler routine will be called. The
scheduler will then look through all of the blocked processes waiting for key events and the state
will be changed from blocked to ready. Whether this keyboard interrupt causes the currently
running process to be moved from Running to Ready depends on the operating system itself.
If you were to examine the states of processes on most computers, you will probably find that
just about every process on the system is blocked most of the time.
Having a lot of blocked processes is a good thing. This means that the CPU is not actually being
used for any useful work. Normally when a CPU does works it consumes power, while if it is not
doing work it can be put into a low power state. Tricks such as these are used in mobile devices
such as phones in order to extend the life of the battery. A process that is actively using the
CPU will consume more power, which is why batteries do not last as long when you are watching
videos on your mobile phone than when you are simply on stand-by waiting for calls.
A Typical State Model
There are two additional states that most operating systems include in addition to the Ready,
Running, and Blocked states. These two extra states are used for house keeping, and only exist
for a short amount of time during the creation and the removal of processes.
When a new process is created there needs to be a new entry placed into the process table.
Setting the process table entry actually takes a bit of time. If the new process was marked as
Ready and the operating system was in the middle of initializing the other columns in the
process table, and suddenly a reschedule operating occurred, how would the operating system
know that it should not select this new process? We control this by the introduction of a New
state. This is simply a place holder that is used by the operating system until the process is
actually ready to start going.
Similarly there is an Exit state that is used when a process is being cleaned up. This state is
also temporary, and the operating system will mark the process first then remove it from the
process table.
Running
dispatched
interrupted
Ready
Blocked
wait
Event complete
New Exit
terminatecreated
Figure 3.5
A typical state machine
Operating System Fundamentals
32
Switch Prevention
There may be situations where a process is in the middle of doing something really important
and it would like to ensure that no other process is allowed to run in the meantime. There are
two mechanisms that are typically available: disabling the scheduler and disabling interrupts. By
disabling the scheduler the process is requesting to the operating system that even if the timer
fires, the process would like the operating system to skip the selection of a new process. In the
case of disabling interrupts, the program is actually asking the CPU to ignore any interrupts that
occur.
Both of these techniques have the ability to seriously hinder the functionality of the computer.
As such, most operating systems will refuse to comply with such requests unless the programs
doing the request have enough privileges. However, some parts of the operating system itself
are very vulnerable to interruption and the operating system itself is allowed to prevent
switching and even interrupts if required. How privileges are enforced is a topic for later
discussion.
Process Creation
The actual creation of a new process on a computer is very specific to the operating system itself
and is often tied very closely to the hardware. We will attempt to describe the creation of
processes in a high level view.
Most operating systems utilize a process table to track all of the processes currently residing on
the system. The creation of a new process will usually require a new entry to be created within
the process table. During the addition of the new entry, the table needs to be locked or
switching needs to be prevented so that the full details can be added without interruption.
Marking the state as ‗New‘ is only part of the problem. The table itself might have issues if the
scheduler sees it partially updated.
In addition to the process table entry, most operating systems keep information about the
process in a separate space called the ―process information block‖ (although this can be viewed
as part of the process table). The space for this process information block needs to be set aside
as part of the process creation step.
Each process will be executing some code. Part of the process creation step will be to set aside
enough memory for the code, and then possibly load the code from the executable (some
operating systems work a little differently as explained next).
Process Creation in Unix/Linux
In the Unix (and Linux) operating system, new processes are created by a simple system call
named ‗fork(). The term fork comes from the description of hitting a fork-in-the-road, in other
words a place where a decision has to be made. It is easily shown by this diagram.
Suppose a program has the following statements:
Operating System Fundamentals
33
Statement_A();
Statement_B();
Statement_C();
Fork();
Statement_D();
Statement_E();
Statement_F();
If the fork() call was not included it is easy to see that the program simply executes the
statements one at a time. However with the inclusion of the fork() command, things get a bit
more interesting.
When the system called is made at fork(), the
operating system will actually duplicate the
current process so that there are now two.
This means that the original process will keep
running statements D through F, but there will
also be another process also running
statement D through F.
When a Unix program issues the fork()
system call, the operating system duplicates
the current process exactly and once
completed there are now two processes
executing that look identical. However, they
will be two distinct processes sharing the CPU.
Any variables (memory) associated with one
process will be different than the memory
associated with the other process.
Who issues the fork command? Most
applications that developers write would never
use the fork system call unless they were
trying to create a program that starts another
program. In fact the fork() system call
remains a mystery to most software
developers, so if you never see it again
outside of this text, that would be quite
normal. When you are working in the
GNOME desktop environment, you select
applications from a launch menu. When you
finally select OpenOffice Write, guess what
happens? Yes, GNOME will issue a fork()
system call.
If you are comfortable with the idea of the
fork() system call, you should now ask the
question: ―If GNOME issues the fork() call,
why then do we see OpenOffice Write instead
of another copy of GNOME?‖. As soon as the new process is created the second process will
replace its current executable code with new code from the drive.
Statement_A
Statement_B
Statement_C
Fork()
Statement_DStatement_D
Statement_EStatement_E
Statement F Statement_F
Figure 3.6
A program issuing the fork() command in Unix
Operating System Fundamentals
34
The source code for launching a new application then typically looks like this:
Main()
{
While (true) {
Display prompt on screen
Wait for user to enter command (i.e. block!)
If command is a shell command process the command (i.e. set, echo)
Assume the command is a program, check if the program exists.
Fork() a new process.
If (we are original process) {
Wait for new program to finish
}
Else { // we must be the new process
Load the program and run it
}
}
}
Stopping a Process
There are three typical ways that a process will terminate and release the memory back to the
operating system:
1. The process can request the operating system to stop itself. This is actually accomplished
when the ―main‖ member function of most programming languages finishes. The
programs that you write are actually linked with some library functions to look after
making the system call for you. Generally an application will always have permission to
terminate itself.
2. The process could generate an unmanaged exception. An exception is simply an error
that has occurred that cannot be dealt with by the program. Examples of exceptions
include division by zero and the process trying to perform an instruction that is not
permitted by the operating system. You have likely seen the failure on Windows which
states something like ―This program has performed and illegal operation‖.
3. One process can request that another process be terminated. Most operating systems
provide some form of security to prevent processes from terminating other processes
unless the originating process has some sort of privilege. In Linux, the kill system is
used to send a variety of messages including a message to terminate a target program.
Threads
Although on a CPU with a single core it is not possible to execute more than one instruction at
the same time, there are often reasons why a programmer would prefer to structure a program
in such a way that it is really programmed more like two or more separate programs. Take for
example the program Outlook. We are all familiar that when using Outlook, you can be in the
middle of creating a mail message and even while you are writing the message, a new mail
messages can arrive causing Outlook to play a sound and show the arrival flag in the tool tray.
One way that the program could be organized is in the following pseudocode:
Operating System Fundamentals
35
Forever(;;) {
Wait for a single key
Add the character to the message
If there is a new message raise flag and play sound
}
The pseudocode above is valid but not really efficient. Notice that the algorithm only checks for
new mail every time that a key is pressed. On the positive side, it is expected that the program
will not be very intensive on the CPU because it will spend a great deal of time waiting for the
keyboard.
In order to solve the problem of only checking after each key is pressed we could change the
algorithm as follows:
Forever(;;) {
Wait up to 1 second for a new key
If a key was pressed add it to the message
If there is new mail, raise the flag
}
This approach solves the problem of only checking for mail every time that the user is present,
but it comes at an expense of efficiency. If a user starts to type a message and then leaves for
several hours, the program will wake up every 1 second and see if there was a key pressed and
check to see if there is mail. It would be more efficient to do neither if nothing is happening.
A second point about this solution is that as we add more features to our e-mail program, this
main loop becomes more and more complex.
If we are able to organize the program into two tasks, each solving just one problem, then we
could have the following:
While (!messageNotSent) {
Wait for key
Add character to message
}
And
Forever() {
Sleep 10 seconds
Check for mail
Raise flag if new mail
}
If both of these small algorithms can run at the same time within the same application then we
have organized our program into a much easier to understand system, and in a way that adding
more features can be easier.
Now that we have a model to help make the program easier to create, we need a mechanism by
which a program can be organized. These two independent mini-programs have their own set
of instructions that need to be executed on a single processor. This concept sounds very similar
to multi-processing that was covered in the previous chapter, only in this case these mini-
programs are actually part of a single large application.
Operating System Fundamentals
36
In an operating system a single sequence of instructions being executed is called a thread.
Every process will always contain at least one thread of execution unless it has been written in a
special way such as what we have described for our e-mail program. In our application we have
one thread whose task is to read the keyboard and compose the message, and another thread
responsible for occasionally checking to see if there is new mail.
Because there are now two separate threads of execution, there will need to be the concept of
switching and scheduling within our application. Luckily, a lot of operating systems look after
this as part of their services, and the way they function is nearly identical to process switching.
In addition to making certain types of programs easier to create, a second advantage of using
multiple threads arises when a processor with more than one core is used (dual-core, quad-core,
etc.). If a program is created that equally divides a lot of work into four separate threads, then
running the application on a quad core processor will make that application run four times faster.
The art of developing programs this way is the topic of a full course.
Why threads?
Performing a quick search on the Internet or looking at most textbooks will often provide the
description that a thread is a light-weight process. It is starting to appear that processes and
threads are almost interchangeable terms. If you recall one from the section on the overview
of an operating system, most operating systems have a responsibility of protecting one
application from another application (if Internet Explorer crashes it should not cause Word to
crash or lose data). As a result, having two processes communicate or share information is a bit
complex due to the security. However, within a single application there is usually no such
protection between threads. As a result, it is often very easy for two threads to cooperate on a
common set of data.
If the lack of protection between threads sounds like it is a bit risky, then you are completely
correct! Improperly designed applications that rely on multiple threads can result in very
difficult to solve problems that only sometimes occur. With multiple threads, the scheduler is
still allowed to select any thread to run that is ready and this means that if you run a multi-
threaded program ten times, it might actually run ten different ways (sometimes one thread
might get more time at the start). This is often why a program might work fine today, but might
crash unexpectedly tomorrow. Multi-threaded programs written correctly have a great
advantage but it takes the right type of programming design and testing to remove bugs.
In many life critical systems such as medical equipment or avionics systems, and even high
demand systems such as telephone equipment, the design team will often avoid using more
than one thread to reduce unpredictability at the cost of increasing the complexity of the code.
Operating System Fundamentals
37
Inter Process Communication
The term inter process communication refers to information being exchanged between two
processes. Although the title of the section seems to suggest that the techniques described here
are only for processes, they actually work equally for multiple threads within a single process.
When two or more threads (regardless if they are in the same process or not) are trying to
cooperatively work on a solution to some problem, they need some sort of mechanism to
exchange information so that they can decide which part of the problem will be solved by which
thread, and to ensure that they do not both try to work on a single part of the problem at the
same time. There may be other situations where one thread would like to wait for the other
thread to finish first before continuing on. All of these cases require some sort of
communication so that the two threads can coordinate.
Synchronization
We will start with a very simple task. Suppose that we have two threads (again, if they are in
the same or different processes the same problems exist) as follows:
threadA()
{
system.out.print(“Thread A”);
sysmtem.out.println();
}
threadB()
{
system.out.print(“Thread B”);
system.out.println();
}
Suppose we were to start the threads at exactly the same time (this is not actually possibly by
the way…why?). The result should be that both messages will appear in the output window.
The question that has to arise however is which message appears first? Does it look like this?
Thread A
Thread B
Does the result look like this?
Thread B
Thread A
Does the result look like this?
Thread A Thread B
Unfortunately, with the information provided the output is actually unknown! This type of
solution is called non-deterministic because we cannot predict (or determine) what will happen
every time. In fact every time that we run the program we might see a different result.
Why do we end up with so many possible results? It looks like in the first example the first
thread got to run completed, then the scheduler switched to the second thread. In the last
Operating System Fundamentals
38
example, the first thread got to run but was not completed (it did not make it to the println()
function), then threadB ran and then threadA finished. Non-deterministic solutions are
extremely dangerous in critical systems where people‘s lives are at stake.
Unfortunately we did not actually say what the desired output should actually look like. Let us
suppose that we want the output to be in the order of ―Thread A‖ followed by ―Thread B‖ on the
next line, and we want to be guaranteed of this order, regardless of what else possibly happens.
This suggests that we need some mechanism of making sure threadB waits for threadA to finish.
As always, we would like to try to minimize the amount of CPU time actually used. This first
thing we will talk about is called synchronization.
Shared Memory
Our first technique that we will try to introduce is the concept of shared memory. This is a very
simple to understand concept but, unfortunately, not a very good method in most cases.
The term shared memory means that both threads have access to a piece of memory that they
can both read and write to at the same time. If one thread puts a value into the memory
location, then when the second thread looks at the memory it will see exactly the same thing.
Of course, unless the two threads are executing on a multi-core processor, there will only be a
single thread executing at a given instant. Configuring a variable to be shared between two
processes on an operating system is usually possible. However, it is not particularly easy due to
the security. The creation of shared variables between two threads of the same process is
generally easy (in fact, too easy because a lot of mistakes are generally made by assuming).
For the purpose of discussion we will assume that there is a variable called ―turn‖ which is a
simple integer that is available to both threadA and threadB. We will assume that the sharing is
already set up. Now consider the following pseudocode:
Configure shared variable turn
Turn = 1
Start threadA
Start threadB
You might jump to the conclusion that threadA will be ahead of threadB because we asked it to
start first. Keep in mind that the scheduler is responsible for picking the order. Just because
you asked to start A first does not mean it actually had a chance to run! Now let‘s try to modify
the two threads to make use of the ‗turn‘ variable to make sure that threadA runs first then
threadB.
Operating System Fundamentals
39
threadA()
{
System.out.print(“Thread A”);
System.out.println();
Turn = 2
}
threadB()
{
While (turn == 1)
;
System.out.print(“Thread B”);
System.out.println();
}
Let us take a quick look to see if this actually does what we want. Notice that we started by
setting the shared variable called turn (both A and B can see this) to the value 1. When
threadA is finished it sets the value to 2. If we look at threadB() we see that it looks at the
value of turn and if it is equal to 1 it does nothing, but we expect that eventually threadA will
finish and the value will be set to 2, allowing threadB to execute.
This is good; this provides us a way that allows the two threads to synchronize. It does not
matter if threadB gets to run for ten minutes before threadA even starts, because it will do
nothing until threadA is finished. The variable called turn is an example of a synchronization
variable.
Although this solution works, it does pose a bit of a problem. Look again at threadB, we see
that it starts with a loop that continuously checks the value called turn. Now if threadB were to
run for five seconds, would the value of turn ever change? Of course not, it is threadA that
changes it. This type of checking is called polling and burns a lot of CPU time, which may
consume more power from our battery (or may just cause the processor to heat up for no
reason).
Self-Yield
A better option would be to have threadB look, and if the turn variable is not set then it should
voluntarily go into the blocked state. Moving into the blocked state would allow the scheduler
to run the other thread. We will introduce a new command called sleep which causes the
current thread (and possibly process) to go into the blocked state until a certain amount of time
has passed.
threadB()
{
While (turn == 1)
Thread.sleep(5000); // sleep for 5000ms or 5 seconds
System.out.print(“Thread B”);
System.out.println();
}
This code is much better for the CPU because now even if threadB runs first, it will immediately
go into a blocked state and will wake up only every five seconds to see if the turn variable has
been set. Therefore, even if threadA takes hours to complete, at least threadB is not using too
much CPU time. Of course there is still a bit of a problem with this solution. Suppose that
threadB executed and saw that the turn variable was still 1 and went to sleep for five seconds.
Operating System Fundamentals
40
Then threadA ran and set the value turn value one second later… threadB would only wake up in
another four seconds to see if the variable was set. This means that our problem has been
slowed down because threadB was asleep. You may be tempted to reduce the amount of
sleeping time but this means that threadB will wake up more often and use more CPU time. A
better option would be to have threadA actually wake up threadB when it is finished.
Signals or Semaphores
A signal (or semaphore) is a special type of variable supporting two operations called wait and
raise (when using the term semaphore the operation names are usually take and give).
Because the term signal is used in Unix to mean something critical has occurred (such as a
crash), we will avoid using the term signal here. However, keep in mind that many books use
the terms interchangeably. Semaphores can be used in a number of ways, but the first thing
that we will consider is using them for synchronization.
The two operations are:
1. Take: When a thread asks to take a semaphore, either the thread is given immediate
control because the semaphore is available or else it goes into a blocked state waiting for
the semaphore to become available. As soon as the semaphore is made available (usually
by some other thread) the requesting thread will be immediately woken up.
2. Give: When a thread gives a semaphore it is potentially waking up another thread that is
waiting.
We turn back to our example of the two threads again and consider the pseudocode to get
things started:
Create a semaphore variable called ―turn‖
Make sure the semaphore turn is not available
Start threadA
Start threadB
Now we change the thread code slightly:
threadA()
{
System.out.print(“Thread A”);
System.out.println();
Turn.give();
}
threadB()
{
Turn.take();
System.out.print(“Thread B”);
System.out.println();
}
We had changed our startup code to indicate that we need to actually create the variable called
turn. We have also included an instruction to make sure that the semaphore is not actually
available to start. This is very important. We want to make sure that the ―give‖ instruction in
threadA is the one that makes the semaphore available.
Operating System Fundamentals
41
When threadB starts it will try to take the semaphore. If the semaphore is not available, the
operating system scheduler will move the thread (and possibly the process) into the blocked
state. While in this blocked state, the thread (and process) will consume no CPU. When
threadA finishes its tasks, it will give the semaphore and the give operation will cause the
operating system scheduler to wake up the blocked thread and make it as ready.
The creation of these semaphores is very specific to each operating system, and there are often
a lot of options available for dealing with very complex problems. As an example, suppose there
are two threads waiting on a single semaphore. Which thread gets the semaphore when it is
given? You will probably have an opinion of this immediately by saying the first thread to ask
should receive it, but perhaps the second thread was actually much more important. The
options provided for the semaphores can be used to control specific behavior depending on the
needs of the application. Another issue that we are deferring to the section on scheduling is how
long does threadB wait? In our particular example we probably want to wait until threadA is
finished, regardless of how long it actually takes. In some programs waiting a really long time
might not be the right answer, and often the semaphore take operations allow for an alarm clock
to wake them if the semaphore is not actually given. Of course what you do when you wake up
from an alarm, rather than actually receiving the semaphore, is very specific to the problem at
hand. It is impossible to suggest in this text how to properly handle the situation.
Critical Sections
A critical section is a part of code (or more often a set of variables) that must be accessed in a
controlled way when dealing with a multi-threaded system. Again, as previously mentioned, it
does not really matter if the multiple threads are within the same process or spread across
multiple processes. Critical code sections are actually quite difficult to visualize, so we will start
with a very simple example and build up a solution technique. We will then introduce some
more typical programming requirements.
Suppose again that we have two threads which we will call threadA and threadB. This time both
threads are responsible for doing some long calculation (the actual calculation is not relevant),
and once the calculation is finished then each thread prints some information on the screen. We
will add some additional complexity to cover a new topic.
threadA()
{
Long calculation
System.out.print(“Thread A is finished: “);
For (I = 0; I < 10; i++) {
System.out.print(i);
System.out.print(“ “);
}
System.out.println();
}
Let us assume that threadB looks identical except for the message ―Thread B is finished: ―.
Although these examples are nonsense, they are easy to describe and introduce the concept of a
critical section of code.
First we consider what happens when we run threadA without adding threadB to the mix. This
thread will perform some long calculation and then print out the numbers 0 through 9 on a
Operating System Fundamentals
42
single line with a space between each. The behavior of the program is really quite simple.
However, if we now launch both threads at the same time we might end up with the scheduler
switching back and forth several times. The result could look something like this:
Thread A is finished: 0Thread B is finished: 0 1 2 1 2 3 4 5 6 7 3 4 5 6 7 8 8 9
9
We have marked the output from threadB using italics so that you can see what is happening.
The problem is that the scheduler keeps switching back and forth between the two threads while
they are printing to the screen. This is quite common. Most operating systems will take an I/O
request as a chance to reschedule the threads or processes. Unfortunately, there are no actual
guarantees of when the rescheduling will occur. Our goal for this small program is to make
sure that the output does not get mixed up. The problem is that all of the output statements in
our threads are part of a critical section, which means that we do not want the other thread
interfering while we are trying to write to the screen.
Preventing Rescheduling
The first technique that we consider is having the thread/process make a request to the
operating system to disallow rescheduling while they are performing the output. The thread
code then looks something like this:
threadA {
Long calculation
Lock Rescheduler
System.out.print(“Thread A is finished: “);
For (I = 0; I < 10; i++) {
System.out.print(i);
System.out.print(“ “);
}
System.out.println();
Unlock Rescheduler
}
By putting a lock and an unlock around the print statements we are asking the operating system
to no do any type of scheduling operation while these statements are being executed.
This solution is completely valid, and if the other thread does the same operation the outcome
will function perfectly fine. Again as expected, this simple approach does have some downfalls.
Suppose that we execute the program containing these two threads but there is another
application (such as Outlook) running on the same computer. If the first thread requests that
the scheduler be disabled then the other applications will not have a chance to run at all, even
though they may have no interest in the screen. The problem here is that the thread has asked
that no other threads, regardless of who they are or what they want, are allowed to run.
An alternative solution that might be suggested is to use a synchronization semaphore, and
make sure that threadA runs to completion before threadB. This is also a valid solution.
However, one should ask if that is actually a valid requirement. We have not provided any
details about the length of the ―long operation‖. Suppose that we were to implement a
synchronization semaphore and always forced taskA to finish before taskB. Now suppose that
the ―long‖ calculation in taskB takes one second, but the long calculation in taskA takes 100
Operating System Fundamentals
43
seconds. Using the synchronization variable means that taskB has to wait 100 seconds before it
can even run. If taskB could have run after one second maybe we should have let it.
To solve the problem in the best way it seems that we should have a race between threadA and
threadB to see who gets to the end of the long calculation first, and then once the first thread
passes a gate we lock out the other thread. We will start by trying to use a simple variable...
threadA {
Long calculation
While (busy == 1)
Do nothing;
Busy = 1;
System.out.print(“Thread A is finished: “);
For (I = 0; I < 10; i++) {
System.out.print(i);
System.out.print(“ “);
}
System.out.println();
Busy = 0
}
Here we assume that a simple integer variable called busy has been created and initialized to 0
before starting. Both threadA and threadB look nearly identical with the exception of the printed
message. At first glance this code looks like it is going to solve our problem. The first thread to
finish the long calculation will see that the busy variable is zero and will then set it to 1. If the
second thread comes along it will see that the busy flag is 1 and will wait for the other thread to
clear it. There are two problems with this solution. The first problem is the while loop which is a
form of busy waiting, this burns CPU time. The second problem which is much more important
is that the solution does not actually work.
Suppose that threadA checks the busy flag and sees that it is zero. This means that the next
instruction is the ―busy = 1‖. But lets suppose that just as threadA is about to change the value
to 1, that the scheduler causes threadB to execute and threadB checks the value of busy. The
value of busy is still 0 because nobody has set it. The result now is that both threads are
executing within their critical section and you will end up with a messed up output. It looks like
there is actually another critical section between the value of the flag checking and the setting.
Let us look again at the use of a semaphore, because it allowed one task to block (without
consuming CPU) until another thread gave the semaphore. However, this time we are going to
initialize the semaphore just a little bit differently. Here is the pseudocode that starts
everything:
Operating System Fundamentals
44
Main()
{
Create semaphore called lock
Make sure that the lock is available
Start ThreadA
Start threadB
}
Now for the thread code, again threadB looks identical except for the message displayed.
threadA {
Long calculation
Lock.take();
System.out.print(“Thread A is finished: “);
For (I = 0; I < 10; i++) {
System.out.print(i);
System.out.print(“ “);
}
System.out.println();
Lock.give();
}
In this case the first thread through the long calculation will take the semaphore, and the second
thread to complete will have to wait because the first thread already took it. The operating
system code would have ensured that if there were critical sections in the take and give code,
they would be protected against switches (likely by disabling interrupts). As soon as the first
thread gets through the critical section, the semaphore lock is given and the other thread would
be woken up and would execute.
This example of a program that does some calculations and prints some numbers to the screen
is just meant to be an easy to understand example. We now consider a real problem involved
with our e-mail program.
To set this up, let us suppose that an individual e-mail message is stored in a Java object of type
EmailMessage. Our e-mail program keeps the e-mail messages in a simple array of
EmailMessage, and there is an integer variable called inboxCount that keeps track of how many
messages are actually in the inbox.
EmailMessage inBox[1000];
Int inboxCount;
Suppose that the software developer responsible for the e-mail program has decided that if
there are 10 messages in the inbox, they will be stored in the array in locations 0 through 9.
New messages that arrive will always be placed in the very last position and the inboxCount will
be increased by 1 and that the developer has created the following member function to handle
new messages:
Void newMessage(EmailMessage msg)
{
inbox[inboxCount] = msg;
inboxCount++;
}
Operating System Fundamentals
45
So we put the new message into the array at the current count position and increment the
counter by one for the next message that comes in.
Now suppose that the software developer has decided that there needs to be a function that
removes a single message from the inbox given the ―index‖ of the message. This code would
likely look something like this:
Void deleteMessage(int index)
{
Delete inbox[index];
For (int I = index; I < inboxCount; i++)
Inbox[i] = inbox[i+1];
inboxCount--;
}
The function operates by releasing all of the information for the message at the given location
then shifts all the messages to the left to fill in the gap. It then finishes by reducing the number
of messages by one.
Now suppose that the software developer decides that new messages coming in should be filled
in by a separate thread than the thread interacting with the user in order to make programming
easier.
Unfortunately the above code may lead to some difficulties with multiple threads. Suppose that
a user is trying to delete a message and is part way through the deleteMessage function. Just
before the inboxCount is about to be reduced a new message arrives, and the newMessage()
function is called. The fact that both the variables inbox and inboxCount are being updated by
two separate threads creates what is called a race condition. A race condition is simply a
sequence of code executed by two or more threads in which the end result depends on which
thread finishes first. In this case if either completely finishes, then it is not a race condition.
However, if one of the threads is interrupted during the critical section, the results may not be
as expected.
This example should show that although simple code solutions work, when more than a single
thread is introduced into the solution there is a risk when it comes to shared variables.
Message Queues
A popular mechanism provided by many operating systems for the purpose of inter process
communication is the concept of a message queue. In many multi-threaded programs there is
often a lot of information that one thread will want to exchange with another thread rather than
just doing synchronization.
In order to send actual data from one thread to another, most operating systems provide a
mechanism known as a message queue or a pipe. In a message queue there is always at least
one sender and one receiver. The sender creates a message and puts it into a queue for
delivery by the operating system. The receiver listens to the queue for messages and will often
block waiting for messages to arrive. Once the message arrives, the receiver is woken up and
the message is processed.
Operating System Fundamentals
46
As an example we will slightly reconsider our e-mail program. Suppose that we have a single
thread that is responsible for adding and removing e-mail from the inbox. We have seen that by
having only one thread we remove critical sections. Is it possible in this type of solution to still
have a separate thread that checks for new messages? Yes! A solution is to configure a
message queue between the checking thread and the inbox manager thread as follows:
The main message processor will accept messages either a receiving thread or from a user
interface thread. The processing thread can only process one single message at a time and its
pseudocode would look like this:
Forever(;;) {
Wait for message from queue
If (message is delete)
Call delete function
Else if (message is new)
Call new message function
}
The message queues in many operating systems will allow for the listener to time-out if that is
important for the program being developed. Most operating systems send data across their
message queues in a first-in-first-out (FIFO) arrangement, and some allow for queue jumping so
that important messages can be inserted at the very front of the list.
Most operating system message queues provide for a ―backlog‖ of messages. If a sender quickly
sends ten messages, the receiver must request each of the messages to remove them from the
queue. This leads to a significant problem in that if the receiver is not able to process the
messages fast enough, you can end up with some strange results. Most people have probably
used a computer at one time that was not responding even though you were typing. After a few
seconds, the program suddenly responded and all of the characters typed appeared instantly.
This is an example where the typed characters were put into a message queue but the receiver
(the program) was busy doing other tasks.
Figure 3.7
Configuring a message queue
Operating System Fundamentals
47
Process Scheduling Algorithms
In this section we will describe the common algorithms that are used for scheduling processes
and threads. Here the word scheduling will refer to the task of deciding which process or thread
should be selected from the process table and put into the Run state.
Most current operating systems actually implement thread scheduling so that it does not matter
if the two threads are in the same process or different processes. In this section we will try to
use only the term process scheduling.‖ However, if the term thread is mentioned it really does
mean the same thing.
Scheduler and the Process Table
The scheduler is a piece of software code (it is generally a function) that is included as part of
the operating system. The scheduler is often tied to the periodic clock interrupt and to most of
the operating system calls. By connecting to both a clock and the I/O, it means that processes
can be scheduled each time there is a clock tick and they can be scheduled each time that the
process talks to the OS (such as when it asks to take a semaphore, asks to sleep, or performs
an I/O operation).
Each time that the scheduler code is executed, the code will examine the process table and
make a decision (based on what it reads from the process table) about which process should be
run next. Over the next few sections we may realize that some additional items are required to
be added to the process table to help the scheduler.
Round-Robin Scheduling
If a computer has four processes currently running, it would seem logical that one of the fairest
ways to divide the CPU time is to give each process a turn in a certain order and continue to use
that order. For example if the processes are called A, B, C, and D then the order could be: A,
B, C, D, A, B, C, D, A, B, etc. This type of scheduling is called round-robin scheduling. Most
modern operating systems provide this type of scheduling algorithms and it is one of the easiest
algorithms to understand.
A common way of showing the algorithm is to create a small table that looks like this:
Process
T1
T2
T3
T4
T5
T6
T7
T8
T9
A
B
C
D
The labels T1, T2, etc. refer to small units of time called time slices or quantums. The actual
number of seconds (or more likely milliseconds) is dependent on the operating system but can
sometimes be configured. Typical values for the time slices are in the order of 10ms.
Operating System Fundamentals
48
Each process is given a number of milliseconds to execute and then interrupted, and the
scheduling algorithm will pick the next process in the list. However, if the process blocks (waits
for a semaphore or sleeps) the scheduler would wake up early and schedule the next process.
While a process is in the blocked state it will be not considered until it gets back into the ready
state.
The size of the time slice is quite important. If you pick a really large time slice (say ten
seconds), then it means that each process gets to execute for up to ten seconds. If Windows
used a ten second time slice, then it means when you click on the start menu (part of the
Explorer process) you might have to wait up to ten seconds for Word to finish what it is doing
before seeing the menu appear. On the other extreme, if you set the time slice to a really small
value (such as 1 ms) then it means that the scheduler is executing 1,000 times per second.
When the scheduler code is busy selecting a process it means that no other process can be
running. The result is an inefficient system that spends more time trying to decide what to do
than actually doing something.
Priority-Based Scheduling
The round-robin type scheduling appears to be
a very fair algorithm, and in most cases the use
of this algorithm works well. However, there
are some situations where another form of
scheduling leads to a better system from the
user‘s point of view.
Most people have probably used a computer at
one time when the computer did not seem to
be responding. While using MS Word you may
find yourself typing a few keys but nothing
appears on the screen for a few seconds, then
suddenly all of the characters you typed appear
at the same time. Although it might be okay if
this happens just once a month, it should not
be happening every few minutes. While the
user is typing they generally like to have some
sort of instant feedback so they know their
keys are being accepted. If it took five seconds
for your phone to show the digit you pressed
when dialing a number, you would find this
quite unacceptable.
Unfortunately, with round-robin scheduling
instant feedback is not possible unless we
choose a very small time slice and we do not
have a lot of processes. While your computer (or mobile) is running another process, we need
some way of having the user application interrupt other applications. In order to keep track of
which processes are more important, the process table generally contains an entry called the
priority. Each time the scheduling algorithm is executed, the scheduler will look at the process
table for the process that is in the Ready state and pick the process with the highest priority.
Figure 3.8
Windows Task Manager Showing Process Priorities
Operating System Fundamentals
49
An operating system could allow the process itself to pick its own priority, or the priority could
be picked by the operating system. In some operating systems, priorities are simply low,
medium or high. In other operating systems, the priorities may be arranged between 0 and 255
with 0 being the highest and 255 being the lowest (the meanings could be reversed with a larger
number being higher in priority). What do we do if there are two processes with the same
priority? The usual implementation is to utilize round-robin scheduling for all processes that are
of equal priority.
When a process is put into the Run state, it will get to run until one of the following things
happens:
1. it finishes,
2. it blocks,
3. a higher priority process becomes ready.
What if the process never blocks or finishes? Then the process gets to monopolize the CPU and
never allows any other process to run. When a process never gets to run because a high
priority process is using all of the time, it is called starvation. Starvation is a danger of any
operating system relying on priorities for scheduling and it requires careful programming to
ensure that it does not happen. You may be wondering how a higher priority process would
ever become ready if another process is running all of the time. As long as the currently running
process eventually waits or gives a semaphore that the higher priority process is waiting for,
then the high priority process gets to run.
In most operating systems, processes started by the user (application) are assigned the same
priority, and generally the processes created by the operating system are provided a higher
priority so that the developers do no need to worry about this aspect. In the Windows Task
manager, it is possible to change the priority of the processes. Linux also allows for the user to
lower the priority of a process.
Case Study: Priority Does Not Mean Importance
A nuclear power plant produces power by having a nuclear reaction heat up water which
generates steam and the steam is used to turn a turbine. To slow down the reaction the plant
can insert control rods into the reactor. If the reactor gets too hot, you might cause a nuclear
meltdown (you do not want this to happen!). Putting in the control rods is usually done by a
robot and takes a certain amount of time.
Suppose there is a sensor on the reactor that notifies a controlling computer when the
temperature gets too hot. Also let us pretend that the computer program to move the control
rods into place takes 30 seconds to complete, but if you do not get the task done in 60 seconds
there is a meltdown.
Meanwhile, inside the break room at the power plant is a coffee maker. The coffee maker has a
sensor which can detect when the pot is about to overflow, and it sends a signal to a computer
program which is responsible for switching off the pot. It takes one second for the computer to
switch off the pot, but if you do not switch it off within five seconds the pot will overflow.
Due to cost cutting, the owner of the nuclear power plant has decided to buy only one computer
to control both the reactor rods and the coffee pot! You of course have two processes running,
Operating System Fundamentals
50
one for the coffee and the other for the reactor. Which process should have the higher priority?
We will assume that both alarms happen at the same time.
Scenario One: The Control Rods get Higher Priority
This makes sense because preventing a meltdown is a lot more important!
1. Time = 0s. Both alarms trigger. Control rod robot activated, ignoring coffee pot.
2. Time = 5s. Coffee pot starts overflowing.
3. Time = 30s. Control rods in place. Lots of water on the floor. Coffee pot is instructed to
switch off.
4. Time = 31s. Coffee pot is off.
Is this a good solution? The plant did not meltdown, therefore this is good. The coffeepot
overflowed which means there is a mess in the break room, but we only need a mop.
Scenario Two: The Coffee Pot gets Higher Priority
This sounds like a silly idea but consider the sequence of
events.
1. Time = 0s. Both alarms sound. Coffee pot starts
to be switched off.
2. Time = 1s. Coffee pot is switched off. Reactor rod
robot started.
3. Time = 31s. Reactor rods inserted and meltdown
prevented.
In this solution, the control rods were inserted well within
the required time and we also avoided the flooded break
room.
Conclusion
Processes with important responsibilities do not always need the highest priority. The selection
of priorities is based a combination of how quickly a process must react and how long it will take
to complete.
Operating System Fundamentals
51
Other Scheduling Algorithms
There are a number of other scheduling algorithms that are described in various texts, but they
are often based on older operating systems where programs ran without a user sitting in front of
the computer. We include a brief description of each of these for completeness.
First Come First Serve
In this algorithm, there is no pre-emptive switching. When a process is scheduled it runs until it
is completely finished (or perhaps until it blocks). The idea is quite simple once a process is
started the most efficient thing to do is to let it run until is completely finished before allowing
another process to run.
This scheduling algorithm does not work for situations where two processes are working
together to solve some problem however from a purely efficiency point of view it works well and
there is no fear of having one process being interrupted by any other process.
In a system running this type of scheduling the process table is created with a list of all the
processes and then run without modification until all tasks are done (or new tasks added only
between processes).
Shortest Task Remaining
The idea for this algorithm is to figure out which process will finish in the shorted amount of time
and schedule the processes in an order that finishes as quickly as possible. It tends to favor
short processes over longer processes. The danger is that a lot of short processes could easily
starve a longer process. It is also not practical because you cannot actually know for certain
how much more time is remaining for a given process.
Scheduling and Blocking
The scheduling algorithms presented earlier in this section have completely ignored tasks which
are currently blocked. We finish the section on scheduling by looking at one particular problem
of blocking from a scheduling point of view.
Suppose we have the following processes:
Process A (priority high)
Wait for 10 seconds
Take semaphore S (block until it is available)
Do calculation
Process B (priority medium)
Sleep 5 seconds
Do calculation that takes 3000 seconds
Operating System Fundamentals
52
Process C (low priority)
Take semaphore S
Do calculation that takes 10 seconds
Give semaphore S
Suppose that all three processes are started at exactly the same time and that the semaphore S
is available to the first process that requests it.
As soon as all processes are started the process table will look like this:
Name
Priority
State
Process A
High
Ready
Process B
Medium
Ready
Process C
Low
Ready
This means that the scheduler will select process A because it is the highest process that is
‗Ready‘. Process A will execute, but the first thing that it does is sleep. This means that its state
will become ‗Blocked‘.
Now the scheduler will wake up and select Process B because it is the highest process that is
ready to run, but as soon as Process B runs its state turns to block because of the Sleep. Next,
process C gets a chance to run, and it takes the semaphore S and starts doing some
calculations. Process C does not block because it is running some long calculation.
After five seconds, Process B wakes up and runs its 3000 second calculation. Remember that
process C is still ‗Ready‘ because it has another five seconds to go and it is also holding
semaphore S.
At ten seconds, process A wakes up from its sleep but immediately goes into a blocked state
because it requests semaphore S (but C has it). The scheduler then runs process B because it
is the process with the highest priority that is available.
The problem here is that Process A has the highest priority and wants to run, but it cannot
because Process C has a semaphore it needs. But Process C is not able to run because Process
B is monopolizing the CPU. This problem is called priority inversion because the medium priority
process is preventing the high priority process from running.
Solutions to this problem are not covered in this text, but two observations are noted:
1. There is a shared semaphore between a high and low priority task; this is generally a bad
idea.
2. The task in the middle has a high priority but it monopolizes the CPU for a relatively long
period of time. Such a lengthy task should probably have had a lower priority.
Operating System Fundamentals
53
Unit Summary
Processes are instances of programs that are currently being run in a computer system. In
order to improve the efficiency of process execution and CPU scheduling, processes are often
broken down into smaller units called threads. Processes and threads are managed in by the
operating system using a Process Table, which lists all processes and threads and their current
state. A process or thread can either be running, ready to run, or blocked (which means that
they will be ignored by the process scheduler until their state has been changed back to
―Ready‖). Processes can also be marked in the Process Table as either ―new‖ or ―exit,‖ which is
a form of blocking preventing them from being executed because they are not actually ready.
A number of different strategies are used to allow processes and threads to exchange
information so that they can synchronize themselves, or arrange themselves to be scheduled
only in the correct sequence (or when specific required information is available). The most
common methods of inter process communication include the use of signals (often called
semaphores) and message queues. When exchanging information between processes or
threads, certain critical sections of code most be handled carefully to make sure that they are
not corrupted, that they are accessible when needed by other processes or threads, and that the
desired outcome is achieved by the instructions being executed.
The scheduling of processes can be handled using a variety of algorithms, but the most common
methods are to handle all processes or threads in sequence (round-robin scheduling), or to
schedule based on process priority (priority-based scheduling).
Key Terms
Algorithm
Blocked
Blocking
Critical section
Dispatch
Dual-core
Exit
First come first serve
fork()
Give
Inter process communication
Interrupt
Message queues
Multicore
Multiprocessor
Multitasking
Multithreading
New
Non-deterministic solution
Non-preemptive switching
Pipe
Polling
Preemptive switching
Priority
Priority inversion
Priority-based scheduling
Process
Process scheduling
Process table
Processor preservation
Quad-core
Quantum
Race condition
Raise
Ready
Round-robin scheduling
Running
Scheduler
Self-yield
Semaphore
Shared memory
Shortest task remaining
Signal
Sleep
Starvation
State changing
State machine
Synchronization
Synchronization variable
Take
Thread
Time slice
Unmanaged exception
Variable
Wait
Operating System Fundamentals
54
Review Questions
1. What is a thread? What is the relationship between a process and its threads?
2. Define multitasking and multithreading.
3. List and describe five major process states.
4. Draw a diagram of a typical state machine, showing the five major process states.
5. List and describe two common problems that may overload a CPU during program
execution.
6. Why is inter process communication important?
7. Briefly define Process Synchronization.
8. What is a Critical Section?
9. Describe the purpose of a signal/semaphore.
10.Describe the purpose of a message queue.
11.Fully describe how ―round robin scheduling‖ and ―priority-based scheduling‖ scheduling
strategies works.
12.With regards to ―round robin scheduling‖ and ―priority-based scheduling‖ scheduling
strategies, give some situations where one strategy works better than the other.
Operating System Fundamentals
55
Unit 4: Memory Management
What is Memory Management?
Memory is vital to the functioning of any computer or
computerized device. As discussed in Units 1 and 2,
memory consists primarily of RAM chips that are installed
on the motherboard of a computer. A typical computer
now has between 1 GB and 4 GB of RAM. RAM is divided
into segments (typically 32 bits) that are used to
temporarily store data and instructions that are being
used by the computer. This is different from your hard
drive, which permanently stores files and programs. RAM
is much faster than your hard drive, so your data and
instructions are loaded from your hard drive into RAM
when the computer is using them. When you turn off the power, everything stored in RAM is
gone. In addition to the physical RAM installed in your computer, most modern operating
systems allow your computer to use a virtual memory system. Virtual memory allows your
computer to use part of a permanent storage device (such as a hard disk) as extra memory.
Memory resources are managed by the operating system. The operating system is responsible
for allocating memory address ranges as needed to run applications and processes. In this unit,
we will look at the function of the memory manager in an operating system, and the types of
problems that the memory manager must resolve. We will look at some of the techniques that
operating systems can use to actually allocate memory, and some of the problems that can
occur when using specific memory allocation strategies. We will also look at the purpose of
virtual memory and how it works. This will include a look at page files, the page table, and the
page replacement policies used by operating systems to manage virtual memory.
The Memory Manager
The memory manager is part of the kernel (core) of the operating system. It is responsible for
efficiently managing all memory resources including RAM and virtual memory, and allocating
memory space to applications and processes as needed. The memory manager must keep track
of all memory addresses and what they are currently being used for. It must also protect those
address ranges, and the data and instructions stored in them, from data and instructions being
used by other processes. The memory manager is also responsible for freeing up memory when
it is no longer being used so that other processes can use it.
The memory manager is responsible for four critical tasks:
1. Allocating main memory to processes
2. Retrieving and storing the contents to and from main memory when requested
3. Effective sharing of main memory
4. Minimizing memory access time
Operating System Fundamentals
56
Efficient Memory Management
A well designed operating system should make access and the management of resources as
efficient as possible for both the user and the system. Efficient memory management includes
keeping track of all memory resources that have been allocated to different processes. It also
includes using different strategies for allocating free memory as it is needed by new processes.
When it comes to the allocation of memory spaces and the retrieval of data from memory, you
can compare the memory manager to the local telephone company. There are many ways that
the phone company can keep track of telephone numbers that have been assigned to customers.
One way is to keep a sequential list of all phone numbers, along with the names of the
subscribers. Another way is to keep track of the names of subscribers, along with their assigned
phone numbers (like a phone book). An efficient memory manager would use both types of
strategies to keep track of data stored in RAM. The first strategy is useful for examining address
ranges and finding free memory that is available for use. The second strategy is useful for
examining individual processes, and keeping track of what memory each process is using.
An operating system‘s memory manager is also responsible for using different strategies to
allocate free memory to new processes. Each strategy has its advantages and disadvantages.
The aim of the operating system is to use the most effective strategy to both minimize the time
needed to save and access data, and to maximize the amount of usable space left in memory.
You can compare this task to parking vehicles in a parking lot. For example, you could park a
motorcycle in the first available space. If it is a very big space, then you will end up reducing
the maximum number of vehicles you can park in the lot, because the rest of the space may not
be quite big enough to fit another vehicle. Conversely, you could try looking for the smallest
possible parking space where the motorcycle will fit. The drawback here is that it might take
you longer to find such a spot.
There are three main strategies that the memory manager can use when allocating free memory
to processes:
Best Fit
o Find the smallest free memory block that
will fit the process needs
o Idea is minimize wastage of free memory
space
Worst Fit
o Find the largest free memory block that
will fit the process needs
o Idea is to increase the possibility that
another process can use the left-over
space
First Fit
o Find the first space to fit the memory
needs
o Minimize the time to analyze the
memory space available
Memory
First Fit
Worst Fit
Best Fit
Figure 4.1
Using memory allocation strategies to assign
12KB of memory
Operating System Fundamentals
57
Fragmentation
The use of particular memory allocation strategies may
result in wasting valuable free memory. That is because
using either the best fit or first fit strategies may leave
small chunks of free memory that are not large enough to
be useful for any other processes (like parking a
motorcycle in a parking space, and not having enough
space left for another vehicle). The allocation and de-
allocation of memory creates a condition called
fragmentation. This means that there are lots of small
fragments (or holes) of free memory that cannot be used
by any other process. This results in a reduction in the
amount of total available memory.
The worst fit memory allocation strategy attempts to
minimize fragmentation by finding the largest chunk of
free memory available to allocate to a process. By doing
this, the memory manager tries to ensure that the unused
memory ―holes‖ it leaves are as large as possible. The
goal is to leave as many big chunks of memory as
possible so that as many fragments as possible are large
enough to be used by other processes.
Relocation
Another task that must be handled by the
Memory Manager is the relocation of applications
in memory. Applications have certain memory
requirements when they are loaded. However, it
is not possible for the application to know in
advance which memory range it will be
allocated, because the application itself does not
know what other processes will be running, how
much physical memory the system will have
available to it, and whether or not some of its
required memory will be comprised of virtual
memory pages (see next topic). For this reason,
the application will only make reference to
relative‖ or logical address ranges. When the
application is compiled to load a process into memory, the operating system will allocate the
actual address range based upon the relative ranges provided by the application, and the actual
memory resources available. As demonstrated in Figure 4.3 (right), the process that is loaded
will provide a logical address, which is combined with a base address provided by the operating
system, to determine its actual location in memory. The application itself will be unaware of its
location in memory. However, once a process has been started, it is not possible for the
Memory Manager to relocate that process in memory unless it is aware of the relocation, and the
new base address.
Figure 4.2
The worst fit strategy assigns the largest
free memory chunk so that it leaves behind
the largest possible fragment
Logical Address
Base Address
+
Physical Address
From program
From the OS The addition is
done by the
CPU
Figure 4.3
Allocation of Physical Memory
Operating System Fundamentals
58
Virtual Memory
Despite the fact that most modern computers have between 1 GB and 4 GB of RAM, it is possible
to run out of memory. This can happen when you are running too many processes at the same
time, or when a process requires more memory than is currently available (unused). One
solution to this problem is to use virtual memory.
Virtual memory means that the operating system‘s memory manager will use a portion of
another storage device to act as if it is extra RAM. In most cases, this virtual memory space will
be on a permanent storage device such as a hard disk. However, some newer operating
systems (such as Windows Vista and Windows 7) also allow you to use a removable media
device like a USB flash drive as virtual memory (called Windows Ready Boost). The memory
manager will manage this resource so that it gives the illusion that your computer has more RAM
than is actually installed.
How Virtual Memory Works
Virtual memory works by breaking down large programs into smaller units called pages. The
memory manager will store these pages on an area of the storage device (hard disk) that has
been allocated for use as virtual memory. This area is called a Page File (or Paging File).
All of the pages that are currently loaded in main memory (RAM) will be listed in a Page Table,
which is maintained by the memory manager. If a process requests a page that is not currently
loaded into RAM (in the Page Table), then a page fault will be generated. The memory manager
will attempt to resolve the page fault by locating the requested page in virtual memory and then
loading it into RAM (and listing it in the Page File). The memory manager will then try to re-
execute the instruction that caused the fault.
Pages, Virtual Addresses and Physical Memory
Figure 4.3 (right) shows how pages, virtual
addresses and physical memory are combined to
create a virtual memory system. The memory
manager maintains both a virtual memory space
and a Page Table. All processes and data that are
loaded into either main memory or the page file
(hard disk) are given addresses in the virtual
memory space. The Page Table keeps track of
the locations of all the pages. Although pages
may be located on different physical devices, the
memory manager treats them as if they are all
contained in one large memory system. When a
page is actually needed for processing, it is
loaded into main memory (RAM). Older pages
that are not currently being processed may be
swapped from RAM back to the paging device
(usually a hard disk).
0-4 K
4-8 K
8-12 K
12-16 K
16-20 K
20-24 K
A
A
C
B
C
C
Page 1
(Process C)
Page 2
(Process A)
Free Space
Page 3
(Process A)
Virtual Memory
Space
Page
Table
Physical Memory
(RAM)
Paging Device
(Hard Disk)
Figure 4.4
Combining Elements to Create
a Virtual Memory System
Operating System Fundamentals
59
Swapping Pages
Whenever main memory (RAM) is full, or when a page is requested that is located in virtual
memory, the memory manager needs to swap old or unused pages from RAM into virtual
memory in order to make space for the new pages being loaded. There are a number of
strategies used by the memory manager to handle this task. These strategies are often called
replacement policies. There are five main replacement policies that are used by operating
systems:
Random Replacement
First In First Out (FIFO)
Second Chance
Least Recently used
Least Frequently Used
Each of these replacement policies uses different algorithms for selecting pages to be swapped
from RAM into the page file.
Random Replacement
Replace pages in main memory randomly.
On the average, does not work well.
FIFO
Uses a queue data structure to keep track of the pages in main memory.
Oldest page at the front (head) and newest page at the back (tail).
Always replace (get rid of) the oldest page.
Does not always work, because the oldest page may still be used by the process.
Second Chance
Another version of FIFO to address the problem of FIFO.
All pages in the page table are tracked to see if they have been referenced recently by a
process.
A Reference (R) bit for each page is used for this purpose:
o R = 1 when the page is being referenced.
o R = 0 when the page has not been used after a time period.
o The OS will periodically check the (R) bit for each page and move the pages those
(R) = 1 to the tail of the queue thus given a 2nd chance.
Least recently Used (LRU)
Replace the page in main memory that has not been used the longest.
Least Frequently Used (LFU)
Counter are used to record the number of times each page have been used.
The pages that have been used the least (lowest count) would be replaced.
Operating System Fundamentals
60
Hit Ratios: Determining Which Replacement Policy to Use
Using each of the five main replacement policies to swap pages between main memory and the
page file will result in different hit ratios. A hit ratio is the number of times that a page is
actually found in main memory, as opposed to the number of page faults generated (requiring
the memory manager to retrieve the page from virtual memory). To calculate the hit ratio, we
divide the number of non page faults by the total number of page requests (the number of times
that data has been sent between the CPU and main memory). Rememberpage faults occur
when a page is requested by the CPU that is not currently in the page table (in other words, not
currently in RAM). The memory manager resolves the page fault be swapping an old page from
RAM to the page file (on the hard disk), then retrieving the requested page from virtual memory
to RAM, and then re-executing the instruction. Since there are extra steps involved in order to
execute the instruction, and since retrieving a page from a hard disk is slower than RAM, page
faults result in longer processing time. We use hit ratios to determine which replacement policy
will result in the fewest number of page faults, and the fastest overall processing time. The
policy that produces the lowest number of page faults is usually the policy we want to use.
Calculating Hit Ratios
To understand how to calculate hit ratios, we will examine an example that uses a RAM space of
three (3) frames, and the following processing sequence:
1 2 1 3 1 4 1 5 2 3 2 4 1 5
In these examples, Y = a hit (the page is found in RAM), and N = a page fault (the page must be
retrieved from virtual memory. A frame is a segment of RAM that can be allocated to hold a
page, and is typically the same size as a page. Frames 1-3 are located in RAM. Frames 4-5
represent pages located in virtual memory.
Using FIFO:
Sequence
1
2
1
3
1
4
1
5
2
3
2
4
1
5
Hit?
N
N
Y
N
Y
N
N
N
N
N
Y
N
N
N
Frame 1
1
1
1
1 4
4 2
2
2 1
Frame 2
2
2 1
1 3
3 5
Frame 3
3
3 5
5 4
Hit Ratio