Fifty Years of Operating Systems
The Founding of the SOSP Conferences
Perspectives on OS Foundations
My question is: how and when did the key OS principles emerge?
Timelines of the evolution of operating systems follow available
technologies and respond to market concerns. There were four stages
from the 1950s to present: batch, interactive, distributed network, and
cloud-mobile. The SOSP symposia, founded to focus on developing and
validating OS principles, have involved thousands of researchers over
the past fifty years. OS research has contributed a dozen great
principles to all of computer science, including as processes, locality,
interactive computing, concurrency control, location independent naming,
and virtualization. I will look more closely at the research around two
principles I was involved with: locality and location independent
naming. Virtual memory -- a new, alluring, but controversial technology
in the 1960s -- motivated both areas. The early concerns were whether
the automation of paging would perform well, and whether
name-to-location mappings could be done with no significant performance
degradation. Performance was a major concern for virtual memory because
the speed gap between a main memory access and a disk address was 10,000
or more; even a few page faults hurt performance. (The gap is worse
today.) We hypothesized that paging would perform well if memory
managers could guarantee that each process’s working set is in memory.
We justified this from intuitions about locality, which predicts that
the working set is the maximum likelihood predictor of the process’s
memory demand in the immediate future. These ideas were extensively
validated through years of study of paging algorithms, multiprogramming,
and thrashing, leading to control systems that measured working sets,
avoided thrashing, and optimized system throughput. Locality is
harnessed today in all levels of systems, including the many layers of
cache built into chips and memory control systems, the platforms for
powering cloud computing, and in the Internet itself to cache pages near
their frequent users and avoid bottlenecks at popular servers. Location
independent naming is the other principle that permeated all generations
of virtual memory over the years. This principle gave us hierarchical
systems to generate names and very fast mappings from names to the
physical locations of objects. This principle was present in the
original virtual memory, which had a contiguous address space made of
pages, and is present in today's Internet, which provides a huge address
space made of URLs, DOIs, and capabilities.
Perspectives on Protection and Security
Perspectives on System Languages and Abstraction
Evolution of File and Memory Management
Mahadev Satyanarayanan (Satya) presented his thoughts on "The
Evolution of Memory and File Systems". He observed that over a
60-year period, there have been four drivers of progress: the quests
for scale, performance, transparency, and robustness. At the dawn of
computing, the quest for scale was dominant. Easing the memory
limitations of early computers was crucial to the growth of computing
and the creation of new applications, because memory was so scarce and
so expensive. That quest has been phenomenally successful. On a cost
per bit basis, volatile and persistent memory technologies have
improved by nearly 13 orders of magnitude. The quest for performance
has been dominated by the growing gap between processor performance
and memory performance. This gap has been most apparent since the use
of DRAM technology by the early 1980s, but it was already a serious
issue 20 years before that in the era of core memory. Over time,
memory hierarchies of increasing depth have improved average case
performance by exploiting temporal and spatial locality. These have
been crucial in overcoming the processor-memory performance gap, with
clever prefetching and write-back techniques also playing important
roles.
For the first decade or so, the price of improving scale and
performance was the need to rewrite software as computers were
replaced by new ones. By the early 1960s, this cost was becoming
significant. Over time, as people costs have increased relative to
hardware costs, disruptive software changes have become unacceptable.
This has led to the quest for transparency. In its System/360, IBM
pioneered the concept of an invariant architecture with multiple
implementations at different price/performance points. The principle
of transparent management of data across levels of a memory hierarchy,
which we broadly term "caching", was pioneered at the software level
by the Atlas computer in the early 1960s. At the hardware level, it
was demonstrated first in the IBM System 360 Model 85 in 1968. Since
then, caching has been applied at virtually every system level and is
today perhaps the most ubiquitous and powerful systems technique for
achieving scale, performance and transparency.
By the late 1960s, as computers began to used in mission-critical
contexts, the negative impact of hardware and software failures
escalated. This led to he emergence of techniques to improve
robustness even at the possible cost of performance or storage
efficiency. The concept of separate address spaces emerged partly
because it isolated the consequences of buggy software. Improved
resilience to buggy sofware has also been one of the reasons that
memory and file systems have remained distinct, even though systems
based on the single-level storage concept have been proposed and
experimentally demonstrated. In addition, to cope with hardware,
software and networking failures, technqiues such as RAID, software
replication, and disconnected operation emerged. The quest for
robustness continues to rise in importance as the cost of failures
increases relative to memory and storage costs.
In closing, Satya commented on recent predictions that the classic
hierarchical file system will soon be extinct. He observed that such
predictions are not new. Classic file systems may be overlaid by
non-hierarchical interfaces that uses different abstractions (such as
the Android interface for Java applications). However, they will
continue to be important for unstructured data that must be preserved
for very long periods of time. Satya observed that the deep reasons
for the longevity of the hierarchical file system model were
articulated in broad terms by Herb Simon in his 1962 work, "The
Architecture of Complexity". Essentially, hierarchy arises due to the
cognitive limitations of the human mind. File system implementations
have evolved to be a good fit for these cognitive limitations. They
are likely to be with us for a very long time.
Reflections on the History of Operating Systems Research in Fault Tolerance
Ken Birman's talk focused on controversies surrounding fault-tolerance
and consistency. Looking at the 1990's, he pointed to debate around the
so-called CATOCS question (CATOCS refers to causally and totally ordered
communication primitives) and drew a parallel to the more modern debate
about consistency at cloud scale (often referred to as the CAP
conjecture). Ken argued that the underlying tension is actually one
that opposes basic principles of the field against the seemingly
unavoidable complexity of mechanisms strong enough to solve consensus,
particularly the family of protocols with Paxos-like structures. Over
time, this was resolved: He concluded that today, we finally know how to
build very fast and scalable solutions (those who attended SOSP 2015
itself saw ten or more of the paper on such topics). On the other hand,
Ken sees a new generation of challenges on the horizon: cloud-scale
applications that will need a novel mix of scalable consistency and
real-time guarantees, will need to leverage new new hardware options
(RDMA, NVRAM and other "middle memory" options), and may need to be
restructured to reflect a control-plane/data-plane split. These trends
invite a new look at what has become a core topic for the SOSP community.
Past and Future of Hardware and Architecture
We start by looking back at 50 years of computer architecture, where
philosophical debates on instruction sets (RISC vs. CISC, VLIW vs. RISC)
and parallel architectures (NUMA vs clusters) were settled with billion
dollar investments on both sides. In the second half, we look forward.
First, Moore’s Law is ending, so the free ride is over
software-oblivious increasing performance. Since we’ve already played
the multicore card, the most-likely/only path left is domain-specific
processors. The memory system is radically changing too. First, Jim
Gray’s decade-old prediction is finally true: “Tape is dead; flash is
disk; disk is tape.” New ways to connect to DRAM and new non-volatile
memory technologies promise to make the memory hierarchy even deeper.
Finally, and surprisingly, there is now widespread agreement on
instruction set architecture, namely Reduced Instruction Set Computers.
However, unlike most other fields, despite this harmony has been no open
alternative to proprietary offerings from ARM and Intel. RISC-V (“RISC
Five”) is the proposed free and open champion. It has a small base of
classic RISC instructions that run a full open-source software stack;
opcodes reserved for tailoring an System-On-a-Chip (SOC) to
applications; standard instruction extensions optionally included in an
SoC; and it is unrestricted: there is no cost, no paperwork, and anyone
can use it. The ability to prototype using ever-more-powerful FPGAs and
astonishingly inexpensive custom chips combined with collaboration on
open-source software and hardware offers hope of a new golden era for
hardware/software systems.
Parallel Computing and the OS
Frans Kaashoek's talk divided research on parallelism in operating
systems in 4 periods. Around the first SOSP, the OS community introduced
foundational ideas for parallel programming, covering three types of
parallelism in operating systems: user-generated parallelism, I/O
parallelism, and processor parallelism. With the advent of distributed
computing, the OS community focused its attention to make it easy for
server programmers to exploit parallelism, in particular I/O
parallelism. With the arrival of commodity small-scale multiprocessors,
the OS community "rediscovered" the importance of processor parallelism
and contributed techniques to scale operating systems to large number of
processors. These techniques found their way in today's main-stream
operating systems because today's processors contain by default several
cores. Because software must be parallel to exploit multicore
processors, the OS community is going through a "rebirth" of research in
parallel computing.
The Rise of Cloud Computing Systems
In this talk I will describe the development of systems that underlie
modern cloud computing systems. This development shares much of its
motivation with the related fields of transaction processing systems and
high performance computing, but because of scale, these systems tend to
have more emphasis on fault tolerance using software techniques.
Important developments in the development of modern cloud systems
include very high performance distributed file system, such as the
Google File System (Ghemawat et al., SOSP 2003), reliable computational
frameworks such as MapReduce (Dean & Ghemawat, OSDI 2004) and Dryad
(Isard et al., 2007), and large scale structured storage systems such as
BigTable (Chang et al. 2006), Dynamo (DeCandia et al., 2007), and
Spanner (Corbett et al., 2012). Scheduling computations can either be
done using virtual machines (exemplified by VMWare's products), or as
individual processes or containers. The development of public cloud
platforms such as AWS, Microsoft Azure, and Google Cloud Platform, allow
external developers to utilize these large-scale services to build new
and interesting services and products, benefiting from the economies of
scale of large datacenters and the ability to grow and shrink computing
resources on demand across millions of customers.
Is achieving security a hopeless quest?
Mark Miller:
In the 1970s, there were two main access control models:
the identity-centric model of access-control lists
and the authorization-centric model of capabilities.
For various reasons the world went down the identity-centric path,
resulting in the situation we are now in.
On the identity-centric path, why is security likely a hopeless quest?
When we build systems, we compose software written by different people.
These composed components may cooperate as we intend,
or they may destructively interfere.
We have gotten very good at avoiding accidental interference
by using abstraction mechanisms and designing good abstraction boundaries.
By composition, we have delivered astonishing functionality to the world.
Today, when we secure systems, we assign authority to identities.
When I run a program, it runs as me.
The square root function in my math library can delete my files.
Although it does not abuse this excess authority,
if it has a flaw enabling an attacker to subvert it,
then anything it may do, the attacker can do.
It is this excess authority that invites most of the attacks we see in
the world today.
By contrast, when we secure systems with capabilities,
we work with the grain of how we organize software for functionality.
At every level of composition,
from programming language to operating systems to distributed services,
we design abstraction boundaries so that a component¹s interface
only requires arguments that are somehow relevant to its task.
If such argument passing were the only source of authority,
we would have already taken a huge step towards least authority.
If most programs only ran with the least authority they need to do their
jobs,
most abuses would be minor.
I do not imagine a world with fewer exploitable bugs.
I imagine a world in which much less is at risk to most bugs.
Reminiscences on SOSP History Day