Crippled browser, can't proceed.
:(
Zev Weiss
Tyler Harter
Andrea C. Arpaci-Dusseau
Remzi H. Arpaci-Dusseau
University of Wisconsin-Madison
Department of Computer Sciences
I/O trace replay...why?
SOSP'11: "A File is not a File..." (Harter et al.)
Modern workloads make trace replay more challenging
Why?
Threads.
More specifically, interactions between threads:
Multithreaded replay can be slightly tricky:
T1: open(path1, ...) = 3
T2: read(3, buf, 1K) = 1K
T1: open(path2, ...) = 4
T2: read(3, buf, 1K) = 1K
T1: close(3) = 0
T2: fstat(4) = 0
Multithreaded replay can be slightly tricky:
T1: open(path1, ...) = 3
T2: read(3, buf, 1K) = 1K
T1: open(path2, ...) = 4
T2: read(3, buf, 1K) = 1K
T1: close(3) = 0
T2: fstat(4) = 0
Sneakier problems:
T1: open(path1, O_TRUNC) = 3
T2: open(path1, O_RDONLY) = 4
T1: write(3, ..., 1K) = 1K
T2: read(4, ..., 1K) = 1K
* actual scenario!
One possibility: preserve ordering
Maintains correctness...but very pessimistic! (Limits utility)
ROOT: Resource-Oriented Ordering for Trace replay
Aim:
Correct replay and realistic performance!
Implemented ROOT in ARTC replay system
Evaluation:
Microbenchmarks: ≤ 6% timing error, vs. up to 600%
LevelDB: 95% of original syscall concurrency, vs. 60%
Resource-oriented: just what is a resource?
Some referred to directly, others indirectly via names
All uses after creation, before destruction
Addresses FD use-after-close
, etc.
open(path, ...) = 3
read(3, ..., 1K) = 1K
close(3) = 0
All uses in same order as trace
Addresses file size problems, etc.
open(path1, O_TRUNC) = 3
open(path1, O_RDONLY) = 4
write(3, ..., 1K) = 1K
read(4, ..., 1K) = 1K
Preserves ordering of generations of a name
Addresses problems with lock files, etc.
creat("foo", O_EXCL)
unlink("foo")
creat("foo", O_EXCL)
unlink("foo")
An Approximate-Replay Trace Compiler
What goes in: compiler inputs
ARTC components: compiler and replayer
What comes out: trace replay
An Approximate-Replay Trace Compiler
An Approximate-Replay Trace Compiler
*.so
An Approximate-Replay Trace Compiler
Linux, OS X, FreeBSD, Illumos: all UNIX...but all different!
Common syscalls compatible, others often not.
ARTC emulates trace semantics on target system API
exchangedata
: link/rename/rename
fsync
vs. OS X fcntl(F_FULLFSYNC)
Replay ordering:
I/O via mmap
not reproduced
Alternate strategies:
Single-threaded
Temporally-ordered
Unconstrained
Two criteria:
Semantic correctness
Performance accuracy
Workload: Magritte
Error rate on iPhoto import400 trace (827,964 syscalls):
Microbenchmarks:
LevelDB macrobenchmarks
Two random-read threads, 1xHDD → 2xHDD RAID 0
Two random-read threads, 1xHDD → 2xHDD RAID 0
Two random-read threads, 1xHDD → 2xHDD RAID 0
Two random-read threads, 1xHDD → 2xHDD RAID 0
Two random-read threads, one with seq. warmup, 4GB → 1.5GB
Eight random-read threads, same system
Two sequential-read threads, CFQ slice_sync
100ms → 1ms
Event dependencies
Temporal-ordering dependencies
Average edge length: 10ms
ARTC ordering dependencies
Average edge length: 8.9 seconds
Syscall overlap
Temporally-ordered: good correctness, bad performance
Single-threaded: good correctness, worse performance
Unconstrained: unusable correctness
ARTC: correctness and performance.
Code available at: https://research.cs.wisc.edu/adsl/Software/artc/
Thanks!
Questions?