/sys/doc/ Documentation archive
[originally taken from http://www.scs.stanford.edu/07wi-cs244b/notes/l13d.txt]
Plan 9
Why are we reading this paper?
A different kind of distributed system--day-to-day computing infrastructure
Addresses different kinds of problems:
Simplicity, easy of use, ease of development
Heterogeneity of nodes in a computing environment
Making better use of hardware resources in day-to-day computing
Takes a "clean slate" approach--re-build everything from the ground up
Often lets one test new, better, but previously impractical ideas
Means maybe less likely to be relevant to other work
...but always good to have seen some such ideas just in case
In particular: We've talked about Network Objects/RPC
But here's a distributed system based entirely around network file systems
What is the motivation for this work?
People dislike overloaded, "beureaucratic" time-sharing systems
What are the system's fundamental principles?
1. Resources are named and accessed like files
I.e., like Unix you might have a /dev/cd for the cdrom
But you have more extreme examples, like /proc, /net, /dump, etc.
2. Resources are accessed through a standard protocol, 9P
Basically a network file system protocol. Why is this important?
Combined with 1, means you can access all kinds of devices remotely
(Try this with NFS on UNIX and you will not get the desired results...)
3. Disjoint hierarchies of services joined into *private* namespaces
In other words, my /bin != your /bin. Very different from UNIX. Why?
Compare "my office" to "353 Serra Mall #290":
Former's meaning depends on who says it, but simpler & more intuitive
How do namespaces work?
Three system calls manipulate the namespace: mount, bind, unmount
- int mount(int fd, int afd, char *old, int flag, char *aname)
attaches file descriptor "speaking" the 9P protocol to namespace
Think of descriptor as pipe or socket
Mount regular file? Kernel just writes attach message to file
afd is socket to authentication process, if you want authentication
aname sent to server in attach message so it can have multiple FSes
- int bind(char *name, char *old, int flag)
Replicates name at old
- int unmount(char *name, char *old)
name can be NULL, else it specifies one component of union to remove
flags: MREPL, MBEFORE, MAFTER, MCREATE
When using bind w/o MREPL, creates a union directory
What's this? Example: /bin
Can you create files in a union directory?
Yes, if MCREATE is set on one or more entries
Note: Unlike UNIX, can mount over either a directory, or a file. Why?
E.g., might want to replace /dev/kbd
Can they get rid of the PATH environment variable? How?
Paper claims yes, but short answer is almost, by manipulating namespace
But some people still want "." in PATH, so they kept it
Serial port example: /dev/eia1, /dev/eia1ctl - how do these work?
On plan9, eia1 is just for I/O
eia1ctl (control device) sets baud rate w. text commands like "b1200". Why?
In distributed environment, no worries about byte-order
Easy to use/debug stuff from shell scripts - this is actually a big deal!
C.f. UNIX /dev/tty00: set baud rate, etc. w. complicated ioctl calls
How does /dev/eia1 get into your namespace in the first place?
Each kerel device has a letter -- t for the UART (serial port)
Access "root" of device with special pathname #t
Startup scripts during boot run: bind -a '#t' /dev
What is system like to user?
Login process - type your name and password to terminal
Password is used to authenticate you to file & CPU servers
Terminal doesn't care who you are--you already have console access!
Just reboot to log in as a different user
How does the 8½ windowing system work?
Click right mouse button, menu allows you to create a new window
8½ then runs new shell
Binds over /dev/mouse, /dev/bitblt, and /dev/cons with pipes
Filters input events, so shell only gets them when window selected
Different from X windows--typically graphical apps take over window run in
Note: Can run rio recursively inside a window!
"The text-editing features of 8½ are strong enough to displace
special features such as history in the shell, paging and
scrolling, and mail editors." (p. 4) - How does this work?
Everything is editable
Can edit current line you are typing
For history, scroll back, edit old command, highlight, and "send"
Note many people configure their prompt as no-op alias
Means you can copy and send old command including prompt (easier)
Terminal has "hold mode" toggled by ESC key
In hold mode, even pressing return does not send input to program
Mail program puts terminal in hold mode by default--easy to edit messages
What is the cpu command?
Opens shell on another machine--Plan9's SSH. How does it work?
Runs a shell on a remote machine (authenticates with your password)
Attempts to replicate your local namespace on remote machine
Binds /dev/mouse, /dev/bitblt, /dev/cons on server to local devices
Re-creates your file server mounts (using proxy authentication)
What about /bin? Might not want exact same /bin
Substitutes binaries for CPU server arch, which might be different
Isn't there a limit to how seamless heterogenous hardware can be?
What if CPU server has different endianness?
That's why almost everything uses text commands
Won't cc produce wrong output?
There is no cc command. Explicitly specify architecture in command
8a, 8c, 8l - x86 assembler, compiler, linker; object files named .8
ka, kc, kl - sparc assembler, compiler, linker, use ".k" files
How does 9P network protocol work?
All operations done in terms of "fids" - fid is 32-bit handle
fids are chosen by the client
First fid is set in attach message, will correspond to root directory
Some operations on fids:
walk - traverses the namespace, like "cd" for an individual fid
clone - creates new fid as copy of current fid
open - pins fid to file (performs access check), can then read/write
(can no longer walk a fid after opening)
read/write - as expected
stat - return attributes:
type - type of file on server (e.g., 't' for UART device)
dev - instance of device on server (for devices w. multiple mount points)
path - 8-byte unique id for file ( is unique for server)
version - 32-bit number changed every time file modified
wstat - set attributes
remove - deletes the file
clunk - closes a fid; fid becomes invalid
Can you have hard links?
Protocol doesn't support creating hard links
Might also be hard with remove message? Which link would you remove?
How does version field compare to UNIX combo?
Incremental copies/backups much easier with UNIX model
How do they implement a file cache (p. 15)?
Not built into kernel--run user-level proxy if on slow network
Keep cached contents until version field changes
What about writes? They claim it's write through
But what if someone else is writing, too?
Maybe flush cache if version field increases by > # of writes you did
For comparison, how does 9P differ from NFS?
NFS does not show opens and closes
Makes it hard to implement the kinds of user-level file servers they have
E.g., connection server cs needs to garbage collect client channels
NFS handles are server-chosen identifiers bound to particular files
Opaque, but usually
fids are easier to pipeline (but doesn't sound like they do that)
NFS3 WRITE reply tells you after AND before the operation
So you know if someone else wrote file since your cached version
Kernel architecture
What is the channel abstraction in the kernel? Structure contains:
- Type of device, which indexes table of function pointers
(w. eiaread for serial port, procread for proc, etc)
- Server device number (in case multiple instances of Type)
- Qid =
- Flags, device-specific information
Where are channels used?
File descriptors, text segment, current working directory, mount device
How does mount device work?
Mount device instance allocated by mount system call
int mount(int fd, int afd, char *old, int flag, char *aname)
Creates a new channel, of type 'M', functions
Associates that channel with target channel corresponding to 'fd'
Any operations on new channel will invoke, e.g., mountread, mountwrite, ...
Translate these into 9P messages sent to fd's channel
What happens if you write to fd after mounting it?
Would be bad, because could mess up kernel's 9P messages
Kernel actually sets flag (CMGS) on mounted fd's channel, so can't do this
Process/thread model
To create process: cp /bin/date /proc/clone/mem?
No - e.g., would cause problems when mounting /proc from other arch
Instead? rfork - control over namespace, environment, memory, fds,
What does exportfs do?
Implements 9P protocol in terms of the open/close/read/write/etc. syscalls
Is this straight-forward?
Actually, ensuring unique type,dev,path is very annoying
Why do you need to do this? E.g., so that mount tables work properly
Particularly bad if multiple instances of mount device
Proposal: allow streams to be 'popped' off the mount device
Allow reading and writing 9P messages to fd when CMSG flag set
I actually implemented this; was not too hard
Just need to do tag mapping (tags are their version of RPC xids)
Possible lesson here--always strive for lowest com denom interfaces:
It's easy to implement system call interface in terms of 9P
This is what mount device does
It's harder to implement 9P in terms of system call interface
Which is what exportfs does
So 9P is more general--maybe should replace syscall interface
And Plan 9 syscalls orders of magnitude easier to distribute that UNIX
Imagine trying to export the functionality of the UNIX syscall interface!
What is file server architecture
Have memory, hard disks, and WORM drive - multi-layer architecture
View WORM drive as infinite
Assume hardware will improve faster than group generates data
How do you set up a TCP connection?
Usually don't care that it's a TCP connection!
E.g., want to talk SMTP (mail protocol) to host mail.stanford.edu
Open /net/cs, Write that you want SMTP channel to mail.stanford.edu
Cs says: open file /net/tcp/clone, write "connect 171.67.20.25!25"
Do that, then read clone file, get back, e.g., "5".
This means the file you opened as clone is really /net/tcp/5/ctl
So open /net/tcp/5/data to read and write data to SMTP server
How do gateways work?
Just import /net from gateway machine to get outside
How does ftp work?
Just use ls and cp, it shows up as a file system
What is IL and why?
Need reliability and order (which UDP doesn't have)
Need message boundaries (which TCP doesn't have)
In keeping with clean-slate approach, just design new protocol
Why is Plan9 better than an old time-sharing system?
Couldn't you have multiple machines people can log into for less overload?
Use NSF to give a single-system image on a bunch of old-school UNIX machines?
Point is users have control of their terminals
Can construct your own namespace even on CPU servers
On UNIX, can be a pain to install software if you don't have root
In Plan9 there isn't even really a notion of root
What is the difference between a file system interface and RPC / Net Objects?
Network Objects have inheritance, while methods fixed for FS protocol
FS has uniform and familiar access, protection, and naming mechanisms
"the way things are named has profound influence on the system" (p. 20)
What has been the impact of plan 9?
/proc file system now standard
UTF-8 now standard
Why isn't everyone using Plan 9 today instead of Linux?
Plan 9 was certainly ready in time
Licensing issues prevented redistribution
Maybe would-be early adopters didn't like centralized storage model
Administrative issues Plan 9 addresses might not matter to basement hackers
Not fundamental to Plan 9, but large installation was focus of Plan 9 group
Building from the ground up made for a different user experience
Goal of Linux was to replicate experience people already had
E.g., people might want emacs, not sam / acme
Software portability issues
Had POSIX environment, but still easier to build UNIX software on Linux