add linux_emul base, reorganize docs
[openbsd_emul.git] / linux_emul_base / doc / linuxemu.txt
CommitLineData
cae36a52 1SYSCALLS
2
3on linux/i386, the machine code puts the arguments of a syscall in the
4registers AX, BX, CX, DX, DI, SI and makes a soft interrupt 0x80.
5
6as the plan9 kernel doesnt care about the interrupt vector 0x80 it
7sends a note to the process that traped and if not handled kills it.
8in a note handler, it is possible to access the machine state of the
9process when the trap/interrupt happend from the ureg argument.
10
11in linuxemu, we install a note handler that checks if the trap was a
12linux syscall and call our handler function from our systab.
13
14after our syscall handler returned, we move the program counter
15in the machine state structure after the int 0x80 instruction and
16continue execution by accepting the note as handled with a call to
17noted(NCONT).
18
19todo automatic conversion to a plan9 function call the number of
20arguments and the function name of the handler must be known. this
21information is provided by the linuxcalltab input file that is feed trough
22linuxcalltab.awk to build neccesary tables.
23
24the linux specific syscall handling and argument conversion done in
25linuxcall.c only. the idea is to later add support for other syscall
26personalities like bsd without having to change the handler code.
27
28
29MEMORY
30
31unlike shared libraries wich are position independent, binaries have to be
32loaded to a fixed address location. (elf supports position independent
33programs that can be loaded everywhere, but its not used on i386)
34
35the emulator doesnt need to load and relocate shared libraries itself. this is
36done my the runtime linker (/lib/ld-linux.so). it just needs to load
37the binary and the runtime linker to ther prefered location and jump into
38the entry point. then the runtime linker will parse the elf sections of the
39binary and call mmap to load further shared libraries.
40
41the first thing we need is an implementation of mmap that allows us
42to copy files to fixed addresses into memory. to do that on plan9,
43segments are used.
44
45its is not possible to create a segment for every memory mapping
46because plan9 limits the number of segments per process to a small
47number. instead we create a fixed number of segments and
48expand/shrink them on demand. the linux stack area is fixed size and
49uses the fact thet plan9 doesnt allocate physical memory until pages
50are touched.
51
52here are 3 segments created for a linux process:
53
54"private" is used for all MAP_PRIVATE mappings and can be shared if
55processes run in same address space. code, data and files is mapped there.
56
57"shared" for shared memory mappings.
58
59"stack" is like "private", but lives just below the plan9 stack segment.
60this is needed because glibc expands the stack down by mmap() pages
61below the current stack area. we cannot use the plan9 stack segment
62because that segment is copied on rfork and is never shared between
63processes.
64
65the data structures of the emulator itself ("kernel memory") need to
66be shared for all processes even if the linux process runs in its own
67private address space, so the plan9 Bss and Data segments are made
68shared on startup by copying the contents of the original segment into a
69temporary file, segdetach() it and segattach() a new shared segments
70on the same place and copy the data back in from the file.
71
72with this memory layout, it is possible for the linux process to damage
73data structures in the emulator. but we seem to be lucky for now :)
74
75
76USER PROCESSES (UPROCS)
77
78linuxemu does not switch ans schedule linux processes itself. every user
79process has its own plan9 process. memory sharing semantics is translated
80to rfork flags on fork/clone.
81
82we have a global process table of Uproc structures to track states and
83resources for all user processes:
84
85fs: filesystem mount table
86fdtab: the filedescriptor table
87mem: memory mappings
88signal: signal handler and queue
89trace: debug trace buffer
90
91resources that can be shared are reference counted and get freed when
92the last process referencing them exits.
93
94
95KERNEL PROCESSES (KPROCS)
96
97if we needs to defer work or do asynchronous i/o it can spawn a
98kernel process with kprocfork. kernel processes dont have a Uproc
99structure associated and have the userspace memory segments detached
100therfor cant access userspace memory.
101
102bufprocs and timers are implemented with kernel processes.
103
104
105DEVICES
106
107ealier versions mapped linux files directly to plan9 files. this made
108the implementation of ioctls, symlinks, remove on close, and
109select/poll hard and also had problems with implementing fork sharing
110semantics.
111
112current linuxemu does it all by itself. here is a global device table
113of Udev structures. devices can implement all i/o related syscalls by
114providing a function pointer in ther Udev. when a device has to deal
115with asynchronous io on real plan9 files it uses bufprocs.
116
117