cae36a52 |
1 | SYSCALLS |
2 | |
3 | on linux/i386, the machine code puts the arguments of a syscall in the |
4 | registers AX, BX, CX, DX, DI, SI and makes a soft interrupt 0x80. |
5 | |
6 | as the plan9 kernel doesnt care about the interrupt vector 0x80 it |
7 | sends a note to the process that traped and if not handled kills it. |
8 | in a note handler, it is possible to access the machine state of the |
9 | process when the trap/interrupt happend from the ureg argument. |
10 | |
11 | in linuxemu, we install a note handler that checks if the trap was a |
12 | linux syscall and call our handler function from our systab. |
13 | |
14 | after our syscall handler returned, we move the program counter |
15 | in the machine state structure after the int 0x80 instruction and |
16 | continue execution by accepting the note as handled with a call to |
17 | noted(NCONT). |
18 | |
19 | todo automatic conversion to a plan9 function call the number of |
20 | arguments and the function name of the handler must be known. this |
21 | information is provided by the linuxcalltab input file that is feed trough |
22 | linuxcalltab.awk to build neccesary tables. |
23 | |
24 | the linux specific syscall handling and argument conversion done in |
25 | linuxcall.c only. the idea is to later add support for other syscall |
26 | personalities like bsd without having to change the handler code. |
27 | |
28 | |
29 | MEMORY |
30 | |
31 | unlike shared libraries wich are position independent, binaries have to be |
32 | loaded to a fixed address location. (elf supports position independent |
33 | programs that can be loaded everywhere, but its not used on i386) |
34 | |
35 | the emulator doesnt need to load and relocate shared libraries itself. this is |
36 | done my the runtime linker (/lib/ld-linux.so). it just needs to load |
37 | the binary and the runtime linker to ther prefered location and jump into |
38 | the entry point. then the runtime linker will parse the elf sections of the |
39 | binary and call mmap to load further shared libraries. |
40 | |
41 | the first thing we need is an implementation of mmap that allows us |
42 | to copy files to fixed addresses into memory. to do that on plan9, |
43 | segments are used. |
44 | |
45 | its is not possible to create a segment for every memory mapping |
46 | because plan9 limits the number of segments per process to a small |
47 | number. instead we create a fixed number of segments and |
48 | expand/shrink them on demand. the linux stack area is fixed size and |
49 | uses the fact thet plan9 doesnt allocate physical memory until pages |
50 | are touched. |
51 | |
52 | here are 3 segments created for a linux process: |
53 | |
54 | "private" is used for all MAP_PRIVATE mappings and can be shared if |
55 | processes run in same address space. code, data and files is mapped there. |
56 | |
57 | "shared" for shared memory mappings. |
58 | |
59 | "stack" is like "private", but lives just below the plan9 stack segment. |
60 | this is needed because glibc expands the stack down by mmap() pages |
61 | below the current stack area. we cannot use the plan9 stack segment |
62 | because that segment is copied on rfork and is never shared between |
63 | processes. |
64 | |
65 | the data structures of the emulator itself ("kernel memory") need to |
66 | be shared for all processes even if the linux process runs in its own |
67 | private address space, so the plan9 Bss and Data segments are made |
68 | shared on startup by copying the contents of the original segment into a |
69 | temporary file, segdetach() it and segattach() a new shared segments |
70 | on the same place and copy the data back in from the file. |
71 | |
72 | with this memory layout, it is possible for the linux process to damage |
73 | data structures in the emulator. but we seem to be lucky for now :) |
74 | |
75 | |
76 | USER PROCESSES (UPROCS) |
77 | |
78 | linuxemu does not switch ans schedule linux processes itself. every user |
79 | process has its own plan9 process. memory sharing semantics is translated |
80 | to rfork flags on fork/clone. |
81 | |
82 | we have a global process table of Uproc structures to track states and |
83 | resources for all user processes: |
84 | |
85 | fs: filesystem mount table |
86 | fdtab: the filedescriptor table |
87 | mem: memory mappings |
88 | signal: signal handler and queue |
89 | trace: debug trace buffer |
90 | |
91 | resources that can be shared are reference counted and get freed when |
92 | the last process referencing them exits. |
93 | |
94 | |
95 | KERNEL PROCESSES (KPROCS) |
96 | |
97 | if we needs to defer work or do asynchronous i/o it can spawn a |
98 | kernel process with kprocfork. kernel processes dont have a Uproc |
99 | structure associated and have the userspace memory segments detached |
100 | therfor cant access userspace memory. |
101 | |
102 | bufprocs and timers are implemented with kernel processes. |
103 | |
104 | |
105 | DEVICES |
106 | |
107 | ealier versions mapped linux files directly to plan9 files. this made |
108 | the implementation of ioctls, symlinks, remove on close, and |
109 | select/poll hard and also had problems with implementing fork sharing |
110 | semantics. |
111 | |
112 | current linuxemu does it all by itself. here is a global device table |
113 | of Udev structures. devices can implement all i/o related syscalls by |
114 | providing a function pointer in ther Udev. when a device has to deal |
115 | with asynchronous io on real plan9 files it uses bufprocs. |
116 | |
117 | |