3 Both glibc and sun's libc require 64-bit atomic operations, first found in
4 the Pentium Pro. The suggested method of compiling on 32-bit x86 is to set
9 In order to avoid duplicating OpenSolaris-specifc headers, most extensions
10 define any constants/structs in accompanying private/"P" headers. For
11 example, zone_* is implemented in zone.c and constants/structs are defined in
14 auxiliary vector (auxv_t):
16 Proper OpenSolaris does not support statically-linked executables (i.e. via
17 gcc -static). However, glibc does, but with certain restrictions. The kernel
18 only builds the auxv_t if the elf file is not of type ET_EXEC or if the
19 PT_INTERP program header exists. This means that dynamically-linked
20 executables and libaries get an auxv_t while statically-linked executables
21 don't. This means that statically-linked executables won't see PT_TLS, which
22 is needed for __thread support. We can test for the SHARED macro for libc
23 library code, but in general, __thread will not work for statically-linked
26 In order to fix this, it should be a matter of changing the kernel to
27 unconditionally supply the auxv_t.
31 The OpenSolaris kernel allows for loadable schedule classes. A scheduling
32 class has an id (pc_cid), associated name (pc_clname), and class-specific
33 scheduling information. As new schedulers are loaded, they are incrementally
36 Since id's are assigned dynamically, there is no way to statically associate
37 a class id with a posix scheduler (i.e. SCHED_*). The only exception is
38 SCHED_SYS, which is guaranteed to have cid == 0.
42 Each process has a set of privileges, represented by a prpriv_t. This struct
43 contains a header, followed by a number of priv_chunk_t blocks (the priv
44 sets), and finally followed by a number of priv_info_t blocks (per process
49 The sun libpthread/libthread implementation assumes a 1:1 mapping between
50 pthread_t/thread_t and lwpid_t, while NPTL maps thread descriptors to
51 pthread_t. This behaviour was added to NPTL and maybe enabled by defining
56 Recursive locks are represented by an 8-byte counter defined by the
57 mutex_rcount macro. The maximum number of recursive waiters is
60 Various fields are defined in a 64-bit field. 32 of the bits are used to
61 hold the owner pid. 8-bits each are used for holding the lock byte, the
62 number of waiters, and the number of spinners. Solaris defines some macros
63 for accessing these (architecture dependent of course):
65 mutex_lockword (32-bits): This is used if only the lock bit needs to
68 mutex_lockword64 (64-bits): This is used if you need to atomically swap
69 both the lock bytes and the owner pid. Note that where the pid portion
70 is located is dependent on byte ordering.
72 mutex_lockbyte (8-bits): This is the actual lock byte. It is set to 1 when
73 the lock is locked, and 0 when unlocked.
75 mutex_waiters (8-bits): This is set to 1 when there is another thread
76 waiting and 0 when there are no other waiters.
78 mutex_spinners (8-bits): This byte is apparently unused.
80 mutex_ownerpid (32-bits): Set to the mutex owner's process pid when the
83 The data field (aka mutex_owner) is used by sun libc to store a pointer to
84 the thread-descriptor of the owning thread. We split this 64-bit field into
87 mutex_owner (32-bits): The lwpid of the owning thread.
89 mutex_cond (32-bits): An in-use counter that is incremented when waiting on
90 a condition and decremented when we return (or are cancelled).
92 The kernel only touches the data field when it is cleared during cleanup for
95 The kernel does not handle recursive or error-checking mutexes.
97 The kernel does not set mutex_lockbyte for mutexes with the
98 LOCK_PRIO_INHERIT bit set.
104 The cond_waiters_kernel byte is set to 1 if there are waiters on the
105 condition variable and 0 otherwise. The cond_waiters_user byte is not
108 The only clock types supported by sun libc are CLOCK_REALTIME and
111 The data field is not used by the kernel.
115 The kernel only supports shared/process reader-writer locks; the private
116 rwlock implementation must be completely implemented in libc. For the shared
117 case, readercv and writercv are used to track the owner (thread and process).
118 The sun docs also state that the sun implementation favours writers over
121 There is no apparent advantage in using the rwlock syscalls since any
122 private implementation that used the embedded mutex and cv's would also work
123 correctly in the shared case.
125 Our implementation adds three additional fields for tracking the owner (thread
126 and process) of a reader-writer lock.
128 [0] http://docs.sun.com/app/docs/doc/819-2243/rwlock-init-3c?a=view
134 This is used to search a database given a key. Examples that use nss_search
135 include gethostbyname_r and _getauthattr.
142 These are used when for iterating over a database. nss_getent, nss_sent,
143 and nss_endent are used in gethostent, sethostent, and endhostent,
144 respectively. nss_delete is used to free resources used by the interation;
145 it usually directly follows a call to nss_endent.
149 This function is used to parse a file directly, rather than going through
150 nsswitch.conf and its databases.
154 Dealing with 64-bit returns in 32-bit code is tricky. For 32-bit x86, %eax
155 and %edx are not saved across function calls. Since syscalls always return
156 a 32-bit integer we always have to push/pop %eax across functions. However,
157 since there are very few 64-bit returning syscalls, we don't save %edx unless
158 we have a 64-bit returning syscall. The following is a list of 64-bit
161 getgid, getuid, getpid, forkx, pipe, lseek64
163 Currently, the only time we actually call functions is in the case of
164 cancellation points (we call pthread_async_enable/disable). lseek64 is the
165 only syscall listed above that is a cancellation point. To deal with this,
166 we define SYSCALL_64BIT_RETURN in lseek64.S, which triggers inclusion of
169 Additionally, 64-bit returning syscalls set both %eax and %edx to -1 on
170 error. Similarily this behaviour is enabled by SYSCALL_64BIT_RETURN. Note
171 that getegid, geteuid, and getppid are special in that their libc
172 equivalents actually return 32-bit integers so we don't need to worry about
173 %edx on error. With forkx and pipe, it suffices to check only the lower
174 32-bits (one of the pid/fd's returned) for -1. For lseek64 we do have to
175 check the full 64-bit return for -1.
179 Many of the _SC_ sysconf values are obtained via the systemconf syscall. The
180 following is a table of mappings from _SC_ to _CONFIG_ values. The third
181 column lists the value returned by sysdeps/posix/sysconf.c.
183 _SC_CHILD_MAX _CONFIG_CHILD_MAX _get_child_max
184 _SC_CLK_TCK _CONFIG_CLK_TCK _getclktck
185 _SC_NGROUPS_MAX _CONFIG_NGROUPS NGROUPS_MAX
186 _SC_OPEN_MAX _CONFIG_OPEN_FILES __getdtablesize
187 _SC_PAGESIZE _CONFIG_PAGESIZE __getpagesize
188 _SC_XOPEN_VERSION _CONFIG_XOPEN_VER _XOPEN_VERSION
189 _SC_STREAM_MAX _CONFIG_OPEN_FILES STREAM_MAX
190 _SC_NPROCESSORS_CONF _CONFIG_NPROC_CONF __get_nprocs_conf
191 _SC_NPROCESSORS_ONLN _CONFIG_NPROC_ONLN __get_nprocs
192 _SC_NPROCESSORS_MAX _CONFIG_NPROC_MAX
193 _SC_STACK_PROT _CONFIG_STACK_PROT
194 _SC_AIO_LISTIO_MAX _CONFIG_AIO_LISTIO_MAX AIO_LISTIO_MAX
195 _SC_AIO_MAX _CONFIG_AIO_MAX AIO_MAX
196 _SC_AIO_PRIO_DELTA_MAX _CONFIG_AIO_PRIO_DELTA_MAX AIO_PRIO_DELTA_MAX
197 _SC_DELAYTIMER_MAX _CONFIG_DELAYTIMER_MAX DELAYTIMER_MAX
198 _SC_MQ_OPEN_MAX _CONFIG_MQ_OPEN_MAX MQ_OPEN_MAX
199 _SC_MQ_PRIO_MAX _CONFIG_MQ_PRIO_MAX MQ_PRIO_MAX
200 _SC_RTSIG_MAX _CONFIG_RTSIG_MAX RTSIG_MAX
201 _SC_SEM_NSEMS_MAX _CONFIG_SEM_NSEMS_MAX SEM_NSEMS_MAX
202 _SC_SEM_VALUE_MAX _CONFIG_SEM_VALUE_MAX SEM_VALUE_MAX
203 _SC_SIGQUEUE_MAX _CONFIG_SIGQUEUE_MAX SIGQUEUE_MAX
204 _SC_SIGRT_MAX _CONFIG_SIGRT_MAX
205 _SC_SIGRT_MIN _CONFIG_SIGRT_MIN
206 _SC_TIMER_MAX _CONFIG_TIMER_MAX TIMER_MAX
207 _SC_PHYS_PAGES _CONFIG_PHYS_PAGES __get_phys_pages
208 _SC_AVPHYS_PAGES _CONFIG_AVPHYS_PAGES __get_avphys_pages
209 _SC_COHER_BLKSZ _CONFIG_COHERENCY
210 _SC_SPLIT_CACHE _CONFIG_SPLIT_CACHE
211 _SC_ICACHE_SZ _CONFIG_ICACHESZ
212 _SC_DCACHE_SZ _CONFIG_DCACHESZ
213 _SC_ICACHE_LINESZ _CONFIG_ICACHELINESZ
214 _SC_DCACHE_LINESZ _CONFIG_DCACHELINESZ
215 _SC_ICACHE_BLKSZ _CONFIG_ICACHEBLKSZ
216 _SC_DCACHE_BLKSZ _CONFIG_DCACHEBLKSZ
217 _SC_DCACHE_TBLKSZ _CONFIG_DCACHETBLKSZ
218 _SC_ICACHE_ASSOC _CONFIG_ICACHE_ASSOC
219 _SC_DCACHE_ASSOC _CONFIG_DCACHE_ASSOC
220 _SC_MAXPID _CONFIG_MAXPID
221 _SC_CPUID_MAX _CONFIG_CPUID_MAX
222 _SC_EPHID_MAX _CONFIG_EPHID_MAX
223 _SC_SYMLOOP_MAX _CONFIG_SYMLOOP_MAX SYMLOOP_MAX