Linux Portability, Part 2: Exploring musl #ifdefs, or #define PDP_ENDIAN 3412

October 31st, 2018

Part 1 here.

First, we need to get a list of files with useful defines. Useful here means that the preprocessor condition does something of interest, instead of enabling/disabling visibility of functions (POSIX/BSD/GNU_SOURCE), lying around in the /math/ or being math-realated (LDBL_MANT_DIG, Linux does not care about floating point), protecting headers from double inclusion (_H$). Stuff in arch/ we have already seen in Part 1. All other files more or less qualify.

With the list of files ready, let's read:

~/src/musl/ $ grep -r '#if' | grep -v -e '/math/' -e _GNU_SOURCE -e _POSIX_SOURCE -e _BSD_SOURCE \
   -e GNUC -e __cplusplus -e '/complex/' -e '_H$' -e '^arch/' -e 'LDBL_MANT_DIG' | awk -F: '{print $1}' | sort | uniq
configure
include/arpa/ftp.h1
include/arpa/nameser.h2
include/arpa/telnet.h3
include/assert.h4
include/complex.h5
include/endian.h6
include/fcntl.h7
include/features.h8
include/inttypes.h9
include/limits.h10
include/link.h11
include/netinet/icmp6.h12
include/netinet/ip.h13
include/netinet/ip6.h14
include/netinet/tcp.h15
include/poll.h16
include/resolv.h17
include/signal.h18
include/stdc-predef.h19
include/stdint.h20
include/stdio.h21
include/sys/epoll.h22
include/sys/mtio.h23
include/sys/procfs.h24
include/sys/resource.h25
include/sys/socket.h26
include/sys/stat.h27
include/sys/statvfs.h28
include/sys/ttydefaults.h29
include/syslog.h30
include/wchar.h31
ldso/dlstart.c32
ldso/dynlink.c33
src/crypt/crypt_blowfish.c34
src/crypt/crypt_sha512.c35
src/ctype/__ctype_b_loc.c36
src/env/__init_tls.c37
src/env/__libc_start_main.c38
src/fcntl/posix_fadvise.c39
src/fenv/__flt_rounds.c40
src/fenv/arm/fenv-hf.S
src/fenv/arm/fenv.c
src/fenv/fesetround.c
src/fenv/m68k/fenv.c
src/fenv/mips/fenv-sf.c
src/fenv/mips/fenv.S
src/fenv/mips64/fenv-sf.c
src/fenv/mips64/fenv.S
src/fenv/mipsn32/fenv-sf.c
src/fenv/mipsn32/fenv.S
src/fenv/powerpc/fenv-sf.c
src/fenv/powerpc/fenv.S
src/fenv/sh/fenv-nofpu.c
src/fenv/sh/fenv.S
src/internal/atomic.h41
src/internal/dynlink.h42
src/internal/libc.h43
src/internal/pthread_impl.h44
src/internal/syscall.h45
src/internal/vdso.c46
src/ipc/msgctl.c47
src/ipc/msgget.c48
src/ipc/msgrcv.c49
src/ipc/msgsnd.c50
src/ipc/semctl.c51
src/ipc/semget.c52
src/ipc/semop.c53
src/ipc/semtimedop.c54
src/ipc/shmat.c55
src/ipc/shmctl.c56
src/ipc/shmdt.c57
src/ipc/shmget.c58
src/ldso/arm/tlsdesc.S59
src/linux/arch_prctl.c60
src/linux/cache.c61
src/linux/epoll.c62
src/linux/eventfd.c63
src/linux/inotify.c64
src/linux/ioperm.c65
src/linux/iopl.c66
src/linux/personality.c67
src/linux/ptrace.c68
src/linux/signalfd.c69
src/linux/sync_file_range.c70
src/malloc/malloc.c71
src/mman/mmap.c72
src/network/lookup_name.c73
src/network/recvmmsg.c74
src/network/recvmsg.c75
src/network/sendmmsg.c76
src/network/sendmsg.c77
src/process/fork.c78
src/process/posix_spawn.c79
src/process/vfork.c80
src/regex/regcomp.c81
src/regex/regexec.c82
src/regex/tre.h83
src/sched/sched_getcpu.c84
src/select/poll.c85
src/select/select.c86
src/setjmp/mips/longjmp.S87
src/setjmp/mips/setjmp.S88
src/setjmp/mips64/longjmp.S89
src/setjmp/mips64/setjmp.S90
src/setjmp/mipsn32/longjmp.S91
src/setjmp/mipsn32/setjmp.S92
src/setjmp/powerpc/longjmp.S93
src/setjmp/powerpc/setjmp.S94
src/setjmp/sh/longjmp.S95
src/setjmp/sh/setjmp.S96
src/signal/block.c97
src/signal/sigfillset.c98
src/stat/chmod.c99
src/stat/fchmod.c100
src/stat/fstat.c101
src/stat/lstat.c102
src/stat/mkdir.c103
src/stat/mknod.c104
src/stat/stat.c105
src/stat/statvfs.c106
src/stat/utimensat.c107
src/stdio/__stdio_seek.c108
src/stdio/remove.c109
src/stdio/rename.c110
src/stdio/tempnam.c111
src/stdio/tmpfile.c112
src/stdio/tmpnam.c113
src/stdio/vfwscanf.c114
src/string/arm/memcpy.c115
src/string/arm/memcpy_le.S116
src/string/memcpy.c117
src/string/strsignal.c118
src/thread/__set_thread_area.c119
src/thread/arm/__set_thread_area.c120
src/thread/pthread_cancel.c121
src/thread/sh/__set_thread_area.c122
src/thread/sh/__unmapself.c123
src/time/clock_gettime.c124
src/unistd/access.c125
src/unistd/chown.c126
src/unistd/dup2.c127
src/unistd/dup3.c128
src/unistd/fchown.c129
src/unistd/lchown.c130
src/unistd/link.c131
src/unistd/lseek.c132
src/unistd/pause.c133
src/unistd/pipe.c134
src/unistd/readlink.c135
src/unistd/rmdir.c136
src/unistd/symlink.c137
src/unistd/unlink.c138
tools/mkalltypes.sed139

Now, what remains is to summarize (40-4)+139 footnotes.

  1. ifdef FTP_NAMES, not referenced anywhere. []
  2. Endianness of DNS PDU header structure. []
  3. String tables under defines unused anywhere (AUTH_NAMES, etc.). []
  4. NDEBUG. []
  5. Complex definitions and operations fixups for clang/gcc/C++. []
  6. Definitions of functions like be16toh, headers included conditionally; lulz: include/endian.h:#define __PDP_ENDIAN 3412. I had NFI. []
  7. Largefile: versions of fs-related syscalls with explicitly 64-bit arguments are redirected to the default ones (IIRC this is done for architectures where defaults are 64-bit already), S_IRUSR* flags protected from double definition (the double definition is in stat.h) []
  8. Nothing interesting, C/POSIX feature flags definitions. []
  9. C type definitions. []
  10. C type limits. []
  11. Dynamic linker structure definitions: field sizes differ depending on pointer size, etc. []
  12. Endianness fixups for flags. []
  13. Endianness fixups for structures. []
  14. Ditto. []
  15. Ditto. []
  16. Mips fixup using previous definition (see Part 1), unused fixup POLLMSG. []
  17. Define for changing resolver config path. []
  18. For mips: two fields in siginfo_t structure are swapped. []
  19. C-specific define. []
  20. wchar_t signedness detection, uintptr_t-related macro definitions dependent of pointer size. []
  21. Largefile-related redefinitions. []
  22. Structure epoll_data is packed on x86_64. []
  23. Define for changing default path to tape device. []
  24. Pointer size related changes in structure elf_prpsinfo field size. []
  25. Fixups for mips, and for largefile support. []
  26. SOCK_STREAM,SOCK_DGRAM redefinitions on mipses, setsockopt redefs for mipses. []
  27. S_IRUSR* flags protected from double definition, largefile fixup for types/functions. []
  28. Endianness related changes in structures, largefile rewiring. []
  29. Nothing platform-specific, conditionally defines some synonym constants. []
  30. Conditional definition of SYSLOG_NAMES structure. []
  31. Wchar_t type and function definitions, platform-specific: test for wchar_t signedness. []
  32. Nommu stuff. []
  33. Nommu stuff, mips-specific dynamic linker function definitions, mmap2 over mmap difference (sixth argument of mmap2 is in pages, not in bytes), thread-local-storage setup is also platform specific. []
  34. 'if 0' define, disables some dead code. []
  35. 'if 1' define, rolls-in (opposite of unroll) a loop. []
  36. Endianness related define. []
  37. TLS, mmap2, pointer size. []
  38. poll vs. ppoll system call usage (aarch64 and or1k). []
  39. powerpc posix_fadvise has 6, not 5 arguments like rest of the arches. []
  40. Architecture-dependent floating-point unit stuff. Same for the rest of the files in src/fenv/. []
  41. definitions of atomic operations in terms of 'load linked'-'store conditional' primitives for platforms that have these primitives and lack direct atomic operations. []
  42. Pointer size, fdpic aka nommu stuff, some mips GOT relocation stuff (not that we care about dynamic linking on mips) and debug symbol tables -- also for mips. []
  43. Provides page size definition for platforms that have tweakable page size (arm, mips, powerpc). []
  44. Some architectures (powerpc, x32) have stack check guard canary value in the different place in the thread control block. []
  45. Important header:
  46. mips uses a value for RLIM_INFINITY different from other architectures.
  47. SuperH specific pread fixup for 64-bit-bit arguments (LL_O/LL_E macros are necessary to 'support' 64-bit arguments on architectures with 32-bit registers).
  48. or1k uses different value for mmap2 unit in offset field.
  49. For x32 ABI, cast arguments to long long.
  50. microblaze and or1k with clang, 32-bit powerpc invoke system call not through C inline functions with inline assembly inside, but through assembly-written function. My understanding is that all system call arguments are provided in this case.
  51. socketcall: some archtectures (i386, m68k, s390x) don't have a dedicated syscall numbers for each network-related syscall, instead they use a single socketcall syscall, i.e. instead of syscall(sys_socket, a, b, c) they use syscall(sys_socketcall, sys_socket, a,b,c).
  52. 32-bit arm, i386, m68k, microblaze, and superh have their UID-related systemcalls rewired to 32-bit versions (on these arches, uids are 16-bit in default versions): i.e., setuid(int16_t) gets rewired to setuid32(int32_t).
  53. Usual 32-bit to 64-bit system call rewiring.
  54. Some architectures (aarch64, or1k) have no open system call, and have to use openat instead.
  55. []
  56. On some arches (aarch64, arm, i386, mips, mips64, mipsn32, x86_64), kernel injects a vDSO -- a dynamic library -- into an address space of each applications. The reasoning is that some system calls can be serviced without entering kernel, and for these cases the dynamic library contains syscall implementation (in practice, these syscalls are clock_gettime/gettimeofday). Also, on i386, vdso contains code for preferable way to enter kernel. Anyway, man vdso tells it all. []
  57. On arm, m68k, microblaze, and sh, (but only little-endian, where both can apply) kernel structure must contain mode field shifted left by 16 bits. Also, i386, m68k, microblaze, mips (but not mips64, mipsn32), powerpc, powerpc64, s390x, sh don't have SYS_msgctl system call, and have to use SYS_ipc syscall similar to socketcall. []
  58. Ditto for sys_ipc. []
  59. Ditto for sys_ipc. []
  60. Ditto. []
  61. Ditto for sys_ipc and 16-bit-shifted mode on little-endian architectures. []
  62. Ditto for sys_ipc. []
  63. Ditto for sys_ipc. []
  64. Ditto for sys_ipc. []
  65. i386, m68k, mips, powerpc, powerpc64, s390x, sh don't have shmat syscall, it goes through SYS_ipc. []
  66. Ditto for shifted mode, ditto for SYS_ipc. []
  67. Ditto for SYS_ipc. []
  68. Ditto for SYS_ipc. []
  69. Different arm versions use different instructions to get thread-local-storage address for provided descriptor. []
  70. Only Intel architectures have this system call. []
  71. cachectl is available only on mipses, cacheflush -- on arm, m68k, mips, mips64, mipsn32, sh. []
  72. Fallback to older versions of epoll_* system call if possible. Not possible on aarch64 and or1k. []
  73. Fallback to older versions of eventfd system call if possible. Not possible on aarch64 and or1k. []
  74. Fallback to older versions of inotify_init system call if possible. Not possible on aarch64 and or1k. []
  75. Syscall only available on i386, microblaze mips, powerpc, powerpc64, x32, x86_64. []
  76. Ditto. []
  77. Has a define that allows disabling this system call, but it is enabled everywhere. []
  78. SPARC-specific define. Musl does not support sparc. []
  79. Fallback to older version of signalfd system call if possible. Not possible on aarch64 and or1k. []
  80. arm, powerpc, powerpc64 use sync_file_range2 syscall, rest use sync_file_range. []
  81. #if 0, #if 1 to disable chunks of code. []
  82. Use mmap2 where possible/necessary (arm, i386, m68k, microblaze, mips, or1k, powerpc, sh). []
  83. #if 0 define. []
  84. Padding zeroed for arches where LONG_MAX > INT_MAX. []
  85. Ditto. []
  86. Bridging over structure differences between Linux and Posix specification: on arches where LONG_MAX > INT_MAX, use sendmsg instead of sendmmsg. []
  87. Padding and consistency checking for arches where LONG_MAX > INT_MAX. []
  88. aarch64 and or1k don't have fork syscall, have to use clone syscall instead. []
  89. aarch64 and or1k don't have dup2 syscall, and use dup3 instead. . []
  90. Can not be done from C, so C code falls back to fork. Actual implementations available for arm, i386, s390x, sh, x32, x86_64. []
  91. Nothing kernel-related. #if 0 and #if debug conditional compilation. []
  92. Nothing kernel-related. #if 0, and conditional compilation of alloca-based allocator. []
  93. Nothing kernel-related, alloca-based allocator definition. []
  94. Uses vdso where supported. []
  95. aarch64 and or1k don't have poll syscall, and use ppoll instead. []
  96. i386, m68k, microblaze, powerpc, powerpc64, s390x, x32, x86_64 use select syscall, aarch64, arm, mips, mips64, mipsn32, or1k, sh don't have that and use pselect6. []
  97. Should have ignored these. Softfloat-specific code. []
  98. Ditto. []
  99. Ditto. []
  100. Ditto. []
  101. Ditto. []
  102. Ditto. []
  103. Ditto. []
  104. Ditto. []
  105. Ditto. []
  106. Ditto. []
  107. Internal function to mask all signals. Selects right mask based on the number of signals and unsigned long type. []
  108. Ditto. []
  109. aarch64 or1k don't have chmod, use chmodat. []
  110. Ditto. []
  111. Ditto. []
  112. Ditto. []
  113. Ditto. []
  114. Ditto. []
  115. Ditto. []
  116. 64-bit vs. 32-bit system call versions, fixup code is present to make up for Linux-POSIX structure differences. []
  117. Fallback to futimesat for aarch64 and or1k. []
  118. arm, i386, m68k, microblaze, mips, or1k, powerpc, powerpc64, sh use llseek instead of lseek -- allows using 64-bit offsets on platforms with 32bit registers. []
  119. remove(2) is not present on aarch64 and or1k, those use unlinkat(2) instead. []
  120. Ditto for *at version. []
  121. Ditto: tempnam internally uses lstat on all arches but aarch64 and or1k. []
  122. Ditto: unlink vs unlinkat. []
  123. Ditto: tempnam internally uses lstat on all arches but aarch64 and or1k. []
  124. Messes with internals of stdio implementation in platform-independent way. []
  125. Imports platform-independent version. []
  126. Conditional compilation for little endian arm. []
  127. Endianness workarounds. []
  128. Ifdefs are used here to map mips signal codes to the right indices in string table. []
  129. syscall for this functionality is present only on i386, m68k, microblaze, mips, mips64, mipsn32, x86_64, arm and sh have their own implementations. []
  130. arm invokes this system call via a magical constant. []
  131. Thread killing code on sh modifies GOT table (dynamic linking). []
  132. On sh, this functionality is available via inline assembly. []
  133. ifdef is used to check if MMU is present on the platform (TBH I haven't head of SuperH with MMU). []
  134. Implementation uses vDSO where present. []
  135. Same or1k/aarch stuff. []
  136. Ditto. []
  137. dup2 unavailable on aarch64/or1k. []
  138. Tries to use dup2 by default, unless it is not present. []
  139. or1k/aarch64 miss chown syscall. []
  140. Ditto. []
  141. Ditto. []
  142. same as stdio_seek. []
  143. aarch64/or1k lack pause system call, use ppoll instead. []
  144. Ditto (pipe/pipe2). []
  145. Ditto. []
  146. Ditto. []
  147. Ditto. []
  148. Ditto. []
  149. This script expands some macros that can't be implemented in C preprocessor. []

Leave a Reply