Lisp, documenting my encounter with, step #1

July 27th, 2019

Dear blog, Yesterday I started to learn the LISP language. I do not remember how many times I tried to do this (probably about 5 times), but this time will be different. For one, I'm going to document the process. For two, I'm going to use the language. I tried some books before, this time I'll just use the Hyperspec1.

Let's just dive right in at DESTRUCTURING-BIND2. In the description we find 2 lines, the links in the first line refer to page we are on3. Plus, the language is weird, it reads as if dead things (or ideal objects) somehow posses agency. The real information is probably in the link to the description of the lambda form, but that part I'll do later in another post. On this page we have to example to parse.

 (defun iota (n) (loop for i from 1 to n collect i))       ;helper
 (destructuring-bind ( (a &optional (b 'bee)) one two three)
     `((alpha) ,@(iota 3))
   (list a b three two one)) =>  (ALPHA BEE 3 2 1)

Let's explain something with the most convoluted example we can think of. First the helper function, I could not image this would be lisp, it reads like python with "loop",  "for", "from", "to" and "collect" syntax. Next, from the existence of this helper function in the documentation, can I then conclude that no standard function has been specified to generate a sequence of consecutive integers?. Finally, this function was not needed for the example as in the next line it will be used to create an array of 3 items!. Next, the actual use of the function with an attempt to put most of the special syntax into the call, all of this could be put in the documentation of the syntax itself. The whole line does provide a good exercise to learn and figure out a whole host of the special common lisp syntax trick (look at that we have ', ` and ,@).  In the next line the bound variables are used, this time not using special syntax but a call to the list function4. Finally the result of the last form is printed5.

After I gave up on using the destructuring-bind function, I came to 'if' and so into progn. Both, to be done next.

  1. Let's read that first page

    The ANSI Common Lisp standard contains nearly 1100 pages describing nearly a thousand functions and variables in sufficient detail to accommodate hosting of the language on a wide variety of hardware and operating system platforms

    First, the nearly, these numbers should be easy enough to determine exactly, especially the number of functions and variables (one could even write a LISP program to do so). Next, somehow by describing functions it is possible to accommodate hosting? or is it the containing of the pages that does this?.

    Another page with this information can be found here, clearly an earlier version of the page. With dead links and reference to a Harlequin company, which seems to have disappeared into thin air.

    []

  2. How I got there; I wanted to write a function to concatenate strings. A function exists to concatenate the contents of all kinds of sequences. But it's to generic for my taste, so let's write a wrapper function. Then I have a list in my caller,  where a &rest parameter is expected and so I found DESTRUCTURING-BIND []
  3. How does that help?  It seems that the links have been auto-generated []
  4. Could this also be done with special syntax? note to self: this list function is a good candidate for a next post []
  5. No, '=>' is not special syntax []

GNAT SJLJ Build project

July 7th, 2019

Getting a SJLJ build working is not going well. Although success seemed just around the corner 2 weeks ago, the whole project is bogged down and seemingly going nowhere. To help myself, I'm making a list of the foreseeable steps to take next and an estimation on when I hope to have a build done.

First, the current status. The builds of the cross compiler for x86-64 and aarch64 seem to work. But this only seems so, as the test I am using to see if aborting a thread works, does not work. Also, the next step, building the native compilers crashes. The crash has to do with a jump instruction somewhere in a file called "haifa_sched.c". First inspection of the "haifa_sched.c" file was depressing and looks like sphagetti.

The plan (left is failure, right is success);

plandot

Currently I expect each task to tak

e bout a day, with the hope this will improve.

Estimate to have this project done: 15-september-2019

Estimate to have this project done: 15-September-2019

Update (20-July-2019):

And so it goes, time flies while you are looking for ways on not working on a task you do not want to do yet.

plandot1

I've updated the plan graph and added some more items, now I need to find a way to make clear that;

  1. I'm first going to re-try to get a working test-case with the AdaCore binaries. All the ingredients are already available and have been reported in the forum to work, so this should not be hard. The difference with the previous attempt is that I'll document it.
  2. Next, given the working test-case, I'm going to work on getting a working ave1 cross compiler for intel.
  3. Next, given the working test-case + cross-compiler, I'm going to work on getting a working native compiler for intel.
  4. Next (not in the graph yet), given the working test-case + cross-compiler + native compiler for inter, I'm going to do step 2 and 3 for ARM.

Of course, my previous attempt was to go for the last point in step 4. If that had worked then it would have provided me with the shortest path to the finish. But once on the path it is very hard to simply "core dump"1 the whole thing and start from scratch.

  1. Yes I know the term, what it means and that this is the wrong usage. I call this humor, my only excuse being that I'm boring. []

ваше слово, товарищ маузер!

November 30th, 2018

Every now and then some russian is posted in the log. As I'm trying to learn russian, I wanted
to translate all those russian words and sentences and make a page out of it. This morning the last
line of the log read

"ваше слово, товарищ маузер!".

What could this possibly mean? I knew the first tree words but had to look up the last.
This last word turned out to be german written in cyrillic: "Mauser". And
so we get

Your word, comrade mauser!

So there you have it! A fresh translation, but I have no idea of the meaning. Let's try to find
some meaning and search for the sentence. It turns out the be part of a poem (стихи), "Левый марш".
And so, "Левый" meaning "left wing", I assume the poem could be called in english:
"The March of the Socialist". From the name and the number of links you get when searching
for it, the poem was probably taught to all USSR children.1

The full poem:

Разворачивайтесь в марше!
Словесной не место кляузе.
Тише, ораторы!
Ваше
слово,
товарищ маузер.
Довольно жить законом,
данным Адамом и Евой.
Клячу историю загоним.
Левой!
Левой!
Левой!

Эй, синеблузые!
Рейте!
За океаны!
Или
у броненосцев на рейде
ступлены острые кили?!
Пусть,
оскалясь короной,
вздымает британский лев вой.
Коммуне не быть покорённой.
Левой!
Левой!
Левой!

Там
за горами го́ря
солнечный край непочатый.
За голод,
за мора море
шаг миллионный печатай!
Пусть бандой окружат на́нятой,
стальной изливаются ле́евой[1], —
России не быть под Антантой.
Левой!
Левой!
Левой!

Глаз ли померкнет орлий?
В старое ль станем пялиться?
Крепи
у мира на горле
пролетариата пальцы!
Грудью вперёд бравой!
Флагами небо оклеивай!
Кто там шагает правой?
Левой!
Левой!
Левой

Let's try to translate the first verse, "Разворачивайтесь в марше". The first
word is "Turn around", so probably "Turn around in the march". The image could be that I'm
marching towards and unknown goal and now have to decide to turn around on the basis of a point
that will be made in this poem. It could also be that it should be in english "Turn around and march",
stop whatever you are doing and go and march.

Next, "Словесной не место кляузе.", the first one means "verbal", the second and third can have multiple meanings
but probably stand for "no place", the last one is again some german "Klaus". This may mean "No more place for Klaus to speak"

Then, "Тише, ораторы!" which seems to be not so hard: "Quiet, orator!"

Then the sentence from the log "Your word, comrade mauser!"

"Довольно жить законом,". "Enough live of the statutes,". This makes no sense yet, so let's do the next sentence.

"данным Адамом и Евой". "Information/data of adam and eve". Biblical reverences, searching for the two first words in the previous sentence always results in the poem, so it's probably an uncommon construct. The previous sentence might be "The statuses/scriptures have lived long enough" or "Enough with a life of the scriptures". Next could come: "as has the knowledge of Adam and Eve".

"Клячу историю загоним.". Again that german "Klaus", then "history" , then a verb that means "drive/hammer/herd into" but also "to exhaust". So, this could be "Klaus history has been exhausted".

The verse ends with: "Left!", 3 times2

And we get this reconstruction in English

Turn around in the march
No more place for Klaus to speak
Quiet, orator!
Your word, comrade mauser!
The statuses/scriptures have lived long enough,
as has the knowledge of Adam and Eve
Klaus history has been exhausted
Left!
Left!
Left

Still, no obvious meaning, but we can take guess. This word, is the word of a German, but that same German's is then compared to a mauser. You speak sweet words but will shoot in the back. How far off am I?

  1. I used to have this image that russians were
    barbarian people without any culture. As it turns out, it was the other way around, who could have known? Any random USSR
    trained man knows more poems and has deeper cultural references then me. Well, at least I read a lot of SF! []
  2. This is how you march in russia, you keep shouting "Левой" and use it to pace your step. []

UDP - No C

November 13th, 2018

Some time ago, I promised to replace the C code in the udp library of Stanislav (on loper-os.org) with assembly code.
I've finally done so for 64bit intel linux.

This release is divided in two parts, (a) to replace the string to ip address functions with pure ada functions and (b) to replace the C calls with assembly equivalents. For other platforms (b) will need to be changed and the tree forked at that point.

The replacement of the string to ip address functions only works for the most common way to write ip-addresses12.

Next, the actual replacement of the C functions. This adds an extra module to provide Ada versions of some of the linux syscalls. Note that some of the code is a bit non-Ada, I wanted to keep the interfaces to the original C functions intact.

Finally, my signatures for the earlier patches.

  1. As 4 decimal numbers, each separated by a dot; "127.0.0.1". Each decimal number may range from 0 to 255 []
  2. The C-library version of these functions also accept hexadecimal and octal numbers and can handle space characters at the start of each number []

Building GNAT on MUSL, no more /usr/include/x86_64-linux-gnu

September 24th, 2018

An update on the previous version.

The produced gcc compiler builds static executables and no dynamically linked executables.

The compiler produced with the previous releases worked with several distributions but mysteriously failed for some. It seemed that the directory /usr/include/x86_64-linux-gnu was added by some developer to the include path on systems that support this directory. The files under that directory are specific for the GNU C Library and fail when included in MUSL C based builds. Of course, you wonder why this directory is always included and it turns out this is part of the default specfile for gcc1.

Before removing the line from the configuration, I wanted to know the history and possible usefulness of this item. The line can be found in the gcc/config/i386/gnu-user64.h file. My first step was the gcc git repository, this configuration item was not in the current source or in any previous version of gcc. Next up was the AdaCore release and it did include the item. Could it be that this was copied from any distribution? debian, gentoo and redhat all do not include a patch for gcc with this item. In short it's a specific line added by a developer at or for AdaCore. If I would take a guess at the usefulness of this item, I would propose this scheme; The AdaCore compilers can live in any directory and may be used to build code that contains system specific files, the compiler has tricks to find these files but always in relation to the path of the compiler. The binary compiler package does not include those files (and as these have a very close relation to the version of the GNU C library on the system, cannot contain those files and always work). To make gcc find the right path by default, this solution was found and implemented. This guess plus the problem that a MUSL C based gcc compiler cannot use the files in gcc/config/i386/gnu-user64.h plus the observation that the default gcc source-code does not include the line warrants removal of the line2.

Still undetermined, why does compilation sometimes work with the previous version of the code? Some systems do not include a /usr/include/x86_64-linux-gnu directory, but others do and still the compilation does not fail. I'll have to install more distributions to figure this one out, or if you have such a system, could you compile something with: gcc -v -Wmissing-include-dirs and report on the output?

For detailed instructions in how to run the script see the readme-2018-09-24.txt.

  1. In the past year, I've bumped against this specific configuration item before and I even changed the path for an AdaCore gcc installation. I was lazy and stupid and did not look into it any further. []
  2. To see the result of the algorithm that gcc uses for the compile path do: gcc -v -Wmissing-include-dirs []

GNAT Zero Foot Print - Take 5 - Assert and Aggregates

September 17th, 2018

Unfortunately, I've added a more files to the ZFP runtime. These files are all needed to support the full Ada syntax;

Assert
The mechanism behind the Assert pragma depends on the Ada.Assertions module (implemented in the files adainclude/a-assert.adb and adainclude/a-assert.adb), see also the LRM. This module was added but no visible effect was found when compiling an Ada module with an assert pragma. The GNAT compiler instead uses the Raise_Assert_Failure procedure (in file adainclude/s-assert.adb).
Aggregates
Some operations on arrays will apply for every element of an array. For example clearing an array with something like A := (others => 0);. The operations are called aggregate and depend on memcpy and memcmp. These functions are not available when compiling without any C library. The GNAT source contains Ada implementations for both and I've included memcmp function (in file adainclude/s-memcom.adb). Such a version is portable but not really optimal, so for memcpy I've included the code from the MUSL C library. So far I've only used memcmp, so more testing is needed.

To use this code, download the patch and signature;

And then press and play.

Annotated Assembly Code for a Boot Record

August 16th, 2018

Below, my notes to help me understand the boot code published here; http://btcbase.org/log/2018-07-06#1832315.. The boot loader is the first code run after the BIOS (512 bytes long, and loaded by the BIOS) and it in turn will load the rest OS / application, switch to 64bit mode and start to execute that code.

1
2
3
4
5
6
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Boot Loader - QEMU Variant
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
        payload_blocks   equ 14        ;; N * 512b blocks to load
        stack_top        equ 0x90000   ;; top of stack
	kernel_offset    equ 0x1000    ;; bottom of kernel

A number of constants is defined, the assembler will replace all occurrences of these names with the values after equ.

9
    	[BITS 16]

All lines after the '[BITS 16]' statement will be compiled for 16-bit intel. The boot process always starts with the processor in "real mode", in this mode all code is supposed to follow the 8086, 16 bit command set.

10
        section .text

Code and data can be compiled into sections, the boot program will be contained in a single section which is labelled with ".text".

11
	jmp     init

First line of actual code, a jump instruction to the body of the code. Between the jmp and the body, some data and utility functions can be defined.

12
13
14
15
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
gdtr:
    	dw	gdt_end - gdt - 1 ; GDT limit
	dw	gdt		  ; GDT base

A definition for a Global Descriptor Table. This particular definition is for an empty table with just one entry. This GDT will not be used and can be removed from the file. A GDT is a simple vector of 64-bit (8 byte) elements. A register will contain the length of the table and a pointer to the table, first 2 bytes encode the length (in bytes, not in elements), second 2 bytes the position. The length in bytes must be decreased by 1.

17
18
19
20
21
22
23
gdt:	times 8 db 0		; null descriptor
gdt_end:
	gdt64		dq 0x0000000000000000
	.code 		equ $ - gdt64
	dq 		0x0020980000000000
	.data 		equ $ - gdt64
	dq 		0x0000900000000000

A definition for a GDT that will be used. The first element is zero (apparently bios programs may expect this), the second is for the code section. The statement on line 18 defines a constant (and is not the same as .code section in assembly or object files), the constant will have a value of 8. The code segment element defines the offset in memory where that segment starts, its' size and some flags. To decode the GDT, label the bytes from right the left starting at 0 and ending at 7. The base, (start address position, in bytes or pages) is constructed from bytes 7, 4, 3, 2, and is a 32 bits value. The size, (number of bytes or pages) is constructed from 0 and 1 and half of 6. The other half of 6 defines the size flags. Byte 5 is used for flags. In the number 0x0020980000000000, base and limit are both zero. The size field is 0x2 or 0b0010, which means this is a 64bit descriptor. The flags field is 0x98, or 0b10011000, from high to low, high bit set == valid entry, 00 == privilege, ring 0, 1 == always set, 1 == executable, 0 == code can be run only in ring level 0, 0 == code segment cannot be read (can never be written to by definition), 0 == accessed bit, will be set by processor. Line 22 + 23 is for the description of a data block with the same base as the code block, this is not a 64bit segment. The flags are, 0x90, or 0b1001000, which means a valid data entry, with ring 0 privilege that grows up and is not writable. The last entry is not an entry in the table but the contents for the GDT register. First a 16bit length in bytes (minus the 1), next the 16bit position of the start of the table. How these flags, bases and lengths work out will hopefully become clear in the memory handling code.

29
30
31
32
33
34
35
36
37
38
DiskPacket:
	db	0x10
	db	0
d_blk:	dw	payload_blocks	; int 13 resets this to # of blocks actually read/written

db_off:	dw	after_me	; memory buffer destination offset
db_seg:	dw	0	        ; memory buffer destination segment

d_lba:	dd	1		; put the lba to read in this spot
	dd	0		; more storage bytes only for big lba's ( > 4 bytes )

The BIOS provides services to the boot program, one of these services is reading sectors from the disk. The service needs a structure filled with the number of sectors to read from the disk (14 in this code), were to put the read data (just after the code that was loaded from the same disk and is now running) and the LBA address (1 is the block just after the boot block).

40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
read_sector:
 	mov 	si, DiskPacket		; address of "disk address packet"
	mov 	ah, 0x42		; AL is unused
	mov	dl, [BootDrv]
	or 	dl, 0x80		; drive number 0 (OR the drive # with 0x80)
	int 	0x13
	jc 	bad_disk
	inc	dword [d_lba]
	ret
bad_disk:
        mov     si, disk_sad_msg
        call    print
halt:
        hlt
        jmp halt

The call to read block from the hard disk, the bios will load the first block and put this block at 0x7c00. The other blocks will need to be loaded by the boot code (and will be placed 0x7e00). This is a standard implementation of how to call the bios and load the blocks. This service is activated by the 0x13 interrupt with the AH register set to 0x42 and the DL register set to the boot drive. The service will set the carry flag on any error, and the boot code will then print a message and halt the machine. As for line 47, I have no idea why the word at the d_lba address needs to be increased.

57
58
59
60
61
62
63
64
65
66
67
68
69
70
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Print string at si using bios console
print:
        mov    al, [si]
        inc    si
        or     al, al
        jz     end_print    ; end at NUL
        mov    ah, 0x0e     ; op 0x0e
        mov    bh, 0x00     ; page number
        mov    bl, 0x07     ; color
        int    0x10         ; INT 10 - BIOS print char
        jmp    print
end_print:
        ret

Print characters in a zero delimited buffer one at the time using a bios service.

75
76
77
78
79
data:
        start_msg      db 13, 10, "Loading payload from disk...", 13, 10, 0
	end_msg        db "Running Payload...", 13, 10, 0
        disk_sad_msg   db "Disk Error!", 13, 10, 0
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

Text strings to print, 13 == CR, 10 == LF, 0 is end of string byte

81
	BootDrv        db 0 ; drive that we booted from

Byte to store the number of the boot drive

85
86
87
88
89
init:
        xor     ax, ax
        mov     ds, ax
        mov     es, ax
        mov     ss, ax

Set ax to zero and copy this value into ds (data segment), es (extra segment), ss (stack segment).

90
91
       	mov	bp, 0x9c00  ; init realmode stack
	mov     sp, bp

Setup a stack location, note that this is 8k bytes removed from the start of the boot code. The current minimal OS code is 3.3k so this is far away removed.

The stack is only used for a couple of calls in this boot rom and will not grow down by more than 1 word (the IP pointer will be pushed on the stack).

92
        mov	[BootDrv], dl  ; where we booted from

The bios will fill the lower part of the dx register with the index of the boot drive, store this index in memory

96
97
	mov     si, start_msg
        call    print

Print a start message to the boot screen

98
	call    read_sector

Read the rest of the rom

 99
100
	mov     si, end_msg
        call    print

Print an end message, rom has been read

101
        cli

Clear all status flags

102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
	;; enable a20
	call	a20_loop
	jnz	a20_done
	mov	al, 0xD1
	out	0x64, al
	call	a20_loop
	jnz	a20_done
	mov	al, 0xDF
	out	0x60, al
a20_loop:
	mov	ecx, 0x20000
a20_loop_2:
	jmp 	short a20_c
a20_c:
	in	al, 0x64
	test	al, 0x2
	loopne	a20_loop_2
a20_done:

A internet search for the A20 line in intel processors will inform you on some interesting properties of the intel processors. In short, the 20th address line is disabled at boot and no memory above 1mb can be accessed, to get to 64bit mode the address line has to be enabled. The most standard method to enable the line is to send a message to the keyboard controller and this is done in this code. Strangely the a20_loop code is missing a 'ret' statement after line 118 and even if ret is added, the statements at 104 and 108 will do nothing as the loop will only finish when the Z-flag is not set. The jump at line 114 is get a small delay. At line 110 and extra call to the loop and an unconditional jump to a20_done should be added. The boot rom works, but only because the qemu bios already enables the a20 flag.

123
124
	xor	bx, bx
	mov	es, bx

Build a PML4 page table, first setup registers. I will need to look-up how these page tables work. Set the BX register to 0 and copy this value to the ES. ES should still be zero from the code at line 88 but it maybe that the register was changed in the bios code.

125
	cld

Clear direction pointer, for the following string operations.

126
127
128
	mov	di, 0xA000
	mov	ax, 0xB00F
	stosw

Store the value 0xB00F at address 0xA000 and increase di.

129
130
131
	xor	ax, ax
	mov	cx, 0x07FF
	rep 	stosw

Store the word (2 byte) value 0 for 2047 times, will set 4k bytes to zero.

131
132
133
134
135
136
	rep 	stosw
	mov	ax, 0xC00F
	stosw
	xor	ax, ax
	mov	cx, 0x07FF
	rep 	stosw

The PDP table, start with 0xC00F, repeat zeros

137
138
139
140
141
	mov	ax, 0x018F
	stosw
	xor	ax, ax
	mov	cx, 0x07FF
	rep 	stosw

The PD table, start with 0x018F, repeat zeros

This ends the set-up of the paging tables.

143
144
	mov 	eax, 10100000b		; PAE, PGE
	mov 	cr4, eax

To enable 64bit two bits in the CR4 control register need to be set; (1) the Physical Address Extension (bit 5) when set will enable 36 bit instead of 32 bit addresses and (2) Page Global Enable (bit 7) when set will enable global pages that are maintained for all tasks. The Intel documentation notes that the PG flag (in CR0) must be set first, in this code it will be set after this statement at line 151-153. Note that even in REAL mode the 32bit registers are available.

145
146
	mov 	edx, 0x0000A000		; PML4
	mov 	cr3, edx

The address of the paging table is stored in CR3 (and 0xA000 was used in the setup for the paging tables)

147
148
149
150
	mov 	ecx, 0xC0000080		; EFER.LME
	rdmsr				; long mode!
	or 	eax, 0x00000100
	wrmsr

Change a Model Specific Register, the address of the register must be put in ECX and the value of the register will be put in EAX and EDX. In this case a bit in the MSR IA32_EFER must be set, its' address is 0xC0000080. The bit will enable the IA-32e mode as no flag is set in the Code Segment descriptor bits, the mode will be the so called "compatibility mode". The actual model (64bit or less) will then be determined from the GDT and in the GDT the 64bit flag was set.

151
152
153
	mov	ebx, cr0		; long mode
	or	ebx, 0x80000001		; Paging and protection
	mov	cr0, ebx		; Skip pmode

Enable paging

154
	lgdt	[gdt64.pointer]

The GDT register is loaded, and CPU will use the GDT from now on

155
 	jmp	gdt64.code:longmode     ; CS, 64b seg

A mixed size jmp, nasm implements code for this. As gdt4.code points to a quad word (8 bytes, 64 bits) the jmp is into a 64 bit segment.

156
[BITS 64]

Generate 64 bit code starting from this point

158
159
160
161
162
	;; set up new code/data/stack segments
        mov     ebp, stack_top
	mov     esp, ebp
	extern main
        jmp main

Setup C stack and call main

164
	times	510-($-$$) db 0

Fill up any leftover space with zero bytes but leave out the 2 last bytes

165
166
bootsig:
	dw 0xAA55

All boot sectors end with two bytes,0xAA and 0x55

168
after_me:

Label to use for loading the data from this disk into physical memory.

GNAT Zero Foot Print - Take 4 - Introduction of the platform

August 13th, 2018

An Ada runtime library is used to provide a standard interface to different operating systems and hardware. Already two different ways of compilation (1) based on the C library (2) based on assembly code, is supported in the ZFP library. Both versions can be had by pressing a different node of the v-tree. Although this works, it all becomes complicated when I want to add the same file to both systems and have to maintain multiple branches. Also, I want to a add a version of the library with no OS support and one with 64-bit arm support and probably MIPS and so on and so forth.

I needed to do a major overhaul of the code to support different platforms. An option was added to the gprbuild project file and with this option different source directories are selected to compile the library. All the sources have been distributed over different directories, one directory adainclude for generic (non-platform specific) code and multiple directories under the platform directory for all those files that are different per system. Now that all the source files are in different directories, the only way the runtime can be used is once it is installed1.

To use this new code, download the patch and signature;

After pressing, you'll need to do the following magic commands in the zfp directory2;

make clean MODE=x86_64-asm

make MODE=x86_64-asm

make install MODE=x86_64-asm PREFIX=prefix-asm

To check;

cd examples

make clean

make RTS=../prefix-asm

This will build the assembly based gnat library, for the C based do in the zfp directory;

make clean

make MODE=x86_64-c

make install MODE=x86_64-asm PREFIX=prefix-c

Again, to check

cd examples

make clean

make RTS=../prefix-c

Once built and installed into a prefix directory the default GNAT, the C and asm library can all be used to build the examples. The only thing to be set is the runtime directory with the RTS environment variable.

  1. At installation time the source files will nicely be put into the target adainclude directory with the gprinstall command []
  2. make is necessary, the gprbuild is fine for building Ada libraries and executables but when it comes to a simple rule to copy a file to a new name (so that gprinstall can pick it up and install that file) you can forget about it. []

GNAT Zero Foot Print - Take 3 - Regrind

August 7th, 2018

No new code in this installment. Instead, a regrind of all 3 patches, after a helpful suggestion to do so by Diana Coman . With this regrind, I updated the patches to follow the current thinking in vpatch management; the whole package under a common subdirectory, addition of a manifest and all files hashed with Keccak.

You can download and press the files with

v.pl init http://ave1.org/code/zfp

but you will have to comment out the hash checking code.

GNAT Zero Foot Print - Take 2 - No C

July 6th, 2018

"Libc gotta go."

—Stanislav Datskovskiy

And it will. In an, at this moment, unknown amount of steps the C library can be ripped out from the Ada Runtime library and be replaced with Ada and assembly code. In the first step, all C calls need to be replaced with Ada code and possibly some assembly to perform system calls to the Linux kernel. The second step is then to replace the C library specific start-up code with code for Ada.

I start with the previous version of a minimal ZFP library for Linux. This library uses only two calls to the C library, one to output characters and the other to exit the code. Both are replaced with a direct system call1. The second change is to include a file with startup code2. The resulting code is published in the following vpatch (with signature).

Combine this patch with those from the previous installment, press and build it. Building the code needs to be done with the Makefile3.

<<create a directory and put a .wot directory in it with at least my key>>

v.pl init http://ave1.org/code/zfp

v.pl p a zfp_2_noc.vpatch

cd a

make

cd examples

make

All system calls can be found in the adainclude/s-syscal.adb file. The Write function (used for outputting characters) is implemented as a single assembly statement syscall with the parameter list specified to fill the right processor registers. The function starts with a conversion from characters to bytes4 and ends with a check of the return values. After a completed system call the 'RAX' register will be filled with a return code. If an error occurred during execution of the system call, the register will contain the error code as a negative number (always between -1 and -4096). If the execution was successful the register will contain 0 or any other 64bit number outside of the range -1 to -4096.

function Write (fd : in Int; S : in String; E : out ErrorCode) return Int is
    type byte is mod 2**8;
    B : array (S'Range) of byte;
    R : Int := 0;
 begin
    for I in S'Range loop
       B (I) := Character'Pos (S (I));
    end loop;
    Asm
      ("syscall",
       Outputs => (Int'Asm_Output ("=a", R)),
       Inputs  =>
         (Int'Asm_Input ("a", SYSCALL_WRITE),
          Int'Asm_Input ("D", fd),
          System.Address'Asm_Input ("S", B'Address),
          Int'Asm_Input ("d", B'Length)),
       Volatile => True);
    if R < 0 and R >= -(2**12) then
       E := ErrorCode'Val (-R);
       R := -1;
    else
       E := OK;
    end if;
    return R;
 end Write;

The a-textio.adb and last_chance_handler.adb files have been updated to use the system calls instead of the C library. The s-maccod.ads was added from the GNAT runtime library to support the inline assembly code. The other addition is the startup.S file. In it simplest working form it just needs to contain one definition of a global (_start), a call to a main function and a syscall to exit the code;

.global _start

_start:
  call main

  /* exit code */
  mov $60, %rax
  mov $0, %rdi
  syscall

The version in the patch also stores the argument count and a pointer to the argument array in two globals. Both globals are unused for now but will be needed for future parsing of any command line arguments.

The final noteworthy change is the inclusion of a runtime.xml file. The gprbuild command will use this file to set flags for all projects that are build with the runtime library. For reasons , this file is written as an xml file containing gprbuild project statements;

<?xml version="1.0" ?>

<gprconfig>
  <configuration>
   <config>
   package Linker is
      for Required_Switches use Linker'Required_Switches &amp;
        ("${RUNTIME_DIR(ada)}/adalib/libgnat.a") &amp;
        ("-nostdlib", "-nodefaultlibs", "-lgcc");

      for Required_Switches use Linker'Required_Switches &amp;
          ("${RUNTIME_DIR(ada)}/adalib/start.o");
   end Linker;

   package Binder is
      for Required_Switches ("Ada") use Binder'Required_Switches ("Ada") &amp;
       ("-nostdlib") ;
   end Binder;
   </config>
  </configuration>
</gprconfig>

The linker flags are set so that no standard C library or startup code is included in the resulting binary. As we are then lacking the default startup code, an extra line is added to include the start.o code with every compile.

In the end, the fun part, a working binary. The hello world example from the previous installment can be built and it's size inspected. It is now at 2.6k (down from 54k) on my computer5.

In the final end, I will include another reference to AdaCore's configurable runtime documentation. The GNAT documentation has been very helpful for learning the GNAT system and developing this library.

  1. The main difficulty in doing so is to learn how the Linux system calls work and get a better understanding of the inline assembly statements. Stans' demo.asm posted in the logs proved very helpful for this process []
  2. This file is now written in assembly, although (upon reflection) it should be possible to rewrite it in Ada []
  3. I did not find a method to compile one separate file into an object file with gprbuild []
  4. Which in practice will be a copy operation []
  5. Ofcourse, this minimal library is too minimal. In some cases (for example when a string is concatenated) the compiler will generate memcpy or memset calls. We need to provide replacement Ada functions for each. This is not difficult as the ada 2017 code contains pure Ada versions for all of these. []