On FFA Chapter 2

12/02/18, modified 12/02/18

To be specific: "Finite Field Arithmetic." Chapter 2: Logical and Bitwise Operations.

The first section contains a short (re-)introduction and a list of additional files to download. With these files, the code can be updated and build again. This section is easy to follow and finish.

Next a short exposition on the early feedback on chapter 1, with the conclusion that W_Swap is now removed.

A definition of four predicate functions on Word types is given first. These are all very minimal and well designed. I could construct an alternative version of W_NZeroP, but it was so much more ugly than the published version. My stance on removing all unused functions is quickly becoming unattainable, clearly a general purpose library is being build and not a RSA specific one1. A retreat is in order, from now on everything dubious will be removed. Where dubious is an arbitrarily defined quality.

Next up, are 5 basic functions on the FZ type.

As FZ_Clear "illustrates the use of Ada's array assignment", we'll need to investigate it's consequences. We do not have to worry that this statement can be dependent on the array contents. We do have to worry about optimizations the compiler might perform (the famous memset removal in gcc). Let's try to create a small example with the FZ_Clear procedure2. First, a patch on the chapter 2 code:

The inline_always pragma has been moved from the 'fz_basics.adb' to 'fz_basics.ads' in this patch.

The somewhat interesting code is in 'clear_mod.adb'.

with Ada.Text_IO; use Ada.Text_IO;

-- From FFA:
with Words;    use Words;
with FZ_Type;  use FZ_Type;
with FZ_Arith; use FZ_Arith;

-- FFA Ch. 2
with FZ_Basic; use FZ_Basic;
with FFA_IO;   use FFA_IO;

package body Clear_Mod is

   procedure Clear_Procedure is
      Z : FZ (0 .. 128);
   begin
      FZ_Set_Head (Z, 20);
      FZ_Clear (Z);
      Put_Line ("--- cleared: ---");
      Dump (Z);
      New_Line;
      FZ_Set_Head (Z, 22);
      Z (2)   := 16#DEAD#;
      Z (3)   := 16#FAAF#;
      Z (4)   := 16#1001#;
      Z (100) := 16#0110#;
      Z (128) := 16#F1F1#;
      Dump (Z);
      New_Line;
      FZ_Clear (Z);
   end Clear_Procedure;

   procedure Print_Procedure is
      Z : FZ (0 .. 256);
      Y : FZ (0 .. 132);
   begin
      --Dump(Z);
      New_Line;
      Dump (Y);
      Put_Line ("--- <<<< ---");
      New_Line;
   end Print_Procedure;

end Clear_Mod;

In clear_procedure we change a FZ, it is cleared, it's state is dumped on the terminal, some interesting words are put in the FZ and finally the FZ is cleared again. This last action is just to be sure that no information will be left on the stack after this procedure finishes. Without inlining, this function works perfectly3. With inlining, the code is different, the compiler will determine that the final clear is useless and remove it.


c/ffa/fzclear/bin/clear:     file format elf64-x86-64

Disassembly of section .text:

0000000000401530 <clear_mod__clear_procedure>:
  401530:       55                      push   %rbp
  401531:       48 89 e5                mov    %rsp,%rbp
  401534:       53                      push   %rbx
  401535:       48 81 ec 38 14 00 00    sub    $0x1438,%rsp
  40153c:       48 83 0c 24 00          orq    $0x0,(%rsp)
  401541:       48 81 c4 20 10 00 00    add    $0x1020,%rsp
  401548:       ba 14 00 00 00          mov    $0x14,%edx
  40154d:       be 30 c8 49 00          mov    $0x49c830,%esi
  401552:       48 8d 9d e0 fb ff ff    lea    -0x420(%rbp),%rbx
  401559:       48 89 df                mov    %rbx,%rdi
  40155c:       e8 ef 00 00 00          callq  401650 <fz_basic__fz_set_head>
  401561:       31 c0                   xor    %eax,%eax
  401563:       48 89 df                mov    %rbx,%rdi
  401566:       b9 81 00 00 00          mov    $0x81,%ecx
  40156b:       f3 48 ab                rep stos %rax,%es:(%rdi)
  40156e:       be 38 c8 49 00          mov    $0x49c838,%esi
  401573:       bf 20 c8 49 00          mov    $0x49c820,%edi
  401578:       e8 f3 2e 00 00          callq  404470 <ada__text_io__put_line__2>
  40157d:       48 89 df                mov    %rbx,%rdi
  401580:       be 30 c8 49 00          mov    $0x49c830,%esi
  401585:       e8 d6 fd ff ff          callq  401360 <ffa_io__dump__2>
  40158a:       bf 01 00 00 00          mov    $0x1,%edi
  40158f:       e8 bc 2a 00 00          callq  404050 <ada__text_io__new_line__2>
  401594:       ba 16 00 00 00          mov    $0x16,%edx
  401599:       48 89 df                mov    %rbx,%rdi
  40159c:       be 30 c8 49 00          mov    $0x49c830,%esi
  4015a1:       e8 aa 00 00 00          callq  401650 <fz_basic__fz_set_head>
  4015a6:       48 89 df                mov    %rbx,%rdi
  4015a9:       be 30 c8 49 00          mov    $0x49c830,%esi
  4015ae:       48 c7 85 f0 fb ff ff    movq   $0xdead,-0x410(%rbp)
  4015b5:       ad de 00 00
  4015b9:       48 c7 85 f8 fb ff ff    movq   $0xfaaf,-0x408(%rbp)
  4015c0:       af fa 00 00
  4015c4:       48 c7 85 00 fc ff ff    movq   $0x1001,-0x400(%rbp)
  4015cb:       01 10 00 00
  4015cf:       48 c7 85 00 ff ff ff    movq   $0x110,-0x100(%rbp)
  4015d6:       10 01 00 00
  4015da:       48 c7 45 e0 f1 f1 00    movq   $0xf1f1,-0x20(%rbp)
  4015e1:       00
  4015e2:       e8 79 fd ff ff          callq  401360 <ffa_io__dump__2>
  4015e7:       bf 01 00 00 00          mov    $0x1,%edi
  4015ec:       e8 5f 2a 00 00          callq  404050 <ada__text_io__new_line__2>
  4015f1:       48 81 c4 18 04 00 00    add    $0x418,%rsp
  4015f8:       5b                      pop    %rbx
  4015f9:       5d                      pop    %rbp
  4015fa:       c3                      retq
  4015fb:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

Another fortunate side-effect of reading the FFA chapters, is that I now have a good opportunity to learn x86 assembly language4. The final FZ_Clear implementation is removed from the procedure, so it follows that the values should be still on the stack and it may be possible to read these. And that is why the clear_mod.ads module also contains Print_Procedure, with some padding FZ and a FZ to capture the stack. On my local gnat installation (the gnat 2016 binary from adacore), the dump of Y shows the contents of the local variable Z from Print_Procedure. This is probably not what you would expect, the best option to me seems to make sure that FZ_Clear is never inlined. Unfortunately, I've not found a pragma Never_Inline.

With FZ_Get_Head and FZ_Set_Head we get some confusion of terminology, suddenly we are talking about a head or first or youngest word of an FZ. And it becomes clearer that the words in an FZ have an order and we are missing an exact definition of FZ outside of the code itself. Here is one attempt to do so, an FZ is a number5 between 0 and 2 to a given power. This power is determined by the number of words in a FZ multiplied by the number of bits in a word (the bitness). An FZ can be seen as the summation over the array of words that make up the FZ where each word is multiplied by 2 to the power of the bitness times the position of that word in the array. For the math to be a correct, we define the first word in the array to have position 06. As for a visual representation, when the words (and the bytes inside the words) are written from left to right, the tail or last or oldest word is written first and the head or first or youngest is written last. Although the FZ_Get_Head and FZ_Set_Head procedures triggered a nice train of thought, I would remove these.

The best is left for last in this module, FZ_Mux is indeed crucial. It's implementation in terms of W_Mux is trivial, so if you understand W_Mux all is well. I found one alternative for this function, but it is wasteful compared to this one. Image if you define a 2d array with as many rows as there are words in X and each row having two columns. Put each word in X in the first column of the corresponding row in the 2d array. Put each word in Y in the second column. Now the result can be formed by iterating over the array and accessing the element from the right column. Unfortunately, this hypothetical code based on a 2d array, accesses memory ever so slightly differently based on the Sel bit, and thus might leak information.

Next up, two unary predicate operators on the FZ type are introduced. The FP_ZeroP procedure can be used to determine if every word in the FZ has the value 0. The FP_OddP can be used to check if the last bit of the head word is set, and here FZ_Get_Head is not used in the implementation.

Three binary predicate operators are defined in fz_cmd.adb. The FZ_EqP procedure follows the pattern first seen in FP_ZeroP. The FZ_LessThanP and FZ_GreaterThanP are implemented in terms of the subtraction operator. The result is determined by the borrow. Now, you just have to determine if the order of the arguments for the FZ_Sub call is correct.

With a set of 7 bitwise operators, the introduction of the ffa library in chapter 2 is finished. In fz_bitop.ads, binary operators are defined for two FZ operands, and separately for one FZ with one Word operand. These last versions work on the head word of an FZ, and we'll just have to see if there is any use for these procedures. Another remark, although these procedures could have been implemented in terms of the FZ_Get_Head / FZ_Set_Head procedures, again they have not. A final remark on the same procedures is that these modify the first input FZ operand where the full FZ procedures fill a third operand7.

We skip the demo and finish for the day, on to the next chapter!.

  1. This disclaimer was stated in channel and on the pages, so only my stupidity can be blamed []
  2. This is where I found a problem with the inlining on my GNAT installation. After resolving this we can continue. []
  3. If you turn off the inlining, you will see that the array assignment will be implemented as a memset []
  4. The first FZ_Clear call is inlined and implemented on the rep stos line. The xor %eax, %eax sets the 32bits %eax register to 0, which by extension also set the overlapping 64bits %rax to zero. The value in the first argument to the rep stos is therefore 0. []
  5. For a good book on numbers, read Frege []
  6. This in contrast to most of the actual ada code where the array indexing starts at 1. []
  7. I do not know the use for these functions. If we see a word as a FZ type of any length with that word as the head and all other words set to zero, then the implementation of FZ_And_W and FZ_Xor_W is wrong. []

6 Responses to “On FFA Chapter 2”

  1. > Unfortunately, this code accesses memory ever so slightly differently based on the Sel bit, and thus might leak information.

    This is nonsense, there is no change in the addressing.
    From where did you get the notion?

  2. or hmm, looks like text referred to a hypothetical alternative?

    • ave1 says:

      Yes, indeed it refers to the option described before with the 2d array. (I'll make a small update to the paragraph to make the sentence clearer!)

      The version implemented in FFA works perfectly and has exact the same code path and memory access pattern regardless of Sel.

  3. I must add, most of the "useless" knobs do in fact come into use in later chapters. (It is possible that some are left without any conceivable use; these will be removed.)

    Their presence is an artifact of the ongoing rewrite (the bulk of the material in fact existed, in a rougher draft form, since September.)

    • ave1 says:

      Well I did say "I see no use", which is a statement about my limited ability to foresee this field, not about the procedures.

      The only problem with "I see no use", is that it makes it harder for me to evaluate their correctness.

      I can see these correctly do their actions on the first element of the FZ, I cannot (i.e. my limit) see why not also for tail or anywhere else.

      One of the comments I see in channel about pgp / mpi, is that is has these functions that seem dead but are used and may contain a fatal error, which can be related to the difference between "forseen use" (20 or more years ago) and "actual use" (a couple of years ago when somebody just looked at the name of that function an thought it would fit or something).

  4. [...] very little towards no discussion published of *other people's code*. Ave1 has some on FFA (e.g. on chapter 2) and I have some on FFA as well (e.g. reading notes and an overview) but I can't say I saw much [...]

Leave a Reply to Stanislav Datskovskiy