v.sh parts 6&7: handling orphans and updated Keccak

April 10th, 2020

I hope this will be one of the last updates to v.sh: I will add two vpatches, one for handling orphan vpatches, and an update of bundled SMG.Keccak to the latest version.

An orphaned vpatch (or orphan) is a vpatch that is missing some of its antecedents, and therefore cannot be pressed. Current behavior in this case is to crash hard via an exception in vflow, which v.sh does not handle gracefully. This is not helpful at all, since the user is stuck figuring out what went wrong manually. The intention of this vpatch is to correct this oversight.

    Vpatch_Dependencies: Vpatch_Relationship := (others => (others => False));
    Vpatch_Conflicts   : Vpatch_Relationship := (others => (others => False));
    Vpatch_Order       : Vpatch_List         := (others => 0);
+   Vpatch_Orphans     : Vpatch_Set          := (others => False);

    function Get_Hunk return Hunk is
       H: Hunk;

Vpatch_Orphans represents a set of orphan vpatches, initially empty.

 	    -- Looped through all vpatches, there are hunks without matched inputs:
 	    if not Finished then
-	       raise Constraint_Error with "Unsatisfied hunks.";
+	       Put_Line("O " & Integer'Image(I));
+	       Vpatch_Orphans(I) := True;
 	    end if;
       end loop;

Upon detecting an orphan vpatch, it is added to the set, and reported to the output with "O" prefix.

       -- Check if there are still free vpatches (there is a loop):
       for I in Free_Vpatches'Range loop
-	 if Free_Vpatches(I) then
+	 if Free_Vpatches(I) and not Vpatch_Orphans(I) then
 	    Put_Line("L " & Integer'Image(I));
 	    Has_Loops := True;
 	 end if;

Orphan vpatches may end up outside of the normal vpatch flow, where they could be mixed up with vpatches that form a loop. Print the looping vpatches separately from the orphans.

       procedure Add_Dependencies is
-	 Finish : Boolean := False;
-	 while not Finish loop
-	    Finish := True;
-	    for I in Selected_Vpatches'Range loop
-	       if Selected_Vpatches(I) then
-		  for J in Vpatch_Dependencies'Range(2) loop
-		     if Vpatch_Dependencies(I,J) and not Selected_Vpatches(J) then
-			Finish := False;
-			Selected_Vpatches(J) := True;
-		     end if;
-		  end loop;
-	       end if;
-	    end loop;
+	 -- loop forward propagating orphans
+	 for I in Vpatch_Order'Range loop
+	    declare
+	       V : Vpatch_Index := Vpatch_Order(I);
+	    begin
+	       for J in Vpatch_Dependencies'Range(2) loop
+		  if Vpatch_Dependencies(V,J) and Vpatch_Orphans(J) then
+		     Vpatch_Orphans(V) := True;
+		  end if;
+	       end loop;
+	    end;
+	 end loop;
+	 for I in reverse Vpatch_Order'Range loop
+	    declare
+	       V : Vpatch_Index := Vpatch_Order(I);
+	    begin
+	       for J in Vpatch_Dependencies'Range(1) loop
+		  if Vpatch_Dependencies(J,V) and Selected_Vpatches(J) then
+		     Selected_Vpatches(V) := True;
+		  end if;
+	       end loop;
+	    end;
 	 end loop;
       end Add_Dependencies;

Add_Dependencies added the antecendents of vpatches manually selected for pressing to the set of vpatches that will be pressed. Here, we first "propagate" the membership in the ophaned set to the descendants, then membership in the selected set to the antecedents. Because the flow is avalailable at this point, I also changed the backward propagation algorithm for selection to a simpler one.

+      procedure Print_Heads is
+      begin
+	 for I in Vpatch_Dependencies'Range(2) loop
+	    declare
+	       Is_Head : Boolean := not Vpatch_Orphans(I);
+	    begin
+	       for J in Vpatch_Dependencies'Range(1) loop
+		  Is_Head := Is_Head and
+		    (not Vpatch_Dependencies(J,I) or Vpatch_Orphans(J));
+	       end loop;
+	       if Is_Head then
+		  Put_Line("H " & Integer'Image(I));
+	       end if;
+	    end;
+	 end loop;
+      end Print_Heads;

When orphaned vpatches are in the tree, the naive heads calculation previously present in v.sh does not work, because a vpatch without descendants (which awk code would mark as head) may be orphaned, and correctly handling this case in awk would require reconstructing dependency table and orphan set there. It is actually simpler just do the calculation in vflow.

@@ -399,18 +429,27 @@
 	 end loop;

 	 for I in Vpatch_Order'Range loop
-	    if Selected_Vpatches(Vpatch_Order(I)) then
-	       Put_Line("P " & Integer'Image(Vpatch_Order(I)));
-	    else
-	       Put_Line("p " & Integer'Image(Vpatch_Order(I)));
-	    end if;
+	    declare
+	       V : Vpatch_Index := Vpatch_Order(I);
+	    begin
+	       if        Selected_Vpatches(V) and not Vpatch_Orphans(V) then
+		  Put_Line("P " & Integer'Image(V));
+	       elsif     Selected_Vpatches(V) and     Vpatch_Orphans(V) then
+		  Put_Line("E " & Integer'Image(V));
+	       elsif not Selected_Vpatches(V) and not Vpatch_Orphans(V) then
+		  Put_Line("p " & Integer'Image(V));
+	       else
+		  Put_Line("e " & Integer'Image(V));
+	       end if;
+	    end;
 	 end loop;
-      end Print_Flow_Or_Conflicts;
+      end Print_Flow_And_Conflicts;

When reporting the press order, orphaned vpatches are marked with "E" or "e" depending if they are selected for pressing or not.

v.sh is also updated accordingly:

 set -e

+[ -n "$VERBOSE" ] && set -x


Setting VERBOSE environment variable enables output of executed commands to stderr. This is ~unrelated functionality, however it is useful for debugging unexpected crashes -- and crashes due to an orphan vpatch were unexpected to me.

+report_vpatches() {
+    >&2 printf "$1"
+    awk -v L="$2" '$1 == L {print $2}' < "$topo" | vpatchnames >&2

A new function for simplifying error reporting in vpatchflowerr.

 vpatchflowerr() {
     set +e
     awk 'BEGIN {err=0; ok=0;}
 err == 0 && $1 == "C" {err=1; ech=$1;}
 err == 0 && $1 == "L" {err=2; ech=$1;}
+err == 0 && $1 == "E" {err=3; ech=$1;}
 err == 0 && $1 == "P" {ok=1;}
-END {if (err != 0) exit err; if (ok != 1) exit 3; exit 0;}' <"$topo"
+END {if (err != 0) exit err; if (ok != 1) exit 4; exit 0;}' <"$topo"
     set -e
     if [ "$exitcode" -eq "1" ]; then
 	    >&2 printf "Conflicting leaves selected:\n"
 	    awk '$1 == "C" {print $2}' < "$topo" | vpatchnames | sort | uniq >&2
-	    >&2 printf "Available heads:\n"
-	    awk '
-($1 == "I" || $1 == "d" ) {non_terminals[$3]=1;}
-($1 == "P") {if (!($2 in non_terminals)) {print $2;}}' < "$topo" | vpatchnames >&2
+	    report_vpatches "Orphans:\n" "O"
+	    report_vpatches "Available heads:\n" "H"
     elif [ "$exitcode" -eq "2" ]; then
-	    >&2 printf "Loop among vpatches:\n"
-	    awk '{print $2}' < "$topo" | vpatchnames >&2
+	    report_vpatches "Loop among vpatches:\n" "L"
     elif [ "$exitcode" -eq "3" ]; then
+	    report_vpatches "Orphan vpatches selected:\n" "E"
+	    report_vpatches "Available heads:\n" "H"
+	    fail
+    elif [ "$exitcode" -eq "4" ]; then
 	    >&2 echo "Unknown error. Investigate in $vgpgdir:"
 	    >&2 cat "$topo"
 	    exit 1 # no cleanup

The error reporting code got updated, first to use the new report_vpatches function, second, to get heads using "H" prefixed lines instead of calculating them manually.

 	cat >"$dotfile" < \""a[$2]"\""}
 FILENAME=="-" && $1 == "d" {print "\""a[$3]"\" -> \""a[$2]"\""}
 ' "$vpatches" - <"$topo" >>"$dotfile"
 	echo "}" >> "$dotfile"

Also show orphans in the graph. They are shown as descendants of an invisible node.

 function max(a,b) { return (a > b) ? a : b;}
 function repeat(s,n) {i=0; while (i++ < n) {printf("%s",s);}}
 FILENAME==VF {names[$2]=$1}
-FILENAME=="-" && ($1 == "I" || $1 == "d" ) {non_terminals[$3]=1;}
+FILENAME=="-" && ($1 == "O") {print "Orphan:", names[$2]; orphans[$2]=1;}
+FILENAME=="-" && ($1 == "H" ) {heads[$2]=1;}
 FILENAME=="-" && ($1 == "d") {if (!($2 in n_ante)) n_ante[$2] = 0; dir_ante[$2][n_ante[$2]] = $3; n_ante[$2]++;}
-FILENAME=="-" && ($1 == "P" || $1 == "p") {
+FILENAME=="-" && ($1 == "P" || $1 == "p" || $1 == "E" || $1 == "e") {
    level[$2] = 0;
    if ($2 in n_ante) {
       for (i = 0; i < n_ante[$2]; i++) {
@@ -295,7 +307,8 @@
    repeat(" ", level[$2]);
-   printf("%s%s%s\n",names[$2], ($2 in non_terminals) ? "" : " (*)", ($1 == "P") ? " +" : "" );
+   if ($1 =="E" || $1 == "e") printf("O: ");
+   printf("%s%s%s\n",names[$2], ($2 in heads) ? " (*)" : "", ($1 == "P") ? " +" : "" );
 }' "$vpatches" - <"$topo"

And an update to the "vtree" command, which now also reports orphans and gets the list of heads the new way.

The update to the latest code SMG.Keccak is much simpler, so I won't show the code, the only interesting bit was that I got confused with all the endianness conversion in bitwise and bytewise Keccak. As the sponge function output was previously an array of bits, the user had to reconstruct the byte value himself. In the new code, the value is in bytes, but it is not directly usable on little-endian machines, and must be reversed. I followed the advise of Diana and just removed the conversion to bit reversal in the smg_keccak.adb.

Another change that I did was to switch all binary vtools to compile with optimizations, like so:

project Ksum is
  for Languages use ("Ada");
  for Source_Dirs use ("src");
  for Object_Dir use "obj";
  for Exec_Dir use ".";
  for Main use ("ksum.adb");

  type Mode_Type is ("debug", "release");
  Mode : Mode_Type := external ("mode", "release");

  package Compiler is
      case Mode is
         when "debug" =>
            for Switches ("Ada")
                use ("-O2", "-ggdb3");
         when "release" =>
            for Switches ("Ada")
                use ("-O2");
      end case;
  end Compiler;

  package Binder is
     for Switches ("Ada") use ("-static");
  end Binder;

  package Linker is
     for Switches ("Ada") use ("-static");
  end Linker;
end Ksum;

The effects of these changes are rather nice, I was in the error previously when I thought that "noone will notice", as 2.5x speedup from this change is quite easy to notice:

$ time ksum.old ksum
08ffc69748db90b48fc2813abf40b6123798f4e99f7cddbe02f5547ec2e0504b1af5e2e8d87b108e1e18b128c21fc67c40f34facf73314d5a86cea8b2ce7d762  ksum
ksum.old ksum  1.41s user 0.01s system 90% cpu 1.566 total
$ time ksum.new ksum
08ffc69748db90b48fc2813abf40b6123798f4e99f7cddbe02f5547ec2e0504b1af5e2e8d87b108e1e18b128c21fc67c40f34facf73314d5a86cea8b2ce7d762  ksum
ksum.new ksum  0.50s user 0.00s system 94% cpu 0.532 total

OTOH, vdiff still spends ~90% CPU time in the Keccak code, so perhaps some more performance can be extracted. This would help working with e.g. Linux kernel or MPWP even more.

# Overhead  Command  Shared Object     Symbol
# ........  .......  ................  ........................................
    53.90%  vdiff    vdiff             [.] smg_keccak__rho
    18.08%  vdiff    vdiff             [.] smg_keccak__keccak_function
    12.65%  vdiff    vdiff             [.] smg_keccak__chi
    10.85%  vdiff    vdiff             [.] smg_keccak__theta
     1.81%  vdiff    vdiff             [.] keccak_hash
     0.90%  vdiff    vdiff             [.] bits__tobytestream
     0.90%  vdiff    [kernel.vmlinux]  [k] do_syscall_64
     0.90%  vdiff    [kernel.vmlinux]  [k] get_page_from_freelist
     0.00%  vdiff    vdiff             [.] read_files
     0.00%  vdiff    vdiff             [.] smg_keccak__absorbblock
     0.00%  vdiff    vdiff             [.] smg_keccak__keccakhash
     0.00%  vdiff    [kernel.vmlinux]  [k] copy_user_enhanced_fast_string
     0.00%  vdiff    vdiff             [.] memcpy
     0.00%  vdiff    vdiff             [.] memcmp
     0.00%  vdiff    vdiff             [.] diff_2_files
     0.00%  vdiff    [kernel.vmlinux]  [k] syscall_return_via_sysret
     0.00%  vdiff    [kernel.vmlinux]  [k] entry_SYSCALL_64
     0.00%  vdiff    vdiff             [.] memchr
     0.00%  vdiff    vdiff             [.] smg_keccak__squeezeblock
     0.00%  vdiff    vdiff             [.] free
     0.00%  vdiff    [kernel.vmlinux]  [k] __d_lookup_rcu

Vpatches and signatures:

curl 'http://bvt-trace.net/vpatches/vtools_vsh_orphans.vpatch' > vtools_vsh_orphans.vpatch
curl 'http://bvt-trace.net/vpatches/vtools_vsh_orphans.vpatch.bvt.sig' > vtools_vsh_orphans.vpatch.bvt.sig
curl 'http://bvt-trace.net/vpatches/vtools_update_keccak.vpatch' > vtools_update_keccak.vpatch
curl 'http://bvt-trace.net/vpatches/vtools_update_keccak.vpatch.bvt.sig' > vtools_update_keccak.vpatch.bvt.sig

9 Responses to “v.sh parts 6&7: handling orphans and updated Keccak”

  1. Diana Coman says:

    Ha, going from 1.41s to 0.5 is indeed hard to not notice. Looks good at a first read, I'll have to give it another read and a test drive too!

  2. Diana Coman says:

    The .sig for vtools_update_keccak.vpatch (aka vtools_update_keccak.vpatch.bvt.sig) doesn't verify for me, not sure how or what might be wrong with my setup (I looked inside both the vpatch and the sig files and they both seem to match the ones linked above so it doesn't seem to be that I got either corrupted on upload):

    $ gpg --verify .seals/vtools_update_keccak.vpatch.bvt.sig patches/vtools_update_keccak.vpatch
    gpg: Signature made Fri 10 Apr 2020 05:03:48 PM BST using RSA key ID 4B962B68
    gpg: BAD signature from "bvt aka BT"

  3. bvt says:

    Thanks for spotting, I have reuploaded the signature. Seems that as I was writing the article, I uploaded the signature for older version of the vpatch.

  4. Diana Coman says:

    Thanks for the prompt update - confirmed, the new sig works. However, it turns out meanwhile the awk sequence at line 301 in v.sh barfs on my CentOS 6 running GNU Awk 3.1.7, here's the error thrown when running v vtree:

    awk: cmd. line:6: FILENAME=="-" && ($1 == "d") {if (!($2 in n_ante)) n_ante[$2] = 0; dir_ante[$2][n_ante[$2]] = $3; n_ante[$2]++;}
    awk: cmd. line:6: ^ syntax error
    awk: cmd. line:6: FILENAME=="-" && ($1 == "d") {if (!($2 in n_ante)) n_ante[$2] = 0; dir_ante[$2][n_ante[$2]] = $3; n_ante[$2]++;}
    awk: cmd. line:6: ^ syntax error
    awk: cmd. line:11: level[$2] = max(level[$2], level[dir_ante[$2][i]]+1);
    awk: cmd. line:11: ^ syntax error
    awk: cmd. line:11: fatal: invalid subscript expression

    The rest of commands (I tested briefly patches, wot, flow, antecedents, descendants, press) seem to work fine, at least when tested on vtools' own tree. At a quick first look in the code, I don't know if the trouble might be perhaps a case of that i variable failing to be expanded because of the single quotes for the awk, or similar.

    As a side note, now with all the .gpr files to build, I ended up making a "buildall.sh" that just calls gprbuild -P on each of them, as otherwise I *always* seem to forget one!

  5. Diana Coman says:

    Only now I had a second look at that line where awk complains and I realised that you are possibly trying to use array of arrays? Iirc this is supported only in some versions (gawk but also not older ones) so possibly that's why it fails on my OS. Nevertheless, awk itself allows multiple indices, it just goes array[i,j] (it concatenates them so a sort of inside trick but perfectly fine for needing "several dimensions") - and it usually ends up in shorter code from what I noticed. So looking there, it would be something like dir_ante[$2, n_ante[$2]], perhaps.

  6. bvt says:

    Hm, thanks for having a look at this. I will install the same awk version and test the code with it, then see how to best fix it -- most likely using multiple indexes will work out.

  7. bvt says:

    Indeed it seems to be gawk-only feature. I have tested your suggested fix with awk 3.1.7, mawk from ~2010, and busybox awk, no problems with any of them. The vpatch is here:

    curl 'http://bvt-trace.net/vpatches/vtools_vsh_fix_awk.vpatch' > vtools_vsh_fix_awk.vpatch
    curl 'http://bvt-trace.net/vpatches/vtools_vsh_fix_awk.vpatch.bvt.sig' > vtools_vsh_fix_awk.vpatch.bvt.sig

  8. Diana Coman says:

    Confirmed it works perfectly now on my machine too, thank you!

    I should get around to mirror the whole tree for vtools and publish my sigs for it too, sometime this week.

  9. [...] own public forum worked quite as intended. Nevertheless, as bvt has been dilligently working on owning and improving the vtools suite, clearly communicating on it and on his own history of V use that informs his choices of [...]

Leave a Reply