v.sh parts 2.5 and 3: one-binary Ada solver and Ada vfilter implementation

February 16th, 2020

This article starts where we have stopped last time - I have introduced a set of Ada utilities for computing the vpatch dependencies, performing a topological sort on them, and simplifying the flow for making a nicer graph - which made v.sh a complete tool. However, I was quite unsatisfied with the amount of code duplication between the utilities: first, the ~same set of type and variable declarations was used in all the tools; second, the fact that the variables were constantly serialized and deserialized made the code look too messy and unfinished to leave it like that. In the first vpatch (part 2.5), I will unify the code of all the v.sh utilities in one binary that does all computations end-to-end. As the algorithms/functions did not change much, I will not add a detailed description of them in this article, if you are interested, they are available in the previous article.

with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
with Ada.Command_Line; use Ada.Command_Line;
with Ada.Containers.Generic_Sort;

procedure Vflow is
   type Hunk_State is (Unsatisfied, Satisfied);
   Empty_Hash : constant := 0;
   type Hunk is record
      Vpatch:   Natural;
      In_File:  Natural;
      Out_File: Natural;
      In_Hash:  Natural;
      Out_Hash: Natural;
      State:    Hunk_State;
   end record;

   Num_Hunks    : Integer := Integer'Value(Argument(1));
   Num_Vpatches : Integer := Integer'Value(Argument(2));

   subtype Hunk_Index is Natural range 1..Num_Hunks;
   type    Hunk_Array is array (Hunk_Index range <>) of Hunk;
   All_Hunks : Hunk_Array(Hunk_Index'Range);

   subtype Vpatch_Index is Natural range 0..Num_Vpatches - 1;
   type    Vpatch_Hunks is array (Vpatch_Index'Range) of Hunk_Index;
   VH_Low : Vpatch_Hunks := (others => Hunk_Index'First);
   VH_High: Vpatch_Hunks := (others => Hunk_Index'First);

   type Vpatch_Set          is array (Vpatch_Index'Range) of Boolean;
   type Vpatch_List         is array (Vpatch_Index'Range) of Vpatch_Index;
   type Vpatch_Relationship is array (Vpatch_Index'Range, Vpatch_Index'Range) of Boolean;

   Vpatch_Dependencies: Vpatch_Relationship := (others => (others => False));
   Vpatch_Conflicts   : Vpatch_Relationship := (others => (others => False));
   Vpatch_Order       : Vpatch_List         := (others => 0);

   function Get_Hunk return Hunk is
      H: Hunk;
   begin
      Get(H.Vpatch);
      Get(H.In_File);
      Get(H.Out_File);
      Get(H.In_Hash);
      Get(H.Out_Hash);
      if H.In_Hash = Empty_Hash then
	 H.State := Satisfied;
      else
	 H.State := Unsatisfied;
      end if;
      return H;
   end Get_Hunk;

   procedure Read_Hunks is
      I: Natural := All_Hunks'First;
   begin
  Read_Loop:
      for I in All_Hunks'Range loop
	 exit Read_Loop when End_Of_File;
	 All_Hunks(I) := Get_Hunk;
      end loop Read_Loop;
      if I /= All_Hunks'Last then
	 raise Constraint_Error with "Insufficient number of hunks provided.";
      end if;
   end Read_Hunks;

   procedure Init_Vpatch_Hunk_Bounds is
   begin
      for I in All_Hunks'Range loop
	 declare
	    V : Integer := All_Hunks(I).Vpatch;
	 begin
	    if VH_Low(V) > I or All_Hunks(VH_Low(V)).Vpatch /= V then
	       VH_Low(V) := I;
	    end if;
	    VH_High(V) := I;
	 end;
      end loop;
   end Init_Vpatch_Hunk_Bounds;

   procedure Populate_Conflicts is
      function Conflicts(I: Vpatch_Index; J: Vpatch_Index) return Boolean is
      begin
	 for A in VH_Low(I)..VH_High(I) loop
	    for B in VH_Low(J)..VH_High(J) loop
	       if (All_Hunks(A).In_File = All_Hunks(B).In_File) and
		 (All_Hunks(A).In_Hash = All_Hunks(B).In_Hash) then
		  return True;
	       end if;
	    end loop;
	 end loop;
	 return False;
      end Conflicts;
   begin
      for I in Vpatch_Index'Range loop
	 for J in Vpatch_Index'Range loop
	    if I < J and then Conflicts(I,J) then
	       Vpatch_Conflicts(I,J) := True;
	       Vpatch_Conflicts(J,I) := True;
	    end if;
	 end loop;
      end loop;
   end Populate_Conflicts;

   procedure Solve is
      Input_Hunks : Hunk_Array := All_Hunks;
      Output_Hunks: Hunk_Array renames All_Hunks;
      Finished    : Boolean    := False;

      -- Start of sorting boilerplate
      --
      function Input_Before(A: Positive; B: Positive) return Boolean is
      begin
	 return Input_Hunks(A).In_File < Input_Hunks(B).In_File;
      end Input_Before;

      function Output_Before(A: Positive; B: Positive) return Boolean is
      begin
	 return Output_Hunks(A).Out_File < Output_Hunks(B).Out_File;
      end Output_Before;

      procedure Input_Swap(A: Positive; B: Positive) is
	 Tmp : Hunk;
      begin
	 Tmp := Input_Hunks(B);
	 Input_Hunks(B) := Input_Hunks(A);
	 Input_Hunks(A) := Tmp;
      end Input_Swap;

      procedure Output_Swap(A: Positive; B: Positive) is
	 Tmp : Hunk;
      begin
	 Tmp := Output_Hunks(B);
	 Output_Hunks(B) := Output_Hunks(A);
	 Output_Hunks(A) := Tmp;
      end Output_Swap;

      procedure Input_Sort is
	 new Ada.Containers.Generic_Sort (Positive, Input_Before, Input_Swap);
      procedure Output_Sort is
	 new Ada.Containers.Generic_Sort (Positive, Output_Before, Output_Swap);
      --
      -- End of sorting boilerplate

      function Check_Genesis(Input: in out Hunk_Array) return Boolean is
	 Is_Genesis : Boolean := True;
	 Vpatch: Vpatch_Index := Input(Input'First).Vpatch;
      begin
	 for I in Input'Range loop
	    Is_Genesis := Is_Genesis and (Input(I).In_Hash = Empty_Hash);
	 end loop;
	 return Is_Genesis;
      end Check_Genesis;

      function Try_Connect(Input: in out Hunk_Array;
			   Output: in out Hunk_Array) return Boolean is
	 I         : Positive := Input'First;
	 J         : Positive := Output'First;
	 All_Marked: Boolean  := True;
      begin
	 while I <= Input'Last and J <= Output'Last loop
	    if Input(I).State /= Unsatisfied or
	      Input(I).In_File < Output(J).Out_File then
	       I := I + 1;
	    elsif Input(I).In_File > Output(J).Out_File then
	       J := J + 1;
	    else
	       if Input(I).In_Hash = Output(J).Out_Hash and
		 Input(I).In_Hash /= Empty_Hash then
		  Input(I).State := Satisfied;
		  Vpatch_Dependencies(Input(Input'First).Vpatch,
				      Output(Output'First).Vpatch) := True;
	       end if;
	       I := I + 1;
	    end if;
	 end loop;

	 for I in Input'Range loop
	    All_Marked := All_Marked and Input(I).State = Satisfied;
	 end loop;
	 return All_Marked;
      end Try_Connect;

   begin
      for I in Vpatch_Index'Range loop
	 Output_Sort(VH_Low(I), VH_High(I));
	 Input_Sort(VH_Low(I), VH_High(I));
      end loop;

      for I in Vpatch_Index'Range loop
	 declare
	    Hunks_In : Hunk_Array renames Input_Hunks(VH_Low(I)..VH_High(I));
	 begin
	    Finished := False;

	    for J in Vpatch_Index'Range loop
	       if J /= I then
		  declare
		     Hunks_Out : Hunk_Array renames
		       Output_Hunks(VH_Low(J)..VH_High(J));
		  begin
		     Finished := Try_Connect(Hunks_In, Hunks_Out);
		  end;
	       else
		  Finished := Check_Genesis(Hunks_In);
	       end if;
	       exit when Finished;
	    end loop;

	    -- Looped through all vpatches, there are hunks without matched inputs:
	    if not Finished then
	       raise Constraint_Error with "Unsatisfied hunks.";
	    end if;
	 end;
      end loop;
   end Solve;

   function Sort return Boolean is
      Free_Vpatches  : Vpatch_Set  := (others => True);
      N_Selected     : Natural     := 0;
      Finished       : Boolean     := False;
      Has_Loops      : Boolean     := False;
   begin
      while not Finished loop
	 Finished := True;
	 for I in Free_Vpatches'Range loop
	    if Free_Vpatches(I) then
	       declare
		  All_Satisfied : Boolean := True;
	       begin

		  -- Check if all dependencies are already in the press order
		  for J in Vpatch_Dependencies'Range(2) loop
		     if Vpatch_Dependencies(I,J) then
			All_Satisfied := All_Satisfied and not Free_Vpatches(J);
		     end if;
		  end loop;

		  -- All dependencies present, can add this vpatch:
		  if All_Satisfied then
		     Free_Vpatches(I)         := False;
		     Vpatch_Order(N_Selected) := I;
		     N_Selected               := N_Selected + 1;
		     Finished                 := False;
		  end if;
	       end;
	    end if;
	 end loop;
      end loop;

      -- Check if there are still free vpatches (there is a loop):
      for I in Free_Vpatches'Range loop
	 if Free_Vpatches(I) then
	    Put_Line("L " & Integer'Image(I));
	    Has_Loops := True;
	 end if;
      end loop;
      return Has_Loops;
   end Sort;

   procedure Print_Relationship(C: Character; R: Vpatch_Relationship) is
   begin
      for I in R'Range(1) loop
	 for J in R'Range(2) loop
	    if R(I,J) then
		  Put_Line(C & " " & Integer'Image(I) & " " & Integer'Image(J));
	    end if;
	 end loop;
      end loop;
   end Print_Relationship;

   procedure Simplify_Inner is
      Vpatch_TC_Indirect:  Vpatch_Relationship := (others => (others => False));
   begin
      -- Fill Vpatch_TC_Indirect
      for I in Vpatch_Order'Range loop
	 declare
	    N : Vpatch_Index := Vpatch_Order(I);
	 begin
	    for J in Vpatch_Dependencies'Range(2) loop
	       if Vpatch_Dependencies(N,J) then
		  for K in Vpatch_Dependencies'Range(2) loop
		     Vpatch_TC_Indirect(N,K) := Vpatch_TC_Indirect(N,K) or
		       Vpatch_Dependencies(J,K) or
		       Vpatch_TC_Indirect(J,K);
		  end loop;
	       end if;
	    end loop;
	 end;
      end loop;

      -- Output Vpatch_TC_Indirect
      Print_Relationship('I', Vpatch_TC_Indirect);

      -- Remove redundant connections from Vpatch_Dependencies
      for I in Vpatch_Dependencies'Range(1) loop
	 for J in Vpatch_TC_Indirect'Range(2) loop
	    Vpatch_Dependencies(I,J) := Vpatch_Dependencies(I,J) and
	      not Vpatch_TC_Indirect(I,J);
	 end loop;
      end loop;

      -- Output filtered out Vpatch_Dependencies
      Print_Relationship('d', Vpatch_Dependencies);
   end Simplify_Inner;

   procedure Print_Selected is
      Selected_Vpatches: Vpatch_Set := (others => False);

      procedure Read_Selection is
	 C: Character;
	 I: Vpatch_Index;
      begin
     Read_Loop:
	 loop
	    exit Read_Loop when End_Of_File;
	    Get(C);
	    if C = 's' then
	       Get(I);
	       Selected_Vpatches(I) := True;
	    elsif C = 'S' then
	       for J in Selected_Vpatches'Range loop
		  Selected_Vpatches(J) := True;
	       end loop;
	    end if;
	 end loop Read_Loop;
      end Read_Selection;

      procedure Add_Dependencies is
	 Finish : Boolean := False;
      begin
	 while not Finish loop
	    Finish := True;
	    for I in Selected_Vpatches'Range loop
	       if Selected_Vpatches(I) then
		  for J in Vpatch_Dependencies'Range(2) loop
		     if Vpatch_Dependencies(I,J) and not Selected_Vpatches(J) then
			Finish := False;
			Selected_Vpatches(J) := True;
		     end if;
		  end loop;
	       end if;
	    end loop;
	 end loop;
      end Add_Dependencies;

      procedure Print_Flow_Or_Conflicts is
	 Has_Conflicts : Boolean := False;
      begin
	 for I in Vpatch_Conflicts'Range(1) loop
	    for J in Vpatch_Conflicts'Range(2) loop
	       if Selected_Vpatches(I) and Selected_Vpatches(J)
		 and Vpatch_Conflicts(I,J) then
		  Put_Line("C " & Integer'Image(I));
		  Put_Line("C " & Integer'Image(J));
		  Has_Conflicts := True;
	       end if;
	    end loop;
	 end loop;

	 if Has_Conflicts then
	    return;
	 end if;

	 for I in Vpatch_Order'Range loop
	    if Selected_Vpatches(Vpatch_Order(I)) then
	       Put_Line("P " & Integer'Image(Vpatch_Order(I)));
	    end if;
	 end loop;
      end Print_Flow_Or_Conflicts;

   begin
      Read_Selection;
      Add_Dependencies;
      Print_Flow_Or_Conflicts;
   end Print_Selected;

begin
   Read_Hunks;
   Init_Vpatch_Hunk_Bounds;
   Populate_Conflicts;
   Solve;

   if Sort then
      Simplify_Inner;
      Print_Selected;
   end if;
end Vflow;

I used this opportunity to corrected a few sore spots in the algorithms:

  • Handling of file creations in vpatches is simplified.
  • "S" command selects all vpatches for toposort calculations. This simplifies awk part of v.sh a bit.
  • The dependency graph is always simplified.
  • Instead of toposorting only the selected vpatches and their antecedents, now the whole vtree is toposorted, and then user-chosen vpatches with their antecedents are selected from the flow.

The end result of these changes is significant economy of code (~390 instead of 499 LoC - around 10%), and less piping here-and-there on the file system.

The corresponding changes to v.sh:

--- a/vtools/v.sh 24defb473835e2980044a9bb526d2133f3056f62109b3a8136cb168019889f3c94bb721362cffcdb6df2ac52bbc6c111a680d9ba22b759ea676e7ddff7082fa1
+++ b/vtools/v.sh 18e17b243c223f806fa1d9462e096ee2f56d5bdcc2714fff374518bf1ee5fb0fe830720006739b1ed987beeaf4116fd079a7db6a3f900e4b0cba7b0e5847395c
@@ -4,10 +4,8 @@

 vgpgdir=""
 vpatches=""
-flow=""
 filtered=""
 selfile=""
-simplified=""
 topo=""

 usage() {
@@ -34,10 +32,8 @@
 loadwot() {
 	vgpgdir="$(mktemp -d v-gpg-XXXXXX)"
 	vpatches="$vgpgdir/vpatches.txt"
-	flow="$vgpgdir/flow.txt"
 	filtered="$vgpgdir/filtered.txt"
 	selfile="$vgpgdir/selected.txt"
-	simplified="$vgpgdir/simplified.txt"
 	topo="$vgpgdir/topo.txt"

 	checkdir wot

$flow and $simplified temporary files are eliminated, as now the dependency graph is always simplified.

@@ -130,30 +126,34 @@
 	filterall | internall >"$filtered"
 	selected=""
 	if [ "$*" == "-" ]; then
-		selected=$(awk '{print $2}' < "$vpatches")
+		echo "S" >"$selfile"
 	else
-		selected="$(echo $* | tr ' ' 'n' | vpatchnos)"
+		echo $* | tr ' ' 'n' | sed -e 's/^/s /g' - | vpatchnos >"$selfile"
 	fi
-	vflow $(wc -l <"$filtered") $(wc -l <"$vpatches") <"$filtered" >"$flow"
-	echo "$selected" | sed -e 's/^/s /g' - | sort | uniq >"$selfile"
-	cat "$flow" "$selfile" | vtoposort > "$topo"
+	cat "$filtered" "$selfile" |
+		vflow $(wc -l <"$filtered") $(wc -l <"$vpatches") >"$topo"
 }
 

When all leaves are selected for flow calculation, a new "S" command is emitted. Also, instead of vflow | vtoposort pipeline, only vflow is used.

 vpatchflowerr() {
-	ret=$(head -n1 "$topo" | awk '{print $1}')
-	if [ "$ret" = "C" ]; then
-		printf "nConflicting leaves selected:n"
-		awk '$1 == "C" {print $2}' < "$topo" | vpatchnames | sort | uniq
-		cleanup
-	elif [ $ret == "L" ]; then
-		echo "nLoop among vpatches:"
-		awk '{print $2}' < "$topo" | vpatchnames
-		cleanup
-	elif [ "$ret" != "P" ]; then
-		echo "Unknown error. Investigate in $vgpgdir:"
-		cat "$topo"
-		exit 0 # no cleanup
-	fi
+    awk 'BEGIN {err=0; ok=0;}
+err == 0 && $1 == "C" {err=1; ech=$1;}
+err == 0 && $1 == "L" {err=2; ech=$1;}
+err == 0 && $1 == "P" {ok=1;}
+END {if (err != 0) exit err; if (ok != 1) exit 3; exit 0;}' <"$topo"
+    exitcode="$?"
+    if [ "$exitcode" -eq "1" ]; then
+	    printf "nConflicting leaves selected:n"
+	    awk '$1 == "C" {print $2}' < "$topo" | vpatchnames | sort | uniq
+	    cleanup
+    elif [ "$exitcode" -eq "2" ]; then
+	    printf "nConflicting leaves selected:n"
+	    awk '{print $2}' < "$topo" | vpatchnames
+	    cleanup
+    elif [ "$exitcode" -eq "3" ]; then
+	    echo "Unknown error. Investigate in $vgpgdir:"
+	    cat "$topo"
+	    exit 0 # no cleanup
+    fi
 }

 [ -z "$1" ] && usage

The new error handling routine scans the result of vflow for error or success conditions: if conflicts or loops among vpatches are detected, those are reported. If no press order is calculated for whatever reason, this is also reported.

@@ -180,7 +180,6 @@
 	shift

 	vpatchflow -
-	cat "$flow" "$topo" | vsimplify > "$simplified"
 	cat >"$dotfile" < ""a[$2]"""}
-' "$vpatches" - <"$simplified" >>"$dotfile"
+' "$vpatches" - <"$topo" >>"$dotfile"
 	echo "}" >> "$dotfile"
 	dot -Tsvg $dotfile -o $svgfile
 	# dot -Tcmapx $dotfile -o $svgfile.html-map

There is no need to run vsimplify anymore, as output is always simplified.

@@ -202,11 +201,10 @@

 	vpatchflow -
 	vno=$(awk -v V="$vpatch" '$1==V {print $2}' < "$vpatches")
-	cat "$flow" "$topo" | vsimplify > "$simplified"
-	awk -v S="$simplified" -v T="$topo" -v N="$vno" '
-FILENAME==S && ($1 == "I" || $1 == "d" ) && $2 == N {a[$3]=1}
-FILENAME==T && ($1 == "P") {if ($2 in a) {print $2}}
-' "$simplified" "$topo" | vpatchnames
+	awk -v N="$vno" '
+($1 == "I" || $1 == "d" ) && $2 == N {a[$3]=1}
+($1 == "P") {if ($2 in a) {print $2}}
+' "$topo" | vpatchnames
 	cleanup
 fi

@@ -217,11 +215,10 @@

 	vpatchflow -
 	vno=$(awk -v V="$vpatch" '$1==V {print $2}' < "$vpatches")
-	cat "$flow" "$topo" | vsimplify > "$simplified"
-	awk -v S="$simplified" -v T="$topo" -v N="$vno" '
-FILENAME==S && ($1 == "I" || $1 == "d" ) && $3 == N {a[$2]=1}
-FILENAME==T && ($1 == "P") {if ($2 in a) {print $2}}
-' "$simplified" "$topo" | vpatchnames
+	awk -v N="$vno" '
+($1 == "I" || $1 == "d" ) && $3 == N {a[$2]=1}
+($1 == "P") {if ($2 in a) {print $2}}
+' "$topo" | vpatchnames
 	cleanup
 fi

"antecedents" and "descendants" commands now read a single file, so I made awk filters less complex as well.

@@ -258,7 +255,7 @@
 	vpatchflow $*
 	vpatchflowerr

-	awk '{print $2}' < "$topo" | vpatchnames | while read i; do
+	awk '$1 == "P" {print $2}' < "$topo" | vpatchnames | while read i; do
 		i=$(readlink -f $i)
 		cd "$dir"
 		vpatch <"$i" >/dev/null
@@ -267,4 +264,6 @@
 	cleanup
 fi
 

"press" now needs to select the press order out of general output.

+echo "Unknown command: $1"
+
 cleanup

One addition is also that any unknown command is reported.

The vpatch is available here:

curl 'http://bvt-trace.net/vpatches/vtools_vsh_utils_one_binary.vpatch' > vtools_vsh_utils_one_binary.vpatch
curl 'http://bvt-trace.net/vpatches/vtools_vsh_utils_one_binary.vpatch.bvt.sig' > vtools_vsh_utils_one_binary.vpatch.bvt.sig


Ada vfilter

This vpatch introduces the Ada version of vfilter program that was mentioned in the first article. It's awk version is presented below:

vfilter() {
	awk -v N="$1" '
BEGIN {r=0;}
$0 ~ /^diff -uNr/ {r=1;}
r == 1 && $1 == "---" {sub("[^/]*/", "", $2); ip=$2; ih=$3}
r == 1 && $1 == "+++" {sub("[^/]*/", "", $2); print N,ip,$2,ih,$3; r=0;}'
}

To build the Ada version, I have repurposed code from vpatch.adb. All the common code was moved to vpatch_utils.adb, with vpatch_utils.ads looking like this:

with Bits; use Bits;
with Ada.Text_IO;  use Ada.Text_IO;
with Ada.Integer_Text_IO; use  Ada.Integer_Text_IO;
with Character_IO; use Character_IO;
with Ada.Strings.Fixed;
with Ada.Directories;
with Ada.Characters;
with Ada.Characters.Handling;
with Ada.Characters.Latin_1;
with Ada.Sequential_IO;
with SMG_Keccak; use SMG_Keccak;
with Temporary_File; use Temporary_File;

package Vpatch_Utils is

   Hash_Length: constant Positive := 128;
   type Hash_Type is (Empty, Value);
   type Hash(The_Type: Hash_Type := Empty) is record
      case The_Type is
         when Value =>
            Value: String(1..Hash_Length);
         when Empty =>
            null;
      end case;
   end record;

   type Header (From_L, To_L: Natural) Is record
      From_Hash: Hash;
      From_File: String(1..From_L);
      To_Hash: Hash;
      To_File: String(1..To_L);
   end record;

   function Get_Header return Header;
   function Starts_With(S: String; Prefix: String) return Boolean;
   procedure Process_Hunks_For_Header(A_Header: Header);
   function Path_Without_Prefix(Pathname: String;
				Prefix: Positive) return String;
end Vpatch_Utils;

Then, vpatch.adb is changed to use the new module:

with Ada.Text_IO;  use Ada.Text_IO;
with Vpatch_Utils; use Vpatch_Utils;

procedure VPatch is
begin
Read_Loop:
    loop
       exit Read_Loop when End_Of_File;
       declare
          S: String := Get_Line;
       begin
          if Starts_With(S, "diff ") then
             declare
                H: Header := Get_Header;
             begin
                Process_Hunks_For_Header(H);
             end;
          else
             Put_Line("Prelude: " & S);
          end if;
       end;
    end loop Read_Loop;
end;

And vfilter.ads looks similar, but has different printing code:

with Ada.Text_IO;  use Ada.Text_IO;
with Vpatch_Utils; use Vpatch_Utils;
with Ada.Command_Line; use Ada.Command_Line;

procedure VFilter is
   procedure Put(A_Hash: Hash) is
   begin
      case A_Hash.The_Type is
         when Value =>
            Put(A_Hash.Value);
         when Empty =>
            Put("false");
      end case;
   end;

   procedure Put(A_Header: Header) is
   begin
      Put(Path_Without_Prefix(A_Header.From_File, 1) & " " &
	    Path_Without_Prefix(A_Header.To_File, 1) & " ");
      Put(A_Header.From_Hash);
      Put(" ");
      Put(A_Header.To_Hash);
      New_Line;
   end;

   Vpatch_No : String := Argument(1);
begin
Read_Loop:
    loop
       exit Read_Loop when End_Of_File;
       declare
          S: String := Get_Line;
       begin
          if Starts_With(S, "diff ") then
             declare
                H: Header := Get_Header;
             begin
		Put(Vpatch_No & " ");
                Put(H);
             end;
          end if;
       end;
    end loop Read_Loop;
end;

Nothing fancy, as you can see, but as I mentioned, there is a performance benefit to this change. The quoted benchmark was done on rather slow system with gawk. On a system with Gales Linux and busybox awk, but faster CPU1, the performance difference is even bigger:

$ time vfilter 0 < patches/linux-genesis.vpatch >/dev/null
    0m04.30s real     0m04.08s user     0m00.21s system
$ time awk -v N="0" '
> BEGIN {r=0;}
> $0 ~ /^diff -uNr/ {r=1;}
> r == 1 && $1 == "---" {sub("[^/]*/", "", $2); ip=$2; ih=$3}
> r == 1 && $1 == "+++" {sub("[^/]*/", "", $2); print N,ip,$2,ih,$3; r=0;}
> ' < patches/linux-genesis.vpatch >/dev/null
    0m20.89s real     0m20.81s user     0m00.07s system

So whether this speedup is worth the extra vpatch, judge yourself, but I intend to grow future vpatches on this leaf.

The vpatch is available at the following link:

curl 'http://bvt-trace.net/vpatches/vtools_vsh_ada_vfilter.vpatch' > vtools_vsh_ada_vfilter.vpatch
curl 'http://bvt-trace.net/vpatches/vtools_vsh_ada_vfilter.vpatch.bvt.sig' > vtools_vsh_ada_vfilter.vpatch.bvt.sig


With these changes in place, I'm pretty much satisfied with where v.sh is, so next item on my TODO list will be port of keccak RNG to linux-2.6.32. I will also add some sort of autosyncing from the reference code shelf to v.sh, but this will come a bit later in the future.

  1. Disk does not matter as in both cases the file was cached in RAM. []

12 Responses to “v.sh parts 2.5 and 3: one-binary Ada solver and Ada vfilter implementation”

  1. bvt says:
    1

    Meanwhile, these vpatches have undergone a regrind, to fix an issue. I made a mistake in changes to v.sh in the vtools_vsh_utils_one_binary.vpatch:

     	if [ "$*" == "-" ]; then
    		echo "S" >"$selfile"
     	else
    -		echo $* | tr ' ' 'n' | sed -e 's/^/s /g' - | vpatchnos >"$selfile"
    +		echo $* | tr ' ' 'n' | vpatchnos | sed -e 's/^/s /g' - >"$selfile"
    	fi
    

    Sorry for the trouble. Mea culpa.

  2. Diana Coman says:
    2

    I gave all this another read and spin - nice work on basically extracting the info strictly needed for the V-press/calculations out of the code diffs as such and I certainly prefer this integrated tools as opposed to the earlier separation with code duplication. I think the Ada version of vfilter is fine too (even though the shortness of awk still draws my eye to it, I'll admit it).

    A few questions/observations on my list after the read & press (in the order I jotted them down so they are not at all in order of importance):
    1. why not make the set-dir names (ie patches, wot, seals) also variables in the script at the beginning so they work essentially like knobs (if manual knobs) ? I don't see in the end why should everyone be stuck with either exactly those dirs or otherwise have to either change throughout the code or rename dirs (granted, minor grate/curiosity rather than any sort of problem at all).
    2. the current set of commands seems to force the graphviz thing because there's no way to see or reasonably explore the tree otherwise; tbh I never installed graphviz on any of my machines and can't say I ever found I couldn't do without it, it's not like it couldn't print the tree indented in text mode and provide perhaps a command to get the list of leaves at least. Why was that useful leaves command of vk.pl dropped?
    3. Trying it on eucrypt's vpatches worked fine and as expected for pressing to a given vpatch - but it was also NOT showing any feedback and I'd much rather it did show what it was doing. Do you prefer it without feedback or why so silent?
    4. Still on eucrypt's vpatches, I had trouble with the antecedents and descendants commands: running them for instance on eucrypt_oaep_fix_checks.vpatch returns empty and without any error or anything (and I tried it on a few other patches as well with same results). Is there something I don't fully get there?

  3. bvt says:
    3

    Thanks!

    1. Yes, this can be done rather easily. Would be an improvement in style with an added benefit of allowing easy fitting for ones taste.
    2. For no good reason really, see comment to your article (still in the queue). This command can be added, indented printing as well.
    3. Indeed, I like it silent more. Do you prefer to see invocations of vpatch and/or output of the vpatch as well?
    4. The breaking UI change is that I relied more on command line completion than on full manual retyping and copy-paste availability. So antecedents/descendants also require vpatch name input in the form of patches/eucrypt_oaep_fix_checks.vpatch, not just eucrypt_oaep_fix_checks.vpatch. This can be changed back to the standard way of passing argument, if there is a significant dislike of the new style. Alternatively, a bit if DWIMming can be added with a basename in the right place, but I am quite suspicious of DWIM, so not sure I'd like that myself.

  4. > 1. why not make the set-dir names (ie patches, wot, seals) also variables in the script at the beginning

    I believe this is the wrong kind of knob, like "redefining the value of 4". There's no possible legitimate use for it (besides, perhaps, some crude attempt at republic-with-serials-filed-off, and there's a very certain cost of sifting through options. Let the names be standardized, what's the big deal.

    Re 2, a purely ascii alternative would be great, esp for perceived portability. No ?

    Re the silence, it seems evident there should be a silent and a verbose mode. This is not something we've invented, it comes from good historical precedent (not just because different sorts of outputs are useful when the reader's another script further down the pipe versus when the reader's an actual human), and perhaps we should actually build around it ? Have a "human" mode where shit's spelled out, and a "machine" mode where it mostly stfu's ?

    DWIM seems a terrible idea in this context, and no doubt exactly the path to hell paved with good intentions people ended up traveling with eg gpg.

  5. Diana Coman says:
    5

    @Mircea Popescu

    I believe this is the wrong kind of knob, like "redefining the value of 4". There's no possible legitimate use for it (besides, perhaps, some crude attempt at republic-with-serials-filed-off, and there's a very certain cost of sifting through options. Let the names be standardized, what's the big deal.

    I don't agree here at all. For one thing, it's something on the *user's machine* so wtf and what sifting through options?? The point is not to give some "options" but simply to write the code so that if one wants to change a fucking name of a dir (and it IS just that, nothing more), it's only ONE place to do the change, not 100. And for that matter I see it quite conversely - basically writing "wot" and "patches" everywhere in the code sounds like some crude attempt at windows-style-copywriting-of-names or dunno. I just can't begin to see the case (or how it would actually even *work* because fuck, if I want to change it, I WILL CHANGE IT anyway) for having strings repeated in the code instead of having a variable with that text.

    @bvt
    Good that you mentioned your comment as it had gotten in the spam queue somehow, I just fished it out now. No idea how & why that happened exactly but possibly some IP mess, sorry about it.
    Re 3 - yes, I'd very much like to have the *option* of verbose.
    Re 4 - honestly, I tried *both* with and without patches/ because yes, indeed, I actually prefer it with patches exactly as you say because of auto-complete. But in both cases I got nothing out of it. Is it somehow working for you?

  6. bvt says:
    6

    Re 1, I don't think there was an argument for a knob, more of a code style change (i.e. $wotdir/*.asc instead of wot/*.asc).
    Re 2, something I came up with looks like:

    patches/eucrypt_genesis.vpatch
     patches/ch1_mpi.vpatch
      patches/ch2_truerandom.vpatch
     patches/eucrypt_ch11_serpent.vpatch
     patches/eucrypt_ch6_keccak_permutations.vpatch
      patches/eucrypt_ch7_keccak_sponge.vpatch
     patches/eucrypt_ch8_bit_keccak.vpatch
       patches/eucrypt_ch9_keccak_endianness.vpatch
      patches/eucrypt_mpi_fix_copy_incr.vpatch
        patches/eucrypt_ch10_oaep_tmsr.vpatch
        patches/eucrypt_ch3_miller_rabin.vpatch
         patches/eucrypt_ch4_rpng.vpatch
          patches/eucrypt_ch5_rsa_keys.vpatch
         patches/eucrypt_oaep_fix_checks.vpatch
                  patches/eucrypt_ch12_wrapper_rsa_oaep_c_ada.vpatch
                   patches/eucrypt_keccak_bitrate_fix.vpatch
                    patches/eucrypt_check_nread.vpatch
                     patches/eucrypt_ch13_smg_rng.vpatch
                      patches/eucrypt_manifest.vpatch
                       patches/eucrypt_fix_256.vpatch
                        patches/eucrypt_ch14_crc32.vpatch
                         patches/eucrypt_ch15_arbitrary_e.vpatch
                          patches/eucrypt_ch16_bytestream_keccak.vpatch
    
    patches/vtools_genesis.vpatch
     patches/vdiff_fixes_newline_gcc.vpatch
      patches/keccak.vpatch
       patches/vdiff_keccak.vpatch
        patches/vtools_fixes_bitrate_char_array.vpatch
         patches/vtools_vpatch.vpatch
          patches/vtools_fixes_static_tohex.vpatch
           patches/vtools_vpatch_newline.vpatch
            patches/vtools_ksum.vpatch
             patches/vtools_tempfile_standalone_notmp.vpatch
              patches/vdiff_blockwise_read-2.vpatch
               patches/vtools_fixes_rootdir_files.vpatch
                patches/vtools_small_fixes.vpatch
                 patches/vtools_add_vsh.vpatch
                  patches/vtools_add_vsh_utils.vpatch
                   patches/vtools_vsh_utils_one_binary.vpatch
                    patches/vtools_vsh_ada_vfilter.vpatch
    
    patches/genesis.vpatch
     patches/bitcoin-asciilifeform.1.vpatch
      patches/rm_rf_upnp.vpatch
       patches/bitcoin-asciilifeform.2-https_snipsnip.vpatch
       patches/bitcoin-asciilifeform.3-turdmeister-alert-snip.vpatch
         patches/bitcoin-asciilifeform.4-goodbye-win32.vpatch
          patches/bitcoin-v0_5_3_1-rev_bump.7.vpatch
        patches/bitcoin-v0_5_3_1-static_makefile_v002.8.vpatch
          patches/bitcoin-v0_5_3-db_config.6.vpatch
          patches/asciilifeform_dnsseed_snipsnip.vpatch
          patches/asciilifeform-kills-integer-retardation.vpatch
           patches/asciilifeform_maxint_locks_corrected.vpatch
        patches/asciilifeform_orphanage_thermonuke.vpatch
           patches/asciilifeform_tx-orphanage_amputation.vpatch
           patches/asciilifeform_zap_hardcoded_seeds.vpatch
            patches/asciilifeform_zap_showmyip_crud.vpatch
           patches/asciilifeform_and_now_we_have_block_dumper_corrected.vpatch
             patches/asciilifeform_dns_thermonyukyoolar_kleansing.vpatch
                patches/asciilifeform_ver_now_5_4_and_irc_is_gone_and_now_must_give_ip.vpatch
            patches/mod6_fix_dumpblock_params.vpatch
                    patches/asciilifeform_and_now_we_have_eatblock.vpatch
                       patches/asciilifeform_lets_lose_testnet.vpatch
                        patches/asciilifeform_add_verifyall_option.vpatch
                         patches/programmable-versionstring.vpatch
                          patches/malleus_mikehearnificarum.vpatch
                          patches/mod6_der_high_low_s.vpatch
                           patches/mod6_privkey_tools.vpatch
                              patches/makefiles.vpatch
    

    For Eucrypt, vtools, and trb correspondingly.
    Re 3, I'll think about where to best put this option. Is environment variable OK, or better have as yet another CLI option?
    Re 4, I wonder what went wrong there:

    $ v.sh antecedents patches/eucrypt_ch10_oaep_tmsr.vpatch
    patches/eucrypt_genesis.vpatch
    patches/eucrypt_ch6_keccak_permutations.vpatch
    patches/eucrypt_ch7_keccak_sponge.vpatch
    patches/eucrypt_ch9_keccak_endianness.vpatch
    $ v.sh antecedents patches/eucrypt_oaep_fix_checks.vpatch
    patches/eucrypt_genesis.vpatch
    patches/eucrypt_ch6_keccak_permutations.vpatch
    patches/eucrypt_ch7_keccak_sponge.vpatch
    patches/eucrypt_ch9_keccak_endianness.vpatch
    patches/eucrypt_ch10_oaep_tmsr.vpatch
    
  7. @Diana What is to be gained from the user puting his wot in .berniesandals2020 rather than .wot ? By anyone, I mean, what conceivable benefit could it ever deliver ?

    Sure, it's the user's machine. It's also your code. If he thinks "comments should be gender neutral" because "it's his machine" what, I'ma have a gender variable in the comment parser ?

    It seems to me however that I misunderstood what you were saying ; it had nothing to do with any of that, it was a simple "if you're going to use magic numbers, pass them by reference rather than directly please", which is a sound principle but substantially a different thing and, importantly, I don't think it applies here.

    Consider the following schematic program :

    a = read ('./wot');
    b = read ('./seals');
    c = read ('./patches');
    do things;

    Now we want to improve this by applying the above principle. First, we have to find names for the references ; and we want to make them good. How about "Name_Of_Directory_Holding_The_Web_Of_Trust = 'wot';" ? That should work ; our schematic program now becomes :

    Name_Of_Directory_Holding_The_Web_Of_Trust = 'wot';
    Name_Of_Directory_Holding_The_Seals = 'seals';
    Name_Of_Directory_Holding_The_Patchset = 'patches';
    a = read ('./'.Name_Of_Directory_Holding_The_Web_Of_Trust);
    b = read ('./'.Name_Of_Directory_Holding_The_Seals);
    c = read ('./'.Name_Of_Directory_Holding_The_Patchset);
    do things;

    We've gone from four lines to seven (+75%) ; from 76 bytes to 330 (+334%) and we're also using no less than THREE string concatenation operations between an absolute and a reference, which we're both likely to fuck up, and are likely to by their nature suck. All this so the putative user is saved the (apparently notable) inconvenience of sed 's^./wot^./berniesandalstotallycouldfixtheeconomy^g' ? Because it's his machine ? What sense does this make!

    At no point will the references be shorter than the directory names as they stand, in which case the replacement is an increase in both brute length and code complexity. There's just nothing gained here, I can appreciate that your experience with the planeshift/creepspace insane clown possee riled you up some, but we're using directories sanely (as we regard them as namespaces, ie gns subspaces in the first place), as opposed to haphazardly. What's so wrong with doing it simply ?

  8. Diana Coman says:
    8

    @bvt That looks fine to me, re 2.
    I still need to look into what is wrong with the script/output on my machine - I half suspect some specific version(s)/settings, hm.

    @Mircea Popescu

    It seems to me however that I misunderstood what you were saying ; it had nothing to do with any of that, it was a simple "if you're going to use magic numbers, pass them by reference rather than directly please",

    Passing them by reference rather than directly is what I was saying, indeed. This being said, it possibly grew into way more than it should have, to start with - it was more of a itch rather than anything that I can't live with,otherwise. The concrete use case I have for it is mainly when I have the patches for something in a dump rather than neatly in "patches" but at all times *renaming a dir* is anyway so easy as to make it a non-issue, indeed.

    Looking at it though, I think you are right in that I have some increased sensitivity to the pass by reference rather than directly, due to ...exposure, let's say. It's true.

    Now we want to improve this by applying the above principle. First, we have to find names for the references ; and we want to make them good. How about "Name_Of_Directory_Holding_The_Web_Of_Trust = 'wot';" ?

    No, not really and precisely for the reason you said initially - namely that "let the names be standardized". So there's no reason for that Name_Of_Directory_etc because that whole string is indeed fixed as wot, there is already a name for it. In my view however, this standard name means in code-terms the standard name of the *variable* rather than the contained magic-string. It's true that most programming languages are shit with strings so I can see the point that it adds unnecessary potential failure modes but note that having to manually write the strings everywhere is *also* prone to error, there's no ideal way around this sort of thing.

    Overall, I don't think there's a huge gain/loss either way really so if you consider it to be a very important thing, I'm fine with it either way really. I don't see a real gain in not-using the pass by reference (since all the "added" stuff is really just the $ sign instead of two " " and otherwise the initial declarations + init of those variables, not at all the full blown bureaucratic angle there, no) but again, I don't see a huge loss at this point either.

  9. bvt says:
    9

    @Diana Coman:
    The way I solve this issue -- it arises with vdiff as well, where you have a hard requirement for having "a" and "b" directories (not in the tool, but in the work process) -- is with symbolic links. I.e. there is a directory future_vpatch_name, which gets symlinked to "b", and intended antecedent, symlinked to "a". If switching the symlinks requires too much time, a set of scripts can make this rather pleasant.

  10. I don't think it's a very important thing tbh ; I have a slight preference for literalism as an approach to life, but in the end it costs more to try and distinguish the costs/benefits of the two approaches than the difference could ever justify. I suppose bvt's symlink solution is actually why symlinks even exist in the first place.

  11. Diana Coman says:

    @bvt Right you are, that's probably the most logical solution there actually. For the press/vdiff since I made those scripts for code control, I just use them but your solution is even more straightforward in fact.

    Unrelated but since I finally remembered *before* sending the comment: the vtools still have that Makefile that just packs the gprbuild commands for the various tools - to my mind it should either be updated so at least one can build each and all of the new tools as well or possibly just ditched entirely if it's not used/useful.

    @Mircea Popescu Basically, I agree. So it's up to bvt now, whatever he prefers to have in there!

  12. [...] started as a comment1 on bvt's very useful work building up the vtools suite. As I really liked his approach2 and I would really prefer to have just one suite (vtools) with [...]

Leave a Reply