In the part one, we had a look at the v.sh script, which should give the reader an understanding of how this v-presser works. In the part two, I will add the utilities required by the v.sh:
- vflow
- vtoposort
- vsimplify
- vfilter (in awk).
I should mention that while writing this article, I really started to hate the duplication of input-output code, so I think I have figured out a way of restructuring the whole thing as a single application. But I did not want to delay the release any further, so I present what I currently have.
vflow
The goal is to determine descendant-antecedent relationships between vpatches based on the information about hunks. From data processing point of view, the goal is to transform the input:
vpatch-id in-file-id out-file-id in-hash-id out-hash-id
into output:
v vpatches-count d descendant-id antecedent-id ... c vpatch1-id vpatch2-id ...
by matching the hunk inputs (path, hash) in vpatch A to the hunk output with the same values in vpatch B: in this case A is the descendant of B.
The source:
with Ada.Text_IO; use Ada.Text_IO; with Ada.Integer_Text_IO; use Ada.Integer_Text_IO; with Ada.Command_Line; use Ada.Command_Line; with Ada.Containers.Generic_Sort; procedure Vflow is type Hunk_State is (Unsatisfied, Satisfied); Empty_Hash : constant := 0; type Hunk is record Vpatch: Natural; In_File: Natural; Out_File: Natural; In_Hash: Natural; Out_Hash: Natural; State: Hunk_State; end record;
Hunks are described by a tuple of (vpatch, input file path, output file path, input file hash, output file hash). State is added for internal bookkeeping.
Num_Hunks : Integer := Integer'Value(Argument(1)); Num_Vpatches : Integer := Integer'Value(Argument(2));
The number of hunks and vpatches are provided as command-line arguments. Rest of the types are elaborated based on this input.
subtype Hunk_Index is Natural range 1..Num_Hunks; type Hunk_Array is array (Hunk_Index range <>) of Hunk; All_Hunks : Hunk_Array(Hunk_Index'Range);
All_Hunks stores all the input to the application.
subtype Vpatch_Index is Natural range 0..Num_Vpatches - 1; type Vpatch_Hunks is array (Vpatch_Index'Range) of Hunk_Index; VH_Low : Vpatch_Hunks := (others => Hunk_Index'First); VH_High: Vpatch_Hunks := (others => Hunk_Index'First);
Input should be provided sorted by vpatch identifier. VH_Low(I) stores the first index into All_Hunks where hunks belonging to vpatch I are stored, VH_High(I) - the last.
One problem is that all these data structures are stored on stack, which has been fine so far even with the whole Linux tree, but may require either "ulimit -s unlimited" in v.sh, or dynamic allocation here, if in the future the vtree becomes too large.
type Vpatch_Relationship is array (Vpatch_Index'Range, Vpatch_Index'Range) of Boolean; Vpatch_Dependencies: Vpatch_Relationship := (others => (others => False)); Vpatch_Conflicts : Vpatch_Relationship := (others => (others => False));
Vpatch_Dependencies(I,J) is set to true if vpatch I is a descendant on vpatch J, the same goes for Vpatch_Conflicts if vpatches I and J are mutually incompatible descendants of the same vpatch.
function Get_Hunk return Hunk is H: Hunk; begin Get(H.Vpatch); Get(H.In_File); Get(H.Out_File); Get(H.In_Hash); Get(H.Out_Hash); H.State := Unsatisfied; return H; end Get_Hunk;
With simple numeric input, we don't have to rely on any hand-rolled parsers, and can just use Ada's built-in tools.
procedure Read_Hunks is I: Natural := All_Hunks'First; begin Read_Loop: loop exit Read_Loop when End_Of_File; All_Hunks(I) := Get_Hunk; I := I + 1; end loop Read_Loop; end Read_Hunks;
Here, we just read all hunks one-by-one.
procedure Init_Vpatch_Hunk_Bounds is begin for I in All_Hunks'Range loop declare V : Integer := All_Hunks(I).Vpatch; begin if VH_Low(V) > I or All_Hunks(VH_Low(V)).Vpatch /= V then VH_Low(V) := I; end if; VH_High(V) := I; end; end loop; end Init_Vpatch_Hunk_Bounds;
We do not keep track of where each vpatch starts and ends in All_Hunks when reading, so this information is recovered by Init_Vpatch_Hunk_Bounds. One linear scan is enough.
procedure Populate_Conflicts is function Conflicts(I: Vpatch_Index; J: Vpatch_Index) return Boolean is begin for A in VH_Low(I)..VH_High(I) loop for B in VH_Low(J)..VH_High(J) loop if (All_Hunks(A).In_File = All_Hunks(B).In_File) and (All_Hunks(A).In_Hash = All_Hunks(B).In_Hash) then return True; end if; end loop; end loop; return False; end Conflicts; begin for I in Vpatch_Index'Range loop for J in Vpatch_Index'Range loop if I < J and then Conflicts(I,J) then Vpatch_Conflicts(I,J) := True; Vpatch_Conflicts(J,I) := True; end if; end loop; end loop; end Populate_Conflicts;
The definition of conflict between two vpatches is rather naive: two vpatches are conflicting when they both contain hunks have the same (path,hash) as inputs. We check this property among all pairs of vpatches.
Next comes the main dependency resolution function, Solve.
procedure Solve is Input_Hunks : Hunk_Array := All_Hunks; Output_Hunks: Hunk_Array renames All_Hunks; Finished : Boolean := False;
The way vflow detects if vpatch I is a descendant of vpatch J is by checking, if any (output path, output hash) created by vpatch J is used as (input path, input hash) in vpatch I. To reduce the algorithmic complexity, we sort All_Hunks by join keys (because that's what this procedure is essentially doing) - input or output path ids. We have to sort using different keys because we cannot assume that the input path is always the same as the output path. So hunks sorted by input path are stored in Input_Hunks, for the hunks sorted by output path (Output_Hunks) we reuse the storage of All_Hunks.
-- Start of sorting boilerplate -- function Input_Before(A: Positive; B: Positive) return Boolean is begin return Input_Hunks(A).In_File < Input_Hunks(B).In_File; end Input_Before; function Output_Before(A: Positive; B: Positive) return Boolean is begin return Output_Hunks(A).Out_File < Output_Hunks(B).Out_File; end Output_Before; procedure Input_Swap(A: Positive; B: Positive) is Tmp : Hunk; begin Tmp := Input_Hunks(B); Input_Hunks(B) := Input_Hunks(A); Input_Hunks(A) := Tmp; end Input_Swap; procedure Output_Swap(A: Positive; B: Positive) is Tmp : Hunk; begin Tmp := Output_Hunks(B); Output_Hunks(B) := Output_Hunks(A); Output_Hunks(A) := Tmp; end Output_Swap; procedure Input_Sort is new Ada.Containers.Generic_Sort (Positive, Input_Before, Input_Swap); procedure Output_Sort is new Ada.Containers.Generic_Sort (Positive, Output_Before, Output_Swap); -- -- End of sorting boilerplate
This boilerplate is required to use Ada's built-in generic sorting facilities.
function Check_Genesis(Input: in out Hunk_Array) return Boolean is Is_Genesis : Boolean := True; Vpatch: Vpatch_Index := Input(Input'First).Vpatch; begin for I in Input'Range loop Is_Genesis := Is_Genesis and (Input(I).In_Hash = Empty_Hash); end loop; return Is_Genesis; end Check_Genesis;
A vpatch is genesis if all its hunks are file creations.
function Try_Connect(Input: in out Hunk_Array; Output: in out Hunk_Array) return Boolean is I : Positive := Input'First; J : Positive := Output'First; All_Marked: Boolean := True; begin while I <= Input'Last and J <= Output'Last loop if Input(I).State /= Unsatisfied or Input(I).In_File < Output(J).Out_File then I := I + 1; elsif Input(I).In_File > Output(J).Out_File then J := J + 1; else if Input(I).In_Hash = Output(J).Out_Hash and Input(I).In_Hash /= Empty_Hash then Input(I).State := Satisfied; Vpatch_Dependencies(Input(Input'First).Vpatch, Output(Output'First).Vpatch) := True; end if; I := I + 1; end if; end loop; for I in Input'Range loop All_Marked := All_Marked and Input(I).State = Satisfied; end loop; return All_Marked; end Try_Connect;
Try matching the hunks belonging of candidate descendant (in array Input) with hunks of candidate antecedent (in array Output). The arrays are sorted, so a variant of a sorted join can be used. If there is a match, we assume that there is a dependency among the vpatches. The return value of the function is True is all hunks in the vpatch are satisfied (that is, have an antecedent); to determine the return value, we iterate over Input once again.
begin for I in Vpatch_Index'Range loop Output_Sort(VH_Low(I), VH_High(I)); Input_Sort(VH_Low(I), VH_High(I)); end loop; for I in Vpatch_Index'Range loop declare Hunks_In : Hunk_Array renames Input_Hunks(VH_Low(I)..VH_High(I)); begin Finished := False; for J in Vpatch_Index'Range loop if J /= I then declare Hunks_Out : Hunk_Array renames Output_Hunks(VH_Low(J)..VH_High(J)); begin Finished := Try_Connect(Hunks_In, Hunks_Out); exit when Finished; end; else Finished := Check_Genesis(Hunks_In); exit when Finished; end if; end loop; -- Looped through all vpatches, there are hunks without matched inputs: if not Finished then -- Check if the remaining hunks are file creations: declare OK : Boolean := True; begin for I in Hunks_In'Range loop if Hunks_In(I).In_Hash = Empty_Hash and Hunks_In(I).State = Unsatisfied then Hunks_In(I).State := Satisfied; end if; OK := OK and (Hunks_In(I).State = Satisfied); end loop; if not OK then raise Constraint_Error with "Unsatisfied hunks."; end if; end; end if; end; end loop; end Solve;
The body of the function, in a loop:
- First, sort hunks by input path and output path, in blocks belonging to different vpatches.
- Then, we try to find antecedents for each of the vpatches, also checking if a vpatch is a genesis.
- If after the looping through each potential antecedent there still are hunks without a matching input, we exit with an error.
procedure Produce_Output is begin Put_Line("v " & Integer'Image(Num_Vpatches)); for I in Vpatch_Dependencies'Range(1) loop for J in Vpatch_Dependencies'Range(2) loop if Vpatch_Dependencies(I,J) then Put_Line("d " & Integer'Image(I) & " " & Integer'Image(J)); end if; end loop; end loop; for I in Vpatch_Conflicts'Range(1) loop for J in Vpatch_Conflicts'Range(2) loop if Vpatch_Conflicts(I,J) then Put_Line("c " & Integer'Image(I) & " " & Integer'Image(J)); end if; end loop; end loop; end Produce_Output;
Produce_Output just prints out the number of vpatches, the antecedent-descendant relationships between vpatches stored in Vpatch_Dependencies, and the conflicts stored in Vpatch_Conflicts.
begin Read_Hunks; Init_Vpatch_Hunk_Bounds; Populate_Conflicts; Solve; Produce_Output; end Vflow;
vtoposort
The goal is to calculate the press order of vpatches based on the dependency and the conflict information produced by vflow concatenated with the set of vpatches selected by the operator for pressing.
with Ada.Text_IO; use Ada.Text_IO; with Ada.Integer_Text_IO; use Ada.Integer_Text_IO; procedure Vtoposort is Command : Character; Num_Vpatches : Integer;
Command is used for reading all input, Num_Vpatches is self-describing.
procedure Toposort(Num_Vpatches : Integer) is subtype Vpatch_Index is Natural range 0..Num_Vpatches - 1; type Vpatch_Set is array (Vpatch_Index'Range) of Boolean; type Vpatch_List is array (Vpatch_Index'Range) of Vpatch_Index; type Vpatch_Relationship is array (Vpatch_Index'Range, Vpatch_Index'Range) of Boolean;
Some new types are Vpatch_Set, which we simulate by an array of booleans, where Set(I) = True means that element I is present in the set, and Vpatch_List, which is used for storing the vpatch indices (the press order). Vpatch_Relationship we have seen before.
Vpatch_Dependencies: Vpatch_Relationship := (others => (others => False)); Vpatch_Conflicts : Vpatch_Relationship := (others => (others => False)); Free_Vpatches : Vpatch_Set := (others => False);
Vpatch_Dependencies and Vpatch_Conflicts are the same as in vflow, and their duplication suggest that perhaps uniting all utilities in one tool would be useful. Free_Vpatches initially contains the list of vpatches selected for pressing by user, but if Toposort detects that it is one of the dependencies of any of the vpatches, it will add this dependency to the Free_Vpatches set.
procedure Read_Commands is I: Vpatch_Index; J: Vpatch_Index; begin Read_Loop: loop exit Read_Loop when End_Of_File; Get(Command); if Command = 'd' then Get(I); Get(J); Vpatch_Dependencies(I,J) := True; elsif Command = 'c' then Get(I); Get(J); Vpatch_Conflicts(I,J) := True; elsif Command = 's' then Get(I); Free_Vpatches(I) := True; else raise Constraint_Error with "Unknown Command"; end if; end loop Read_Loop; end Read_Commands;
Straightforward loop for reading the input, not much to write about.
procedure Add_Dependencies is Finish : Boolean := False; begin while not Finish loop Finish := True; for I in Free_Vpatches'Range loop if Free_Vpatches(I) then for J in Vpatch_Dependencies'Range(2) loop if Vpatch_Dependencies(I,J) and not Free_Vpatches(J) then Finish := False; Free_Vpatches(J) := True; end if; end loop; end if; end loop; end loop; end Add_Dependencies;
This function loops through all vpatches in the Free_Vpatches set, and for each vpatch adds all the antecedents of the vpatch to Free_Vpatches set. Iteration stops when there are no more vpatches to add to the set.
procedure Report_Conflicts is begin for I in Free_Vpatches'Range loop if Free_Vpatches(I) then for J in Vpatch_Conflicts'Range(2) loop if Vpatch_Conflicts(I,J) and Free_Vpatches(J) then Put_Line("C " & Integer'Image(I)); Put_Line("C " & Integer'Image(J)); end if; end loop; end if; end loop; end Report_Conflicts;
Report_Conflicts checks if mutually exclusive vpatches are present among the selected.
procedure Sort is Added_Vpatches : Vpatch_Set := (others => False); Vpatch_Order : Vpatch_List := (others => 0); N_Selected : Natural := 0; Finished : Boolean := False; Has_Loops : Boolean := False; begin while not Finished loop Finished := True; for I in Free_Vpatches'Range loop if Free_Vpatches(I) then declare All_Satisfied : Boolean := True; begin -- Check if all dependencies are already in the press order for J in Vpatch_Dependencies'Range(2) loop if Vpatch_Dependencies(I,J) then All_Satisfied := All_Satisfied and Added_Vpatches(J); end if; end loop; -- All dependencies present, can add this vpatch: if All_Satisfied then Free_Vpatches(I) := False; Added_Vpatches(I) := True; Vpatch_Order(N_Selected) := I; N_Selected := N_Selected + 1; Finished := False; end if; end; end if; end loop; end loop; -- Check if there are still free vpatches (there is a loop): for I in Free_Vpatches'Range loop if Free_Vpatches(I) then Put_Line("L " & Integer'Image(I)); Has_Loops := True; end if; end loop; -- Print vpatches in press order: if not Has_Loops then for I in Vpatch_Order'First..N_Selected - 1 loop Put_Line("P " & Integer'Image(Vpatch_Order(I))); end loop; end if; end Sort;
Sort is the main, well, sorting function. The variables are:
- Vpatch_Order - the press order, which is supposed to be main output of this tool.
- N_Selected - the first free index in Vpatch_Order (point into which new vpatch is added).
- Added_Vpatches - a set of vpatches already in Vpatch_Order.
- Finished - the sorting loop converged on a solution.
- Has_Loops - flag for the case when a loop has been discovered.
First, loop through vpatches in the Free_Vpatches set, checking if all the antecedents of each vpatch are present. If yes, the vpatch is added to Vpatch_Order, updating all the related variable accordingly.
If after this loop finishes, there are still selected but not added vpatches, there is a loop which gets printed. Otherwise, everything is OK and the press order gets printed.
begin Read_Commands; Add_Dependencies; Report_Conflicts; Sort; end Toposort;
It should be clear what Toposort does, and why, at this point.
begin Get(Command); if Command /= 'v' then raise Constraint_Error with "Expected 'v' command."; end if; Get(Num_Vpatches); Toposort(Num_Vpatches); end Vtoposort;
The main function gets the first command, verifies that it is the "number of vpatches" command, and starts toposort program, which can now allocate all its data structures on stack.
vsimplify
with Ada.Text_IO; use Ada.Text_IO; with Ada.Integer_Text_IO; use Ada.Integer_Text_IO; procedure Vsimplify is Command : Character; Num_Vpatches : Integer; procedure Simplify(Num_Vpatches : Integer) is subtype Vpatch_Index is Natural range 0..Num_Vpatches - 1; type Vpatch_Set is array (Vpatch_Index'Range) of Boolean; type Vpatch_List is array (Vpatch_Index'Range) of Vpatch_Index; type Vpatch_Relationship is array (Vpatch_Index'Range, Vpatch_Index'Range) of Boolean; Vpatch_Dependencies: Vpatch_Relationship := (others => (others => False)); Vpatch_TC_Indirect : Vpatch_Relationship := (others => (others => False)); Vpatch_Order : Vpatch_List := (others => 0); Free_Vpatches : Vpatch_Set := (others => False); N_Selected : Natural := 0;
New item here is Vpatch_TC_Indirect, which holds the indirect transitive antecedent set of each vpatch.
procedure Read_Commands is I: Vpatch_Index; J: Vpatch_Index; begin Read_Loop: loop exit Read_Loop when End_Of_File; Get(Command); if Command = 'd' then Get(I); Get(J); Vpatch_Dependencies(I,J) := True; elsif Command = 'C' then -- Ignore conflicts in this tool, but consume the command Get(I); elsif Command = 'L' then raise Constraint_Error with "Cannot simplify loops"; elsif Command = 'P' then Get(I); Vpatch_Order(N_Selected) := I; N_Selected := N_Selected + 1; end if; end loop Read_Loop; if N_Selected - 1 /= Vpatch_Order'Last then raise Constraint_Error with "Not all vpatches present in the press order"; end if; end Read_Commands;
Up to this point, we have already seen everything, though Read_Commands is updated to be more in line with output of vtoposort.
procedure Simplify_Inner is begin -- Fill Vpatch_TC_Indirect for I in Vpatch_Order'Range loop declare N : Vpatch_Index := Vpatch_Order(I); begin for J in Vpatch_Dependencies'Range(2) loop if Vpatch_Dependencies(N,J) then for K in Vpatch_Dependencies'Range(2) loop Vpatch_TC_Indirect(N,K) := Vpatch_TC_Indirect(N,K) or Vpatch_Dependencies(J,K) or Vpatch_TC_Indirect(J,K); end loop; end if; end loop; end; end loop; -- Output Vpatch_TC_Indirect for I in Vpatch_TC_Indirect'Range(1) loop for J in Vpatch_TC_Indirect'Range(2) loop if Vpatch_TC_Indirect(I,J) then Put_Line("I " & Integer'Image(I) & " " & Integer'Image(J)); end if; end loop; end loop; -- Remove redundant connections from Vpatch_Dependencies for I in Vpatch_Dependencies'Range(1) loop for J in Vpatch_TC_Indirect'Range(2) loop Vpatch_Dependencies(I,J) := Vpatch_Dependencies(I,J) and not Vpatch_TC_Indirect(I,J); end loop; end loop; -- Output filtered out Vpatch_Dependencies for I in Vpatch_Dependencies'Range(1) loop for J in Vpatch_TC_Indirect'Range(2) loop if Vpatch_Dependencies(I,J) then Put_Line("d " & Integer'Image(I) & " " & Integer'Image(J)); end if; end loop; end loop; end Simplify_Inner;
Simplify_Inner is the core of vsimplify:
To fill Vpatch_TC_Indirect, for each vpatch N in the press order, loop through antecedents of N, summing their direct antecedent sets (from vflow), and indirect antecedent sets (the ones being calculated in this very loop) into the row N. Because iteration through the vpatches happens in the press order, the indirect antecedent sets are guaranteed to be completed at the time when they are accessed.
Then, output the indirect antecedent sets, and substract the indirect antecedent set from the antecedent sets that came as input from vflow. Finally, we output the cleaned up direct antecendent sets.
begin Read_Commands; Simplify_Inner; end Simplify;
begin Get(Command); if Command /= 'v' then raise Constraint_Error with "Expected 'v' command."; end if; Get(Num_Vpatches); Simplify(Num_Vpatches); end Vsimplify;
Main function of vsimplify follows the pattern of vtoposort.
vfilter (in awk this time)
This is added to v.sh:
vfilter() { awk -v N="$1" ' BEGIN {r=0;} $0 ~ /^diff -uNr/ {r=1;} r == 1 && $1 == "---" {sub("[^/]*/", "", $2); ip=$2; ih=$3} r == 1 && $1 == "+++" {sub("[^/]*/", "", $2); print N,ip,$2,ih,$3; r=0;}' }
Upon detecting the "diff -uNr" in the colunm 0 of a line, read input and output information of a hunk. The path has the initial directory stripped (because a/ and b/ prefixes inside vpatches do not matter for the purposes of v-presser).
curl 'http://bvt-trace.net/vpatches/vtools_add_vsh_utils.vpatch' > vtools_add_vsh_utils.vpatch
curl 'http://bvt-trace.net/vpatches/vtools_add_vsh_utils.vpatch.bvt.sig' > vtools_add_vsh_utils.vpatch.bvt.sig
Hmm, this risks to misfire for instance on older trees that have the manifest added at some point rather than baked in from the start. Arguably they are due for a regrind anyway but the above definition (and its implicit reliance on the manifest file mechanism) sits rather uncomfortably with me. To my mind a vpatch is genesis if it doesn't have any ancestors - would this be all that harder to use at this specific point?
@Diana Coman:
If you consider a case of two vpatches that only create files, a genesis by intention (which V can't know) and a vpatch that adds manifest, those two vpatches can either conflict (if they create files with the same name) or commute (if there is no intersection between the sets of created names). How would a ~stateless system of V know which one is the genesis, and which one is just a leaf with manifest? AFAIK in V as described by Ben Vulpes, these will be ordered in flow somehow, and you would be able to press genesis, genesis+manifest OR manifest, manifest+genesis, depending on the exact flow the tool will give you; in v.sh, pressing one of them *will not* pull in the other, however you can select multiple leafs for pressing, not just one, so you can press any combination of them.
If you consider a case of more vpatches, then genesis is not special vpatch in any way (so I guess Is_Genesis is rather misleading name) - it just happens that all other vpatches have this one vpatch - genesis - as their antecedent, while not all vpatches have the vpatch with manifest in their antecedent set.
I kept thinking this over and despite my initial uneasiness, I think you are actually right in that a vpatch with *all* hunks creating files is indeed a genesis since it does not/can not properly speaking have an antecedent as it doesn't require/rely on anything being present.
> Upon detecting the "diff -uNr" in the colunm 0 of a line, read input and output information of a hunk.
This wasn't previously a required part of the unified diff format, and at least busybox diff doesn't include it (see also). It also means one might need to be more careful in any commentary at the top of a patch file.
Still, I'm thinking this is the saner way to go since it simplifies any tool that needs to parse patches while complicating only theoretical vdiffs that don't conform; nicely anti-Postel if we can afford it.
Yes, it's nice, however it's phf who came up with this: vpatch utility won't detect a header unless it is preceded by a line that matches "^diff -uNr", and this is what I carried over to the rest of the v.sh components -- in the end, vpatch is the ultimate downstream consumer. Also, vdiff produces this header unconditionally.
IMO if you want to get rid of Ada dependency, it's better to move vdiff to keksum code instead of relying on whatever diff busybox has -- also because if you diff two trees, it's not guaranteed that phf's vdiff and busybox+wrapper will produce identical hunks inside the vpatch.
Oh I see. The exact pattern looks to be "^diff ".
I can't say I like the idea of needing a 2k+ line GNUistic vdiff compared to <100 lines of awk on top of a standard utility, or this bitcoin-like situation where "the code is the spec and realistically there can be only one implementation", or the idea that all vdiffs must produce the same output anyway as that closes off improvement of the algorithm. Do you have a ref handy for why identical output is important? I recall complaints about locale-dependent file sort order from the logs but not beyond that. Anyway, thank you for the caution and I will probably do as you suggest, as this doesn't seem like a battle I can or need to win.
I don't have a reference specifically, but the clue for me was reading somewhere that phf took care that vdiff would reproduce all TRB vpatches accurately from the pressed trees. If you think about it, if vdiffs are allowed to produce differing outputs, shipping vpatch signatures without a vpatch à-la trinque becomes impossible. To still have this possibility without cementing the diff algorithm, signature would have to cover only the vpatch headers, with filenames and input and output hashes, leaving the hunk bodies out.
And, well, the "standard utility" is also implemented in some ~1000 LoC even with busybox. This is not to say that vdiff could not use some improvement: it does have some crap that should go.
> shipping vpatch signatures without a vpatch à-la trinque becomes impossible
Although that was for a genesis, where the diff hunks are fully determined anyway.
I'd say it makes a good example of the importance of choosing optionality wisely, because putting it in one place closes it off in another.
> it does have some crap that should go.
Ah yes, the "quoting is good mmkay, so let's do it, sometimes, if we feel like it".
Any thoughts on what if anything it should do on encountering file names with embedded whitespace (which on unix can even include newlines)?
In the context of vpatches I don't see a good solution, but for the sake of honestly I considered aborting vdiff process or at least reporting that the vpatch is borked to stderr, as most likely no V presser/patcher would be able to do anything with it.
[...] his awk + posix shell + Ada Vtron in a 3 part series (1, 2, [...]