FG-fed Linux RNG

October 21st, 2019

In this post, I will explain the new code for Linux random.c (RNG interface for kernel and userspace), as well as provide a vpatch for it. The code is an implementation of MP's spec:

mircea_popescu: let me formalize this, 1sec.
mircea_popescu: ring buffers : Inner (small, 16kb to cpu-cache-size) ; Outer (large, 1 MB to swap partition size). each buffer has a writing head moving around it and a reading head moving around it, their position is W and R at any time.
mircea_popescu: hash functions : Fast (takes one byte, produces many bytes), Good (takes one byte, produces one byte).
mircea_popescu: the operation then consists of : 1. FG -> I.W 2. if I.W = I.R, I.R -> O.W, such that if O.R >= O.W/2, next O.W goes through HF filling many offset bytes ; if O.R <= O.W/2, next OW goes through HG, filling one offset byte.
mircea_popescu: this way - O is "always full" from the pov of userland ; I is protected from userland reading O.

bvt:
one q though, per my reading of the formalization, stretching happens only on I overflow? i.e. if there is a consumer reading from I, preventing it from overfilling, bytes would never fall into O, and stretching is not triggered?
mircea_popescu: bvt indeed, if there's no I to O overflow, then O just acts as a cprng, keeps hashing itself

Some of the items that may require special attention and perhaps revision are:

  • Implementation of good and fast hash functions. Currently those are the same SHA1 and ChaCha20 (in RNG mode) used by the original code. I also had a version with Serpent for both of them (still in hash and RNG modes) in testing, which I can restore.
  • The selection routine for good and fast hashes.
  • The sleep and wakeup conditions for writers to /dev/random: in this vpatch writer fills the rings until they both become full, and sleeps afterwards; it is woken up only when either of the rings becomes less than half full.

Let's have a look at the new source.

#define INNER_RING_SIZE (16*1024)
#define OUTER_RING_SIZE (2*1024*1024)

static DEFINE_KFIFO(inner_ring, unsigned char, INNER_RING_SIZE);
static DEFINE_KFIFO(outer_ring, unsigned char, OUTER_RING_SIZE);

static DECLARE_WAIT_QUEUE_HEAD(inner_read_wait);
static DECLARE_WAIT_QUEUE_HEAD(inner_write_wait);
static struct fasync_struct *fasync;
static DEFINE_SPINLOCK(inner_lock);
static DEFINE_SPINLOCK(outer_lock);

This chunk of code declares:

  • Sizes of inner and outer ring - 16kbytes and 2Mbytes correspondingly.
  • The actual inner and outer rings. For this, I used data structure already present in the kernel - kfifo, which is a ring buffer implementation.
  • Wait lists for readers and writers to /dev/random. /dev/urandom is fully non-blocking.
  • Signal-driven IO helper structure.
  • Locks for inner and outer rings.
union sha1state {
	u8 c[20];
	__u32 w[5];
};

static int
hash_good(const unsigned char __user *data, size_t data_len)
{
	union sha1state hash;
	__u32 workspace[SHA_WORKSPACE_WORDS];
	u8 s_block[SHA_MESSAGE_BYTES];
	size_t nbytes;
	size_t hashed = 0;

	sha_init(hash.w);
	while (hashed < data_len) {
		nbytes = data_len - hashed;
		nbytes = min_t(size_t, sizeof(s_block), nbytes);
		memset(s_block, 0, sizeof(s_block));
		if (copy_from_user(s_block, data + hashed, nbytes)) {
			return -EFAULT;
		}
		sha_transform(hash.w, s_block, workspace);
		kfifo_in(&outer_ring,  &hash.c[0], sizeof(hash));
		hashed += nbytes;
	}
	return 0;
}

The good hash function is specified to take individual byte and hash it. Given that most of the time the data would be passed in large chunks, a block-based hash function looked like a good fit. I have used the same SHA1 as was used here before. I considered using Serpent or ChaCha20 in the hash mode as an alternative (encrypting a data block with some pre-defined key): this would have a benefit being 32->32 byte hash, not 64->20 byte one. However, SHA1 code worked for me so far1.

static int
hash_fast(const unsigned char __user *data, size_t data_len, size_t output_len)
{
	u32 ctx[16] = {0};
	u8 s_block[CHACHA20_BLOCK_SIZE] = {0};
	size_t nbytes;
	size_t hashed = 0;

	if (copy_from_user(&ctx[0],
			   data,
			   min_t(size_t, data_len, sizeof(ctx)))) {
		return -EFAULT;
	}

	while (hashed < output_len) {
		nbytes = output_len - hashed;
		nbytes = min_t(size_t, CHACHA20_BLOCK_SIZE, nbytes);
		chacha20_block(&ctx[0], s_block);
		nbytes = kfifo_in(&outer_ring, s_block, nbytes);
		hashed += nbytes;
	}
	return 0;
}

For the fast hash function, I have used the same ChaCha20 as in the old source, using it in an RNG mode2. For this, the hash function code repeatedly encrypts an initially empty block, using user input as ChaCha20 state.

static int
transfer_to_outer(const unsigned char __user *data, ssize_t len)
{
	int outer_ring_length;
	int r = 0;
	unsigned long flags;

	spin_lock_irqsave(&outer_lock, flags);
	outer_ring_length = kfifo_len(&outer_ring);
	if (outer_ring_length >= OUTER_RING_SIZE/2) {
		r = hash_good(data, len);
	} else {
		r = hash_fast(data, len, OUTER_RING_SIZE/2 - outer_ring_length);
	}
	spin_unlock_irqrestore(&outer_lock, flags);
	return r;
}

This function is called to transfer whatever data is left after feeding the inner ring to the outer ring. The first half of the outer ring is initialized with the fast hash, the second half - using a good hash. This is a bit different from "if O.R >= O.W/2, next O.W goes through HF filling many offset bytes ; if O.R <= O.W/2, next OW goes through HG, filling one offset byte". The problem with literal implementation is that:

  • If I consider *.R and *.W to be indices into the underlying array of the ring buffer, this behavior is implementable but seems totally devoid of physical sense.
  • If I consider *.R and *.W to be monotonically increasing indices that are mapped to in-array indices by a masking/modular operation, then after *.R > Size(*), *.R would always be > O.W/2 (given that for a ring buffer max(W-R) = Size(*)).

Thus I implemented the version that had some physical sense to me: (*.(W-R) >= Size(*)/2) -> HG; (*.(W-R) < Size(*)/2) -> HF. If I'm in the error here, I will fix this ASAP.

static int should_wake_up_writer(void) {
	return (kfifo_len(&inner_ring) <= INNER_RING_SIZE/2 ||
		kfifo_len(&outer_ring) <= OUTER_RING_SIZE/2);
}

static void wake_up_writer(void)
{
	if (!should_wake_up_writer()) {
		return;
	}
	wake_up_interruptible(&inner_write_wait);
	kill_fasync(&fasync, SIGIO, POLL_OUT);
}

These are the condition for waking the feeder application and the actual waking function. The current condition for writer to wake up is if any of the rings is at least half empty. The waking function does its job only when this condition holds. May need revision, but I'd like to get more experimental data on its behavior.

static ssize_t inner_read(char __user *buf, size_t nbytes)
{
	int to_read;
	ssize_t r = 0;
	unsigned int read = 0;
	unsigned long flags;

	spin_lock_irqsave(&inner_lock, flags);
	to_read = kfifo_len(&inner_ring);
	to_read = min_t(int, to_read, nbytes);
	if (to_read > 0) {
		r = kfifo_to_user(&inner_ring, buf, to_read, &read);
		wake_up_writer();
	}
	spin_unlock_irqrestore(&inner_lock, flags);

	return ( (r == 0) ? read :  r);
}

inner_read transfers some amount of data from inner ring into user-provided buffer, waking up the writer if necessary.

static ssize_t
_random_read(int nonblock, char __user *buf, size_t nbytes)
{
	ssize_t nread = 0;
	ssize_t r;
	while (1) {
		r = inner_read(buf + nread, nbytes - nread);
		if (r < 0) {
			return r;
		}

		nread += r;
		if (nread >= nbytes) {
			break;
		}
		if (nonblock) {
			return -EAGAIN;
		}
		wait_event_interruptible(inner_read_wait,
					 kfifo_len(&inner_ring) > 0);
		if (signal_pending(current)) {
			return -ERESTARTSYS;
		}
	}
	return nread;
}

_random_read repeatedly calls inner_read, blocking if the inner ring becomes empty.

static ssize_t
random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
{
	return _random_read(file->f_flags & O_NONBLOCK, buf, nbytes);
}

This function is just an adapter for _random_read to struct file_operations mandated interfaces. It is necessary to separate random_read and _random_read because there is an alternative interface to the same functionality - getentropy system call.

static ssize_t
_random_write(const char __user *buffer, size_t count)
{
	unsigned int written;
	ssize_t r;
	unsigned long flags;

	if (kfifo_len(&inner_ring) == INNER_RING_SIZE &&
	    kfifo_len(&outer_ring) == OUTER_RING_SIZE) {
		wait_event_interruptible(inner_write_wait,
					 should_wake_up_writer());
		if (signal_pending(current)) {
			return -EINTR;
		}
	}

	spin_lock_irqsave(&inner_lock, flags);
	r = kfifo_from_user(&inner_ring, buffer, count, &written);
	spin_unlock_irqrestore(&inner_lock, flags);

	if (r != 0) {
		return r;
	}

	if (written > 0) {
		wake_up_interruptible(&inner_read_wait);
		kill_fasync(&fasync, SIGIO, POLL_IN);
	}

	/* Avoid situations where CPRNG generates outer ring content
	 * without complete seed */
	if (count - written < SHA_MESSAGE_BYTES) {
		return count;
	}
	r = transfer_to_outer(buffer + written, count - written);
	if (r != 0) {
		return r;
	}
	return count;
}

Writing data to /dev/[u]random: first, the writer checks whether it must sleep - on the condition, that both rings are full. If it is woken up, it pushes the data into the inner ring as much as fits there. The remaining bytes, if there is enough data for both good and fast hash, are transferred to the outer ring.

static ssize_t
random_write(struct file *file, const char __user *buffer,
	     size_t count, loff_t *ppos)
{
	return _random_write(buffer, count);
}

Just and adapter/wrapper. random_write and _random_write are separate because _random_write is also used by ioctl call for feeding the data into the RNG.

static void
_outer_self_hash(size_t to_hash)
{
	union sha1state hash;
	__u32 workspace[SHA_WORKSPACE_WORDS];
	u8 s_block[SHA_MESSAGE_BYTES];
	size_t nbytes, peeked;
	size_t hashed = 0;

	sha_init(hash.w);
	while (hashed < to_hash) {
		memset(s_block, 0, sizeof(s_block));
		nbytes = min_t(size_t, sizeof(s_block), (to_hash - hashed));
		peeked = kfifo_out_peek_forward(&outer_ring, s_block, nbytes);
		sha_transform(hash.w, s_block, workspace);
		nbytes = kfifo_in(&outer_ring, &hash.c[0], sizeof(hash));
		hashed += nbytes;
	}
}

This is the function that implements the self-hashing using HG. It continuously 'peeks' the data from the ring buffer, hashes it, and properly pushes it into the outer ring. To implement this function, I have extended the kfifo functionality with an ability to read data past the write index (in lib/kfifo.c and include/linux/kfifo.h):

unsigned int __kfifo_out_peek_forward(struct __kfifo *fifo,
               void *buf, unsigned int len)
{
       unsigned int size = fifo->mask + 1;
       if (len > size) {
               len = size;
       }

       kfifo_copy_out(fifo, buf, len, fifo->out);
       return len;
}
EXPORT_SYMBOL(__kfifo_out_peek_forward);

/* In include/linux/kfifo.h */

#define kfifo_out_peek_forward(fifo, buf, n) \
__kfifo_uint_must_check_helper( \
({ \
        typeof( (fifo) + 1) __tmp = (fifo); \
        typeof(__tmp->ptr) __buf = (buf); \
        unsigned long __n = (n); \
        const size_t __recsize = sizeof(*__tmp->rectype); \
        struct __kfifo *__kfifo = &__tmp->kfifo; \
        (__recsize) ? \
        0 : \
        __kfifo_out_peek_forward(__kfifo, __buf, __n); \
}) \
)

extern unsigned int __kfifo_out_peek_forward(struct __kfifo *fifo,
        void *buf, unsigned int len);

Yes, kfifo_out_peek_forward is that ugly macro, the implementation is fully consistent in style with the rest of the header.

static void
outer_self_hash(size_t to_hash)
{
	if (kfifo_len(&outer_ring) == 0) {
		_outer_self_hash(min_t(size_t,
				       to_hash,
				       kfifo_avail(&outer_ring)));
	}
}

The self-hashing functionality is triggered only when the ring buffer is empty.

static int
urandom_read_available(char __user *buf, size_t nbytes, unsigned int *written)
{
	unsigned long flags;
	int outer_ring_length;
	ssize_t cnt = 0;
	int r;

	spin_lock_irqsave(&outer_lock, flags);
	outer_ring_length = kfifo_len(&outer_ring);
	cnt = min_t(int, outer_ring_length, nbytes);
	outer_self_hash(nbytes);
	r = kfifo_to_user(&outer_ring, buf, cnt, written);
	spin_unlock_irqrestore(&outer_lock, flags);
	wake_up_writer();

	return r;
}

Reads the data currently available in the outer ring into the user-provided buffer. Self-hashes the ring buffer if necessary.

static ssize_t
urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
{
	int nread = 0;
	ssize_t r = 0;
	unsigned int written;

	while (nread < nbytes) {
		/* This preserves the semantics of urandom_read
		 * for legacy applications */
		if (nbytes > 256 && need_resched()) {
			if (signal_pending(current)) {
				if (r == 0) {
					r = -ERESTARTSYS;
				}
				break;
			}
			schedule();
		}

		r = urandom_read_available(buf + nread,
					   nbytes - nread,
					   &written);

		if (r != 0) {
			break;
		} else {
			nread += written;
		}
	}

	return ( (r == 0) ? nread : r);
}

This is the entry point for read operations on /dev/urandom. It just calls urandom_read_available in a loop until a buffer is filled, preserving the original behavior of atomicity of small reads.

void get_random_bytes(void *buf, int nbytes)
{
	int outer_ring_length;
	unsigned long flags;
	int nread = 0;
	ssize_t to_read = 0;
	unsigned int l;

	while (nread < nbytes) {
		spin_lock_irqsave(&outer_lock, flags);
		outer_ring_length = kfifo_len(&outer_ring);
		to_read = min_t(int, outer_ring_length, (nbytes - nread));
		outer_self_hash(nbytes - nread);
		l = kfifo_out(&outer_ring, buf + nread, to_read);
		spin_unlock_irqrestore(&outer_lock, flags);
		nread += l;
		wake_up_writer();
	}
}
EXPORT_SYMBOL(get_random_bytes);

This function is the equivalent of urandom_read for in-kernel consumers. urandom_read cannot be reused as-is, because different kinds of kfifo APIs are used to copy data to userspace and kernelspace memory. The old code avoids this code duplication because it has additional buffering of data inside kernel, and thus can make the distinction between destinations for copying at much higher levels of call stack (closer to the syscall/(get_random_bytes|get_random_int|get_random_long) invocation spot).

static unsigned int
random_poll(struct file *file, poll_table * wait)
{
	unsigned int mask;

	poll_wait(file, &inner_read_wait, wait);
        poll_wait(file, &inner_write_wait, wait);
	mask = 0;
	if (kfifo_len(&inner_ring) > 0)
		mask |= POLLIN | POLLRDNORM;
	if (should_wake_up_writer()) {
		mask |= POLLOUT | POLLWRNORM;
	}
	return mask;
}

An implementation of poll(2) for the /dev/random. Sets the flags according to the wakeup conditions for readers and writers.

static int random_fasync(int fd, struct file *filp, int on)
{
	return fasync_helper(fd, filp, on, &fasync);
}

This is a simple helper for signal-driven IO.

static long random_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
{
	int size, ent_count;
	int __user *p = (int __user *)arg;
	int retval;

	switch (cmd) {
	case RNDGETENTCNT:
		/* inherently racy, no point locking */
		ent_count = kfifo_len(&inner_ring);
		if (put_user(ent_count, p))
			return -EFAULT;
		return 0;
	case RNDADDTOENTCNT:
		return 0;
	case RNDADDENTROPY:
		if (!capable(CAP_SYS_ADMIN))
			return -EPERM;
		if (get_user(ent_count, p++))
			return -EFAULT;
		if (ent_count < 0)
			return -EINVAL;
		if (get_user(size, p++))
			return -EFAULT;
		retval = _random_write( (const char __user *)p, size);
		if (retval < 0)
			return retval;
		return 0;
	case RNDZAPENTCNT:
	case RNDCLEARPOOL:
		/*
		 * Clear the entropy pool counters. We no longer clear
		 * the entropy pool, as that's silly.
		 */
		if (!capable(CAP_SYS_ADMIN))
			return -EPERM;
		kfifo_reset(&inner_ring);
		kfifo_reset(&outer_ring);
		return 0;
	default:
		return -EINVAL;
	}
}

An implementation of the same ioctl calls for /dev/random and /dev/urandom.

const struct file_operations random_fops = {
	.read  = random_read,
	.write = random_write,
	.poll  = random_poll,
	.unlocked_ioctl = random_ioctl,
	.fasync = random_fasync,
	.llseek = noop_llseek,
};

const struct file_operations urandom_fops = {
	.read  = urandom_read,
	.write = random_write,
	.unlocked_ioctl = random_ioctl,
	.fasync = random_fasync,
	.llseek = noop_llseek,
};

The same structures with file operations are as before.

SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
		unsigned int, flags)
{
	if (flags & ~(GRND_NONBLOCK|GRND_RANDOM))
		return -EINVAL;

	if (count > INT_MAX)
		count = INT_MAX;

	if (flags & GRND_RANDOM) {
		if (!capable(CAP_SYS_ADMIN))
			return -EPERM;
		return _random_read(flags & GRND_NONBLOCK, buf, count);
	} else {
		return urandom_read(NULL, buf, count, NULL);
	}
}

This is a system call for getting random bytes without opening any files. Here, I added a permission check for reading from inner ring -- only root should be allowed to do that.

/* Callbacks for adding entropy from event timings are now unused: */
void add_device_randomness(const void *buf, unsigned int size) {}
EXPORT_SYMBOL(add_device_randomness);
void add_input_randomness(unsigned int type, unsigned int code,
			  unsigned int value) {}
EXPORT_SYMBOL_GPL(add_input_randomness);
void add_interrupt_randomness(int irq, int irq_flags) {}
EXPORT_SYMBOL_GPL(add_interrupt_randomness);

#ifdef CONFIG_BLOCK
void add_disk_randomness(struct gendisk *disk) {}
EXPORT_SYMBOL_GPL(add_disk_randomness);
#endif

int add_random_ready_callback(struct random_ready_callback *rdy)
{
	return -EALREADY;
}
EXPORT_SYMBOL(add_random_ready_callback);

void del_random_ready_callback(struct random_ready_callback *rdy) {}
EXPORT_SYMBOL(del_random_ready_callback);

Entropy input interface and CRNG ready callbacks are unused now.

struct batched_entropy {
	union {
		unsigned long entropy_long[CHACHA20_BLOCK_SIZE / sizeof(unsigned long)];
		unsigned int entropy_int[CHACHA20_BLOCK_SIZE / sizeof(unsigned int)];
	};
	unsigned int position;
};

/*
 * Get a random word for internal kernel use only.
 */
static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_long);
unsigned long get_random_long(void)
{
	unsigned long ret;
	struct batched_entropy *batch;

	batch = &get_cpu_var(batched_entropy_long);
	if (batch->position % ARRAY_SIZE(batch->entropy_long) == 0) {
		get_random_bytes( (u8 *)batch->entropy_long,
			   sizeof(batch->entropy_long));
		batch->position = 0;
	}
	ret = batch->entropy_long[batch->position++];
	put_cpu_var(batched_entropy_long);
	return ret;
}
EXPORT_SYMBOL(get_random_long);

#if BITS_PER_LONG == 32
unsigned int get_random_int(void)
{
	return get_random_long();
}
#else
static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_int);
unsigned int get_random_int(void)
{
	unsigned int ret;
	struct batched_entropy *batch;

	batch = &get_cpu_var(batched_entropy_int);
	if (batch->position % ARRAY_SIZE(batch->entropy_int) == 0) {
		get_random_bytes( (u8 *)batch->entropy_int,
			   sizeof(batch->entropy_int));
		batch->position = 0;
	}
	ret = batch->entropy_int[batch->position++];
	put_cpu_var(batched_entropy_int);
	return ret;
}
#endif
EXPORT_SYMBOL(get_random_int);

/**
 * randomize_page - Generate a random, page aligned address
 * @start:	The smallest acceptable address the caller will take.
 * @range:	The size of the area, starting at @start, within which the
 *		random address must fall.
 *
 * If @start + @range would overflow, @range is capped.
 *
 * NOTE: Historical use of randomize_range, which this replaces, presumed that
 * @start was already page aligned.  We now align it regardless.
 *
 * Return: A page aligned address within [start, start + range).  On error,
 * @start is returned.
 */
unsigned long
randomize_page(unsigned long start, unsigned long range)
{
	if (!PAGE_ALIGNED(start)) {
		range -= PAGE_ALIGN(start) - start;
		start = PAGE_ALIGN(start);
	}

	if (start > ULONG_MAX - range)
		range = ULONG_MAX - start;

	range >>= PAGE_SHIFT;

	if (range == 0)
		return start;

	return start + (get_random_long() % range << PAGE_SHIFT);
}

This code remains ~unchanged.

void get_random_bytes_arch(void *buf, int nbytes)
{
	get_random_bytes(buf, nbytes);
}
EXPORT_SYMBOL(get_random_bytes_arch);

I removed RDRAND usage here, but the interface itself should remain.

#ifdef CONFIG_BLOCK
void rand_initialize_disk(struct gendisk *disk) {}
#endif

void add_hwgenerator_randomness(const char *buffer, size_t count,
				size_t entropy) {}
EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);

These calls are also unused now.

#ifdef CONFIG_SYSCTL

#include <linux/sysctl.h>
#include <linux/uuid.h>

static char sysctl_bootid[16];

/*
 * This function is used to return both the bootid UUID, and random
 * UUID.  The difference is in whether table->data is NULL; if it is,
 * then a new UUID is generated and returned to the user.
 *
 * If the user accesses this via the proc interface, the UUID will be
 * returned as an ASCII string in the standard UUID format; if via the
 * sysctl system call, as 16 bytes of binary data.
 */
static int proc_do_uuid(struct ctl_table *table, int write,
			void __user *buffer, size_t *lenp, loff_t *ppos)
{
	struct ctl_table fake_table;
	unsigned char buf[64], tmp_uuid[16], *uuid;

	uuid = table->data;
	if (!uuid) {
		uuid = tmp_uuid;
		generate_random_uuid(uuid);
	} else {
		static DEFINE_SPINLOCK(bootid_spinlock);

		spin_lock(&bootid_spinlock);
		if (!uuid[8])
			generate_random_uuid(uuid);
		spin_unlock(&bootid_spinlock);
	}

	sprintf(buf, "%pU", uuid);

	fake_table.data = buf;
	fake_table.maxlen = sizeof(buf);

	return proc_dostring(&fake_table, write, buffer, lenp, ppos);
}

static int _proc_do_entropy(struct ctl_table *table, int write,
			   void __user *buffer, size_t *lenp, loff_t *ppos,
			   int entropy)
{
	struct ctl_table fake_table;
	int entropy_count;

	entropy_count = entropy;

	fake_table.data = &entropy_count;
	fake_table.maxlen = sizeof(entropy_count);

	return proc_dointvec(&fake_table, write, buffer, lenp, ppos);
}

static int proc_do_entropy(struct ctl_table *table, int write,
			   void __user *buffer, size_t *lenp, loff_t *ppos)
{
	int entropy_count;

	entropy_count = kfifo_len(&inner_ring);

	return _proc_do_entropy(table, write, buffer, lenp, ppos, entropy_count);
}

static int proc_do_outer_entropy(struct ctl_table *table, int write,
			   void __user *buffer, size_t *lenp, loff_t *ppos)
{
	int entropy_count;

	entropy_count = kfifo_len(&outer_ring);

	return _proc_do_entropy(table, write, buffer, lenp, ppos, entropy_count);
}

extern struct ctl_table random_table[];
struct ctl_table random_table[] = {
	{
		.procname	= "entropy_avail",
		.maxlen			= sizeof(int),
		.mode		= 0444,
		.proc_handler	= proc_do_entropy,
		.data		= 0,
	},
	{
		.procname	= "outer_ring_avail",
		.maxlen			= sizeof(int),
		.mode		= 0444,
		.proc_handler	= proc_do_outer_entropy,
		.data		= 0,
	},
	{
		.procname	= "boot_id",
		.data		= &sysctl_bootid,
		.maxlen			= 16,
		.mode		= 0444,
		.proc_handler	= proc_do_uuid,
	},
	{
		.procname	= "uuid",
		.maxlen			= 16,
		.mode		= 0444,
		.proc_handler	= proc_do_uuid,
	},
	{ }
};
#endif		/* CONFIG_SYSCTL */

The sysctl interface is much smaller now. I wired only the inner and outer ring lengths to /proc/sys/kernel/random/entropy_avail and /proc/sys/kernel/random/entropy_avail for now, but potentially I can add monitoring of other driver aspects.


The vpatch and signature are available here:

curl 'http://bvt-trace.net/vpatches/linux-fuckgoats-rng.vpatch' > linux-fuckgoats-rng.vpatch
curl 'http://bvt-trace.net/vpatches/linux-fuckgoats-rng.vpatch.bvt.sig' > linux-fuckgoats-rng.vpatch.bvt.sig

Pressing using V may take quite some time and significant amount of RAM (~4.6Gb) though:

$ time ~/v/vk.pl flow
linux-genesis.vpatch (bvt)

real    100m50.435s
user    100m43.297s
sys     0m7.090s

So pressing the vpatches manually according to the manifest may be an option for the impatient. Just make sure to verify the signatures in this case.


How do you feed the RNG? You can just use:

# stty -F "$FUCKGOATS_DEVICE" 115200 raw -echo -echoe -echok
# nohup dd if="$FUCKGOATS_DEVICE" of=/dev/random iflag=fullblock bs=256 &

somewhere in you startup scripts, or you can use this simple proggy (included in genesis in linux/tools/rng):

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <sys/types.h>
#include <termios.h>
#include <unistd.h>
#include <errno.h>

int init_fg(const char *devpath)
{
    int r;
    int fd;
    struct termios term;

    fd = open(devpath, O_RDONLY);
    if (fd == -1) {
        fprintf(stderr, "Error opening file: %s\n", strerror(errno));
        exit(EXIT_FAILURE);
    }
    r = tcgetattr(fd, &term);
    if (r == -1) {
        fprintf(stderr, "Error getting tty attributes: %s\n", strerror(errno));
        exit(EXIT_FAILURE);
    }
    cfmakeraw(&term);
    r = cfsetspeed(&term, B115200);
    if (r == -1) {
        fprintf(stderr, "Error setting serial speed: %s\n", strerror(errno));
        exit(EXIT_FAILURE);
    }
    r = tcsetattr(fd, TCSANOW, &term);
    if (r == -1) {
        fprintf(stderr, "Error setting tty attributes: %s\n", strerror(errno));
        exit(EXIT_FAILURE);
    }
    return fd;
}

int main(int argc, char *argv[])
{
    int in, out;
    if (argc != 2) {
        fprintf(stderr, "usage: %s FUCKGOATS-tty\n", argv[0]);
        exit(EXIT_FAILURE);
    }
    in = init_fg(argv[1]);
    out = open("/dev/random", O_WRONLY);
    while (1) {
        char buf[256];
        ssize_t nread = 0;
        while (nread < sizeof(buf)) {
            ssize_t l = read(in, &buf[nread], sizeof(buf) - nread);
            if (l == -1) {
                fprintf(stderr, "Read failed: %s\n", strerror(errno));
                exit(EXIT_FAILURE);
            }
            nread += l;
        }
        write(out, &buf[0], nread);
    }
    return 0;
}

Notably missing is any kind of disconnection detection and reporting. To implement something for this, I would need feedback from the users - how would you monitor these events?


To prevent non-root users from writing to /dev/random I also did a small change in drivers/char/mem.c:

 #endif
         [5] = { "zero", 0666, &zero_fops, 0 },
         [7] = { "full", 0666, &full_fops, 0 },
-        [8] = { "random", 0666, &random_fops, 0 },
+        [8] = { "random", 0600, &random_fops, 0 },
         [9] = { "urandom", 0666, &urandom_fops, 0 },
 #ifdef CONFIG_PRINTK
        [11] = { "kmsg", 0644, &kmsg_fops, 0 },

But on Cuntoo, the following change is also necessary: last line of /lib/udev/rules.d/40-gentoo.rules should read:

SUBSYSTEM=="mem", KERNEL=="null|zero|full|urandom", MODE="0666"

(i.e., |random should be removed from the list).


Please scrutinize and test everything: this is a fairly large change to a crucial component, in C, so mistakes are probable, even if I'm not aware of any ATM.

  1. Mostly due to fast hashing and self-hashing, though. Given the FG rate of 7.2k/sec, reducing it to ~2.4k/sec makes filling a top 1Mb half of the outer ring using HG a rather long-term task. []
  2. RNG, hash function, and a block cypher can be transformed into one another. []

8 Responses to “FG-fed Linux RNG”

  1. > If I'm in the error here, I will fix this ASAP.

    I believe you aren't, but that my spec was hastily overshort. Suppose instead of the incorrectly fixedpoint O.R >= O.W/2 the criteria is properly written as (O.W - O.R) To implement something for this

    I do not believe the kernel should watch the FG.

    In closing -- you did an impressive amt of work here ; it will take multiple digestion passes on my part.

  2. God damn the blog ate my thing.

    -the criteria is properly written as (O.W - O.R)
    +the criteria is properly written as (O.W - O.R) < O / 2.

  3. Possibly orthogonal to the immediate subj : what's the obstacle to writing kernel module in Ada? Has anyone tried yet?

  4. bvt says:
    4

    @Mircea Popescu: (1) Good that I got it right.
    (2) Re blog eating parts of comments, it seems to be a never ending issue. Is it really worth having this functionality?

    @Stanislav: I am aware of this but never tried in practice. This runtime thing seems to be designed specifically for writing modules, and it is not clear how it will work for non-module code (code is this post is not a module, for example). Something would have to call adainit(), at the very least.

    I can give it a shot, I guess.

  5. bvt says:
    5

    And the first spotted error: unprivileged users can write to inner ring through /dev/urandom just fine.

  6. You realise I could turn comment #5 right back on #4, point out how teh kernel is certainly going to be a neverending source of issues, and inquire the same thing. What can we do, practically, turn off the machine ?

  7. bvt says:
    7

    Yeah, I see: can't stop operation because of that.

    Thinking about the markup filtering problem a bit more, a guess some sort of preview functionality for comments would help?

  8. [...] Linux 4.9.95 was genesis'd and feeding the RNG with FG is [...]

Leave a Reply