Friday, April 24, 2015

Streaming UUEncoder in .NET.

Flashback

The last time I used Unix-to-Unix format (AKA UUEncoding) was when USENET was still the big thing and Mosaic web browser was just coming out. That was until recently, when I had a requirement to encode and decode this file type.

Searching for an Implementation

Since Base64 has largely replaced this older format, it was hard to find a current implementation for the .NET platform. I did run across a port of KDE's kcodecs. Although the port wasn't a streaming solution in the context of implementing the Stream class. Also, it allocated a lot of one item byte arrays using the ReadByte call for each character.

Creating an Implementation

Originally I tried to create my own solution by implementing the .NET Encoder class but the interface didn't fit the requirements of UUEncoding. For example, the GetBytes call works on a per character basis whereas UUEncoding takes 3 characters at a time. Also, a header and footer needs to be written, and the encoded payload is segmented into lines prefixed by encoded line lengths.

I ended up creating my own encoder class that was scoped to only handle data line by line.

public static class UUEncoder
{
  // Assumes the current position is at the start of a new line.
  public static byte[] DecodeLine(Stream buffer)
  {
    // ...
  }  
  public static byte[] EncodeLine(byte[] buffer)
  {
    // ...
  }
}

I then created encode and decode Stream classes that depended on the encoder. Having the encoding and decoding happening in a Stream based way was critical for my requirements since I was lazily evaluating the data and wouldn't just read it all up front. This was important since some of the files tended to be Gigabytes in size and an in-memory solution would have created an unacceptable memory footprint. Along with the nastiness that potentially comes with it like thrashing.

Using the Code

You can find my implementation, with tests, on Github here.

To decode any stream:

using (Stream encodedStream = /* Any readable stream. */)
using (Stream decodedStream = /* Any writeable stream. */)
using (var decodeStream = new UUDecodeStream(encodedStream))
{ 
  decodeStream.CopyTo(decodedStream);
  // Decoded contents are now in decodedStream.
}

To encode any stream:

bool unixLineEnding = // True if encoding with Unix line endings, otherwise false.
using (Stream encodedStream = /* Any readable stream. */)
using (Stream decodedStream = /* Any writeable stream. */)
using (var encodeStream = new UUEncodeStream(encodedStream, unixLineEnding))
{
  decodedStream.CopyTo(encodeStream);
  // Encoded contents are now in encodedStream.
}

Note on Licensing

I published the code under version 2 (not 2.1) of the LGPL since I took the bit twiddling and encoder maps from KDE's implementation.

More Resources & Reading

No comments:

Post a Comment