Friday, April 24, 2015

Streaming UUEncoder in .NET.

Flashback

The last time I used Unix-to-Unix format (AKA UUEncoding) was when USENET was still the big thing and Mosaic web browser was just coming out. That was until recently, when I had a requirement to encode and decode this file type.

Searching for an Implementation

Since Base64 has largely replaced this older format, it was hard to find a current implementation for the .NET platform. I did run across a port of KDE's kcodecs. Although the port wasn't a streaming solution in the context of implementing the Stream class. Also, it allocated a lot of one item byte arrays using the ReadByte call for each character.

Creating an Implementation

Originally I tried to create my own solution by implementing the .NET Encoder class but the interface didn't fit the requirements of UUEncoding. For example, the GetBytes call works on a per character basis whereas UUEncoding takes 3 characters at a time. Also, a header and footer needs to be written, and the encoded payload is segmented into lines prefixed by encoded line lengths.

I ended up creating my own encoder class that was scoped to only handle data line by line.

public static class UUEncoder
{
  // Assumes the current position is at the start of a new line.
  public static byte[] DecodeLine(Stream buffer)
  {
    // ...
  }  
  public static byte[] EncodeLine(byte[] buffer)
  {
    // ...
  }
}

I then created encode and decode Stream classes that depended on the encoder. Having the encoding and decoding happening in a Stream based way was critical for my requirements since I was lazily evaluating the data and wouldn't just read it all up front. This was important since some of the files tended to be Gigabytes in size and an in-memory solution would have created an unacceptable memory footprint. Along with the nastiness that potentially comes with it like thrashing.

Using the Code

You can find my implementation, with tests, on Github here.

To decode any stream:

using (Stream encodedStream = /* Any readable stream. */)
using (Stream decodedStream = /* Any writeable stream. */)
using (var decodeStream = new UUDecodeStream(encodedStream))
{ 
  decodeStream.CopyTo(decodedStream);
  // Decoded contents are now in decodedStream.
}

To encode any stream:

bool unixLineEnding = // True if encoding with Unix line endings, otherwise false.
using (Stream encodedStream = /* Any readable stream. */)
using (Stream decodedStream = /* Any writeable stream. */)
using (var encodeStream = new UUEncodeStream(encodedStream, unixLineEnding))
{
  decodedStream.CopyTo(encodeStream);
  // Encoded contents are now in encodedStream.
}

Note on Licensing

I published the code under version 2 (not 2.1) of the LGPL since I took the bit twiddling and encoder maps from KDE's implementation.

More Resources & Reading

Monday, April 20, 2015

Get Windows Service name from executable in PowerShell.

I was recently putting some PowerShell scripts together for deployment and maintenance of software to our machine instances. One of the requirements was to be able to discover the service name from a Windows Service executable that uses ServiceInstaller. I needed to be able to extract this value in a generic way in order to query the service and stop it if running. Here is what I was able to put together.

Function Get-WindowsServiceName([string]$exePath)
{
    $assembly = [System.Reflection.Assembly]::LoadFrom($exePath)
    $type = $assembly.GetTypes() | Where { $_.GetCustomAttributes([System.ComponentModel.RunInstallerAttribute], 0).Length -gt 0 } | Select -First 1
    $installer = [System.Configuration.Install.Installer][Activator]::CreateInstance($type)
    $serviceInstaller = [System.ServiceProcess.ServiceInstaller]$installer.Installers[0]
    $serviceInstaller.ServiceName
}

Note that this doesn't support multiple installers in a single executable.

Sunday, March 29, 2015

Log per class pattern.

Rookie Moves

Awhile ago, I had originally created a single logger for each service and shared it statically across the application.

public static class Log
{
  private readonly static Lazy<Log> instance = new Lazy<Log>(() => new Log(), true);
  public static Log Instance
  {
    get { return instance.Value; }
  }
}

It always had felt strange doing this since it violated encapsulation. I was referencing a static instance inside my objects without ensuring the existence of the log instance itself, while also making assumptions about the instance's state. Essentially reaching outside the class every time to the singleton, which is a global state instance in my case.

Best Practices

As I researched more on best practices with logging, logger per class appeared to be the best pattern since it offered the most fine grain control with respect to filtering and configuration.

When using logging in your class, you should separate the concern of how the log gets created from the use of the log. This can be achieved by having a factory accessor.

public class Log
{
  // Set your factory property to the actual log implementation you wish to use.
  public static Func<Type, Log> Factory { get; set; }

  // Instance based properties and methods.
}

You can also use the abstract factory pattern if you have to wrap your logging implementations.

public abstract class LogBase
{
  // Set your factory property to the actual log implementation you wish to use.
  public static Func<Type, LogBase> Factory { get; set; }

  // Abstract properties and methods.
}

Then just call the factory by passing in the class type it is for.

public class SomeObject
{
  private static readonly LogBase log = Log.Factory(typeof(SomeObject));
}

The Difference

It may not seem like much of a change but subtle differences are happening:

  • Calling class can now communicate it's state to the log's creation.
  • Log creation is no longer the responsibility, implicitly or otherwise, of the class.

In my case, this was very liberating since we had several log implementations in our large codebase. I no longer needed to worry about which log implementation was being used by messing with the singleton construction, and could leverage filtering when I need to isolate a single component or service that was causing trouble. Something that I couldn't do before.

More Reading & Resources

Saturday, March 28, 2015

Creating XML based templates in log4net.

Motivation

I decided to use log4net for a recent project I had been working on. There is a concept of loggers and appenders. Loggers are composed of appenders and a log threshold. Appenders are consumers of logging information and provide specific implementations (eg. file, email, event log, database). You can configure the loggers and appenders either in the application configuration or at runtime.

Configuring logging in the application configuration provides the most flexibility. It is great being able to change settings on the fly, especially when it is running in a production environment and redeploying the build is out of the question. Although this approach comes at the expense of having a lot of information in your application configuration for the loggers and appenders. No big deal though if you just have to configure once.

Why Templates?

I had quite a bit of projects with a lot of redundant logging configuration information in each one's application configuration file. Much of the information had a standard form that we wanted uniform across our different projects too (eg. log file name conventions, event log setup, email format). Also, if we updated the logging appender configuration for a new standard, we would need to do it to every project's application configuration file. This is where templates came into play.

Writing the Code

To cut down the amount of configuration information needed to start a new project with logging and make the configuration more uniform where needed, we offloaded it into the code and left the rest in the application configuration file like so.

<log4net xsi:noNamespaceSchemaLocation="log4net.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<logger name="LoggerTemplate">
  <appender name="SmtpAppenderTemplate" type="log4net.Appender.SmtpAppender">
    <to value="peter@initech.com" />
    <from value="system@initech.com" />
    <smtpHost value="mail.initech.com" />
    <username value="peter" />
    <password value="abc123" />
  </appender>
</logger>

<root>
  <level value="INFO" />
</root>

</log4net>

We don't use the logger directly but rather as a template for our root logger. Now we just need to craft a method to consume the template and create the root appenders at runtime.

/// <summary>
/// Get appenders matching the logger template name and use them to populate the root appenders at runtime.
/// </summary>
/// <param name="loggerTemplateName">The logger template name found in the application configuration.</param>
public static void ConfigureRootFromTemplate(string loggerTemplateName)
{
  ILog logTemplate = LogManager.GetLogger(loggerTemplateName);

  if (logTemplate == null)
  {
    throw new ArgumentException(
      String.Format(
        "Logger template {0} not found in log4net configuration. Make sure there is an " +
        "logger in the log4net configuration with the name {0}.",
        loggerTemplateName),
        "loggerTemplateName");
  }

  IAppender[] appenderTemplates = logTemplate.Logger.Repository.GetAppenders();
  var smtpAppenderTemplate = appenderTemplates.FirstOrDefault(a => a is SmtpAppender) as SmtpAppender;

  if (smtpAppenderTemplate == null)
  {
    throw new ArgumentException(
      String.Format(
        "SmtpAppender template not found in log4net configuration. Make sure there is an " +
        "SmtpAppender in the log4net {0} logger.",
        loggerTemplateName),
        "loggerTemplateName");
  }

  // Can repeat the above pattern with other appenders as well.
  // Create appenders using the template information from above.
  
  AddAppendersToRootAndConfigure(
    new AppenderCollection 
 {
      // Put your created appenders here.
    });
}

private static void AddAppendersToRootAndConfigure(AppenderCollection appenders)
{
  // Get the log repository.
  var hierarchy = (Hierarchy)log4net.LogManager.GetRepository();
  // Get the root logger.
  Logger rootLogger = hierarchy.Root;
  foreach (var appender in appenders)
  {
    // Add all the appenders and activate.
    rootLogger.AddAppender(appender);
    ((AppenderSkeleton)appender).ActivateOptions();
  }
  // Flag root logger as configured with new appender information.
  rootLogger.Hierarchy.Configured = true;
}

Then just call the configuration method at the application's startup.

class Program
{
  /// 
  /// The main entry point for the application.
  /// 
  static void Main()
  {
    // Other startup configuration code.

    log4net.Config.XmlConfigurator.Configure(); // Load the application configuration information.
    Log.ConfigureRootFromTemplate("LoggerTemplate");

    // More startup configuration code.
  }

Considerations

I don't recommend using this approach in all cases. It definitely cuts down the amount of application configuration needed but at the cost of information hiding, since it has been moved to code. Also, it may not be obvious to an uninitiated developer what your application configuration is doing, especially since this template approach is not encoded into the structure of log4net's XML. Although, if you have many projects and need to effect changes to logging across them, this may be a good solution for you.

More Reading & Resources

Using type converters for JSON.NET.

Motivation

I had these flyweights that added a lot of overhead to the serialization process. They weren't really needed in the serialized payload either. In fact, I could recreate the flyweight in memory from just a single property on the object.

public class FlyweightObject
{
   public string Key { get; private set; }
   public string AProperty { get; private set; }
   public string AnotherProperty { get; private set; }
   public string YetAnotherProperty { get; private set; }
   // ... Lots of properties.

   // Overridden GetHashCode and Equals methods to make equality by Key.
}
public class FlyweightObjectFactory
{
   public static FlyweightObjectFactory Instance { get; private set; }

   // Singleton initialization code.

   public FlyweightObject GetObject(string key)
   {
      // Get from dictionary, or create object and add to dictionary.
      // Return object from dictionary.
   }
}

Nothing out of the ordinary here. Although, these flyweights were also used as keys in a dictionary I was serializing. This was a problem too because only scalars can be used when serializing dictionaries with JSON.NET. It does mention that I can use a type converter though.

Serializing

Let focus on serializing this object to a scalar first. The JSON.NET serialization guide mentions that I can override the Object.ToString method. So let's do that.

public class FlyweightObject
{
   public string Key { get; private set; }
   public string AProperty { get; private set; }
   public string AnotherProperty { get; private set; }
   public string YetAnotherProperty { get; private set; }
   // ... Lots of properties.

   // Overridden GetHashCode and Equals methods to make equality by Key.

   public override string ToString()
   {
      return this.Key;
   }
}

I'm done right? Who needs type converters? Well this doesn't help us when I need to deserialize. This is where type converters come into play.

Implementing a Type Converter

First we have to provide the concrete implementation of our type converter class for the flyweight.

internal class FlyweightObjectConverter
   : TypeConverter
{      
   /// <summary>
   /// Returns whether this converter can convert an object of the given type to the type of this converter, using the specified context.
   /// </summary>
   /// <returns>
   /// true if this converter can perform the conversion; otherwise, false.
   /// </returns>
   /// <param name="context">An  that provides a format context. </param>
   /// <param name="sourceType">A <see cref="T:System.Type"/> that represents the type you want to convert from. </param>
   public override bool CanConvertFrom(ITypeDescriptorContext context, Type sourceType)
   {
      return sourceType == typeof(string) || base.CanConvertFrom(context, sourceType);
   }

   /// <summary>
   /// Converts the given object to the type of this converter, using the specified context and culture information.
   /// </summary>
   /// <returns>
   /// An <see cref="T:System.Object"/> that represents the converted value.
   /// </returns>
   /// <param name="context">An <see cref="T:System.ComponentModel.ITypeDescriptorContext"/> that provides a format context. </param>
   /// <param name="culture">The <see cref="T:System.Globalization.CultureInfo"/> to use as the current culture. </param>
   /// <param name="value">The <see cref="T:System.Object"/> to convert. </param><exception cref="T:System.NotSupportedException">The conversion cannot be performed. </exception>
   public override object ConvertFrom(ITypeDescriptorContext context, CultureInfo culture, object value)
   {
      return FlyweightObjectFactory.Instance.GetObject((string)value);
   }
}

Then attribute the convertible class with the implementation.

[TypeConverter(typeof(FlyweightObjectConverter))]
public class FlyweightObject
{
   public string Key { get; private set; }
   public string AProperty { get; private set; }
   public string AnotherProperty { get; private set; }
   public string YetAnotherProperty { get; private set; }
   // ... Lots of properties.

   // Overridden GetHashCode and Equals methods to make equality by Key.
}

Good to go. Now I can deserialize the flyweights. The JSON.NET secret sauce can be found in the static method Newtonsoft.Json.Utilities.ConvertUtils.TryConvertInternal. This method is used by the internal reader class for its own deserialization method and consequently for the final Newtonsoft.Json.JsonSerializer class, which is the lynchpin for all the serialization helper methods in the JSON.NET ecosystem.

Considerations

Instead of overriding the Object.ToString method, I could implement the TypeConverter.CanConvertTo and TypeConverter.ConvertTo methods. It is really up to you. In my case, I already had the Object.ToString method overridden since the key property uniquely identifies the object without having to create an implicitly structured string (ie. comma delimited properties as a single string; "Prop1|Prop2|Prop3").

If you only need a type conversion for JSON, you can also use JSON.NET's own custom type converters. Although this doesn't work in the case of dictionary keys.

Lastly, if you have a complex type as your dictionary key, you may want to consider just serializing it as a collection and then deserializing it using a JSON.NET custom converter.

More Reading & Resources

Friday, October 17, 2014

Using CC.NET & Gallio for priority based smoke testing.

Pitfalls in Production

Being able to monitor production services for potential errors is critical. Especially if the services are dependant on external resources which may become unavailable or exhibit unexpected behavior. Even if you follow good software development discipline, this is always a source of concern. Think network outages, unannounced third party API changes, hosted services becoming unavailable, etc.

For large software projects, creating a testing strategy that involves unit and integration is helpful when managing complexity of the commit to deployment workflow. Functional/smoke tests are also good to ensure critical functionality works as expected. Although, in an environment where the running of your software is dependent on external resources, you need a system of continuous monitoring that run these smoke tests.

Monitoring Confusion

At Xignite, we use Gallio for our testing framework and CC.NET for our continuous integration. I used these tools for my production smoke tests but soon realized that not all tests were equal. Getting paged at 2am for a test failure that is not mission critical sucks. Even worse, these lower priority failures can mask very high priority ones since they cause the entire test fixture to fail, and require scrutiny on behalf of the dev-ops person to ensure that a high priority failure doesn't fall through the cracks.

Consider the following test fixture and how it lumps all the different tests together.

namespace MyServices.Tests.Smoke
{
   using Gallio.Framework;
   using Gallio.Model;
   using MbUnit.Framework;

   [TestFixture]
   public class OneOfMyServicesTests
   {
      [SetUp]
      public void Setup()
      {
         // Setup an individual test to run.
      }

      [FixtureSetUp]
      public void TestFixtureSetUp()
      {
         // Configure fixture for testing.
      }

      [FixtureTearDown]
      public void TestFixtureTearDown()
      {
         if (TestContext.CurrentContext.Outcome == TestOutcome.Failed)
         {
            // Send signal to monitoring system that test fixture has failed.
         }
         else if (TestContext.CurrentContext.Outcome == TestOutcome.Passed)
         {
            // Send signal to monitoring system that test fixture has succeeded.
         }
         else
         {
            // Handle some other outcome.
         }
      }     

      [Test]
      public void MissionCritical()
      {
         // ...
      }

      [Test]
      public void Important()
      {
         // ...
      }

      [Test]
      public void MinorFunctionality()
      {
         // ...
      }     
   }
}

Any test failure will make the entire context outcome to fail whether it was the mission critical test or the test that affects minor functionality. I tried looking through the Gallio/MbUnit API, event its source, but couldn't find a way to find out which tests failed within a fixture. If anyone knows how to determine this, please let me know.

Prioritized Testing

What I do know though is that you can inherit the TestAttribute class and override its Execute method. I created a required parameter to specify the priority of the test and then used a LoggedTestingContext class to store all the results.

namespace MyServices.Tests
{
   using Gallio.Framework.Pattern;
   using MbUnit.Framework;

   public class LoggedTestAttribute
      : TestAttribute
   {
      public const int MinPriority = 1;
      public const int MaxPriority = 3;

      private readonly int priority;
      public int Priority { get { return this.priority; } }

      public LoggedTestAttribute(int priority)
      {
         if (priority < MinPriority || priority > MaxPriority)
         {
            throw new ArgumentException("Priority must be 1, 2, or 3.", "priority");
         }
         this.priority = priority;
      }

      protected override void Execute(PatternTestInstanceState state)
      {
         try
         {
            base.Execute(state);
            LoggedTestingContext.AddTest(this, state.Test, true);
         }
         catch (Exception)
         {            
            LoggedTestingContext.AddTest(this, state.Test, false);
            throw;
         }
      }
   }
}
namespace MyServices.Tests
{
   using System;
   using System.Collections.Generic;
   using System.Linq;
   using Gallio.Framework;
   using Gallio.Model.Tree;

   public static class LoggedTestingContext
   {
      private class TestFailure
      {
         public string FullName { get; private set; }

         public LoggedTestAttribute TestAttribute { get; private set; }

         public TestFailure(LoggedTestAttribute testAttribute, Test test)
         {
            this.FullName = test.FullName;
            this.TestAttribute = testAttribute;
         }
      }

      private const int PriorityCount = LoggedTestAttribute.MaxPriority - LoggedTestAttribute.MinPriority + 1;
      
      private static readonly Dictionary nameToFailure = new Dictionary();

      internal static void AddTest(LoggedTestAttribute testAttribute, Test test, bool passed)
      {         
         if (passed)
         {
            return;
         }
         var failure = new TestFailure(testAttribute, test);
         if (!nameToFailure.ContainsKey(failure.FullName))
         {
            nameToFailure.Add(failure.FullName, failure);
         }
      }

      private static bool HasFailed(Test fixtureTest, int priority)
      {
         return fixtureTest.Children
            .Any(c =>
               nameToFailure.ContainsKey(c.FullName) &&
               nameToFailure[c.FullName].TestAttribute.Priority == priority);
      }

      public static void LogSmokeTests(Test fixtureTest, string serviceName)
      {     
         foreach (var priority in Enumerable.Range(LoggedTestAttribute.MinPriority, PriorityCount))
         {
            if (HasFailed(fixtureTest, priority))
            {
               // Send signal to monitoring system that test fixture has failed for priority # tests.
            }
            else
            {
               // Send signal to monitoring system that test fixture has succeeded for priority # tests.
            }
         }
      }
   }
}

Finally, putting it together, I replaced the TestAttribute with the new LoggedTestAttribute and then process the results in the test fixture teardown.

namespace MyServices.Tests.Smoke
{
   using Gallio.Framework;
   using Gallio.Model;
   using MbUnit.Framework;

   [TestFixture]
   public class OneOfMyServicesTests
   {
      [SetUp]
      public void Setup()
      {
         // Setup an individual test to run.
      }

      [FixtureSetUp]
      public void TestFixtureSetUp()
      {
         // Configure fixture for testing.
      }

      [FixtureTearDown]
      public void TestFixtureTearDown()
      {
         LoggedTestingContext.LogSmokeTests(TestContext.CurrentContext.Test, "OneOfMyServices");
      }     

      [LoggedTest(1)]
      public void MissionCritical()
      {
         // ...
      }

      [LoggedTest(2)]
      public void Important()
      {
         // ...
      }

      [LoggedTest(3)]
      public void AffectsFunctionalityByDoesntRequireImmediateAttention()
      {
         // ...
      }     
   }
}

More Reading & Resources

Monday, March 17, 2014

State behavioral pattern to the rescue.

The Creeping Problem

I recently found myself developing a request-response style system, where the lifetime of a request could be interrupted at any moment. For most same process execution, like your average desktop application, this is a concern, but arises more often when dealing with multiple coordinated processes. My case ended up being the latter.

One of the ways to ensure redundancy is to isolate the steps of the request-response workflow into isolated atomic units or states. This way, if it fails, it can always be re-executed without having to perform the work that came before it. It is especially helpful when the total resources required are large and there is a higher probability of failure. We can just divvy up the work into states that act like idempotent functions. Below is a great simplification of the actual project I worked on but I wanted to boil it down to its simplest form, eliminating excessive states that I have collapsed to the CreateResponse state.

In my original implementation, I modeled the requests as queued items (UserRequest) that I would dequeue and start work on.

foreach (var request in requests) // as IEnumerable
{
   switch (request.State)
   {
      case State.ReceiveRequest:
         if (TryReceiveRequest(request)) request.State = State.CreateResponse;
         break;
         
      case State.CreateResponse:
         if (TryCreateResponse(request)) request.State = State.SendResponse;
         break;

      case State.SendResponse:
         if (TrySendResponse(request)) request.State = State.ResponseSent;
         break;

      case State.ResponseSent:
         break;

      case State.Faulted:

      default:
         throw new ArgumentOutOfRangeException("request.State");
   }
   if (request.State != State.ResponseSent && request.State != State.Faulted)
      requests.Enqueue(request);
}

Seems simple enough, but in my case the CreateResponse state ended being fairly computationally intensive and could take anywhere from a few seconds to several minutes. These long delays could be due to the workload of remote processes it was waiting on, temporal failure points like the network or even the system the process was running on. Another added complexity was that these requests were being serviced in parallel, by multiple processes that could be on the same system or not. Lastly, actual production level code never ends up being this simple; you quickly find yourself adding a lot of instrumentation and covering of edge cases.

foreach (var request in requests)
{
   log.LogDebug("request.Id = {0}: Request dequeued in state {1}.", request.Id, request.State);
   switch (request.State)
   {
      case State.ReceiveRequest:
         logger.LogDebug("request.Id = {0}: Trying to receive request.", request.Id);
         if (TryReceiveRequest(request)) request.State = State.CreateResponsePart1;
         break;
         
      case State.CreateResponsePart1:
         logger.LogDebug("request.Id = {0}: Trying to create response for part 1.", request.Id);
         if (TryCreateResponsePart1(request)) request.State = State.CreateResponsePart2;
         break;

      case State.CreateResponsePart2:
         logger.LogDebug("request.Id = {0}: Trying to create response for part 2.", request.Id);
         if (TryCreateResponsePart2(request))
         {
            request.State = State.CreateResponsePart3;
         }
         else
         {
            request.State = State.CreateResponsePart1;
            ExecuteCreateResponsePart2Cleanup();
            logger.LogError("request.Id = {0}: Unexpected failure while evaluation create response part 2.", request.Id);
         }
         break;

      case State.CreateResponsePart3:
         logger.LogDebug("request.Id = {0}: Trying to create response for part 3.", request.Id);
         bool unrecoverable;
         if (TryCreateResponsePart3(request, out unrecoverable))
         {
            request.State = State.SendResponse;
         }
         else
         {
           if (unrecoverable)
           {
               logger.LogError("request.Id = {0}: Failure is unrecoverable, faulting request.", request.Id);
               request.State = State.Faulted;
           }
           else
           {
               request.State = State.CreateResponse2;
           }
         }
         break;

      case State.SendResponse:
         logger.LogDebug("request.Id = {0}: Trying to send response.", request.Id);
         if (TrySendResponse(request)) request.State = State.ResponseSent;
         break;

      case State.ResponseSent:
         break;

      case State.Faulted:
         logger.LogCritical("request.Id = {0}: Request faulted.", request.Id);
         break;

      default:
         throw new ArgumentOutOfRangeException("request.State");
   }
   log.LogDebug("request.Id = {0}: Request transitioned to state {1}.", request.Id, request.State);
   if (request.State != State.ResponseSent && request.State != State.Faulted)
   {
      logger.LogDebug("request.Id = {0}: Re-enqueuing request for further evaluation.", request.Id);
      requests.Enqueue(request);
   }
   else
   {
      logger.LogDebug("request.Id = {0}: Request evaluation is complete, not re-enqueuing.", request.Id);
   }
}

What a mess! This code quickly starts getting bloated. In addition, not every state evaluation will be successful and be considered exceptional. Maybe it is polling another process and can't transition until that process is ready. As the result of each state evaluation changes beyond a simple yes/no (true/false), we end up with a state machine that could have multiple transitions. This makes for ugly code and too much coupling. All the state evaluation logic is in the same class and you have this huge switch statement. We could get around the extensive logging by using dependency injection but what do we inject? There is no consistent call site signature to inject to. The ever growing case statements could be extracted into their own method, but then readability suffers. This sucks.

You may be saying, "Well you obviously could solve it by ..." and I would agree with you. This code ugliness could be solved many different ways, and is intentionally crap for the purpose of this post. The major problems I faced was:

  • A large state machine object and large code blocks.
  • Lack of symmetry in state handling.
  • Multiple method postconditions that couldn't be expressed by the boolean return result alone.
  • Coupling of state transition logic, business logic and diagnostics.

I knew something was wrong but I wasn't quite sure how to solve it without adding more complexity to the system and allowing readability to suffer. As someone who has had to spend hours reading other people's unreadable code, I didn't want to commit the same sin.

Looking For A Solution

In university, they teach you how to be a good Computer Scientist; you learn complexity analysis, synthetic languages and the theoretical underpinning of computation. Although, none of this really prepares you to be a software engineer. I could concoct my own system, by why do this when I can stand on the shoulders of giants.

I always read or heard references to the Gang of Four book, even listened to talks by the original authors and became familiar with some of the more famous patterns (Factory and Singleton come to mind). Maybe there was a solution in there. I can't be the first one to come across this simple design problem. So there I found it in the State design pattern.

The design is pretty simple. You have a context that is used by the end user, and the states themselves wrapped by the context. The context can have a range of methods that behave differently based on the concrete state type being used at that moment (eg. behavior of a cursor click in a graphics editor). I modified this design, using a single method to abstract workflow and act as a procedural agent for processing multiple state machines.

The Code

The first step was to construct a state object that will be the super type to all of my concrete states.

public abstract class StateBase   
{
   // Let the concrete type decide what the next transition state will be.
   protected abstract StateBase OnExecute();  

   public StateBase Execute()
   {
      // Can add diagnostic information here.
      return this.OnExecute();   
   }   
}

Next I need a context class that can create and run the state machine.

public abstract class StateContextBase
{
   private StateBase state;   

   protected abstract StateBase OnCreate();
   protected abstract StateBase OnExecuted(StateBase nextState);
   protected abstract bool OnIsRunning(StateBase state);

   public StateContextBase(StateBase state)
   {
      this.state = state;
   }

   public StateContextBase Execute()
   {
      // Need to create the state machine from something.
      if (this.state == null)
      {
         // We will get to this later.
         this.state = this.OnCreate();
      }
      // Let the concrete context decide what to do after a state transition.
      this.state = this.OnExecuted(state.Execute());
      return this;
   } 
 
   public bool IsRunning()
   {
      // Have the concrete type tell us when it is in the final state.
      return this.OnIsRunning(this.state);
   }
}

While glossing over the details, what will this look like at the application's entry point.

class Program
{
   static void Main(string[] args)
   {
      // Will need to get it from somewhere but won't worry about this for now.
      var requests = Enumerable.Empty<StateContextBase>();

      // Can be changed to false on an exit call.
      var running = true;
      while (running)
      {    
         requests = requests
            .Where(r => r.IsRunning())
            .Select(r => r.Execute());
      }
   }
}

That is beautiful! All I see is the state machine decider logic and I don't even need to be concerned with what type of state machines are running.

So let's dive into the details. First, there is the creation of the state into memory. We have to get this from somewhere, so let's add another abstraction on top of our StateBase super type. Something that can be persisted in case the process crashes and can be accessed across many systems (eg. database).

In my case, I used the Entity Framework ORM, which is based off of the unit of work and repository design patterns. There is a context (DataContext) that I will get my model object (UserRequest) from to figure out the current state. A unique key (UserRequest.Id : Guid) will be used to identify the persisted object. We won't concern ourselves as to why this is just unique and not an identity key (that could be in another post) but it basically comes down to the object's initial creation at runtime not relying on any persistence store for uniqueness.

public class DataContext
   : System.Data.Entity.DbContext
{
   public DbSet UserRequests { get; set; }

   public DataContext()
      : base("name=DataContext")
   {
   }
}
public abstract class PersistedStateBase<TEntity>
   : StateBase
   where TEntity : class
{
   private Guid id;

   protected abstract StateBase OnExecuteCommit(DataContext context, Guid id, TEntity entity);
   protected abstract TEntity OnExecuteCreate(DataContext context, Guid id);
   protected abstract StateBase OnExecuteRollback(DataContext context, Guid id, TEntity entity);

   public PersistedStateBase(Guid id)
   {
      this.id = id;
   }

   protected override StateBase OnExecute()
   {
      // Also consider exceptions thrown by DataContext.
      StateBase nextState = this;
      using (var context = new DataContext())
      {
         TEntity entity = null;
         try
         {
            entity = this.OnExecuteCreate(context, this.id);
            nextState = this.OnExecuteCommit(context, this.id, entity);
            context.SaveChanges();
         }
         catch (Exception ex)
         {
            // Handle exception.
            nextState = this.OnExecuteRollback(context, this.id, entity);
         }
      }
      return nextState;
   }
}

The model object (UserRequest, our entity type) will hold the state as an enumeration (UserRequest.State) and contain all the data needed for processing through the state machine.

public enum UserRequestState
{
   None = 0,  
   Receive = 1,  
   CreateResponse = 3,
   SendResponse = 4,
   ResponseSent = 5,
   Faulted = -1,  
}
[DataContract]
public class UserRequest
{
   [DataMember]
   public Guid Id { get; private set; }
   [DataMember]
   public UserRequestState State { get; private  set; }

   // Other properties here like the location of the user request and other metadata.

   private UserRequest()  // Required by EF to create the POCO proxy.
   {}

   public UserRequest(Guid id, UserRequestState state)
   {
      this.Id = id;
      this.State = state;
   }
}

Now let's implement our first state using the types we have created.

public class ReceiveState
   : PersistedStateBase<UserRequest>
{
   public ReceiveState(Guid id)
      : base(id)
   {}
  
   protected override StateBase OnExecuteCommit(DataContext context, Guid id, UserRequest entity)
   {      
      var successful = false;
      var faulted = false;
      // Receive user request and decide whether successful, unsuccessful with retry or
      // unrecoverable/faulted. 
      if (successful)
      {
         return new CreateResponseState(id);
      }
      else
      {
         return faulted ? new FaultedState(id) : this;
      }
   }

   protected override UserRequest OnExecuteCreate(DataContext context, Guid id)
   {
      // Get model object
      return context.UserRequests.Find(id);
   }

   protected override StateBase OnExecuteRollback(DataContext context, Guid id, UserRequest entity)
   {
      // Rollback any changes possibly made in the OnExecuteCommit method and attempt recovery,
      // if possible, in this method. For this example, we will just return the current state.
      return this;
   }
}

We need to also make our state context concrete with the type below. This tends to have more wiring since type per state doesn't really translate well in an ORM. This class could be greater simplified with attributes on the state types, designating the enumeration value they map to.

public class UserRequestContext
   : StateContextBase
{
   private static Dictionary<Type, UserRequestState> typeToDbState;
   private static bool databaseRead = false;

   public Guid Id { get; private set; }

   static UserRequestContext()
   {
      databaseRead = false;
      typeToDbState = new Dictionary<Type, UserRequestState>()
      {
         { typeof(ReceiveState), UserRequestState.Receive },
         { typeof(CreateResponseState), UserRequestState.CreateResponse},
         { typeof(SendResponse), UserRequestState.SendResponse},
         { typeof(ResponseSent), UserRequestState.ResponseSent},
         { typeof(FaultedState), UserRequestState.Faulted },
      };
   }

   public UserRequestContext(Guid id)
      : base(null)
   {
      this.Id = id;
   }

   public static IEnumerable<Guid> GetRunningIds()
   {
      if (UserRequestContext.databaseRead)
      {
         var ids = Enumerable.Empty<Guid>(); // Get from message queue.    
         return ids;
      }
      else
      {
         using (var dataContext = new DataContext())
         {
            var ids = dataContext.UserRequests
               .Where(u => 
                  u.State != UserRequestState.ResponseSent &&
                  u.State != UserRequestState.Faulted)
               .Select(u => u.Id)
               .ToArray(); // Force evaluation.

            UserRequestContext.databaseRead = true;

            return ids;
         }
      }
   }

   protected override bool OnIsRunning(StateBase state)
   {
      return !(state is CompleteState);
   }

   protected override StateBase OnCreate()
   {
      using (var dataContext = new DataContext())
      {
         // Maps persisted state enumeration to runtime types.
         var entity = dataContext.UserRequests.Find(this.Id);
         switch (entity.State)
         {
            case UserRequestState.Receive:
               return new ReceiveState(this.Id);

            case UserRequestState.CreateResponse:
               return new CreateResponseState(this.Id);

            case UserRequestState.SendResponse:
               return new SendResponseState(this.Id);

            case UserRequestState.ResponseSent:
               return new ResponseSentState(this.Id);

            case UserRequestState.Faulted:
               return new FaultedState(this.Id);

            default:
               throw new ArgumentOutOfRangeException();
         }
      }
   }

   protected override StateBase OnExecuted(StateBase nextState)
   {
      // Run any other deciding logic in here that is independent 
      // of the states themselves (eg. logging, perf counters).

      return nextState;      
   }
}

Finally, let's come full circle and show what the application entry point will look like once all is said and done.

class Program
{
   static void Main(string[] args)
   {
      var requests = Enumerable.Empty<StateContextBase>();

      var running = true;
      while (running)
      {         
         requests = requests
            .Where(r => r.IsRunning())
            // Append new user requests found.
            .Concat((IEnumerable)UserRequestContext
               .GetRunningIds()
               .Select(i => new UserRequestContext(i)));
            .Select(r => r.Execute());
      }
   }
}

Ahhhh Yeah...

After fleshing it all out, I really got that satisfying feeling you get as a software engineer when you know that you made the right design decisions. I isolated my business logic into their own types (eg. ReceiveRequestState), separated it from the state machine transition logic, added symmetrical handling of persistence logic by layering it on top of the state type (PersistedStateBase) and contained the persistence-runtime bridge (from UserRequest to PersistedStateBase subtypes) into its own type (UserRequestContext). If I want to add more states, I can simply add to the model's state enumeration (UserRequest.State) and update the state context (UserRequestContext). If I want to change the transition logic, all I need to do is go to the concrete state type itself (eg. ReceiveRequestState) and feel confident that my variables are all scoped correctly. No coupling, no excessive mutations and no excessive side effects.

Using The Right Tool

This design pattern isn't for every state machine problem. In simple cases, it can definitely be overkill; you can see a bit of starter code is needed. Although, if you find yourself designing a state machine with multiple outbound transitions and final states, this could be the right modified pattern for you.

More Reading & Resources