Posts Tagged

C#

A .NET coding puzzle: Can strings change?

December 5, 2024 0 Comment Challenges

The holiday season is upon us, and what better way to celebrate than with a festive coding challenge?
As part of this year’s “C# Advent” event, I’m sharing a tricky little puzzle that combines holiday cheer with some .NET and C# challenge. If you are a seasoned .NET developer and enjoy solving unique challenges, you might enjoy this one.

Disclaimer
This coding challenge is designed purely for fun and to explore some of .NET’s internal mechanics. It showcases advanced techniques that can be harmful if not handled carefully and you probably should never use this example in real-world production code.

The Challenge

In the spirit of the season, let’s take a look at the problem:

void Main()
{
    Initialize();

    string happyHolidays = "Merry Christmas";
 
    System.Console.WriteLine(happyHolidays); // should output: Advent of C#

    static void Initialize()
    {
        // Your implementation here
    }
}

Your task is to modify the “Initialize” method so that the program outputs exactly: Advent of C#

Here are the constraints:

You are only allowed to modify the implementation of the Initialize method.
You cannot change the method name, parameters, or signature.
You cannot alter any other parts of the program.
You cannot use System.Console.WriteLine (or similar) in the Initialize method.
You cannot change the default output stream.

This is one of my favorite challenges, it requires some out-of-the-box thinking and a deep knowledge in how Strings works in the .NET runtime.
Before diving into the solution, give yourself a minute to try to solve it!

Solution Breakdown

When solving this challenge, we need to remember that the primary task is to modify the Initialize method in a way that allows the happyHolidays variable to reflect the string “Advent of C#” instead of “Merry Christmas.” Since we’re restricted to modifying only the Initialize method, the solution lies in:

Exploring how strings work in C#: Strings in .NET are immutable, which means they cannot be changed after they are created. However, their references can be manipulated under certain conditions.
Using an internal implementation detail of the runtime and the C# compiler: Since the “Initialize” method is static we can’t directly access the “happyHolidays” variable, therefore we have to be clever and use knowledge of how both the C# compiler and the .NET runtime treat strings.

Solution

There are multiple ways to code the solution, some of which are more elegant that others, but all solutions uses the same tricks:

The string “Merry Christmas” is defined as a constant, and the compiler optimizes for efficiency by eliminating duplicate string constants. To achieve this, it ensures that each constant string is interned.
In .NET, the runtime uses a String Intern Pool, which acts as a cache to store a single instance of each string. This allows multiple references to point to the same string, saving memory. Since strings are immutable, this sharing doesn’t cause any issues.
While strings are not automatically added to the intern pool at runtime and requires an explicit call to “String.Intern”, constant strings are interned by default by the compiler.
In .NET, strings are immutable, but this immutability is enforced by the provided APIs. While there are no official public APIs to modify the value of a string instance, this doesn’t mean a string’s value is entirely unchangeable. Using C# features like unsafe code and pointers, it is possible to directly access and modify the string’s memory, bypassing the usual immutability constraints.
Since “Merry Christmas” is longer than “Advent of C#”, we can reuse the existing memory allocated for “Merry Christmas” without needing to allocate new memory. Making a string shorter is easier than making it longer.

Solution #1 – not the most elegant solution

static void Initialize()
{
    string adventOfCsharp= "Advent of C#";
    string happyHolidays = "Merry Christmas";

    unsafe
    {
        fixed (char* happyHolidaysPointer = happyHolidays)
        {
            for (int i = 0; i < adventOfCsharp.Length; i++)
            {
                happyHolidaysPointer[i] = adventOfCsharp[i];
            }
        }
    }

    typeof(string)
        .GetField("_stringLength", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)
        .SetValue(happyHolidays, adventOfCsharp.Length);
}

Solution explanation

This solution works by directly modifying the memory and metadata of the interned string “Merry Christmas”, replacing its content with “Advent of C#”.

String Memory Modification:
- The unsafe block and the fixed keyword pin the memory of the happyHolidays string, allowing direct pointer access.
- A loop overwrites the characters in happyHolidays with the corresponding characters from adventOfCsharp
Updating String Metadata:
- - The typeof(string).GetField() retrieves the private _stringLength field, which stores the length of the string.
  - Using reflection, the length of happyHolidays is updated to match the new content, ensuring that subsequent operations recognize the string’s new length.

Solution #2 – the elegant solution

static void Initialize()
{ 
    var source = "Merry Christmas";
    var target = "Advent of C#";
    ref var sourceRef = ref MemoryMarshal.GetReference(source.AsSpan());
    var sourceSpan = MemoryMarshal.CreateSpan(ref sourceRef, source.Length);
    target.AsSpan().CopyTo(sourceSpan);
    Unsafe.Subtract(ref Unsafe.As<char, int>(ref sourceRef), 1) = target.Length;
}

Solution explanation

This solution is safer since it doesn’t use “unsafe” and direct pointer access, instead it works by using the MemoryMarshal which is less error-prone than raw pointer manipulation.

Memory Modification:
- Uses MemoryMarshal.GetReference to retrieve a reference to the string’s memory.
- Creates a mutable Span from the reference, allowing efficient memory modification without explicit pointer arithmetic.
Length Update:
- Uses Unsafe.As<char, int> and Unsafe.Subtract to directly modify the string’s length, as the length stored in memory just before the start of the string’s content
- Avoids reflection entirely, making the solution faster and more resilient to changes in .NET internals.

Important
Although this solution seem safer since it avoids direct pointer manipulation and pointer arithmetic, using these APIs can still performs operations that violate the runtime’s assumptions, making it inherently risky despite being arguably more elegant.
Both solutions are not really safe to use if you are not careful, and depending on the situation, it is easy to make mistakes and introduce hard to track bugs or even interrupt with GC/JIT optimizations.
Use both of these APIs in the scenarios they were intended to be used and by being very careful.

Key Takeaways

This challenge highlights the importance of understanding how strings and memory work under the hood. While direct memory manipulation is rarely needed in day-to-day programming, it’s a valuable tool for understanding how things work behind the scenes.

And most importantly, this challenge showcases advanced .NET techniques for learning purposes only. You probably should not use these methods in production code 🙂

Boost your productivity with Visual Studio’s DebuggerAttributes

December 18, 2017 0 Comment Debugging

Visual Studio is a great IDE with a great Debugger. It currently provides one of the best debugging experiences out there. The extensibility and the customizability of Visual Studio make it even better. For example, it provides Attributes for customizing the display of objects during debugging sessions. By utilizing these attributes we can decrease debugging time.

Assuming we need to examine a collection of objects during a debugging session when the objects are of type Scientist. By hovering over the collection, Visual Studio will list the objects using DataTips in the following manner:

These DataTips aren’t much help since we can’t see any information about the objects without expanding each one of them. This could increase debugging time if the collection is long.

One way to improve these DataTips is by overriding the ToString method of the Scientist class. For the following implementation of ToString:

public override string ToString() 
    => $"Name: {Name}, Birthday: {Birthday}";

This is how the collection will be displayed:

This is much better, but this isn’t always a viable option. For example, when we don’t want to override the ToString method only for debug purposes or when we completely can’t, for instance, when the class is defined in a third-party assembly.

Fortunately, Visual Studio provides ways to customize how objects are displayed during debug sessions by using Debug Attributes.

With DebuggerDisplayAttribute we can define how objects are displayed when hovering over a variable. The Attribute can be set either on the class level (Above a class) or on an assembly level for when we can’t modify the class.
The attributes expect a string similar to C#’s 6 String interpolation format and an optional TargetType that is only necessary when we define the attribute on an assembly level.

For our example, we can define the attribute directly over the class:

Or on an assembly level:

Note: Instead of using the Target parameter which accepts a Type, we could use TargetName which accepts a fully qualified name of a class. This is handy when the attribute is applied on an assembly level that don't reference the assembly defining the Type.

In both cases, the result will be the same:

Defining attributes for each type we might encounter while debugging isn’t too hard, but sometimes it might harm productivity. You’ll have to stop a debug session just for placing a new attribute or for changing how DebuggerDisplayAttribute displays a type. In addition, this requires a code modification that you’ll have to decide whether to commit to your source control or not.

Fortunately, OzCode has a neat feature called Reveal that solves this problem. It can change how objects are displayed at runtime without interrupting the debug session and without modifying your code. Just press on the ‘Star’ icon of each property you want to display.

Now, what about how each object is displayed when expanding it? By default Visual Studio will display all of its properties and fields:

This might be ok in most of the times, but sometimes either the object has too many fields or it might be missing fields or properties that might be helpful for debugging.

Luckily, Visual Studio provides a way to change completely how each object is displayed when it is expanded. This is done by using the DebuggerTypeProxyAttribute with a class that we will write to expose only the necessary properties and fields.

Assuming the Awards of each scientist aren’t necessary for our current debugging session and assuming we usually need to calculate how many years has passed since the current scientist was born, we implement the following class:

class ScientistTypeProxy
{
    private readonly Scientist _scientist;

    public ScientistTypeProxy(Scientist scientist)
    {
        _scientist = scientist;
    }

    public string Name => _scientist.Name;
    public string[] Fields => _scientist.Fields;
    public int YearsPassed => _scientist.Birthday.YearsPassedSince();
}

We can apply the DebuggerTypeProxyAttribute directly over the Scientist class or on the assembly level, similar to the DebuggerDisplayAttribute.

Or by the class Level:

This also might interrupt the debugging session and require code modifications. Fortunately, OzCode’s Custom Expression to the rescue! It provides the ability to add custom properties to be displayed when expanding an object during debugging. These Custom Expression’s won’t modify the actual classes, so no code modifications are necessary.

This is already better, but we still need to expand the array in order to see the values, or at least add a property to the DebuggerTypeProxy’s class or a Custom Expression using Ozcode to aggregate the array to a string. This is acceptable, but Visual Studio provides the DebuggerBrowsableAttribute specifically for that. Using this attribute, we can hide members of a class or even expand arrays by default using the RootHidden argument.

By applying the following attribute with RootHidden over the Fields member in ScientistTypeProxy

We receive the following result:

This is even better but we can go even further. For complex data, like collections, values for plotting graphs or for visualizing any complex model; properties and fields might not be the best way to debug the values of an object.
With the DebuggerVisualizerAttribute we can get total control on how objects are displayed. Actually, using a WinForms/WPF window, we can even plot graphs, display images and do whatever is needed to better debug such objects.

For example, the following class is a custom visualizer that can show the scientist’s picture if it is available

public class ScientistVisualizer : DialogDebuggerVisualizer
{
    public ScientistVisualizer()
    {
    }

    protected override void Show(IDialogVisualizerService windowService, IVisualizerObjectProvider objectProvider)
    {
        var scientist = (Scientist)objectProvider.GetObject();

        Window window = new Window()
        {
            Title = scientist.Name,
            Width = 400,
            Height = 300
        };

        var images = new Dictionary<string, string>;
        {
            ["Marie Curie"] = "Marie_Curie.jpg",
            ["Maria Goeppert-Mayer"] = "Maria_Goeppert-Mayer.jpg",
            ["Rosalind Franklin"] = "Rosalind_Franklin.jpg",
            ["Barbara McClintock"] = "Barbara_McClintock.jpg",
        };

        if (images.ContainsKey(scientist.Name))
        {
            string imageName = images[scientist.Name];

            window.Background = new ImageBrush(new BitmapImage(new Uri(string.Format("pack://application:,,,/{0};component/Images/{1}", typeof(ScientistVisualizer).Assembly.GetName().Name, imageName))));

            window.WindowStartupLocation = WindowStartupLocation.CenterScreen;
            window.ShowDialog();
        }
    }
}

We have to add the DebuggerVisualizerAttribute for the class we want to visualize. Similarly to the rest of the attributes, we can add it over the class or on the assembly level.

In addition, we have to add the SerializableAttribute over the class being visualized.

Or by the assembly level

As a result, by pressing on the “Magnify” icon next to each item, an image of the scientist appears:

Examples:

Because the custom visualizer has to extend the type DialogDebuggerVisualizer and because it opens a WPF window, we had to add references for:

Microsoft.VisualStudio.DebuggerVisualizers.dll
PresentationCore.dll
PresentationFramework.dll
System.Xaml.dll
and WindowsBase.dll

Because of this, it is usually better to define the Visualizer in a separate assembly.

This is great, but in case of a long collection, it might still be hard to find the items we are interested in. Luckily, OzCode has a collection filtering feature and a Search feature that are really handy in this situations.

You can find all of the classes above in this GitHub Gist or in this GitHub Repository.

When disaster strikes: the complete guide to Failover Appenders in Log4net

February 13, 2017 0 Comment During Work Logging

Log4Net is a cool, stable, fully featured, highly configurable, highly customizable and open source logging framework for .Net.

One of its powerful features is that it can be used to write logs to multiple targets, by using “Appenders”.
An Appender is a Log4Net component that handles log events; It receives a log event each time a new message is logged and it ‘handles’ the event. For example, a simple file appender will write the new log event to a local file.

Although there are a lot of Log4Net Appenders that are included in the Log4net framework, occasionally we won’t find one that fully satisfies our needs. During a project I was working on, I had to implement a failover mechanism for logging, where the app had to start logging to a remote service, and then failback to a local file system if that remote service wasn’t reachable anymore.

Fortunately, Log4Net allows us implement our own Custom Appenders.
The Appender had to start writing logs to a remote service, and fallback to a local disk file after the first failed attempt to send a log message to that service.

Implementing the Appender

To create a custom Appender we have to implement the IAppender interface. Although easy to implement, Log4Net makes it even simpler by providing the AppenderSkeleton abstract class, which implements IAppender and adds common functionalities on top of it.

public class FailoverAppender : AppenderSkeleton
{
    private AppenderSkeleton _primaryAppender;
    private AppenderSkeleton _failOverAppender;

    //Public setters are necessary for configuring
    //the appender using a config file
    public AppenderSkeleton PrimaryAppender 
    { 
        get { return _primaryAppender;} 
        set 
        { 
             _primaryAppender = value; 
             SetAppenderErrorHandler(value); 
        } 
    }

    public AppenderSkeleton FailOverAppender 
    { 
        get { return _failOverAppender; } 
        set 
        { 
            _failOverAppender = value; 
            SetAppenderErrorHandler(value); 
        } 
    }

    public IErrorHandler DefaultErrorHandler { get; set; }

    //Whether to use the failover Appender or not
    public bool LogToFailOverAppender { get; private set; }

    public FailoverAppender()
    {
        //The ErrorHandler property is defined in
        //AppenderSkeleton
        DefaultErrorHandler = ErrorHandler;
        ErrorHandler = new FailOverErrorHandler(this);
    }

    protected override void Append(LoggingEvent loggingEvent)
    {
        if (LogToFailOverAppender)
        {
            _failOverAppender?.DoAppend(loggingEvent);
        }
        else
        {
            try
            {
                _primaryAppender?.DoAppend(loggingEvent);
            }
            catch
            {
                ActivateFailOverMode();
                Append(loggingEvent);
            }
        }
    }

    private void SetAppenderErrorHandler(AppenderSkeleton appender)
        => appender.ErrorHandler = new PropogateErrorHandler();

    internal void ActivateFailOverMode()
    {
        ErrorHandler = DefaultErrorHandler;
        LogToFailOverAppender = true;
    }
}

The FailoverAppender above accepts two appenders; a primary appender and a failover appender.

By default it will propagate Log events only to the primary appender, but in case an exception is thrown from the primary appender during event logging , it will stop sending log events to that appender and instead it starts propagating log events only to the failover appender.

I’ve used AppenderSkeleton to reference both the primary and the failover appenders in order to utilize a functionality in the AppenderSkeleton class – in this case the ability to handle errors (a.k.a Exceptions) that were thrown during an appender attempt to log an event.
We can do so by assigning the ErrorHandler property defined in AppenderSkeleton an object.

I use the LogToFailOverAppender flag to determine whether we are in ‘normal’ mode or in ‘FailOver’ mode.

The actual logging logic exists in the overridden ‘Append’ method:

protected override void Append(LoggingEvent loggingEvent)
{
    if (LogToFailOverAppender)
    {
        _failOverAppender?.DoAppend(loggingEvent);
    }
    else
    {
        try
        {
            _primaryAppender?.DoAppend(loggingEvent);
        }
        catch
        {
            ActivateFailOverMode();
            Append(loggingEvent);
        }
    }
}

If the LogToFailOverAppender flag is active, it logs events using the failover appender, as it means an exception has been thrown already. Otherwise, it logs events using the primary appender, and it will activate the failover mode, if an exception is thrown during that time.

The following are the IErrorHandlers that I defined and used

/*
This is important. 
By default the AppenderSkeleton's ErrorHandler doesn't
propagate exceptions
*/
class PropogateErrorHandler : IErrorHandler
{
    public void Error(string message, Exception e, ErrorCode errorCode)
    {
        throw new AggregateException(message, e);
    }

    public void Error(string message, Exception e)
    {
        throw new AggregateException(message, e);
    }

    public void Error(string message)
    {
        throw new LogException($"Error logging an event: {message}");
    }
}

/*
This is just in case something bad happens. It signals 
the FailoverAppender to use the failback appender.
*/
class FailOverErrorHandler : IErrorHandler
{
    public FailOverAppender FailOverAppender { get; set; }
        
    public FailOverErrorHandler(FailOverAppender failOverAppender)
    {
        FailOverAppender = failOverAppender;
    }

    public void Error(string message, Exception e, ErrorCode errorCode)
        => FailOverAppender.ActivateFailOverMode();

    public void Error(string message, Exception e)
        => FailOverAppender.ActivateFailOverMode();

    public void Error(string message)
        => FailOverAppender.ActivateFailOverMode();
}

Testing the Appender

I’ve created a config file you can use to test the appender. These are the important bits:

<!--This custom appender handles failovers. If the first appender fails, it'll delegate the message to the back appender-->
<appender name="FailoverAppender" type="MoreAppenders.FailoverAppender">
    <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%date [%thread] %-5level %logger - %message%newline"/>
    </layout>

    <!--This is a custom test appender that will always throw an exception -->
    <!--The first and the default appender that will be used.-->
    <PrimaryAppender type="MoreAppenders.ExceptionThrowerAppender" >
        <ThrowExceptionForCount value="1" />
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%date [%thread] %-5level %logger - %message%newline"/>
        </layout>        
    </PrimaryAppender>

    <!--This appender will be used only if the PrimaryAppender has failed-->
    <FailOverAppender type="log4net.Appender.RollingFileAppender">
        <file value="log.txt"/>
        <rollingStyle value="Size"/>
        <maxSizeRollBackups value="10"/>
        <maximumFileSize value="100mb"/>
        <appendToFile value="true"/>
        <staticLogFileName value="true"/>
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%date [%thread] %-5level %logger - %message%newline"/>
        </layout>
    </FailOverAppender>
</appender>

In this post I’ll discuss the parts that are relevant to the appender. You can find the full config file here. The rest of the config file are a regular Log4Net configurations, which you can read more about here and here.

Log4Net has a feature that give us an ability to instantiate and assign values to public properties of appenders in the config file using XML. I’m using this feature to instantiate and assign values to both the PrimaryAppender and the FailOverAppender properties.

In this section I’m instantiating the PrimaryAppender:

<PrimaryAppender type="MoreAppenders.ExceptionThrowerAppender" >
    <ThrowExceptionForCount value="1" />
    <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%date [%thread] %-5level %logger - %message%newline"/>
    </layout>        
</PrimaryAppender>

The type attribute’s value is the fully qualified name of the appender’s class.
For our example, I’ve created the ExceptionThrowerAppender appender for testing purposes. It can be configured to throw exceptions once per a certain amount of log events.

In a similar manner, in the following XML I’ve instantiated and configured the FailOverApppender to be a regular RollingFileAppender

<FailOverAppender type="log4net.Appender.RollingFileAppender">
    <file value="log.txt"/>
    <rollingStyle value="Size"/>
    <maxSizeRollBackups value="10"/>
    <maximumFileSize value="100mb"/>
    <appendToFile value="true"/>
    <staticLogFileName value="true"/>
    <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%date [%thread] %-5level %logger - %message%newline"/>
    </layout>
</FailOverAppender>

I used the following code to create log events:

class Program
{
    static void Main(string[] args)
    {
        XmlConfigurator.Configure();

        var logger = LogManager.GetLogger(typeof(Program));

        for (var index = 0; index < int.MaxValue; ++index)
        {
            logger.Debug($"This is a debug message number {index}");
        }

        Console.ReadLine();
    }
}

I started the program in debug mode and placed a breakpoint inside the ‘Append’ method:

Notice how OzCode’s Predict the Future feature marks the if-statement with an X and with a red background, telling us that the condition is evaluated to false. That means an exception wasn’t thrown yet from the primary appender.

In order to make figuring out the loggingEvent message value easier, I’ve used OzCode’s Magic Glance feature to view the necessary information in every LoggingEvent object.

The result:

By continuing the program, the primary appender will handle the logging event, and it will throw an exception

After that exception is propagated by the ErrorHandler, it will be handled by the catch-clause, which activates the FailOverAppender mode (notice how the log event is sent to the FailOverAppender as well) which would send future logging events only to the FailOverAppender

This time the if-statement is marked by a green ‘V’ . This tell us that the condition is evaluated to be true and that it will execute the if-statement body (sends the logging-event to the failover appender).

You can view and download the code by visiting this GitHub Repository.

Summary

Log4Net is a well-known logging framework for .Net. The framework comes with a list of out-of-the-box Appenders that we can use in our programs.
Although these Appenders are usually sufficient for most of us, sometimes you’ll need Appenders that are more customized for your needs.

We saw how we can use Log4Net’s extensibility features to implement our own Log4Net custom Appenders. In this example, we have created a Fail-over mechanism for Appenders that we can use to change the active Appender when it is failing to append log messages.

Log4Net is highly extensible and it has many more extensibility features that I encourage you to explore.

Note: this post is published also at OzCode’s blog.

Using LINQ to abstract Bitwise operations

November 10, 2016 0 Comment Debugging During Work

There are 10 types of developers; those who work in a higher level and depend on abstractions, and those who work with the bits and bytes alongside the bare metal.

As a C# developer, I am one of the former, and I usually try to abstract everything I do. A few weeks ago, I figured out that I can use some of these abstraction techniques to solve “low-level” challenges too. To make a long story short, I was able to utilize LINQ in order to eliminate a rather very complicated bitwise operation.

The mission

A framework for communicating with FPGA cards (Field-Programmable Gate Array), where it sends and receives messages to and from those cards.

One of the framework’s abilities is to serialize and deserialize messages that were sent or received from a FPGA card to and from byte arrays, according to an application-specific protocol.

The structure and the parts each message is composed of vary, and are only determined at runtime.

This looks easy enough

Each message is composed of different parts with different lengths.
By serializing a message, we concatenate each part’s value according to its length in bits.

The following is an example of a message that is composed of 3 parts.

Message
Part 0	Part 1	Part 2
001	1010	00111110000
3 Bits	4 Bits	11 Bits

In this example, “Part 0” has the value of 0x1, “Part 1” has the value of “0xA” and “Part 2” has the value of “0x1F0”.

The framework doesn’t really care about the content of the messages, but rather cares for their structure.

When deserializing a message, we only need the structure or the schema of the message. For each message-part we need to specify its name, index (its order in the message) and length in bits.

I have therefore introduced the following class:

public class LayoutPart
{
	public string Name { get; set; }
	public int Index { get; set; }
	public int BitCount { get; set; }

	public LayoutPart(string name, int index, int bitCount)
	{
		Name = name;
		Index = index;
		BitCount = bitCount;
	}
}

When serializing a message, each message-part has to specify four properties. The three properties of LayoutPart, in addition to the Value of that part.

Here are the classes I have added:

public class MessagePart : LayoutPart
{
     public int Value { get; set; }

     public MessagePart(string name, int index, int bitCount, int value)
          : base(name, index, bitCount)
     {
          Value = value
     }
}

In reality, the content’s type wasn’t known beforehand, and it surly wasn’t always an int, but I decided to use an int in all of my examples to keep them simple.

The relevant API is really simple, and it has two methods; Serialize and Deserialize:

public interface IMessageSerializer
{
    byte[] Serialize(IEnumerable<MessagePart> messageValues);
    IEnumerable<MessagePart> Deserialize(IEnumerabe<LayoutPart> layout,
                                         byte[] bytes);
}

Bitwise operations: This is really ugly

So far so good 😀

After defining the API, I went on and started implementing it.

The following is an example of serializing the message from the example above

Name	BitCount	Value	Value in Bits	Value in Bytes
Part 0	3	1	001	00000001
Part 1	4	0xA	1010	00001010
Part 2	11	0x1F0	00111110000	00000001 11110000

The byte array should look like this

Byte 2	Byte 1	Byte 0
00000000	11111000	01010001

The first byte is composed of Part 0 (3 bits), Part 1 (4 bits) and the first bit of Part 2 (11 bits). The second byte is composed of bits 1-8 of Part 2, and the third byte is composed of bits 9-10 of Part 2 with 6 more trailing zeros to complete a byte of 8 bits.

I approached it using the standard-naive way: bitwise operations.
I needed the ability to extract bits from different bytes and bit indexes, then concatenate them together for composing a byte array.

Assuming we are working with System.Int32, how would I extract the n first bits? Easy enough!

value & ((1 << n) - 1)

And to extract the n last bits?

(value >> (32 - n)) & ((1 << n) - 1)

What about extracting n bits that are in the “middle”? This is starting to be more complicated.

((value >> startIndex) & ((1 << n) - 1));

I did a little refactoring and defined them in methods.

long ExtractBits(uint value, int n, int startIndex) =>
                ((value >> startIndex) & ((1 << n) - 1));

long ExtractFirstBits(uint value, int n) => ExtractBits(value, n, 0);

long ExtractLastBits(uint value, int n) => ExtractBits(value, n, 32 - n);

This looks a little bit better. I had to use System.Int64 and system.UInt32 for overflow/underflow safety when performing bitwise operations.

Most importantly, this is an over simplified version of what I needed to do. I also had to consider:

Concatenating the results
Messaging parts of length exceeding 8 bytes (sizeof(long)). Current implementation won’t work
Transfer the results into a byte array
Implementing deserialization, which is even harder

I went on and finished the rest of the implementation. It took me a full day of work (implementing + testing), and it wasn’t easy, at least not as I thought it would be. I kept telling myself that there has to be a better way.

The code wasn’t maintainable, even with the 90% code coverage I had of tests. Not by the version of me from next month or even from next week, and of course not by another dev from my team. This code will become an unmaintainable, ‘don’t touch’, legacy code as soon as I’ll push the changes to the repo and mark the task as ‘Done’.

This code is ugly.

Oh, much better

I took a step back and tackled the problem from a different angle. I figured that my goal is pretty simple; I have a collection of values and all I have to do for the Serialize method is to:

Project each message part into bits
Concatenate the bits of all values
Project the result into a byte array

And all I have to do for the deserialize method is to:

Project the byte array into bits.
Slice them according to the length of each message part
Project each slice into the corresponding message part object

I quickly thought about LINQ! Isn’t this one of the scenarios that are easily solved by LINQ? I went on and explored this idea further.

I decided to use the Type-Safe Enum Pattern to represent a Bit (Bit.cs). I also used some LINQ operations from Jon Skeet’s MoreLinq library.

For the Serialize method I started by writing the following extension methods:

//Projects a byte into IEnumerable<Bit>
public static IEnumerable<Bit> ToBits(this byte @byte)
{
    for(var index = 0; index < 8; ++index, @byte >>= 1)
    {
        yield return 1 == (@byte & 1);
    }
}

//Projects IEnumerable<Bit> into a byte
public static byte ToByte(this IEnumerable<Bit> bits) =>
    bits.Reverse().Aggregate((byte)0, (currentValue, bit) => 
        (byte)((currentValue << 1 | bit) ));

//Projects a byte[] into IEnumerable<Bit>
public static IEnumerable<Bit> ToBits(this IEnumerable<byte> bytes) 
    => bytes.SelectMany(ToBits);

The following are the steps for the Serialize method:

Project each value into bits, according to the BitCount property of its message-part object
1. Project the value into bytes
2. Project the bytes into bits
Group the bits into a group of 8-bit each
Project each group into a byte
Project the IEnumerable of bytes into a byte array

byte[] Serialize(IEnumerable<MessagePart> messageParts)
{
    return messageParts.OrderBy(part => part.Index)
        .SelectMany(part => 
            BitConverter.GetBytes(part.Value)
            .ToBits())
        .Batch(8)
        .Select(slice => slice.ToByte())
        .ToArray();
}

(Did you notice the bug? I’ll discuss it later)

For implementing the deserialize method, I created the following extension methods:

//Projects an array of 4 bytes into an int
public static int ToInt(this byte[] bytes) => 
                            BitConverter.ToInt32(bytes, 0);

//Projects a bunch of 8-bit IEnumerables into an IEnumerable of bytes
public static IEnumerable<byte> 
    ToBytes(this IEnumerable<IEnumerable<Bit>> bits) => bits.Select(ToByte);

The following are the steps for the deserialize method:

Project the byte array into IEnumerable of bits
Slice the bits according to each message-part length and order
Project each slice into 4 bytes (sizeof(System.Int32))
Project each 4 bytes into a System.Int32

IEnumerable<MessagePart> Deserialize(IEnumerable<LayoutPart> layout, 
                                     byte[] bytes)
{
    var bits = bytes.ToBits();
    
    return layout.OrderBy(part => part.Index).Select(part =>
    {
        var slice = 
            bits
            .Take(part.BitCount)
            .Batch(8)
            .ToBytes()
            .Pad(sizeof(int), (byte)0)
            .Take(sizeof(int))
            .ToArray()
            .ToInt();

        bits = bits.Skip(part.BitCount);

        int value = BitConverter.ToInt32(slice, 0);

        return new MessagePart(part.Name, value);
    }).ToArray();
}

Looks better, Eh?

Testing the implementation

It isn’t always easy to debug LINQ operations, thus in order to debug, test and visualize my examples, I used OzCode which is an awesome debugging extension for Visual Studio. I installed the EAP (Early Access Program) edition, which includes amazing LINQ debugging capabilities, and put it to work.

This is the code according to the example above

var layout = new[]
{
    new LayoutPart("Part 0", 0, 3),
    new LayoutPart("Part 1", 1, 4),
    new LayoutPart("Part 2", 2, 11),
};

var message = new[]
{
    new MessagePart("Part 0", 0, 3, 1),
    new MessagePart("Part 1", 1, 4, 0xA),
    new MessagePart("Part 2", 2, 11, 0x1F0),
};

var serializer = new MessageSerializer();

var bytes = serializer.Serialize(message);
var messageParts = serializer.Deserialize(layout, bytes);

I used OzCode’s LINQ analyser for visualizing and tracking the LINQ operations.

Here we can see what the actual bits are, and their order in the concatenated message-parts.

But according to the BitCount property of Part 0, its value should be projected only to 3 bits.

Fixing the bug:

byte[] Serialize(IEnumerable<MessagePart> messageParts)
{
    return messageParts.OrderBy(part => part.Index)
        .SelectMany(part => 
            BitConverter.GetBytes(part.Value)
            .ToBits()
            .Pad(part.BitCount, Bit.Off) //Added
            .Take(part.BitCount))  //Added
        .Batch(8)
        .Select(slice => slice.ToByte())
        .ToArray();
}

I added Pad and Take to be sure I project exactly the needed count of bits.

This looks much better.

Here we can see the value of each byte:

Testing the Deserialize method, we can see the value of each message part:

Summary

Even though actions that require bitwise operations are usually considered “low-level”, I was able to use LINQ, which is considered a “high-level” API, to perform the same thing. This kind of abstraction saved me time and effort, and made my code more maintainable, readable and clean. I suggest to use abstractions whenever you can, and whenever it make sense.

What I really learnt though, is that we can use skills we acquire in one domain to solve challenges in another.
There is a saying that polyglot programmers write better code than those who specializes only in one language. I think it is yet another example of the same idea – each skill you acquire, will make you better at what you do, regardless how unrelated that skill might look like at that time.

Note: this post is published also at OzCode’s blog.

Moaid Hathot

Moaid Hathot

Posts Tagged

C#

A .NET coding puzzle: Can strings change?

The Challenge

Here are the constraints:

Solution Breakdown

Solution

Solution #1 – not the most elegant solution

Solution #2 – the elegant solution

Key Takeaways

Boost your productivity with Visual Studio’s DebuggerAttributes

When disaster strikes: the complete guide to Failover Appenders in Log4net

Implementing the Appender

Testing the Appender

Summary

Using LINQ to abstract Bitwise operations

The mission

This looks easy enough

Bitwise operations: This is really ugly

Oh, much better

Testing the implementation

Summary