Skip to main content

Memory efficient queuing of variable length elements

In embedded environments memory can be a critical driver of the design of data structures and containers. Computing resources have been expanding steadily each year but there are still a wide range of systems with far less than a megabyte of memory.

On systems with tens of kilobytes of memory, structures are often designed to be compact to maximize data density. Rather than splurging on memory aligned elements that would be faster for the processor to access, a developer will typically use types with minimal sizes based on the known range of values that the element is intending to hold.

Fixed sized buffers

At my day job a fixed size pool of messages was implemented to hold message data. While this achieved one design goal of using statically allocated buffers, avoiding dynamic allocations that might fail at runtime, it isn't efficient if there is a wide range of message sizes. It isn't efficient because each message uses a message buffer. With small message sizes the buffers are mostly unused.

Illustration of the efficiency of storing data in fixed sized buffers
Illustration of the efficiency of storing data in fixed sized buffers


It may be possible to better tune the sizes of the fixed message buffers to the sizes of the data messages. The difficulty is that the expected range of data sizes in the system can run from ~10 bytes to as much as 2k. Whatever buffer size is picked is effectively the upper limit of the size of a single message under the simple one message-one buffer approach being used today. Rather than trying to size the buffers appropriately I've been looking for a more efficient data structure that will handle the wide range of message sizes without manual tuning.

Circular buffer queue

For the initial replacement of this fixed pool I decided to try out a circular buffer of messages with a length header in front of each message. To implement this I'm using two classes, one class to implement the circular buffer with methods for reading and writing data, CircularBufferMemoryStream, and another to implement the circular queue on top of the circular buffer, called CircularQueue, to layer the message header handling on top of the circular buffer.

The circular queue is a circular buffer with headers that indicate the length of the data to follow as shown in the illustration below.

Circular buffer queue illustration showing data headers and queued data
Circular buffer queue illustration showing data headers and queued data


Some of the additional complexities of circular buffer implementations involve wrapping data at the buffer boundaries and detecting whether the buffer is full or empty. Wikipedia's article on circular buffers does a great job of explaining how circular buffers work and various ways to determine how much data is in the buffer.

Your implementation may have different needs but I chose to use an additional boolean flag to indicate full vs. empty because I wanted the buffer to be able to hold the number of bytes that were passed in during its construction. The always keep one slot open approach would have meant that the buffer could have held (size - 1) number of elements. Using the flag approach of indicating full vs. empty means the user of the class can store as many bytes as they made available to the circular buffer instance when it was created. Using the "keep one slot open approach" would have meant the user would have had to either allocate a buffer one byte larger than they needed, or store one less byte in the queue. This seemed more complex than having the class keep an internal "full" flag.

Zero copy

I've always liked the idea of zero copy. Its tough to beat the efficiency and performance of not doing something.

Avoiding memory copying almost always improves performance. A queue with zero copy support means there is no need for a local temporary buffer.

As an additional feature, the CircularQueue can provide zero copy access to its data. The CircularQueue can provide a way to peek into the next message to be dequeued by returning a read only instance of CircularMemoryStream. This enables the user of the CircularQueue to be able to process message data in a stream fashion without having to copy it out of the CircularQueue into another buffer. This saves either stack or static/heap memory space and saves the CPU cycles that would have gone to performing the additional copy.

C++ class interface

Here are some of the headers for the queue and circular stream classes to show the queue and circular buffer might be implemented. These class interfaces show the kinds of access that could be provided to users. ICircularQueue::PeekMessage() provides support for a zero copy approach. This method returns a CircularMemoryStream of the next queued message. This enables the data to be processed directly out of the ICircularQueue without having to copy it into a temporary buffer for processing. ICircularQueue::ReadAndDiscard() is then used to discard the message after processing, again without having to copy it into a temporary buffer.

#pragma once

#include <stdint.h>
#include <stddef.h>
#include <string.h>
#include <algorithm>
#include <assert.h>

/**
 * Circular memory stream
 *
 * Data is read from the tail and written to the head
 *
 * A boolean is used to track whether the buffer is full in the case where
 * head == tail to avoid using a byte of the input buffer. This makes the
 * available space more natural for the end user, what they pass in is
 * what they can store.
 */
class CircularMemoryStream
{
public:
    /**
     * Creates a null stream configured for readonly access.
     *
     * Use 'SetBufferAndLength()' to assign a buffer and control
     * whether the stream is read only or support writes.
     */
    CircularMemoryStream();

    CircularMemoryStream(void *buffer, size_t bufferSize);

    CircularMemoryStream(const void *buffer, size_t bufferSize);

    /**
     * Configures the stream for reading and writing from/to the buffer given by pBuffer
     */
    void SetBufferAndLength(void *pNewBuffer, size_t newBufferSize);

    /**
     * Configures the stream for read only access into the buffer given by pBuffer
     */
    void SetBufferAndLength(const void *pNewBuffer, size_t newBufferSize);

    size_t PeekUint16(uint16_t &val) const;

    /**
     * Configure an instance that represents a sub-section of the current stream,
     * starting with the current position and going forward 'bytesToRead'
     * @return true if successful, false if there aren't enough bytes available
     */
    bool Peek(CircularMemoryStream &circularMemoryStream, size_t bytesToRead) const;

    /**
     * Configure an instance that represents a sub-section of the current stream,
     * starting with the current position + offsetFromTail, and going forward 'bytesToRead'
     * @return true if successful, false if there aren't enough bytes available
     */
    bool Peek(CircularMemoryStream &circularMemoryStream, size_t offsetFromTail, size_t bytesToRead) const;

    /**
     * An optimized version of the routine for writing 8 bit values
     */
    size_t WriteUint8(uint8_t val);

    /**
     * Write data into the circular buffer. Writes in excess of SpaceAvailable(),
     * eg. partial writes, will be rejected.
     */
    size_t Write(const void *source, size_t bytesToWrite);

    size_t Read(void *destination, size_t bytesToRead);

    /**
     * Read data but discard it
     */
    size_t ReadAndDiscard(size_t bytesToRead);

    /**
     * @return Number of bytes stored in the stream,
     * available for reading
     */
    size_t SpaceUsed() const;

    /**
     * @return Number of bytes available for writing into the stream
     */
    size_t SpaceAvailable() const;

    size_t GetCapacity() const;

    bool IsReadOnly() const;
};


#pragma once

#include <stdint.h>
#include <stddef.h>
#include <string.h>
#include <algorithm>
#include <assert.h>

/**
 * Linear memory circular queue. 
 */
class ICircularQueue
{
public:
    /**
     * Writes either complete successfully or fail
     */
    virtual WriteStatus Write(const void *source, size_t bytesToWrite) = 0;

    /**
     * Read the next message from the queue
     * @param destination where the message is stored
     * @param destinationSize the largest message that can be stored in destination
     * @param bytesRead the number of bytes in the message stored in destination
     * @return ReadStatusSuccess if the message was read and bytesRead was updated, otherwise
     * destination and bytesRead are untouched and an error status is returned
     */
    virtual ReadStatus Read(void *destination, size_t destinationSize, size_t &bytesRead) = 0;

    /**
     * Read and discard the next message from the queue.
     * @retval ReadStatusSuccess if there was a message to read and discard in which
     *   case bytesRead is updated, otherwise bytesRead is untouched.
     */
    virtual ReadStatus ReadAndDiscard(size_t &bytesRead) = 0;
    
    /**
     * @return true if there is a message, false if not
     */
    virtual bool PeekNextMessageSize(size_t &nextMessageSize) const = 0;

    /**
     * Set circularMemoryStream to point at the next message in the queue
     */
    virtual bool PeekMessage(CircularMemoryStream &circularMemoryStream) const = 0;

    /**
     * @return the largest message that can be queued
     */
    virtual size_t GetCapacity() const = 0;

    /**
     * @return the amount of data in the queue
     */
    virtual size_t SpaceUsed() const = 0;

    /**
     * @return Largest number of bytes that can be written via Write()
     */
    virtual size_t SpaceAvailable() const = 0;
};

Comments

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. As an ultra-low-cost carrier, Spirit Airlines focuses on providing low fares by offering a basic level of service and charging additional fees for optional services and amenities.
    https://touvarism.com/is-spirit-airlines-safe/

    ReplyDelete

Post a Comment

Popular posts from this blog

Debugging an imprecise bus access fault on a Cortex-M3

This information may apply to other cortex series processors but is written from practical experience with the Cortex-M3. Imprecise bus access faults are ambiguous, as noted by the term "imprecise". Compared to precise bus errors, imprecise errors are much trickier to debug and especially so without a deep understanding of arm processors and assembly language. Imprecise and precise flags are found in the BusFault status register, a byte in the CFSR (Configurable Fault Status Register). BusFault status register bits The definition for imprecise and precise bits is: [2] IMPRECISERR Imprecise data bus error: 0 = no imprecise data bus error 1 = a data bus error has occurred, but the return address in the stack frame is not related to the instruction that caused the error. When the processor sets this bit to 1, it does not write a fault address to the BFAR. This is an asynchronous fault. Therefore, if it is detected when the priority of the current pr

Graco Swing By Me - Battery to AC wall adapter modification

If you have one of these Graco battery powered swings you are probably familiar with the cost of C batteries! The swing takes four of them and they only last a handful of days. I'm not sure if the newer models support being plugged into the wall but ours didn't. If you are a little familiar with electronics and soldering, here is a rough guide on how you can modify yours to plug in! I wasn't sure how exactly to disassemble the swing side where the batteries were. I was able to open up the clamshell a bit but throughout this mod I was unable to determine how to fully separate the pieces. I suspect that there is some kind of a slip plate on the moving arm portion. The two parts of the plastic are assembled and the moving arm portion with the slip plate is slid onto the shaft. Because of the tension in that slip plate it doesn't want to back away, and because of the mechanicals that portion of the assembly doesn't appear accessible in order to free it. I was