Skip to main content

Debugging an imprecise bus access fault on a Cortex-M3

This information may apply to other cortex series processors but is written from practical experience with the Cortex-M3.

Imprecise bus access faults are ambiguous, as noted by the term "imprecise". Compared to precise bus errors, imprecise errors are much trickier to debug and especially so without a deep understanding of arm processors and assembly language.

Imprecise and precise flags are found in the BusFault status register, a byte in the CFSR (Configurable Fault Status Register).

BusFault status register bits


The definition for imprecise and precise bits is:

[2]IMPRECISERR
Imprecise data bus error:
0 = no imprecise data bus error
1 = a data bus error has occurred, but the return address in the stack frame is not related to the instruction that caused the error.
When the processor sets this bit to 1, it does not write a fault address to the BFAR.
This is an asynchronous fault. Therefore, if it is detected when the priority of the current process is higher than the BusFault priority, the BusFault becomes pending and becomes active only when the processor returns from all higher priority processes. If a precise fault occurs before the processor enters the handler for the imprecise BusFault, the handler detects both IMPRECISERR set to 1 and one of the precise fault status bits set to 1.
[1]PRECISERR
Precise data bus error:
0 = no precise data bus error
1 = a data bus error has occurred, and the PC value stacked for the exception return points to the instruction that caused the fault.
When the processor sets this bit is 1, it writes the faulting address to the BFAR.

An imprecise error is most often caused by a write to an invalid address. Because writes can be cached the write can happen an instruction or more after the instruction that performed the write. This delay is the cause of the imprecise error, the current instruction is not the instruction that caused the fault.

A good starting debugging step is to determine the revision of the Cortex-M3 core. Revision 2 of the Cortex-M3 core have the Auxiliary Control register (ACTLR) Older version 1 cores lack this register. If you are using a r2 core you should disable write buffering at startup by setting the DISDEFWBUF bit to 1 . This will slow the execution speed of your application code but it will convert difficult to locate imprecise faults into precise faults, enabling you to look at BFAR to see the address of the instruction that caused the fault. At that point you should be able to debug the issue by looking at the assembly code and call chain and put a breakpoint on the offending line of code to examine the cause.

If, like me, you are using a revision of the Cortex-M3 that lacks the ACTLR register and the ability to disable write buffering, such as the STM32F10x series, you'll have to move to a much more time consuming approach.

Start by determining under what conditions your system is seeing imprecise faults. Reproducible faults are debuggable faults. Bisect the code with prints or breakpoints until you've narrowed down the fault then switch to single stepping through each instruction. As you step through the code record the instruction addresses. At some point you'll step and the processor will jump to the HardFault exception handler. At that point you should restart the system and reproduce the error and start single stepping from the last valid address you recorded. Eventually you should determine precisely where the fault happens. The offending instruction will be within a few instructions of the one that jumps to the HardFault exception.

Debugging imprecise faults isn't easy. I'd recommend trying to use a Cortex-M3 processor with support for disabling write buffering via the ACTLR. I like the STM32F10x series processors but the hundred or more hours spent debugging imprecise bus faults in the past few years hasn't been fun. Hope this helps.

Comments

  1. Thank you! On my M4, this allowed me to find my issue. May we all share knowledge that helps folks.

    ReplyDelete
  2. Thank you! On my M4, this allowed me to find my issue. May we all share knowledge that helps folks.

    ReplyDelete
    Replies
    1. I'm glad you found it helpful. If you have any improvements in the approach feel free to post them here in the comments for others to learn from.

      Delete
  3. the firs instruction after main()
    *(uint8_t *)0xe000ed08 |= 2; //setting the DISDEFWBUF bit to 1
    doesn't do anything

    ReplyDelete
  4. Another thumbs up here! Great help, solved my issue in no time.
    Thanks a lot!

    ReplyDelete

Post a Comment

Popular posts from this blog

Graco Swing By Me - Battery to AC wall adapter modification

If you have one of these Graco battery powered swings you are probably familiar with the cost of C batteries! The swing takes four of them and they only last a handful of days. I'm not sure if the newer models support being plugged into the wall but ours didn't. If you are a little familiar with electronics and soldering, here is a rough guide on how you can modify yours to plug in! I wasn't sure how exactly to disassemble the swing side where the batteries were. I was able to open up the clamshell a bit but throughout this mod I was unable to determine how to fully separate the pieces. I suspect that there is some kind of a slip plate on the moving arm portion. The two parts of the plastic are assembled and the moving arm portion with the slip plate is slid onto the shaft. Because of the tension in that slip plate it doesn't want to back away, and because of the mechanicals that portion of the assembly doesn't appear accessible in order to free it. I was

Memory efficient queuing of variable length elements

In embedded environments memory can be a critical driver of the design of data structures and containers. Computing resources have been expanding steadily each year but there are still a wide range of systems with far less than a megabyte of memory. On systems with tens of kilobytes of memory, structures are often designed to be compact to maximize data density. Rather than splurging on memory aligned elements that would be faster for the processor to access, a developer will typically use types with minimal sizes based on the known range of values that the element is intending to hold. Fixed sized buffers At my day job a fixed size pool of messages was implemented to hold message data. While this achieved one design goal of using statically allocated buffers, avoiding dynamic allocations that might fail at runtime, it isn't efficient if there is a wide range of message sizes. It isn't efficient because each message uses a message buffer. With small message sizes the buff