Skip to main content

Bus contention on a 1-wire bus

Background

I've been playing around with the Maxim DS18B20 digital temperature sensor. These are neat devices that provide a digital temperature, which means they provide an actual Celsius reading in digital form, with a pretty high level of accuracy, +/- 0.5C from -10C to 85C. They are also quite compact. Here is the typical TO92-3 form factor that the DS18B20 comes in.



Why not a thermistor connected to an ADC? Your microprocessor may not have an accurate enough or linear enough ADC to let you attain the desired temperature accuracy. In my case I'm using the Espressif ESP32.


The ESP32 is a great processor but its ADC still needs some work. Code to linearize the ADC results was added earlier this year but the biggest missing piece is proper calibration of the ADCs vref. Without this the scaling of ADC values can vary by several percent. Good news is that factory calibration is possible and should be coming.

Thermistors also have a non-linear resistance over temperature.



From my research it looks like the Steinhart-Hart equation is a typical way to correct for this non-linearity. It may also be possible to use a lookup table with points at a number of temperatures.

With a digital temperature sensor you directly receive a temperature value. No ADC or ADC accuracy to worry about, no adjustment for non-linear resistance.

One downside is that the DS18B20 is a bit pricey. On Digikey they are $2.72 for a single piece and $2.10 at quantity 2000.

I'm using this assembly from DFROBOT (https://www.dfrobot.com/product-689.html). Having the sensor at the end of the cable means you can place the sensors in locations where you might not be able to reach if you put the TO92-3 component on a breadboard.




ESP32 1-wire support

David Antliff created a great library for interfacing to the DS18B20 on the ESP32. You can find it here in the esp32-ds18b20 repository on github.

1-wire

The DS18B20 communicates using a 1-wire bus. 1-wire buses use a single wire for both power and communications. The "bus" in 1-wire buses indicates that more than one device can be on the same signal/power wire. The single signal/power wire is used for communication to/from the master and to/from the devices.

The master, typically a processor, is in charge of communications and has to grant time slices for devices to respond. Due to the bus being shared between the master and a number of devices timing is important. Timing  determines who, either the master or a device, has control of the bus.

During pauses in communication the signal/power wire provides power for devices to charge internal capacitors. When communicating the devices uses this stored energy. Timing is important here as well because each device has a limited amount of stored energy. Depleting a devices energy would cause it to reset and/or not behave correctly.

The term '1-wire' is a bit misleading. You do typically need an additional wire for ground, so at a minimum you need two wires. I mention typical as it is possible to use an equipment chassis for the ground path.

Microprocessor ------- 1-wire bus ------- Device signal/power
       Ground --------------------------- Device ground


The DS18B20 also provides an optional dedicated power pin. Providing power to this mean means there isn't a concern about excessive communication depleting the energy stored in the device. In my configuration I had an extra wire available so I'm providing external power and using a three wire configuration.

Microprocessor ------ 1-wire bus ------ Device signal/power
  Power (3.3V) ------------------------ Device power
        Ground ------------------------ Device ground

CRC errors

With David's 1-wire library I was able to get the sensor working and reporting values in about an hour, including wiring the temperature cable. I started seeing these error messages pop up periodically, maybe once or twice a minute, when issuing temperature reads from the DS18B20 device every few seconds:

E (10812382) ds18b20: CRC failed
E (10815772) ds18b20: CRC failed
E (10819162) ds18b20: CRC failed

The CRC mismatch error indicates that something went wrong with the data during communications. Most temperature read operations are working correctly so I'm able to get confirmed correct temperature readings. The frequency of the CRC errors points to something wrong with the wiring, DS18B20 device, or the software that is performing communications. That got me looking at the physical signals used to communicate with the device.

I recently purchased a Rigol DS1054Z digital oscilloscope. It's an excellent oscilloscope for hobby use and quite powerful. Low cost digital oscilloscopes have improved dramatically over the past several years. Features normally be available only in expensive commercial equipment are now available to the hobbyist for a few hundred dollars.


Reset and presence detection

After reading up on the 1-wire protocol and reviewing the scope traces I spotted something odd.

Here is a capture of the 1-wire presence detection. Presence detection is where the master is attempting to see if there are any 1-wire devices on the bus. A 'reset and presence detection' occurs before each communication from the master to DS18B20 so incorrect communications during this operation could be related to the CRC errors the library is reporting.

Circled in white is a strange voltage level seen on the bus. Typically you'd expect the digital line to be either 0V, ground, or 3.3V, the power supply voltage, or transitioning between the two. But in white the voltage is ~0.4V. It's a logical low value asit's less than 0.8V but it isn't 0V.


Here I've applied some colors. Green represents areas where the bus master (the ESP32 processor) is in control of the bus, and purple where the DS18B20 is in control of the bus. Note that they overlap in the area circled in white.

In this case the ESP32 is attempting to drive the bus to a high state, 3.3V, while the DS18B20 is attempting to pull it down to 0V.

This condition isn't fatal to either the device, the DS18B20, or master, the ESP32, evidenced by it occurring on my setup here for a number of days. The DS18B20 is designed to handle bus contention as it is possible to have a dozen 1-wire devices on the same bus and they sometimes do step on each other. The ESP32 likewise isn't affected here, in this case I have a series resistor limiting its current, effectively limiting how hard it can push/pull the voltage of that signal.

But what is going on there? Is this normal 1-wire behavior to have that contention? It turns out that it isn't required and the DS18B20 documents suggest that the master should release the bus.

Looking at the 1-wire ESP32 library, esp32-owb, I found this code in _reset():

        gpio_set_direction(bus->gpio, GPIO_MODE_OUTPUT);
        _us_delay(bus->timing->G);
        gpio_set_level(bus->gpio, 0);  // Drive DQ low
        _us_delay(bus->timing->H);
        gpio_set_level(bus->gpio, 1);  // Release the bus (NOTE: This does not release the bus)
        _us_delay(bus->timing->I);

        gpio_set_direction(bus->gpio, GPIO_MODE_INPUT); (THIS IS WHERE THE BUS IS RELEASED)
        int level1 = gpio_get_level(bus->gpio);
        _us_delay(bus->timing->J);   // Complete the reset sequence recovery
        int level2 = gpio_get_level(bus->gpio);

The orange line is where the master is trying to pull the bus high. The green line is where the master switches to input mode and stops driving the bus, this is where the bus is actually released.

If we alter this code release the bus at the correct location we get this code:

        gpio_set_direction(bus->gpio, GPIO_MODE_OUTPUT);
        _us_delay(bus->timing->G);
        gpio_set_level(bus->gpio, 0);  // Drive DQ low
        _us_delay(bus->timing->H);
        gpio_set_direction(bus->gpio, GPIO_MODE_INPUT); // Release the bus
        gpio_set_level(bus->gpio, 1);  // Reset the output level for the next output
        _us_delay(bus->timing->I);

        int level1 = gpio_get_level(bus->gpio);
        _us_delay(bus->timing->J);   // Complete the reset sequence recovery
        int level2 = gpio_get_level(bus->gpio);

Note that the green line was moved up above the orange line. We can see the effect this has from another scope trace:

Note the smooth area after the short pulse in the center of the capture and compare that to the lumpy area in the original implementation.

After highlighting the master controlled area in green and the DS18B20 area in purple you can see that the two areas no longer overlap.

We've fixed the bus contention during the presence detection!

But I'm still seeing those CRC errors...

Read Time Slot

Lets's look at is the communication when the data and CRC are being sent. Devices on a 1-wire bus send data during a 'Read Time Slot' where the master initiates the transfer and the device fills in its data value.

Here is a capture during data communication:
Read time slot with bus contention
You can see that not quite 0V level showing up and circled in white. This is just like the issue seen during the 'Reset and presence detection' operation.


Colored version of read time slot with bus contention

And after applying color per the 1-wire protocol specification you can see that the master and DS18B20 are driving the bus for quite a long time. Even though DS18B20 is able to pull the line to a logical low, < 0.8V, it isn't a good idea to have the master and DS18B20 working against each other.

Note that this issue is not resulting in data corruption. The master releases the line in advance of the time where the bus value is sampled. The little dip at the right hand side of the white circle is where the master has stopped driving the line. In addition the DS18B20 is able to pull the line low even as the master is attempting to drive it high. The voltage is a consistently a logical low throughout the communication of the bit.

Read time slot with minimal bus contention
And here is a capture after correcting the same contention issue in the _read_bit() function.

I've highlighted in white the fact that there is still a very small glitch. The level here, and the location in terms of time, aren't a concern, the master isn't going to read the value from the bus until much later in time.

The cause of this glitch is unavoidable. The bus master pulls the line low to tell the DS18B20 that it should respond with a 0 or 1 value by pulling the line high or low. When the master releases the line the external pull-up will pull the line high until the DS18B20 pulls the line either high or low. Because the timing cannot be precisely coordinated there will either be a gap or some overlap. The 1-wire protocol accounts for this overlap by having the master read the bus value after a period of delay where the bus can be released and the DS18B20 can cleanly signal a low or high value.

Conclusion

Bus contention should be avoided. Fortunately none of the issues found here should have affected 1-wire communications but unfortunately that means that these changes aren't likely to fix the CRC errors I was seeing. This is supported by the fact that I'm still seeing CRC errors at the same rate even after these fixes have been applied. I'm planning to continue debugging the source of the CRC issues.

I've passed the bus contention improvements back to David Antliff so the 1-wire library can be improved. His libraries are well written and easy to use. This relatively minor bus contention issue is something easy to miss and it's present in the example pseudo-code provided by Maxim. I'm hoping it will be corrected in these libraries in the very near future.

The scope traces here are indicative of typical signal/bus contention issues. If you see a voltage that isn't logic high or logic low check to see that contention between devices isn't the cause. In some cases contention can cause devices to burn out due to excessive current sunk or sourced as the devices work against each other.

Comments

Popular posts from this blog

Debugging an imprecise bus access fault on a Cortex-M3

This information may apply to other cortex series processors but is written from practical experience with the Cortex-M3. Imprecise bus access faults are ambiguous, as noted by the term "imprecise". Compared to precise bus errors, imprecise errors are much trickier to debug and especially so without a deep understanding of arm processors and assembly language. Imprecise and precise flags are found in the BusFault status register, a byte in the CFSR (Configurable Fault Status Register). BusFault status register bits The definition for imprecise and precise bits is: [2] IMPRECISERR Imprecise data bus error: 0 = no imprecise data bus error 1 = a data bus error has occurred, but the return address in the stack frame is not related to the instruction that caused the error. When the processor sets this bit to 1, it does not write a fault address to the BFAR. This is an asynchronous fault. Therefore, if it is detected when the priority of the current pr...

Travelling on Spirit airlines out of Boston Logan airport? Here are some tips.

I attended CES 2017 in Las Vegas. Booking the trip late I ended up on Spirit airlines. It was both non-stop, making it six hours to Las Vegas from Boston, and affordable, less than $300 for a one way trip compared to around $700 with JetBlue. Here are some tips that might help you when travelling on Spirit from Boston Logan airport. Eat Spirit is located in the B-terminal, gates B-37 and 38, with its own TSA security checkpoint. While it does have restrooms and places to sit the food selection is limited to a single food stand. I'd recommend eating at the Legal C Bar (number 77 in the image below) prior to going through the terminal security checkpoint. The food and service there were great. Drink The water and other drinks are cheaper if you buy them at the food cart rather than on the flight. Seats The seats on Spirit don't recline. They do this to reduce weight, seat cost, seat maintenance costs, and so seats don't impact the free space of other passengers,...

Yocto recipe SRC_URI for a BitBucket / GitHub ssh git repository

This is a particularly geeky post but because Google searches didn't turn up any information I thought it would be helpful to document the issue and solution for others. I was writing  Yocto recipes that pulled from BitBucket git repositories in ssh form and ran into several issues getting a SRC_URI that worked. GitHub uses the same syntax for their ssh repositories. A BitBucket / GitHub git url, in ssh form, looks like: < username >@bitbucket.org:< account name >/< repository name >.git a more concrete example for a git repository in one of my BitBucket accounts looks like: git@bitbucket.org:cmorgan/somerepository.git Yocto recipes can pull from git repositories by setting the SRC_URI variable appropriately. Unfortunately you can't just do: SRC_URI = "git@bitbucket.org:cmorgan/somerepository.git You'll get errors because the Yocto won't know what kind of url this is. You need to specify the protocol for Yocto to k...