The C.R.I.S.P.Y. Checklist
Debugging digital hardware design is not the same as debugging software.
You may have very strong software debugging skills. But I have found that those skills assume that the design is behaving as a digital system. If your hardware design has bugs at the analog level it can cause the digital behavior to appear unpredictable. This behavior and can be very frustrating to debug.
I want to share a method that I teach and use to reduce this frustration. This is a checklist that you can remember using the cutesy mnemonic device 'CRISPY'.
The CRISPY checklist is:
- Clocks
- Reset
- Integrity of Signals
- Power
- You
As the mnemonic does not list the items in the best order, I will explain each item below in the order I have found most useful. I will also be relatively brief in this post. If you have questions about terms I use or tools needed just ask @duppy on twitter #crispyhw tag. If you have better solutions or a problem whose fix did not fit into this checklist, pelase share on twitter @duppy #crispyhw. I'll bet somebody out there has an even better checklist than this one.
- Power
- Are power and ground shorted?
- I had to ask. It is a good habit to "buzz out" all power and ground signals after every circuit mod before applying power.
- Is your design plugged in?
- Seriously. Double check with a volt meter or LED.
- If battery powered, is the battery near full? I recommending replacing the battery with a bench supply while debugging.
- Are you using too much power?
- A good bench supply will tell you how much average current is being drawn. The average will catch some but not all issues.
- Often your micro will boot fine and only fail when you start turning on more modules or clocks or driving too many LEDs or motors. The failure level can vary greatly from chip to chip. Use a bench supply when possible during debugging, add extra LEDs on purpose to stress power, or write specific firmware to intentionally turn on as many of the micro's internal modules and clocks to stress power.
- Is the resistance of your power cable too high (often because it is too long)?
- This is more common when breadboarding. Check power at the power pin of each IC, not at the power supply.
- If you are using a motor, add a diode!
- Are you using bypass capacitors of the correct value and more importantly placement?
- Bypass capacitors can be especially tricky because often modern ICs will work fine without them. Until they don't.
- Also, larger is not better with bypass caps and too small doesn't work either. The value and type needed can also change when going from breadboard to PCB.
- When in doubt, you can almost never go wrong with using 0.1uF caps near each IC power pin and 10uF at the main power source.
- Very rarely will you need to delve deeper than the above recommendation. If you do, start with these links.
- Do you have more than one power domain?
- Remeber that a debug or programming cable or connection to a PC may count as your second power domain.
- Are all the power domains on?
- What order are the power domains turning on and off?
- Check for "backdrive". If an output from a powered IC or a pull-up resistor is driving a high input to an IC that is not powered then you are backdriving that IC. The backdriven IC's behavior can range from appearing to work as if powered properly to releasing smoke. The releasing smoke behavior is the easy bug to find. The other behaviors are harder.
- Can someone recommend a good reference for more on backdrive?
- Are you interfacing 3.3V logic to 5V logic?
- Reset
- Once your digital ICs have stable power, they still need a good Reset to start working.
- Many ICs have internal Power On Resets (POR) so you may have been lured into skipping this step. Don't ignore it becsause your debug cable needs reset to work well and it is also often very convenient to reset your system during testing without having to cycle power.
- Check that any external reset signals are meeting the timing and voltage level requirements.
- Check that all ICs that take a reset (whether explicit signal or implicit POR) are getting reset at the same time. If not, understand the consequences. Recently I work a lot with Electric Imp. In these designs it is tempting to 'reset' by removing and re-inserting the Electric Imp card. The problem is that this does not reset the other digital devices that the Imp talks to. Are you having this problem?
- Clocks
- Even if Power and Reset are good, your ICs still don't work without a stable clock. Like reset, many ICs these days have internal oscilators so you might have also been lured into skipping this step.
- Read the datasheet and make sure you are providing everything necessary for the internal clock to work or your external crystal or oscillator to start up.
- If you have multiple ICs with high speed clocks then you need to read up on jitter, skew, setup, hold, metastability, transmission line termination techniques, crosstalk, duty cycle, and probably a few other topics.
- I don't mean to scare you. If you are using Arduino or similar class micro based designs then high-speed clocks are probably not your problem. Learn enough to rule them out as a possible cause and look everywhere else first.
- Integrity of Signals
- This is more commonly called "Signal Integrity" when you aren't trying to make your checklist have a cutesy mnemonic.
- Know the speed of all your external signals and start with anything faster than 1MHz.
- Know which signals are edge vs. level sensitive. Level sensitive signals can look ugly as they change but must be stable around the setup and hold time of the receiving chip. Edge sensitive signals need to have clean changes and may need to be treated as Clocks
- Lookup VIH, VIL, VOH, VOL , for your chips an understand what they mean.
- Check both your internal and external pull-up/down resistors. Use a 100K ohm resistor connected to VCC or GND and poke it at your IOs to see whether they are being actively driven, floating, or pulled up/down.
- You
- Are you reading *it* right! ?
- Do your pin numbers defined in your firmware match the schematic?
- Are you using the right regster addresses? Are you accessing them as 16bit when they are only 8bit?
- Does the PCB as built match the schematic? Version control is not used as well as often in hobby hardware development as it is in software development. It should be.
- Is your scope or multimeter connected to the write pin? Pin one labeling can often be confusing and it can be hard to count fine pitch pins. Use labeled test points. Don't skimp on annotating your silk screen or use different color breadboard wires in some consistent manner such as red for power, black for ground, blue for IOs, green for clocks.
- Are you assuming a pin is active high when it is relaly active low?
- Are your clocks rising or falling edge?
- If you have a serial bus is it sending LSB or MSB first?
I hope you find this checklist helpful. Share a story if it helps you find a bug. If you find a bug that doesn't fit into one of these categories, share that also.
To discuss: Reply on Twitter @duppy with #crispyhw tag.