HomeAbout UsContact Us

Embedded Firmware Debugging Techniques: From Printf to JTAG

By Jithin Tom
July 01, 2026
4 min read
Embedded Firmware Debugging Techniques: From Printf to JTAG

Table Of Contents

01
The Debugging Spectrum
02
Printf and UART Logging
03
ITM and SWO Trace
04
JTAG and SWD Hardware Debug
05
RTOS-Aware Debugging
06
Post-Mortem and Crash Analysis
07
Debugging Checklist for New Bring-Up
08
Summary
09
Related Reading
10
References
11
Frequently Asked Questions

Debugging embedded firmware is fundamentally different from application debugging on a host OS. There is no console, no process isolation, and often no operating system at all. The target runs bare metal or under an RTOS, interrupts fire asynchronously, and timing is everything. This article surveys the debugging toolbox available to embedded engineers — from the humble printf to full JTAG/SWD hardware debug — and explains when each technique shines.

The Debugging Spectrum

Embedded debugging techniques form a spectrum of intrusiveness and capability:

+------------------------------------------------------------------+
| DEBUGGING TECHNIQUE SPECTRUM |
+------------------------------------------------------------------+
| LOW CAPABILITY HIGH CAPABILITY |
| |
| printf / UART ITM/SWO Trace JTAG / SWD Debug |
| (blocking I/O) (non-blocking) (hardware halt) |
| |
| +----------------+ +----------------+ +----------------+ |
| | Code change | | HW support | | Debug probe | |
| | required | | required | | required | |
| +----------------+ +----------------+ +----------------+ |
| | | | |
| v v v |
| Variable CPU Near-zero Full control |
| overhead overhead over target |
| |
+------------------------------------------------------------------+

Each technique has its place. The skilled engineer chooses based on the problem at hand, not habit.

Printf and UART Logging

The oldest technique remains surprisingly effective for certain problems. Adding printf statements (or a lightweight logging macro) over UART gives visibility into code flow, variable values, and execution paths.

#define LOG_DEBUG(fmt, ...) printf("[DBG %s:%d] " fmt "\n", __func__, __LINE__, ##__VA_ARGS__)
#define LOG_INFO(fmt, ...) printf("[INF %s:%d] " fmt "\n", __func__, __LINE__, ##__VA_ARGS__)
#define LOG_ERROR(fmt, ...) printf("[ERR %s:%d] " fmt "\n", __func__, __LINE__, ##__VA_ARGS__)

Strengths:

  • Zero hardware cost beyond a UART peripheral
  • Works on any target with a serial port
  • Simple to implement and understand

Weaknesses:

  • Blocking I/O — each printf halts the CPU until the UART transmitter drains. At 115200 baud, printing 80 characters takes ~7 ms — an eternity in a 1 kHz control loop.
  • Heisenbug risk — the added latency changes timing, potentially masking or creating race conditions.
  • Code modification required — you must recompile and reflash to add new log points.
  • No state inspection — you see what you logged, not what you forgot to log.

Best for: Boot sequence verification, configuration validation, low-frequency state reporting, and targets without debug probe access.

ITM and SWO Trace

ARM Cortex-M cores include the Instrumentation Trace Macrocell (ITM) and optional Serial Wire Output (SWO). ITM provides 32 stimulus ports that software can write to; SWO streams this data out on a single pin at high speed (typically 1-4 MHz).

// ITM stimulus port 0 registers
#define ITM_PORT0_U32 (*(volatile uint32_t*)0xE0000000UL)
#define ITM_PORT0_U8 (*(volatile uint8_t*)0xE0000000UL)
void itm_print(const char *str) {
while (*str) {
// Wait until FIFO is ready (blocks only if FIFO is full)
while (ITM_PORT0_U32 == 0) {
__asm volatile ("nop");
}
// Write 8-bit packet to save trace bandwidth
ITM_PORT0_U8 = *str++;
}
}

Strengths:

  • Non-blocking — writes to ITM FIFO return immediately if space is available
  • Timestamped — trace packets include cycle-accurate timestamps
  • High bandwidth — SWO at 2 MHz provides ~250 KB/s raw throughput (actual payload lower due to ITM packet framing)
  • No CPU halt — target runs at full speed

Weaknesses:

  • Requires Cortex-M3/M4/M7/M33 with ITM/SWO support
  • SWO pin must be routed on PCB (often shared with other functions)
  • Requires trace-capable probe (J-Link, ST-Link V2/V3, CMSIS-DAP with SWO)
  • Host-side tooling needed (OpenOCD, J-Link SWO viewer, IDE integration)

Best for: Real-time event tracing, performance profiling, interrupt latency analysis, RTOS task switching visualization.

JTAG and SWD Hardware Debug

JTAG (Joint Test Action Group, IEEE 1149.1) and SWD (Serial Wire Debug) provide the gold standard: full control over the target CPU via a debug probe.

JTAG

  • 4-5 signals: TCK, TMS, TDI, TDO, (TRST)
  • Daisy-chain multiple devices on one bus
  • Boundary scan for board-level interconnect testing
  • Standard on most MCUs and SoCs

SWD (ARM)

  • 2 signals: SWCLK, SWDIO (bidirectional)
  • ARM-specific, not a general IEEE standard
  • Same debug access as JTAG with fewer pins
  • Supports SWO for trace on a dedicated pin (typically shared with JTAG TDO)

Debug Probe Options

ProbeJTAGSWDSWOPrice Tier
Segger J-LinkYesYesYesPremium
ST-Link V2/V3YesYesYesLow
CMSIS-DAP (generic)YesYesSometimesLow
FTDI-based (FT2232/FT4232)YesYesVia UART RXMid

OpenOCD and GDB Workflow

OpenOCD (Open On-Chip Debugger) is the bridge between GDB and the debug probe:

# Terminal 1: Start OpenOCD server
openocd -f interface/stlink.cfg -f target/stm32f4x.cfg
# Terminal 2: Connect GDB
arm-none-eabi-gdb build/firmware.elf
(gdb) target extended-remote localhost:3333
(gdb) monitor reset halt
(gdb) load
(gdb) monitor reset halt
(gdb) break main
(gdb) continue

Common GDB Commands for Embedded:

break function_name # Set breakpoint
break *0x08001234 # Break at address
watch *(uint32_t*)0x20000000 # Watch memory location
info registers # Show CPU registers
bt # Backtrace
frame 3 # Switch stack frame
print /x my_var # Print in hex
set var my_var = 0xDEADBEEF # Modify memory/register
monitor reset # Reset target via OpenOCD

Strengths:

  • Full control — halt, single-step, run to breakpoint
  • State inspection — registers, memory, stack, variables
  • No code changes — debug the exact binary that runs in production
  • Breakpoints in ROM/Flash — hardware breakpoints (typically 2-8 on Cortex-M, depending on variant)
  • Watchpoints — data breakpoints on memory access

Weaknesses:

  • Intrusive — halting the CPU stops real-time peripherals (timers, PWM, communication)
  • Requires probe and PCB access — debug header must be populated
  • Limited hardware breakpoints — typically 2-8 on Cortex-M
  • Flash wear — repeated load commands erase/program flash sectors

Best for: Logic bugs, crash analysis, memory corruption, register-level peripheral debugging, bootloader bring-up.

RTOS-Aware Debugging

When running FreeRTOS, Zephyr, or ThreadX, standard GDB sees only the currently running task. RTOS-aware debugging adds task list, stack usage, queue/semaphore state, and task-specific backtraces.

OpenOCD RTOS support:

# In your OpenOCD target configuration script:
$_TARGETNAME configure -rtos FreeRTOS
# Then in GDB:
info threads # List FreeRTOS tasks
thread 3 # Switch to task 3 context
bt # Backtrace for that task

IDE Integration: VS Code with Cortex-Debug, Eclipse, IAR, Keil, and CLion all support RTOS-aware views showing task state (Ready/Running/Blocked/Suspended), stack watermarks, and kernel object status.

Post-Mortem and Crash Analysis

When a HardFault, MemManage, BusFault, or UsageFault occurs, the CPU saves context to the stack. A fault handler can capture this for later analysis.

// Minimal HardFault handler capturing stacked registers
__attribute__((naked)) void HardFault_Handler(void) {
__asm volatile (
"tst lr, #4\n"
"ite eq\n"
"mrseq r0, msp\n"
"mrsne r0, psp\n"
"b hard_fault_handler_c\n"
);
}
void hard_fault_handler_c(uint32_t *stacked) {
// stacked[0]=R0, [1]=R1, [2]=R2, [3]=R3, [4]=R12, [5]=LR, [6]=PC, [7]=xPSR
volatile uint32_t pc = stacked[6];
volatile uint32_t lr = stacked[5];
// Log to backup RAM, flash, or transmit via ITM
(void)pc; (void)lr; // Retain in memory for debugger inspection
while (1); // Halt for probe attachment
}

Cortex-M Fault Status Registers:

  • HFSR (HardFault Status) — indicates forced hard fault, vector table read fault, etc.
  • CFSR (Configurable Fault Status) — MemManage (MMFSR), BusFault (BFSR), UsageFault (UFSR)
  • MMFAR / BFAR — faulting memory addresses

With these registers and the stacked PC/LR, you can often pinpoint the exact instruction that caused the fault without a probe attached at crash time.

Debugging Checklist for New Bring-Up

  1. Verify debug header — check SWDIO/SWCLK continuity, pull-up on SWDIO, pull-down on SWCLK, target voltage on VREF
  2. Confirm probe connectionopenocd -f interface/stlink.cfg -f target/stm32f4x.cfg should connect without errors and report hardware breakpoints
  3. Check IDCODE — confirms correct device and scan chain
  4. Test flash operationsflash probe 0, flash info 0, flash write_image erase build/firmware.elf
  5. Verify reset behaviormonitor reset halt should stop at reset vector
  6. Validate breakpoints — set breakpoint at main, continue, verify stop
  7. Test watchpoints — watch a global variable, write to it, verify trigger

Summary

TechniqueIntrusivenessHardware NeededBest For
printf/UARTHigh (blocking)UART onlyBoot logs, config checks
ITM/SWO TraceNear-zeroCortex-M + SWO pin + probeReal-time tracing, profiling
JTAG/SWD + GDBHigh (halts CPU)Debug probe + headerLogic bugs, crashes, register inspection
RTOS-awareHigh (halts CPU)Probe + RTOS pluginTask deadlocks, stack overflows, queue states
Post-mortemNone (after crash)Backup RAM / flashField failures without probe

The most effective debugging strategy combines techniques: use ITM/SWO for continuous field telemetry, printf for quick sanity checks during development, and JTAG/SWD for deep investigation when things go wrong. Master the toolchain — OpenOCD, GDB, and your probe’s capabilities — and you’ll spend less time guessing and more time fixing.

  • JTAG and SWD Debugging Strategies for Embedded Systems
  • Interrupt Handling and ISRs in Embedded Systems
  • RTOS Concepts: Tasks, Semaphores, and Mutexes

References

  1. ARM Limited. “ARMv7-M Architecture Reference Manual.” ARM DDI 0403E, 2014. https://developer.arm.com/documentation/ddi0403/latest
  2. Segger Microcontroller Systems. “J-Link / J-Trace User Guide.” UM08001, 2024. https://www.segger.com/downloads/jlink/
  3. OpenOCD Contributors. “OpenOCD User’s Guide.” https://openocd.org/doc/html/
  4. Joseph Yiu. “The Definitive Guide to ARM Cortex-M3 and Cortex-M4 Processors.” 3rd ed., Newnes, 2014.
  5. Jack Ganssle. “Proactive Debugging.” Embedded Systems Programming, 2001. https://www.ganssle.com/articles/proactive.htm

Frequently Asked Questions

What is the difference between JTAG and SWD for embedded debugging?

JTAG (IEEE 1149.1) uses 4-5 pins (TDI, TDO, TCK, TMS, TRST) and supports boundary scan for board-level testing. SWD (Serial Wire Debug) is ARM's 2-pin alternative (SWDIO, SWCLK) that provides the same debug access with fewer pins, making it more suitable for pin-constrained microcontrollers.

When should I use printf debugging vs a hardware debugger?

Printf debugging works for simple logic flow verification and when hardware debug probes are unavailable. Use a hardware debugger (JTAG/SWD) for real-time inspection, breakpoint debugging, memory/register examination, and when timing-sensitive code cannot tolerate printf overhead.

How does ITM/SWO tracing differ from traditional printf?

ITM (Instrumentation Trace Macrocell) with SWO (Serial Wire Output) provides non-blocking, timestamped trace data over a single pin. Unlike printf which halts the CPU for UART transmission, ITM/SWO streams data in real-time with minimal intrusion, enabling high-frequency event logging without affecting system timing.

What is the role of GDB and OpenOCD in the debug workflow?

OpenOCD acts as the debug server that translates GDB's remote protocol commands into JTAG/SWD transactions with the target. GDB runs on the host as the client, providing the user interface for breakpoints, watchpoints, stack inspection, and variable examination.

Can I debug optimized release builds effectively?

Yes, but with caveats. Compiler optimizations (O2/O3) may reorder instructions, inline functions, and eliminate variables, making source-level debugging confusing. Use -Og for debug-friendly optimization, keep debug symbols (-g), and be aware that single-stepping may jump non-linearly through optimized code.

Tags

debuggingjtagswdgdbopenocdprintftraceembedded-firmware

Share


Previous Article
Boot Sequence and Reset Handling in Embedded Systems
Jithin Tom

Jithin Tom

A Closer Look at C/C++, RTOS, and Embedded Systems

Related Posts

Unit Testing Embedded Firmware: A Practical Guide
Unit Testing Embedded Firmware: A Practical Guide
June 28, 2026
5 min
© 2026, All Rights Reserved.
Powered By Netlyft

Quick Links

Advertise with usAbout UsContact Us

Social Media