
Every embedded C programmer has encountered them — the cryptic __attribute__((packed)) annotations, the #pragma pack directives, and the interrupt function attributes scattered through vendor startup files. Yet many engineers treat these as magic incantations without understanding what they actually do.
Compiler attributes and pragma directives are the primary mechanism through which embedded developers communicate intent to the compiler — intent that cannot be expressed in standard C. They control memory layout, code placement, optimization behavior, and interrupt handling. Misusing them leads to subtle bugs: struct layouts that differ between builds, ISRs optimized away, or performance that degrades after a compiler upgrade.
This article provides a practical guide to the most important compiler attributes and pragmas in embedded C, with real-world examples and common pitfalls.
#pragma pack vs __attribute__((packed))The most frequently encountered pragma in embedded code is #pragma pack, used to control the alignment of struct members. By default, the compiler inserts padding bytes between struct members to satisfy alignment requirements of the target architecture. For example, on a 32-bit ARM Cortex-M, a uint32_t is typically 4-byte aligned.
Consider this struct:
struct sensor_data {uint8_t status; // 1 byteuint32_t timestamp; // 4 bytesuint16_t value; // 2 bytes};
Without any packing directive, the compiler inserts 3 bytes of padding after status (to align timestamp to a 4-byte boundary) and 2 bytes of trailing padding (to make the struct size a multiple of 4 for array alignment). The total size is 12 bytes, but only 7 bytes carry actual data.
#pragma pack#pragma pack(push, n) sets the maximum alignment for subsequently defined structs to n bytes:
#pragma pack(push, 1)struct sensor_data_packed {uint8_t status;uint32_t timestamp;uint16_t value;};#pragma pack(pop)
This struct is 7 bytes — no padding at all. The push/pop pattern ensures the packing only affects the intended struct, not everything that follows in the translation unit.
Critical caveat: #pragma pack(1) changes the type’s alignment requirement to 1. This means the compiler must assume instances of this struct can appear at any byte address, including misaligned ones. On ARM Cortex-M0 (which lacks hardware unaligned access), every 32-bit field access gets decomposed into multiple byte operations — a 7× code size increase on some targets.
__attribute__((packed))GCC and Clang provide a per-struct alternative:
struct sensor_data_packed {uint8_t status;uint32_t timestamp;uint16_t value;} __attribute__((packed));
This achieves the same zero-padding layout but applies only to this specific struct type. However, it still sets the type alignment to 1, so the same performance penalty applies.
__attribute__((packed, aligned(N)))For new code where you control the data format, combining packed with an explicit alignment is superior:
struct __attribute__((packed, aligned(4))) sensor_data_efficient {uint8_t status;uint32_t timestamp;uint16_t value;};
This struct is 8 bytes (7 bytes of data + 1 byte tail padding to make the size a multiple of 4). The internal layout has no padding between fields, but the struct itself is guaranteed to start at a 4-byte boundary. This improves array stride alignment and avoids misaligned access to the first field. However, internal fields like timestamp (at offset 1) remain misaligned — on Cortex-M3+ the hardware handles this transparently, but on Cortex-M0 it still requires byte-by-byte access. This pattern is recommended when you need compact layout with predictable struct alignment.
__attribute__((section))The section attribute places a variable or function into a named ELF section, which the linker script maps to a specific memory region. This is essential when different memory regions have different properties (flash vs RAM, fast SRAM vs slow external memory).
The most common use case is placing the interrupt vector table at the start of flash:
const void *g_vector_table[] __attribute__((section(".isr_vector"))) = {(void *)_estack, /* Initial stack pointer */(void *)Reset_Handler, /* Reset handler */(void *)NMI_Handler, /* NMI handler */};
The linker script maps .isr_vector to the beginning of flash:
.isr_vector : {KEEP(*(.isr_vector))} >FLASH
Time-critical functions can be placed in faster RAM sections:
void __attribute__((section(".ramfunc"))) process_audio_sample(int16_t sample){// Executes from RAM, not flash// Useful when flash wait states would cause missed deadlines}
The startup code must copy .ramfunc from flash to RAM before main() is called, similar to how .data is initialized.
When using -ffunction-sections -fdata-sections with --gc-sections, the linker removes unreferenced sections. The used attribute prevents this:
static const uint8_t firmware_version[4] __attribute__((used, section(".rodata.version"))) = {1, 0, 0, 3};
Without __attribute__((used)), the linker might discard this symbol since no C code directly references it — even though a bootloader or diagnostic tool reads it from the binary.
noinline, always_inline, and noreturnThe noinline attribute prevents the compiler from inlining a function, regardless of optimization level:
void __attribute__((noinline)) timer_init(void){// Must not be inlined: strict timing between register writesTIM2->CR1 = 0;TIM2->PSC = SystemCoreClock / 1000000 - 1;TIM2->ARR = 0xFFFF;}
This is also useful when debugging — inlined functions are harder to set breakpoints on.
Conversely, always_inline forces inlining even at -O0:
static inline __attribute__((always_inline)) void set_bit(volatile uint32_t *reg, uint8_t bit){*reg |= (1UL << bit);}
For small accessor functions used in tight loops, forcing inlining eliminates call overhead entirely.
The noreturn attribute tells the compiler that a function does not return to its caller:
void __attribute__((noreturn)) error_handler(uint32_t error_code){log_error(error_code);NVIC_SystemReset();}
This enables the compiler to eliminate dead code after the call and produce better warnings about missing return statements.
__attribute__((interrupt))Important Cortex-M note: On ARM Cortex-M processors, the hardware automatically saves the caller-saved registers (R0-R3, R12, LR, PC, xPSR) and handles the correct return sequence via the EXC_RETURN mechanism. This means ISRs on Cortex-M are written as plain C functions — no special attribute is needed:
// Cortex-M: no attribute needed — this is a regular C functionvoid USART2_IRQHandler(void){if (USART2->SR & USART_SR_RXNE) {uint8_t data = USART2->DR;ring_buffer_push(&rx_buf, data);}}
Using __attribute__((interrupt)) on Cortex-M can actually generate unnecessary prologue/epilogue code, increasing code size without any benefit.
However, on classic ARM targets (ARM7TDMI, ARM9) and other architectures (MIPS, RISC-V, some AVR), the interrupt attribute is essential. It tells the compiler to save and restore the full register context and use the architecture-specific return instruction (e.g., SUBS PC, LR, #4 on ARM7):
// ARM7/ARM9: attribute IS requiredvoid __attribute__((interrupt("IRQ"))) UART_IRQHandler(void){// Compiler generates proper IRQ entry/exit sequenceuint8_t data = UART0->DR;ring_buffer_push(&rx_buf, data);}
Without this attribute on non-Cortex targets, the compiler only saves the AAPCS caller-saved registers, potentially corrupting the interrupted context.
The weak attribute creates a symbol that can be overridden by a strong definition elsewhere:
void __attribute__((weak)) Default_Handler(void){while (1) { }}void __attribute__((weak, alias("Default_Handler"))) NMI_Handler(void);void __attribute__((weak, alias("Default_Handler"))) HardFault_Handler(void);void __attribute__((weak, alias("Default_Handler"))) USART1_IRQHandler(void);
This pattern, used extensively in CMSIS startup files, provides default handlers for all interrupts while allowing the application to override any handler simply by defining a function with the same name.
Pitfall 1: Packed structs in arrays. Using __attribute__((packed, aligned(4))) changes the array stride compared to __attribute__((packed)). If you have existing binary data (flash records, protocol packets) using the old layout, you cannot switch without migrating the data.
Pitfall 2: Forgetting #pragma pack(pop). If you forget the pop, all subsequent struct definitions in the translation unit inherit the packing alignment, causing subtle layout changes in unrelated code.
Pitfall 3: Taking the address of packed fields. On GCC, &packed_struct.field creates a pointer with alignment 1. Passing such a pointer to a function expecting a normally aligned uint32_t * violates the alignment requirement of the pointed-to type and causes undefined behavior. On architectures that require aligned access (e.g., Cortex-M0), this results in a HardFault. GCC warns about this via -Waddress-of-packed-member.
Pitfall 4: LTO discarding ISRs. With link-time optimization (-flto), the compiler may discard interrupt handlers it considers unreferenced. The __attribute__((used)) attribute on the handler function prevents this.
Compiler attributes and pragma directives are not optional decorations — they are essential tools for controlling how the compiler generates code for resource-constrained, hardware-coupled embedded systems. The key takeaways:
__attribute__((packed, aligned(N))) instead of #pragma pack(1) for new data structures to get compact layout without sacrificing access performance__attribute__((section)) for memory-mapped data structures and time-critical functions that must reside in specific memory regions__attribute__((used)) to symbols that must survive linker garbage collection, especially ISRs and version markers__attribute__((interrupt)) for ARM interrupt handlers to ensure correct context save/restore__attribute__((weak)) with alias to create overridable default handlersUnderstanding these directives transforms you from someone who copies startup files to someone who can write them — a distinction that embedded interviewers notice immediately.
Quick Links
Legal Stuff




