HomeAbout UsContact Us

Compiler Attributes and Pragma Directives in Embedded C

By embeddedSoft
Published in Embedded C/C++
June 04, 2026
4 min read
Compiler Attributes and Pragma Directives in Embedded C

Table Of Contents

01
Introduction
02
Structure Packing: #pragma pack vs __attribute__((packed))
03
Section Placement: __attribute__((section))
04
Function Control: noinline, always_inline, and noreturn
05
Interrupt Handlers: __attribute__((interrupt))
06
Weak Symbols and Overridable Handlers
07
Practical Pitfalls
08
Summary
09
References

Introduction

Every embedded C programmer has encountered them — the cryptic __attribute__((packed)) annotations, the #pragma pack directives, and the interrupt function attributes scattered through vendor startup files. Yet many engineers treat these as magic incantations without understanding what they actually do.

Compiler attributes and pragma directives are the primary mechanism through which embedded developers communicate intent to the compiler — intent that cannot be expressed in standard C. They control memory layout, code placement, optimization behavior, and interrupt handling. Misusing them leads to subtle bugs: struct layouts that differ between builds, ISRs optimized away, or performance that degrades after a compiler upgrade.

This article provides a practical guide to the most important compiler attributes and pragmas in embedded C, with real-world examples and common pitfalls.

Structure Packing: #pragma pack vs __attribute__((packed))

The most frequently encountered pragma in embedded code is #pragma pack, used to control the alignment of struct members. By default, the compiler inserts padding bytes between struct members to satisfy alignment requirements of the target architecture. For example, on a 32-bit ARM Cortex-M, a uint32_t is typically 4-byte aligned.

Consider this struct:

struct sensor_data {
uint8_t status; // 1 byte
uint32_t timestamp; // 4 bytes
uint16_t value; // 2 bytes
};

Without any packing directive, the compiler inserts 3 bytes of padding after status (to align timestamp to a 4-byte boundary) and 2 bytes of trailing padding (to make the struct size a multiple of 4 for array alignment). The total size is 12 bytes, but only 7 bytes carry actual data.

Using #pragma pack

#pragma pack(push, n) sets the maximum alignment for subsequently defined structs to n bytes:

#pragma pack(push, 1)
struct sensor_data_packed {
uint8_t status;
uint32_t timestamp;
uint16_t value;
};
#pragma pack(pop)

This struct is 7 bytes — no padding at all. The push/pop pattern ensures the packing only affects the intended struct, not everything that follows in the translation unit.

Critical caveat: #pragma pack(1) changes the type’s alignment requirement to 1. This means the compiler must assume instances of this struct can appear at any byte address, including misaligned ones. On ARM Cortex-M0 (which lacks hardware unaligned access), every 32-bit field access gets decomposed into multiple byte operations — a 7× code size increase on some targets.

Using __attribute__((packed))

GCC and Clang provide a per-struct alternative:

struct sensor_data_packed {
uint8_t status;
uint32_t timestamp;
uint16_t value;
} __attribute__((packed));

This achieves the same zero-padding layout but applies only to this specific struct type. However, it still sets the type alignment to 1, so the same performance penalty applies.

The Better Approach: __attribute__((packed, aligned(N)))

For new code where you control the data format, combining packed with an explicit alignment is superior:

struct __attribute__((packed, aligned(4))) sensor_data_efficient {
uint8_t status;
uint32_t timestamp;
uint16_t value;
};

This struct is 8 bytes (7 bytes of data + 1 byte tail padding to make the size a multiple of 4). The internal layout has no padding between fields, but the struct itself is guaranteed to start at a 4-byte boundary. This improves array stride alignment and avoids misaligned access to the first field. However, internal fields like timestamp (at offset 1) remain misaligned — on Cortex-M3+ the hardware handles this transparently, but on Cortex-M0 it still requires byte-by-byte access. This pattern is recommended when you need compact layout with predictable struct alignment.

Section Placement: __attribute__((section))

The section attribute places a variable or function into a named ELF section, which the linker script maps to a specific memory region. This is essential when different memory regions have different properties (flash vs RAM, fast SRAM vs slow external memory).

Interrupt Vector Table

The most common use case is placing the interrupt vector table at the start of flash:

const void *g_vector_table[] __attribute__((section(".isr_vector"))) = {
(void *)_estack, /* Initial stack pointer */
(void *)Reset_Handler, /* Reset handler */
(void *)NMI_Handler, /* NMI handler */
};

The linker script maps .isr_vector to the beginning of flash:

.isr_vector : {
KEEP(*(.isr_vector))
} >FLASH

Fast-Critical Functions in RAM

Time-critical functions can be placed in faster RAM sections:

void __attribute__((section(".ramfunc"))) process_audio_sample(int16_t sample)
{
// Executes from RAM, not flash
// Useful when flash wait states would cause missed deadlines
}

The startup code must copy .ramfunc from flash to RAM before main() is called, similar to how .data is initialized.

Preventing Linker Garbage Collection

When using -ffunction-sections -fdata-sections with --gc-sections, the linker removes unreferenced sections. The used attribute prevents this:

static const uint8_t firmware_version[4] __attribute__((used, section(".rodata.version"))) = {
1, 0, 0, 3
};

Without __attribute__((used)), the linker might discard this symbol since no C code directly references it — even though a bootloader or diagnostic tool reads it from the binary.

Function Control: noinline, always_inline, and noreturn

Preventing Inlining

The noinline attribute prevents the compiler from inlining a function, regardless of optimization level:

void __attribute__((noinline)) timer_init(void)
{
// Must not be inlined: strict timing between register writes
TIM2->CR1 = 0;
TIM2->PSC = SystemCoreClock / 1000000 - 1;
TIM2->ARR = 0xFFFF;
}

This is also useful when debugging — inlined functions are harder to set breakpoints on.

Forcing Inlining

Conversely, always_inline forces inlining even at -O0:

static inline __attribute__((always_inline)) void set_bit(volatile uint32_t *reg, uint8_t bit)
{
*reg |= (1UL << bit);
}

For small accessor functions used in tight loops, forcing inlining eliminates call overhead entirely.

Functions That Never Return

The noreturn attribute tells the compiler that a function does not return to its caller:

void __attribute__((noreturn)) error_handler(uint32_t error_code)
{
log_error(error_code);
NVIC_SystemReset();
}

This enables the compiler to eliminate dead code after the call and produce better warnings about missing return statements.

Interrupt Handlers: __attribute__((interrupt))

Important Cortex-M note: On ARM Cortex-M processors, the hardware automatically saves the caller-saved registers (R0-R3, R12, LR, PC, xPSR) and handles the correct return sequence via the EXC_RETURN mechanism. This means ISRs on Cortex-M are written as plain C functions — no special attribute is needed:

// Cortex-M: no attribute needed — this is a regular C function
void USART2_IRQHandler(void)
{
if (USART2->SR & USART_SR_RXNE) {
uint8_t data = USART2->DR;
ring_buffer_push(&rx_buf, data);
}
}

Using __attribute__((interrupt)) on Cortex-M can actually generate unnecessary prologue/epilogue code, increasing code size without any benefit.

However, on classic ARM targets (ARM7TDMI, ARM9) and other architectures (MIPS, RISC-V, some AVR), the interrupt attribute is essential. It tells the compiler to save and restore the full register context and use the architecture-specific return instruction (e.g., SUBS PC, LR, #4 on ARM7):

// ARM7/ARM9: attribute IS required
void __attribute__((interrupt("IRQ"))) UART_IRQHandler(void)
{
// Compiler generates proper IRQ entry/exit sequence
uint8_t data = UART0->DR;
ring_buffer_push(&rx_buf, data);
}

Without this attribute on non-Cortex targets, the compiler only saves the AAPCS caller-saved registers, potentially corrupting the interrupted context.

Weak Symbols and Overridable Handlers

The weak attribute creates a symbol that can be overridden by a strong definition elsewhere:

void __attribute__((weak)) Default_Handler(void)
{
while (1) { }
}
void __attribute__((weak, alias("Default_Handler"))) NMI_Handler(void);
void __attribute__((weak, alias("Default_Handler"))) HardFault_Handler(void);
void __attribute__((weak, alias("Default_Handler"))) USART1_IRQHandler(void);

This pattern, used extensively in CMSIS startup files, provides default handlers for all interrupts while allowing the application to override any handler simply by defining a function with the same name.

Practical Pitfalls

Pitfall 1: Packed structs in arrays. Using __attribute__((packed, aligned(4))) changes the array stride compared to __attribute__((packed)). If you have existing binary data (flash records, protocol packets) using the old layout, you cannot switch without migrating the data.

Pitfall 2: Forgetting #pragma pack(pop). If you forget the pop, all subsequent struct definitions in the translation unit inherit the packing alignment, causing subtle layout changes in unrelated code.

Pitfall 3: Taking the address of packed fields. On GCC, &packed_struct.field creates a pointer with alignment 1. Passing such a pointer to a function expecting a normally aligned uint32_t * violates the alignment requirement of the pointed-to type and causes undefined behavior. On architectures that require aligned access (e.g., Cortex-M0), this results in a HardFault. GCC warns about this via -Waddress-of-packed-member.

Pitfall 4: LTO discarding ISRs. With link-time optimization (-flto), the compiler may discard interrupt handlers it considers unreferenced. The __attribute__((used)) attribute on the handler function prevents this.

Summary

Compiler attributes and pragma directives are not optional decorations — they are essential tools for controlling how the compiler generates code for resource-constrained, hardware-coupled embedded systems. The key takeaways:

  • Use __attribute__((packed, aligned(N))) instead of #pragma pack(1) for new data structures to get compact layout without sacrificing access performance
  • Use __attribute__((section)) for memory-mapped data structures and time-critical functions that must reside in specific memory regions
  • Always apply __attribute__((used)) to symbols that must survive linker garbage collection, especially ISRs and version markers
  • Use __attribute__((interrupt)) for ARM interrupt handlers to ensure correct context save/restore
  • Use __attribute__((weak)) with alias to create overridable default handlers
  • Be aware of the performance implications of packing on your target architecture — what saves RAM may cost significant code size and execution time

Understanding these directives transforms you from someone who copies startup files to someone who can write them — a distinction that embedded interviewers notice immediately.

References


Tags

compiler-attributespragma-directivesembedded-cgccoptimization

Share


Previous Article
Timer Management and Tickless Mode in RTOS
embeddedSoft

embeddedSoft

Embedded Systems Articles by Jithin Tom & Hermes (AI Agent)

Related Posts

Memory Alignment and Padding in Embedded C Demystified
Memory Alignment and Padding in Embedded C Demystified
June 02, 2026
4 min
© 2026, All Rights Reserved.
Powered By Netlyft

Quick Links

Advertise with usAbout UsContact Us

Social Media