Monitoring Stack and Heap Usage in Zephyr: How to Detect Memory Leaks

Memory management is one of the most critical aspects of embedded systems development. In resource-constrained environments like microcontrollers, running out of memory can cause system crashes, unpredictable behavior, or complete failure. This tutorial will show you how to monitor both heap and stack memory usage in Zephyr RTOS to detect memory leaks before they become problematic.

Understanding Memory in Embedded Systems

Before diving into monitoring techniques, it’s important to understand the two main types of memory we’ll be tracking:

The Heap

The heap is a region of memory used for dynamic allocation during runtime. When you call functions like malloc() or Zephyr’s k_malloc(), memory is allocated from the heap. The heap grows and shrinks as you allocate and free memory throughout your program’s execution.

Why monitor the heap?

  • Memory leaks: When you allocate memory but forget to free it, the heap gradually fills up
  • Fragmentation: Even with proper freeing, the heap can become fragmented over time
  • Resource exhaustion: Running out of heap memory will cause allocation failures

The Stack

The stack is used for local variables, function parameters, and return addresses. Each thread in Zephyr has its own stack that grows and shrinks as functions are called and return.

Why monitor the stack?

  • Stack overflow: Deep recursion or large local variables can exhaust stack space
  • Thread-specific issues: Each thread has a fixed stack size that can be exceeded
  • System crashes: Stack overflow often causes immediate system failure

Why Running Out of Memory is Really Bad

When either heap or stack memory is exhausted:

  • Heap exhaustion: New allocations fail, potentially causing application logic errors
  • Stack overflow: Can corrupt memory and cause immediate system crashes
  • Unpredictable behavior: Memory corruption can lead to hard-to-debug intermittent failures
  • System instability: Critical system functions may fail to operate correctly

Many embedded systems are designed to run for months or years at a time without restarting (or restarting is a lengthy, risky process–think satellites!). As a result, monitoring for memory leaks is absolutely critical before deploying your application.

Setting Up the Demo

For this demonstration, I’ll use an ESP32S3 development board, which provides a good balance of performance and memory resources for testing.

Prerequisites

First, set up your Zephyr development environment using Docker by following the directions in this tutorial:

This Docker-based setup provides a consistent, reproducible environment for Zephyr development without the complexity of manual toolchain installation.

Project Structure

Create a new project directory with the following structure:

/workspace/apps/mem_test
├── boards
├── CMakeLists.txt
├── prj.conf
└── src/
    └── main.c

This follows Zephyr’s standard project layout, making it easy to build and maintain.

Main Application Code

Let’s create an application that deliberately introduces memory leaks while monitoring both heap and stack usage. This will help us understand how memory leaks manifest in real-time.

Here’s the complete main.c file:

#include <stdio.h>
#include <zephyr/kernel.h>

// Settings
static const int32_t sleep_time_ms = 1000;

// System heap
extern struct sys_heap _system_heap;

int main(void)
{
	int ret;
	struct sys_memory_stats heap_stats;
	struct k_thread *current_thread = k_current_get();
	size_t free_stack;

	// Do forever
	while (1) {

		// Inject memory leak (using Zephyr's k_malloc)
        void *leak = k_malloc(100);
        if (leak) {
            printk("Allocated 100 bytes (leaked)\n");
        }

		// Print stack usage of the current thread
		ret = k_thread_stack_space_get(current_thread, &free_stack);
		if (ret < 0) {
			printk("Failed to get stack stats\r\n");
		} else {
			printk("Stack | Free: %u bytes\r\n", free_stack);
		}

		// Print system heap usage
		ret = sys_heap_runtime_stats_get(&_system_heap, &heap_stats);
		if (ret < 0) {
			printk("Failed to get heap stats\r\n");
		} else {
			printk("Heap  | Free: %u bytes, Allocated: %u bytes\r\n",
				heap_stats.free_bytes,
				heap_stats.allocated_bytes);
		}

		// Sleep
		k_msleep(sleep_time_ms);
	}

	return 0;
}

Code Explanation

System Heap: The extern struct sys_heap _system_heap declaration gives you direct access to Zephyr’s global system heap structure (the actual heap that k_malloc() and other dynamic allocation functions use internally). By declaring it as extern, you’re telling the compiler to reference the existing heap structure defined within the Zephyr kernel rather than creating a new one.

Get Current Thread: We use the function k_current_get() to get the ID of the currently running thread.

Memory Leak Injection: The line void *leak = k_malloc(100); deliberately allocates 100 bytes of memory without ever calling k_free(). This simulates a common programming error where allocated memory is never released.

Stack Monitoring: We use k_thread_stack_space_get() to check how much stack space remains for the current thread. Since we’re running in the main thread, this shows us the main thread’s stack usage. You can see an official example of this function in use here: https://docs.zephyrproject.org/apidoc/latest/stack_8h_source.html.

Heap Monitoring: The sys_heap_runtime_stats_get() function provides real-time statistics about the system heap, including both free and allocated bytes. You can see an official example of this function in use here: https://github.com/zephyrproject-rtos/zephyr/blob/main/samples/basic/sys_heap/src/main.c.

CMakeConfiguration

Add this to your CMakeLists.txt:

# Minimum CMake version
cmake_minimum_required(VERSION 3.22.0)

# Locate the Zephyr RTOS source
find_package(Zephyr REQUIRED HINTS $ENV{ZEPHYR_BASE})

# Name the project
project(app)

# Locate the source code for the application
target_sources(app PRIVATE src/main.c)

This is a standard Zephyr CMake configuration that sets up the build system and links your application with the Zephyr kernel.

Kernel Configuration

Add these essential settings to your prj.conf:

CONFIG_THREAD_STACK_INFO=y
CONFIG_INIT_STACKS=y
CONFIG_SYS_HEAP_RUNTIME_STATS=y

Alternative Configuration Method

You can also configure these settings using Zephyr’s menuconfig system, which provides a graphical interface for kernel configuration. For detailed information on using Kconfig, see this helpful guide: Kconfig Configuration Tutorial

Configuration Explanation:

  • THREAD_STACK_INFO: Enables thread stack monitoring capabilities
  • INIT_STACKS: Initializes stack memory, making stack usage calculations more accurate
  • SYS_HEAP_RUNTIME_STATS: Enables heap statistics collection

Building and Running

Build the Application

In the container, navigate to your project directory and build:

cd /workspace/apps/mem_test
west build -p always -b esp32s3_devkitc/esp32s3/procpu

The -p always flag ensures a clean build, while -b esp32s3_devkitc/esp32s3/procpu specifies the target board configuration.

Flash to Hardware

From your host computer, navigate to the introduction-to-zephyr repository directory and flash the application to your ESP32:

cd <path-to-introduction-to-zephyr>
python -m esptool --port "<PORT>" --chip auto --baud 921600 --before default_reset --after hard_reset write_flash -u --flash_mode keep --flash_freq 40m --flash_size detect 0x0 workspace/apps/mem_test/build/zephyr/zephyr.bin

Replace <PORT> with your serial port (e.g., /dev/ttyS0 for Linux, /dev/tty.usbserial-1420 for macOS, COM7 for Windows).

Alternative: QEMU Testing

If you don’t have physical hardware available, you can also run this on QEMU. See this guide for details: How to Run an ESP32 Zephyr Application on Espressif’s QEMU

Monitor the Output

Use a serial terminal to watch the real-time memory usage. For example:

python -m serial.tools.miniterm "<PORT>" 115200

Observing Memory Behavior

As the application runs, you’ll see output similar to this:

Memory leak in Zephyr on ESP32

What You’re Seeing

Heap Behavior: Notice how the “Free” bytes decrease by 100 each iteration while “Allocated” bytes increase by 100. This clearly shows the memory leak in action.

Stack Behavior: The stack usage should remain relatively stable since our main loop doesn’t use significant local variables or deep function calls.

Leak Detection: The steady decrease in free heap memory makes it easy to identify that a memory leak is occurring, even without knowing exactly where in the code it’s happening. 

Why k_malloc() is Better Than malloc()

This example uses k_malloc() instead of the standard C library’s malloc(). k_malloc() provides several important protections:

  • Returns NULL on failure: If there’s insufficient memory, k_malloc() returns NULL rather than corrupting memory or crashing the system
  • Heap bounds checking: The underlying sys_heap implementation validates allocation requests and prevents buffer overruns
  • Metadata protection: Heap metadata structures are protected from corruption

Unlike malloc(), k_malloc() integrates seamlessly with Zephyr’s memory management system:

  • Works directly with sys_heap_runtime_stats_get() for monitoring
  • No additional configuration required
  • Predictable behavior across all supported platforms

When the heap is exhausted, k_malloc() fails gracefully:

void *ptr = k_malloc(1000);
if (!ptr) {
    printk("Allocation failed - heap exhausted\n");
    // System continues running, can handle the error appropriately
}

The system remains stable even when memory allocation fails, allowing your application to handle the situation gracefully rather than crashing.

Going Further

Once you understand the basics of memory monitoring, you can extend this knowledge in several ways:

Advanced Monitoring Techniques

  • Multiple thread monitoring: Track stack usage across different threads
  • Memory pool monitoring: Monitor specialized memory pools for specific use cases
  • Periodic health checks: Implement watchdog functions that alert when memory usage exceeds thresholds

Integration with Debugging Tools

  • Memory debugging: Use tools like AddressSanitizer when available
  • Static analysis: Incorporate tools that can detect potential memory leaks at compile time
  • Runtime profiling: Add timestamps to track allocation patterns over time

Production Considerations

  • Telemetry: Send memory usage statistics to remote monitoring systems
  • Adaptive behavior: Implement strategies to free non-essential memory when heap usage gets high
  • Error recovery: Design systems that can recover gracefully from memory exhaustion

If you’d like to learn more, definitely check out Zephyr’s official Memory Management guides.

Conclusion

Memory monitoring is a crucial skill for embedded systems developers. By understanding how to track both heap and stack usage in Zephyr, you can:

  • Detect memory leaks early in development
  • Optimize memory usage for resource-constrained devices
  • Build more robust and reliable embedded applications
  • Prevent system crashes due to memory exhaustion

The techniques shown in this tutorial provide a solid foundation for implementing comprehensive memory monitoring in your Zephyr applications. Regular memory monitoring during development will save you from difficult debugging sessions later. This is especially true if you’re writing complex drivers or using libraries (and you’re not sure if memory is being allocated without being freed!).

Start with this basic monitoring approach, then gradually add more sophisticated memory management strategies as your applications become more complex. Your future self (and your users) will thank you for the proactive approach to memory management.

Leave a Reply

Your email address will not be published. Required fields are marked *