Today I learned: initializing a rust static with None does NOT ensure a zero initialized variable!
Static data intro
I encourage the use of static storage for all of your embedded application data storage. The reason is that then you know pretty well at build time what your RAM usage is and if it will fit, and you have less risk of stack overflow when the stack is used only for small local stuff.
So, a common pattern is allocating large buffers of data in static variables. However, rust also has some limits on const functionality which can be used for initalization, and some strict rules about uninitialized variables, so a common pattern is to store your static variables in an Option which starts at None, and gets initialized at startup:
static DATA: Mutex<Option<MyType>> = Mutex::new(None);
BSS vs DATA
A key thing to know is that your static variables can end up in two categories:
BSS is a section of memory which is zero-ized during the runtime init -- before your application hits main. DATA is a section of memory that get copied into RAM during the runtime init. The key differentiator is that DATA goes into your flash -- it's part of your code size -- while BSS does not.
Variables which are initialized with ALL zeros go into BSS. Variables which are not, go into DATA. And, there's no splitting up data, or compresssion here. If a struct has a size of 1kB, and one byte in the initial value is non zero, all 1kB will be stored into your flash! This can lead to some surprising code bloat, so it's generally a good idea to try to make sure that your large data types can be zero-initialized where possible.
Is None encoded with zeros?
If you'd asked me a couple days ago, I'd have said: "yes, I think so!" Now, I know the answer is: "Usually!"
But today, when I was investigating my code size, I found a surprising large symbol in the data section. The size made sense -- the struct contained about 5kB worth of CAN message buffering -- but it was being initialied with None so why was it going into the .data section?!
The answer appears to be that None is usually encoded with zeros, but not always. For example, there is an exception when wrapping bool types, which the compiler knows can only have the value of 0 or 1. Since it knows the bool can never be 2, it is free to use the value of 2 as the discriminant for None, and combine the discriminant into the value byte, so it does. This has the benefit that size_of::<Option<bool>>() is 1, just like the bool. Otherwise, it would have to prepend an extra discriminant in front of the bool value, making the size of the data type larger in memory.
My struct, in addition to the CAN message buffers, had an array of bools, and so it seems that the compiler re-arranged the ordering of the struct in memory so that the array of bools came first, and then used a value of 2 as the discriminant for the Option enum, and so my entire 5k struct got initialized from .data.
Demo Code
I wrote a quick example to demonstrate the behavior:
use core::mem::{size_of, transmute};
struct Foo {
buf: [u32; 256],
}
struct FooWithBool {
value: bool,
buf: [u32; 256],
}
struct FooWithBoolArray {
value: [bool; 4],
buf: [u32; 256],
}
static FOO: Option<Foo> = None;
static FOO_WITH_BOOL: Option<FooWithBool> = None;
static FOO_WITH_BOOL_ARRAY: Option<FooWithBoolArray> = None;
static BOOL: Option<bool> = None;
fn main() {
const FOO_SIZE: usize = size_of::<Foo>();
const FWB_SIZE: usize = size_of::<FooWithBool>();
const FWBA_SIZE: usize = size_of::<FooWithBoolArray>();
const BOOL_SIZE: usize = size_of::<Option<bool>>();
let foo_data: &[u8; FOO_SIZE] = unsafe { transmute(&FOO) };
let fwb_data: &[u8; FWB_SIZE] = unsafe { transmute(&FOO_WITH_BOOL) };
let fwba_data: &[u8; FWBA_SIZE] = unsafe { transmute(&FOO_WITH_BOOL_ARRAY) };
let bool_data: &[u8; BOOL_SIZE] = unsafe { transmute(&BOOL) };
// Show the first 12 bytes of each
println!("size_of Option<Foo>: {}", FOO_SIZE);
println!("size_of Option<FooWithBool>: {}", FWB_SIZE);
println!("size_of Option<FooWithBoolArray>: {}", FWBA_SIZE);
println!("size_of Option<bool>: {}", BOOL_SIZE);
println!("FOO: {:?}", &foo_data[..12]);
println!("FOO_WITH_BOOL: {:?}", &fwb_data[..12]);
println!("FOO_WITH_BOOL_ARRAY: {:?}", &fwba_data[..12]);
println!("BOOL: {:?}", bool_data);
}
You can also run this yourself on the Rust Playground
The ouput?
size_of Option<Foo>: 1024
size_of Option<FooWithBool>: 1028
size_of Option<FooWithBoolArray>: 1028
size_of Option<bool>: 1
FOO: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
FOO_WITH_BOOL: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
FOO_WITH_BOOL_ARRAY: [2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
BOOL: [2]
As you can see, with a plain bool, you get a non-zero encoding, as with the array of bools. A single bool in the struct does NOT result in non-zero None encoding, I think because it had to pad the bool anyway, so the 3 extra padding bytes can be used to store the discriminant.
The solution
The only reliable solution I can think of to ensure a static var initializes with 0 is to use MaybeUninit.
The static_cell crate handles this well for you when you just need a local &mut. It will not really help you share the static with an IRQ.