About
In this code snippet, we will take a look at unsafe code, pointers, stack allocation and spans in C#.
C# has managed memory unlike for example C/C++. This means that you don’t have to allocate or clear memory by yourself and in the case of .NET the CLR(common language runtime) takes care of memory allocation and garbage collection. This also means that in C# you can’t work with a pointer or allocate your variables on the stack which is common practice in C/C++.
However, if you enable the “unsafe code” compiler option in your project settings you can use the unsafe code block which allows you to do stack allocations and use pointers within it. You can also mark a method as unsafe and you will be able to use pointers and stackalloc within it.
Spans are a type that gives us a memory safe way to work with a contiguous chunk of memory. They are reference type structs allocated on the stack(much faster than being on the heap). They themselves live on the stack and can have values on the stack or point to values on the heap. A great use case for spans is when working with strings(getting substrings) or arrays(splitting/slicing) as it will not create a new copy of the string/array but instead just use a start and an end pointer to that particular section of memory.
Next, let’s look at the memory layout or memory segmentation of a program. This will help us understand the things said in the previous section above.
Note: The memory segmentation might vary between different programming languages and the target computer architectures.
The text segment is where your entire program(instructions) is located.
The data or initialized data segment is where all of your global and static initialized variables are located.
The bss or uninitialized data is where all of your global and static uninitialized variables are located. They are empty, but that “empty space” is reserved and can be considered used because when the variable is finally given a value it will get written to that empty space.
The stack is where the currently executing function and its local variables are located. When you make a function call the code inside it is pushed onto the stack. This is referred to as a stack frame. If that function calls another function its code is also added to the stack as another stack frame on top of the previous one. When a function is done executing and returns the program execution to its caller its stack frame gets removed automatically.
The heap is dynamically allocated data. In C# this would be everything that uses the new keyword like creating objects and arrays. So for example when the code on the stack executes and it encounters the new keyword the array/object being created is going to be allocated on the heap(this can be a slow operation). The variable you save your object into will itself be on the stack but it will contain only a pointer/reference to the actual object located on the heap.
Note: A while ago I made a tutorial on compiling custom quadrotor firmware. There you can see a practical example where the compiler gives you the information about the memory used.
Now let’s have a look at the code below to see how to use all of the mentioned above.
Code:
//Any unsafe code must be written inside a code block marked by the unsafe keyword. //Unsafe code is considered anything that has to do with pointers and memory allocation done by the programmer. //A pointer points to the location in memory where the actual value of a variable is stored. //In some languages like C/C++ this is how all the code is written but in a managed language like C# the memory by the CLR(Common Language Runtime). //Usually the garbage collector takes care of the memory(dangling pointers, malloc(), calloc, realloc() and free()). //However if we want to manage the memory ourselves we can do so with the aforementioned unsafe code block. //Note: A method can also be marked as unsafe which enables you to write unsafe code in the method without using the unsafe code block. //Note: The unsafe option has to be enabled in your project settings else your code won't compile. //Declaring a regular variable. int myValue = 3; //If we print it out we'll get its value. Console.WriteLine("Value: " + myValue); unsafe { //The & gets the pointer to a variable. //While the * enables a variable data type to store a pointer. int* myValuePointer = &myValue; //If we print the pointer to myValue we'll get the memory address of where myValue is stored. Console.WriteLine($"Pointer: " + (long)myValuePointer); //To get the HEX value: Console.WriteLine($"Pointer HEX: {(long)myValuePointer:X}"); //You can use the * to dereference a pointer and get the actual value it's pointing to. Console.WriteLine($"Value: " + *myValuePointer); //We can use the stackalloc keyword to allocate memory to the stack and then get a pointer to it like so: char* myStackArrayPtr = stackalloc char[5] { 'H', 'e', 'l', 'l', 'o' }; //Performance wise this can be much faster than using the new keyword as new would create the array on the heap. //The heap is slower than the stack however the stack is smaller in size compared to the heap. //You also risk getting a stack overflow which is when the stack grows so much it starts to overwrite the heap. //Allocation on the heap requires a trap(system call to the OS to allocate memory which is quite slow compared to a stack allocation). //Also stack allocated memory will be automatically cleared once the method returns execution back to its caller. //As C# has managed memory some addresses might get moved around by the garbage collector. //This means that a pointer might end up pointing to the wrong thing after some time. //To avoid this we must fix a pointer using the fixed keyword char[] myCharArray = { 'H', 'e', 'l', 'l', 'o' }; int myCharArrayLength = myCharArray.Length; fixed (char* myArrayPtr = myCharArray) //fixed can be only used within an unsafe block/method. { for (long i = 0; i < myCharArrayLength; i++) { Console.WriteLine($"Value: " + myArrayPtr[i]); } } } //Spans are a type and a memory safe way to represent a contiguous block of memory. //Spans can refer to memory both on the stack and heap. char[] myOtherCharArray = new char[] { 'H', 'e', 'l', 'l', 'o' }; Span<char> heapSpan = myOtherCharArray.AsSpan(); //Note: you can use stackalloc outside the unsafe block if used with a Span. Span<char> stackSpan = stackalloc char[] { 'H', 'e', 'l', 'l', 'o' }; //A great way to use spans is when working with strings. //When you get a substring of a string a new string gets created and the substring is copied into it. //This takes up memory and time. If we use a span instead we'll just get a pointer to the start and the length of the subtring. string myString = "Hello World"; //Convert string to span. ReadOnlySpan<char> stringSpan = myString.AsSpan(); //Get "substring". ReadOnlySpan<char> mySubStringSpan = stringSpan.Slice(0, 5); //Note: Span<T> can only be used for synchronous operations. If you want the Span functionality in an asynchronous context use Memory<T>. //Memory<T> is basically a span allocated on the heap which means it can be used also within async methods unlike Span<T>. //More about Memory<T> here: https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/january/csharp-all-about-span-exploring-a-new-net-mainstay#what-is-memoryt-and-why-do-you-need-it:~:text=by%20Memory%3CT%3E.-,What%20Is%20Memory%3CT%3E%20and%20Why%20Do%20You%20Need%20It%3F,-Span%3CT%3E%20is Console.WriteLine(mySubStringSpan.ToString());