Pinning in Rust is an essential concept for scenarios where certain values in memory must remain in a fixed location, making it critical for Rust developers working with async programming, self-referential structs, and Foreign Function Interfaces (FFI). In this article, we’ll dive deeper into pinning, explore the mechanics of the Pin type, and implement a practical, real-world example to solidify your understanding.
Why Pinning is Essential
Rust’s ownership model allows values to be freely moved in memory by default, ensuring optimal performance. However, certain cases require that a value’s memory address remains constant:
- Self-referential Types: Data structures that reference themselves, creating a direct link to their own fields.
- Async Programming: Many async tasks in Rust (
Futures) rely on pinned data structures to avoid issues during suspension and resumption. - FFI (Foreign Function Interface): When working with C libraries or other external code, values need to remain at fixed addresses to keep pointers valid.
Pinning helps manage these cases by marking a value as immovable, which Rust enforces through the Pin type.
Overview of Pin and Unpin
Rust’s Pin type ensures that once a value is pinned, it cannot be moved. Here’s how it works:
Pin<P>Wrapper: This type is a wrapper around pointers likeBox,Rc, or&mut, enforcing that the value inside it stays at a fixed memory address.UnpinTrait: By default, most types in Rust areUnpin, meaning they can be moved. Types that are sensitive to movement (like self-referential structs and async futures) do not implementUnpin, making them compatible withPin.
Building a Self-Referential Struct with Pinning
Let’s create a more advanced example: a self-referential struct that relies on Pin to safely hold a reference to its own data. Self-referential structs are tricky in Rust because moving them would invalidate any internal references, leading to undefined behavior. By using Pin, we ensure the struct remains in a fixed location, making internal references safe.
Implementing a Self-Referential Cache Struct
In this example, we’ll implement a Cache struct that stores a reference to its own data in a cached_data field. This structure simulates a self-referential cache that refreshes its content based on a computationally intensive function, and it relies on Pin to ensure that the self-referential structure is memory-safe.
- Define the Struct: We create the
Cachestruct with fields for the original data and a reference to the cached data. - Use
Pin: To prevent theCacheinstance from moving, we pin it inside aBox.
use std::pin::Pin;
use std::marker::PhantomPinned;
struct Cache {
data: String,
cached_data: Option<*const String>,
_pinned: PhantomPinned, // Prevents the struct from being `Unpin`
}
impl Cache {
/// Creates a new Cache instance with no cached data initially.
fn new(data: String) -> Self {
Cache {
data,
cached_data: None,
_pinned: PhantomPinned,
}
}
/// Initializes or refreshes the cached reference.
fn refresh_cache(self: Pin<&mut Self>) {
// Safe to use `as_ref` because `data` is pinned and will not move.
let self_ptr: *const String = &self.data;
// SAFETY: `self_ptr` remains valid as long as `self` is pinned.
unsafe { self.get_unchecked_mut().cached_data = Some(self_ptr) };
}
/// Returns the cached data reference.
fn get_cached_data(&self) -> Option<&String> {
// SAFETY: `cached_data` contains a valid reference to `data`.
self.cached_data.map(|ptr| unsafe { &*ptr })
}
}
fn main() {
// Step 1: Pin the Cache instance in memory
let mut cache = Box::pin(Cache::new(“Initial Data”.to_string()));
// Step 2: Refresh the cache to set up the self-referential pointer
cache.as_mut().refresh_cache();
// Access the cached data reference
if let Some(cached_data) = cache.get_cached_data() {
println!(“Cached data: {}”, cached_data);
} else {
println!(“No data in cache.”);
}
// Update data and refresh the cache
cache.as_mut().get_unchecked_mut().data = “Updated Data”.to_string();
cache.as_mut().refresh_cache();
// Access the updated cached data
if let Some(cached_data) = cache.get_cached_data() {
println!(“Updated cached data: {}”, cached_data);
}
}
- The
CacheStruct:
data: Stores the primary data.cached_data: Stores a raw pointer todata, making it self-referential._pinned: APhantomPinnedmarker that preventsCachefrom automatically implementingUnpin.
2. Pinning the Cache:
- We pin the
Cacheinstance usingBox::pin, ensuring that it won’t move in memory. This is required for the self-referential structure to be safe.
3. Refreshing the Cache:
- The
refresh_cachemethod usesPin<&mut Self>to modifycached_data, storing a pointer todata. - The
Pinwrapper ensures thatdatawon’t be moved, so any references to it inside the struct remain valid.
4. Accessing Cached Data:
get_cached_datareturns a safe reference todataby dereferencing the pointer stored incached_data.- This method uses
unsafe, but the pointer remains valid as long asdatais pinned, making it safe.
5. Modifying and Refreshing Data:
- We modify
datadirectly (while usingunsafeto bypass pinning restrictions), then callrefresh_cacheto update the self-referential pointer incached_data.
What Would Happen Without Pinning?
Without pinning, the Cache struct could be moved in memory after it’s created. When a Rust value is moved, it is effectively relocated to a different memory address. If our Cache struct holds a pointer to one of its own fields (like data in our example), moving the struct would invalidate that pointer, creating a dangling pointer.
In this scenario:
- Creating the Self-Referential Pointer: Initially, creating the
cached_datapointer might work because it would correctly point todata. - Moving the Struct: If the
Cacheinstance were moved after setting upcached_data, the pointer incached_datawould still point to the old memory location ofdata, which is now incorrect. - Dereferencing the Pointer: Attempting to use
cached_datato accessdatawould result in undefined behavior, potentially causing memory corruption, crashes, or other serious issues.
Let’s break down the problems we’d encounter without pinning and why pinning is necessary.
Key Issues Without Pinning
1. Dangling Pointers
A dangling pointer occurs when a pointer references memory that is no longer valid. If the Cache struct were moved in memory, the cached_data pointer would no longer point to the correct data field.
For example:
let mut cache = Cache::new(“Initial Data”.to_string());
cache.refresh_cache(); // cached_data now points to &cache.data
// Moving `cache` by reassigning it to a new location in memory
let cache = cache; // This moves `cache` to a new address
// Attempting to access `cache.get_cached_data()`
// would now reference an invalid memory location.
Because cache was moved, cached_data would now be an invalid pointer, pointing to the previous address of data, which could lead to undefined behavior when dereferenced.
2. Rust’s Safety Guarantees Broken
Rust’s memory safety model is designed to prevent issues like dangling pointers and invalid references. By default, Rust prevents the creation of self-referential structs that contain pointers to their own fields, as the compiler cannot guarantee they will remain valid if the struct is moved. Without pinning, the Rust compiler would generally disallow this kind of struct.
In our example, we circumvented this using unsafe and raw pointers. However, without pinning, the structure is inherently unsafe because it relies on an assumption that data will never move, which Rust cannot guarantee.
3. Undefined Behavior on Dereference
If cached_data becomes a dangling pointer due to a move, dereferencing it leads to undefined behavior. Undefined behavior in Rust can have various consequences, including:
- Memory Corruption: Accessing or modifying memory that the pointer no longer owns could overwrite other data or cause unexpected behavior.
- Crashes: Attempting to access invalid memory could lead to segmentation faults or crashes.
- Silent Bugs: In some cases, undefined behavior might not immediately crash the program but could produce incorrect results, leading to hard-to-trace bugs.
Why Pinning Solves These Issues
Pinning solves these problems by guaranteeing that a value will not move in memory once it’s pinned. With Pin<Box<Cache>>, we create a boxed instance of Cache that cannot be moved, ensuring that any pointers within the struct (like cached_data) will always reference valid memory. Rust’s Pin type makes self-referential patterns safe and possible to use by enforcing immovability.
Pinned vs Unpinned in Action
To illustrate how pinning keeps the memory address constant and how, in contrast, an unpinned implementation allows the memory address to change, we can add logging to showcase the memory addresses of both data and cached_data during execution.
Let’s start by implementing a version with pinning and logging to show the constant memory address, then proceed to an unpinned version where the address might change.
Example 1: Pinned Implementation with Constant Memory Address Logging
In this pinned version, we’ll log the memory address of data and show that it remains constant throughout the lifetime of the Cache instance.
use std::pin::Pin;
use std::marker::PhantomPinned;
struct Cache {
data: String,
cached_data: Option<*const String>,
_pinned: PhantomPinned, // Prevents the struct from being `Unpin`
}
impl Cache {
fn new(data: String) -> Self {
Cache {
data,
cached_data: None,
_pinned: PhantomPinned,
}
}
fn refresh_cache(self: Pin<&mut Self>) {
let self_ptr: *const String = &self.data;
unsafe {
self.get_unchecked_mut().cached_data = Some(self_ptr);
}
println!(“Pinned data address: {:p}”, &self.data);
println!(“Pinned cached_data address: {:p}”, self_ptr);
}
fn get_cached_data(&self) -> Option<&String> {
self.cached_data.map(|ptr| unsafe { &*ptr })
}
}
fn main() {
// Step 1: Pin the Cache instance in memory
let mut cache = Box::pin(Cache::new(“Initial Data”.to_string()));
// Step 2: Refresh the cache to set up the self-referential pointer
cache.as_mut().refresh_cache();
// Access the cached data reference
if let Some(cached_data) = cache.get_cached_data() {
println!(“Accessing cached data: {}”, cached_data);
} else {
println!(“No data in cache.”);
}
// Update data and refresh the cache
cache.as_mut().get_unchecked_mut().data = “Updated Data”.to_string();
cache.as_mut().refresh_cache();
// Access the updated cached data
if let Some(cached_data) = cache.get_cached_data() {
println!(“Accessing updated cached data: {}”, cached_data);
}
}
Output
With this pinned version, you’ll see consistent memory addresses for both data and cached_data before and after updates, since the Cache instance cannot move.
Example 2: Unpinned Implementation with Memory Address Logging
Now, let’s implement the same structure without pinning and add similar logging. We’ll see that if the Cache instance is moved, the memory addresses might change, demonstrating how an unpinned implementation is unsafe for self-references.
struct Cache {
data: String,
cached_data: Option<*const String>,
}
impl Cache {
fn new(data: String) -> Self {
Cache {
data,
cached_data: None,
}
}
fn refresh_cache(&mut self) {
let self_ptr: *const String = &self.data;
self.cached_data = Some(self_ptr);
println!(“Unpinned data address: {:p}”, &self.data);
println!(“Unpinned cached_data address: {:p}”, self_ptr);
}
fn get_cached_data(&self) -> Option<&String> {
self.cached_data.map(|ptr| unsafe { &*ptr })
}
}
fn main() {
let mut cache = Cache::new(“Initial Data”.to_string());
// Refresh the cache to set up the self-referential pointer
cache.refresh_cache();
// Move `cache` by reassigning it to a new variable
let cache = cache; // This moves `cache`
// Try accessing the cached data reference
if let Some(cached_data) = cache.get_cached_data() {
println!(“Accessing cached data: {}”, cached_data);
} else {
println!(“No data in cache.”);
}
// Update data and refresh the cache
let mut cache = cache; // Moving it again
cache.data = “Updated Data”.to_string();
cache.refresh_cache();
// Access the updated cached data
if let Some(cached_data) = cache.get_cached_data() {
println!(“Accessing updated cached data: {}”, cached_data);
}
}
Explanation of the Output
In this unpinned version, you may observe different memory addresses printed for data and cached_data:
- The first time
refresh_cacheis called,dataandcached_datawill have the same address. - After the first move (
let cache = cache;), thecached_datapointer now points to an old memory location, and accessing it could lead to undefined behavior. - Calling
refresh_cacheagain may assign a new memory address fordata, showing that without pinning,datacan move in memory, making thecached_datapointer potentially invalid and unsafe to dereference.
Summary
- Pinned Version: The memory address remains constant, ensuring that
cached_datapoints to a valid location. - Unpinned Version: Moving the struct changes the address of
data, potentially leading to a dangling pointer incached_data.
This comparison highlights why pinning is essential for self-referential structs in Rust. Without pinning, any movement of the struct results in invalid references, leading to unsafe and undefined behaviour.
🚀 Discover More Free Software Engineering Content! 🌟
If you enjoyed this post, be sure to explore my new software engineering blog, packed with 200+ in-depth articles, 🎥 explainer videos, 🎙️ a weekly software engineering podcast, 📚 books, 💻 hands-on tutorials with GitHub code, including:
🌟 Developing a Fully Functional API Gateway in Rust — Discover how to set up a robust and scalable gateway that stands as the frontline for your microservices.
🌟 Implementing a Network Traffic Analyzer — Ever wondered about the data packets zooming through your network? Unravel their mysteries with this deep dive into network analysis.
🌟Implementing a Blockchain in Rust — a step-by-step breakdown of implementing a basic blockchain in Rust, from the initial setup of the block structure, including unique identifiers and cryptographic hashes, to block creation, mining, and validation, laying the groundwork.
and much more!
✅ 200+ In-depth software engineering articles
🎥 Explainer Videos — Explore Videos
🎙️ A brand-new weekly Podcast on all things software engineering — Listen to the Podcast
📚 Access to my books — Check out the Books
💻 Hands-on Tutorials with GitHub code
📞 Book a Call
👉 Visit, explore, and subscribe for free to stay updated on all the latest: Home Page
LinkedIn Newsletter: Stay ahead in the fast-evolving tech landscape with regular updates and insights on Rust, Software Development, and emerging technologies by subscribing to my newsletter on LinkedIn. Subscribe Here
🔗 Connect with Me:
- LinkedIn: Join my professional network for more insightful discussions and updates. Connect on LinkedIn
- X: Follow me on Twitter for quick updates and thoughts on Rust programming. Follow on Twitter
Wanna talk? Leave a comment or drop me a message!
All the best,
Luis Soares
luis@luissoares.dev
Lead Software Engineer | Blockchain & ZKP Protocol Engineer | 🦀 Rust | Web3 | Solidity | Golang | Cryptography | Author
