Expanding a VecDeque to limit size - collections

I want to implement a VecDeque with a maximum size limit. I have two strategies but I can't complete either.
First approach: Inheritance by composition.
I Created a new struct:
pub struct LimVecDeque<T> {
deque: VecDeque<T>,
limit: usize,
}
And create a new push function:
impl<T> LimVecDeque<T> {
...
pub fn push (&self, elem: T) {
self.deque.push_back(elem);
if self.limit < self.deque.len() {
self.deque.pop_front();
}
}
...
}
This works but, as my program grow up, I require to add functionality to my LimVecDeque Struct. Most of them is a copy from the original VecDeque:
pub fn len(&self) -> usize {
self.deque.len()
}
I have more problems to export VecDeque::iter(). I had problems with types and iterators (I'm not very good with iterators yet).
This approach forces me to clone/export each function in VecDeque into LimVecDeque. Lot of work!
Second Approach: Create a new trait and implement for VecDeque:
trait Limited {
type Elem;
pub fn push_bounded(&self, limit: usize, elem: Elem);
}
and later impl the trait with VecDeque.
But I have to pass limit value in each insertion. How can pass limit value once?
In general, What is an easy way to add functionality to an struct from std (without loosing/hiding current ones)?

As pointed out by edkeveked's answer, there is a crate available (BoundedVecDequeue), which implements the exact case you are trying to implement.
If you look at the implementation of BoundedVecDequeue you will see that it uses the first pattern you describe: to create wrapper methods where it is necessary to modify the behaivour of the wrapped type, and to delegate method calls where it is not.
As you point out, this could result in a lot of boilerplate. To reduce the amount of work involved, you might like to try the delegate crate, which adds a macro that does the delegation for you:
use delegate::delegate;
impl<T> LimVecDeque<T> {
delegate! {
to self.inner {
pub fn len(&self) -> usize;
pub fn truncate(&mut self, new_len: usize);
pub fn as_mut_slices(&mut self) -> (&mut [T], &mut [T]);
// etc
}
}
// ...
}
Caveat: I have not actually used this crate myself yet, so I can't vouch for it's quality.

BoundedVecDeque does just that
use ::bounded_vec_deque::BoundedVecDeque;
fn main() {
let mut a = BoundedVecDeque::new(2);
a.push_front(2);
a.push_front(3);
a.push_front(4);
println!("{:?}", a); //4, 3
}

Related

How to please the borrow checker when implementing a 'too large to fit in memory' datastructure?

I am implementing a simple B-tree in Rust (as hobby project).
The basic implementation works well.
As long as everything fits in memory, nodes can store direct references to their children and it is easy to traverse and use them.
However, to be able to store more data inside than fits in RAM, it should be possible to store tree nodes on disk and retrieve parts of the tree only when they are needed.
But now it becomes incredibly hard to return a reference to a part of the tree.
For instance, when implementing Node::lookup_value:
extern crate arrayvec;
use arrayvec::ArrayVec;
use std::collections::HashMap;
type NodeId = usize;
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum NodeContents<Value> {
Leaf(ArrayVec<Value, 15>),
Internal(ArrayVec<NodeId, 16>)
}
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct Node<Key, Value> {
pub keys: ArrayVec<Key, 15>,
pub contents: NodeContents<Value>,
}
pub trait Pool<T> {
type PoolId;
/// Invariants:
/// - Ony call `lookup` with a key which you *know* exists. This should always be possible, as stuff is never 'simply' removed.
/// - Do not remove or alter the things backing the pool in the meantime (Adding more is obviously allowed!).
fn lookup(&self, key: &Self::PoolId) -> T;
}
#[derive(PartialEq, Eq, PartialOrd, Ord, Clone, Debug)]
pub struct EphemeralPoolId(usize);
// An example of a pool that keeps all nodes in memory
// Similar (more complicated) implementations can be made for pools storing nodes on disk, remotely, etc.
pub struct EphemeralPool<T> {
next_id: usize,
contents: HashMap<usize, T>,
}
impl<T: Clone> Pool<T> for EphemeralPool<T> {
type PoolId = EphemeralPoolId;
fn lookup(&self, key: &Self::PoolId) -> T {
let val = self.contents.get(&key.0).expect("Tried to look up a value in the pool which does not exist. This breaks the invariant that you should only use it with a key which you received earlier from writing to it.").clone();
val
}
}
impl<Key: Ord, Value> Node<Key, Value> {
pub fn lookup_value<'pool>(&self, key: &Key, pool: &'pool impl Pool<Node<Key, Value>, PoolId=NodeId>) -> Option<&Value> {
match &self.contents {
NodeContents::Leaf(values) => { // Simple base case. No problems here.
self.keys.binary_search(key).map(|index| &values[index]).ok()
},
NodeContents::Internal(children) => {
let index = self.keys.binary_search(key).map(|index| index + 1).unwrap_or_else(|index| index);
let child_id = &children[index];
// Here the borrow checker gets mad:
let child = pool.lookup(child_id); // NodePool::lookup(&self, NodeId) -> Node<K, V>
child.lookup_value(key, pool)
}
}
}
}
Rust Playground
The borrow checker gets mad.
I understand why, but not why to solve it.
It gets mad, because NodePool::lookup returns a node by value. (And it has to do so, because it reads it from disk. In the future there may be a cache in the middle, but elements from this cache might be removed at any time, so returning references to elements in this cache is also not possible.)
However, lookup_value returns a reference to a tiny part of this value. But the value stops being around once the function returns.
In the Borrow Checker's parlance:
child.lookup_value(key, pool)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ returns a reference to data owned by the current function
How to solve this problem?
The simplest solution would be to change the function in general to always return by-value (Option<V>). However, these values are often large and I would like to refrain from needless copies. It is quite likely that lookup_value is called in quick succession with closely related keys. (And also, there is a more complex scenario which the borrow checker similarly complains about, where we look up a range of sequential values with an interator. The same situation applies there.)
Are there ways in which I can make sure that child lives long enough?
I have tried to pass in an extra &mut HashMap<NodeId, Arc<Node<K, V>>> argument to store all children needed in the current traversal to keep their references alive for long enough. (See this variant of the code on the Rust Playground here)
However, then you end up with a recursive function that takes (and modifies) a hashmap as well as an element in the hashmap. Rust (rightfully) blocks this from working: It's (a) not possible to convince Rust that you won't overwrite earlier values in the hashmap, and (b) it might need to be reallocated when it grows which would also invalidate the references.
So that does not work.
Are there other solutions? (Even ones possibly involving unsafe?)
Minimal Reproducible Example
Your example contained a lot of errors and undefined types. Please provide a proper minimal reproducible example next time to avoid frustration and increase the chance of getting a quality answer.
EDIT: I saw your modified question too late with the proper minimal example, I hope this one fits. If not, I'll take another look. Let me know.
Either way, I attempted to reverse-engineer what your minimal example could have been:
use std::marker::PhantomData;
type NodeId = usize;
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum NodeContents<Value> {
Leaf(Vec<Value>),
Internal(Vec<NodeId>),
}
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct Node<Key, Value> {
pub keys: Vec<Key>,
pub contents: NodeContents<Value>,
}
struct NodePool<Key, Value> {
_p: PhantomData<(Key, Value)>,
}
impl<Key, Value> NodePool<Key, Value> {
fn lookup(&self, _child_id: NodeId) -> Node<Key, Value> {
todo!()
}
}
impl<Key, Value> Node<Key, Value>
where
Key: Ord,
{
pub fn lookup_value(&self, key: &Key, pool: &NodePool<Key, Value>) -> Option<&Value> {
match &self.contents {
NodeContents::Leaf(values) => {
// Simple base case. No problems here.
self.keys
.binary_search(key)
.map(|index| &values[index])
.ok()
}
NodeContents::Internal(children) => {
let index = self
.keys
.binary_search(key)
.map(|index| index + 1)
.unwrap_or_else(|index| index);
let child_id = &children[index];
// Here the borrow checker gets mad:
let child = pool.lookup(*child_id); // NodePool::lookup(&self, NodeId) -> Node<K, V>
child.lookup_value(key, pool)
}
}
}
}
error[E0515]: cannot return reference to local variable `child`
--> src/lib.rs:50:17
|
50 | child.lookup_value(key, pool)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ returns a reference to data owned by the current function
Thoughts and Potential Solutions
The main question you have to ask yourself: Who owns the data once it's loaded from disk?
This is not a simple question with a simple answer. It opens many new questions:
If you query the same key twice, should it load from disk twice? Or should it be cached and some other previously cached value should be moved to disk?
If it should be loaded from disk twice, should the entire Node that lookup returned be kept alive until the borrow is returned, or should only the Value be moved out and kept alive?
If your answer is "it should be loaded twice" and "only the Value should be kept alive", you could use an enum that can carry either a borrowed or owned Value:
enum ValueHolder<'a, Value> {
Owned(Value),
Borrowed(&'a Value),
}
impl<'a, Value> std::ops::Deref for ValueHolder<'a, Value> {
type Target = Value;
fn deref(&self) -> &Self::Target {
match self {
ValueHolder::Owned(val) => val,
ValueHolder::Borrowed(val) => val,
}
}
}
In your code, it could look like this:
use std::marker::PhantomData;
type NodeId = usize;
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum NodeContents<Value> {
Leaf(Vec<Value>),
Internal(Vec<NodeId>),
}
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct Node<Key, Value> {
pub keys: Vec<Key>,
pub contents: NodeContents<Value>,
}
pub struct NodePool<Key, Value> {
_p: PhantomData<(Key, Value)>,
}
impl<Key, Value> NodePool<Key, Value> {
fn lookup(&self, _child_id: NodeId) -> Node<Key, Value> {
todo!()
}
}
pub enum ValueHolder<'a, Value> {
Owned(Value),
Borrowed(&'a Value),
}
impl<'a, Value> std::ops::Deref for ValueHolder<'a, Value> {
type Target = Value;
fn deref(&self) -> &Self::Target {
match self {
ValueHolder::Owned(val) => val,
ValueHolder::Borrowed(val) => val,
}
}
}
impl<Key, Value> Node<Key, Value>
where
Key: Ord,
{
pub fn lookup_value(
&self,
key: &Key,
pool: &NodePool<Key, Value>,
) -> Option<ValueHolder<Value>> {
match &self.contents {
NodeContents::Leaf(values) => {
// Simple base case. No problems here.
self.keys
.binary_search(key)
.map(|index| ValueHolder::Borrowed(&values[index]))
.ok()
}
NodeContents::Internal(children) => {
let index = self
.keys
.binary_search(key)
.map(|index| index + 1)
.unwrap_or_else(|index| index);
let child_id = &children[index];
// Here the borrow checker gets mad:
let child = pool.lookup(*child_id); // NodePool::lookup(&self, NodeId) -> Node<K, V>
child
.into_lookup_value(key, pool)
.map(|val| ValueHolder::Owned(val))
}
}
}
pub fn into_lookup_value(self, key: &Key, pool: &NodePool<Key, Value>) -> Option<Value> {
match self.contents {
NodeContents::Leaf(mut values) => {
// Simple base case. No problems here.
self.keys
.binary_search(key)
.map(|index| values.swap_remove(index))
.ok()
}
NodeContents::Internal(children) => {
let index = self
.keys
.binary_search(key)
.map(|index| index + 1)
.unwrap_or_else(|index| index);
let child_id = &children[index];
// Here the borrow checker gets mad:
let child = pool.lookup(*child_id); // NodePool::lookup(&self, NodeId) -> Node<K, V>
child.into_lookup_value(key, pool)
}
}
}
}
In all other cases, however, it seems like further thought needs to be put into restructuring the project.
After a lot of searching and trying, and looking at what other databases are doing (great suggestion, #Finomnis!) the solution presented itself.
Use an Arena
The intuition to use a HashMap was a good one, but you have the problems with lifetimes, for two reasons:
A HashMap might be reallocated when growing, which would invalidate old references.
You cannot convince the compiler that new insertions into the Hashmap will not overwrite existing values.
However, there is an alternative datastructure which is able to provide these guarantees: Arena allocators.
With these, you essentially 'tie' the lifetimes of the values you insert into them, to the lifetime of the allocator as a whole.
Internally (at least conceptually), an arena keeps track of a list of memory regions in which data might be stored, the most recent one of which is not entirely full. Individual memory regions are never reallocated to ensure that old references remain valid; instead, when the current memory region is full, a new (larger) one is allocated and added to the list, becoming the new 'current'.
Everything is destroyed at once at the point the arena itself is deallocated.
There are two prevalent Rust libraries that provide arenas. They are very similar, with slightly different trade-offs:
typed-arena allows only a single type T per arena, but runs all Drop implementations when the arena is dropped.
bumpalo allows different objects to be stored in the same arena, but their Drop implementations do not run.
In the common situation of having only a single datatype with no special drop implementation, you might want to benchmark which one is faster in your use-case.
In the original question, we are dealing with a cache, and it makes sense to reduce copying of large datastructures by passing around Arcs between the cache and the arena. This means that you can only use typed-arena in this case, as Arc has a special Drop implementation.

How to update all the values in a BTreeSet?

I have collection which is a field in a struct in some module. I want to update all the values in the collection from another module.
I wrote some code to mimic what I want to achieve. It's shortened a bit, but I think it has all needed parts. There is no struct holding the collection in this code, but imagine this is a getter which returns the collection. I added in comments how I think it should look.
pub mod pos {
use std::cmp::{Ordering, PartialEq};
#[derive(PartialOrd, PartialEq, Eq, Hash, Debug, Copy, Clone)]
pub struct Pos {
pub x: i32,
pub y: i32,
}
#[allow(dead_code)]
impl Pos {
pub fn of(x: i32, y: i32) -> Self {
Self { x, y }
}
pub fn offset(&mut self, pos: &Self) -> Self {
self.x += pos.x;
self.y += pos.y;
*self
}
}
impl Ord for Pos {
fn cmp(&self, other: &Self) -> Ordering {
if self.x < other.x {
Ordering::Less
} else if self.eq(other) {
Ordering::Equal
} else {
Ordering::Greater
}
}
}
}
mod test {
use crate::pos::Pos;
use std::collections::BTreeSet;
#[test]
fn test_iterators() {
let mut data_in_some_strct: BTreeSet<Pos> = BTreeSet::new();
data_in_some_strct.insert(Pos::of(1, 1));
data_in_some_strct.insert(Pos::of(2, 2));
data_in_some_strct.insert(Pos::of(3, 3));
data_in_some_strct.insert(Pos::of(4, 4));
// mimic getter call ( get_data(&mut self) -> &BTreeSet<Pos> {...}
// let set = data_in_some_strct; // works, but not a reference
let set = &data_in_some_strct; // doesn't work, How to adjust code to make it work??
data_in_some_strct = set
.into_iter()
.map(|mut p| p.offset(&Pos::of(1, 0)))
.inspect(|p| println!("{:?}", *p))
.collect();
assert_eq!(data_in_some_strct.contains(&Pos::of(2, 1)), true);
assert_eq!(data_in_some_strct.contains(&Pos::of(3, 2)), true);
assert_eq!(data_in_some_strct.contains(&Pos::of(4, 3)), true);
assert_eq!(data_in_some_strct.contains(&Pos::of(5, 4)), true);
}
}
Playground
error[E0596]: cannot borrow `*p` as mutable, as it is behind a `&` reference
--> src/lib.rs:56:26
|
56 | .map(|mut p| p.offset(&Pos::of(1, 0)))
| - ^ `p` is a `&` reference, so the data it refers to cannot be borrowed as mutable
| |
| help: consider changing this to be a mutable reference: `&mut pos::Pos`
I managed to make it work without borrowing, but I would like to make it work with borrowing. I guess there is more then one way to achieve it. Comments to help my Rust brain dendrites connect are welcome.
You can't mutate items that are part of a HashSet or BTreeSet because the value of the items determines how they are stored and accessed. If you mutate them then, as Stargateur mentioned, you would break the mechanics of the collection. In the case of a HashSet, you would change the hash of the item, which determines the location where the data is stored. In the case of a BTreeSet, the algorithm is based on how the items are sorted.
You are able to do it by taking ownership because you consume the original set and produce a new, well-formed one. You can't take ownership of a borrowed value because that would leave behind a dangling pointer, which Rust will not let you do.
One possible solution is to temporarily replace the original set with an empty one. Then you can take ownership of its contents, as in your working code, and finally write the newly updated set over the original:
let set = std::mem::replace(&mut data_in_some_strct, BTreeSet::new());
data_in_some_strct = set.into_iter()
.map(|mut p| p.offset(&Pos::of(1,0)))
.inspect(|p| println!("{:?}", *p))
.collect();
BTreeSet doesn't implement impl<'a, T> IntoIterator for &'a mut BTreeSet<T> (that would break the tree).
You can only do this with types that implement IntoIterator with mut like impl<'a, T> IntoIterator for &'a mut Vec<T>, example.
For sets that are not a field of a struct
Even std::mem::replace is not required.
data_in_some_strct = data_in_some_strct
.into_iter()
.map(|mut p| p.offset(&Pos::of(1, 0)))
.inspect(|p| println!("{:?}", *p))
.collect();
Explanation
We essentially build an iterator for moving out the BTreeSet’s contents.
Then we call .map() over the elements and call the required methods.
It should be noted that it works here because .offset(...) returns Self. If the method does not return Self, you can simply write:
// ...
.map(|mut p| {
p.offset(&Pos::of(1, 0));
p
})
// ...
At last, we use .collect() to construct a BTreeSet with all newly "updated" values.
For sets that are a field of a struct
Let's assume that the field's name is data_in_some_strct of some given struct Foobar.
struct Foobar {
data_in_some_strct: BTreeSet<Pos>,
}
Let's say hypothetically we also have method called Foobar::update() that updates the values in the set.
impl Foobar {
fn update(&mut self) {
// ...
}
}
To update the set from within, we'd have to use std::mem::take().
fn update(&mut self) {
self.data_in_some_strct = std::mem::take(&mut self.data_in_some_strct)
.into_iter()
.map(|mut p| p.offset(&Pos::of(1, 0)))
.inspect(|p| println!("{:?}", *p))
.collect();
}
Explanation
The key takeaway here is std::mem::take()'s usage. It's doc says:
[It] Replaces dest with the default value of T, returning the previous dest value.
This essentially means that the destination value (here data_in_some_strct) is replaced with a default value (which happens to be an empty set for BTreeSet), and the original value is returned.
We then perform the same operations on the value returned by take() as in the previous explanation above.
As stated above, collect() builds a new BTreeSet for us by inferring its type from self.data_in_some_strct. We then move the newly created BTreeSet to self.data_in_some_strct.
Note: You can replace *p with p.
This is especially valuable and noteworthy for people who will use (or are
already using) Rc<RefCell<T>> or its thread safe variant.
Instead of using map() to mutate the values in-place, thus putting the code at risk of logic error, one should follow the steps above.

Generic map as function argument

I wrote a method:
fn foo(input: HashMap<String, Vec<String>>) {...}
I then realized that for the purpose of writing tests, I'd like to have control of the iteration order (maybe a BTreeMap or LinkedHashMap). This led to two questions:
Is there some trait or combination of traits I could use that would essentially express "a map of string to string-vector"? I didn't see anything promising in the docs for HashMap.
It turns out that in this method, I just want to iterate over the map entries, and then the items in each string vector, but couldn't figure out the right syntax for specifying this. What's the correct way to write this?
fn foo(input: IntoIterator<(String, IntoIterator<String>)>) {...}
There's no such trait to describe an abstract HashMap. I believe there's no plan to make one. The best answer so far is your #2 suggestion: for a read-only HashMap you probably just want something to iterate on.
To answer at the syntax level, you tried to write:
fn foo(input: IntoIterator<(String, IntoIterator<String>)>)
But this is not valid because IntoIterator takes no template argument:
pub trait IntoIterator where Self::IntoIter::Item == Self::Item {
type Item;
type IntoIter: Iterator;
fn into_iter(self) -> Self::IntoIter;
}
It takes two associated types, however, so what you really wanted to express is probably the following (internally I changed the nested IntoIterator to a concrete type like Vec for simplicity):
fn foo<I>(input: I)
where I: IntoIterator<
Item=(String, Vec<String>),
IntoIter=IntoIter<String, Vec<String>>>
However the choice if IntoIterator is not always suitable because it implies a transfer of ownership. If you just wanted to borrow the HashMap for read-only purposes, you'd be probably better with the standard iterator trait of a HashMap, Iterator<Item=(&'a String, &'a Vec<String>)>.
fn foo_iter<'a, I>(input: I)
where I: Iterator<Item=(&'a String, &'a Vec<String>)>
Which you can use several times by asking for a new iterator, unlike the first version.
let mut h = HashMap::new();
h.insert("The Beatles".to_string(),
vec!["Come Together".to_string(),
"Twist And Shout".to_string()]);
h.insert("The Rolling Stones".to_string(),
vec!["Paint It Black".to_string(),
"Satisfaction".to_string()]);
foo_iter(h.iter());
foo_iter(h.iter());
foo(h);
//foo(h); <-- error: use of moved value: `h`
Full gist
EDIT
As asked in comments, here is the version of foo for nested IntoIterators instead of the simpler Vec:
fn foo<I, IVecString>(input: I)
where
I: IntoIterator<
Item=(String, IVecString),
IntoIter=std::collections::hash_map::IntoIter<String, IVecString>>,
IVecString: IntoIterator<
Item=String,
IntoIter=std::vec::IntoIter<String>>
There are not traits that define a common interface for containers. The only trait that maybe is suited for your is the Index trait.
See below for a working example of the correct syntax for IntoIterator and the Index traits. You need to use references if you don't want consume the input, so be careful with lifetime parameters.
use std::ops::Index;
use std::iter::IntoIterator;
use std::collections::HashMap;
// this consume the input
fn foo<I: IntoIterator<Item = (String, String)>>(input: I) {
let mut c = 0;
for _ in input {
c += 1;
}
println!("{}", c);
}
// maybe you want this
fn foo_ref<'a, I: IntoIterator<Item = (&'a String, &'a String)>>(input: I) {
let mut c = 0;
for _ in input {
c += 1;
}
println!("{}", c);
}
fn get<'a, I: Index<&'a String, Output = String>>(table: &I, k: &'a String) {
println!("{}", table[k]);
}
fn main() {
let mut h = HashMap::<String, String>::new();
h.insert("one".to_owned(), "1".to_owned());
h.insert("two".to_owned(), "2".to_owned());
h.insert("three".to_owned(), "3".to_owned());
foo_ref(&h);
get(&h, &"two".to_owned());
}
Edit
I changed the value type to everything implements the IntoIterator trait :
use std::ops::Index;
use std::iter::IntoIterator;
use std::collections::HashMap;
use std::collections::LinkedList;
fn foo_ref<'a, B, I, >(input: I)
where B : IntoIterator<Item = String>, I: IntoIterator<Item = (&'a String, &'a B)> {
//
}
fn get<'a, B, I>(table: &I, k: &'a String)
where B : IntoIterator<Item = String>, I: Index<&'a String, Output = B>
{
// do something with table[k];
}
fn main() {
let mut h1 = HashMap::<String, Vec<String>>::new();
let mut h2 = HashMap::<String, LinkedList<String>>::new();
foo_ref(&h1);
get(&h1, &"two".to_owned());
foo_ref(&h2);
get(&h2, &"two".to_owned());
}

Pointer-stashing generics via `mem::transmute()`

I'm attempting to write Rust bindings for a C collection library (Judy Arrays [1]) which only provides itself room to store a pointer-width value. My company has a fair amount of existing code which uses this space to directly store non-pointer values such as pointer-width integers and small structs. I'd like my Rust bindings to allow type-safe access to such collections using generics, but am having trouble getting the pointer-stashing semantics working correctly.
The mem::transmute() function seems like one potential tool for implementing the desired behavior, but attempting to use it on an instance of a parameterized type yield a confusing-to-me compilation error.
Example code:
pub struct Example<T> {
v: usize,
t: PhantomData<T>,
}
impl<T> Example<T> {
pub fn new() -> Example<T> {
Example { v: 0, t: PhantomData }
}
pub fn insert(&mut self, val: T) {
unsafe {
self.v = mem::transmute(val);
}
}
}
Resulting error:
src/lib.rs:95:22: 95:36 error: cannot transmute to or from a type that contains type parameters in its interior [E0139]
src/lib.rs:95 self.v = mem::transmute(val);
^~~~~~~~~~~~~~
Does this mean a type consisting only of a parameter "contains type parameters in its interior" and thus transmute() just won't work here? Any suggestions of the right way to do this?
(Related question, attempting to achieve the same result, but not necessarily via mem::transmute().)
[1] I'm aware of the existing rust-judy project, but it doesn't support the pointer-stashing I want, and I'm writing these new bindings largely as a learning exercise anyway.
Instead of transmuting T to usize directly, you can transmute a &T to &usize:
pub fn insert(&mut self, val: T) {
unsafe {
let usize_ref: &usize = mem::transmute(&val);
self.v = *usize_ref;
}
}
Beware that this may read from an invalid memory location if the size of T is smaller than the size of usize or if the alignment requirements differ. This could cause a segfault. You can add an assertion to prevent this:
assert_eq!(mem::size_of::<T>(), mem::size_of::<usize>());
assert!(mem::align_of::<usize>() <= mem::align_of::<T>());

Cannot move out of borrowed content when borrowing a generic type

I have a program that more or less looks like this
struct Test<T> {
vec: Vec<T>
}
impl<T> Test<T> {
fn get_first(&self) -> &T {
&self.vec[0]
}
fn do_something_with_x(&self, x: T) {
// Irrelevant
}
}
fn main() {
let t = Test { vec: vec![1i32, 2, 3] };
let x = t.get_first();
t.do_something_with_x(*x);
}
Basically, we call a method on the struct Test that borrows some value. Then we call another method on the same struct, passing the previously obtained value.
This example works perfectly fine. Now, when we make the content of main generic, it doesn't work anymore.
fn generic_main<T>(t: Test<T>) {
let x = t.get_first();
t.do_something_with_x(*x);
}
Then I get the following error:
error: cannot move out of borrowed content
src/main.rs:14 let raw_x = *x;
I'm not completely sure why this is happening. Can someone explain to me why Test<i32> isn't borrowed when calling get_first while Test<T> is?
The short answer is that i32 implements the Copy trait, but T does not. If you use fn generic_main<T: Copy>(t: Test<T>), then your immediate problem is fixed.
The longer answer is that Copy is a special trait which means values can be copied by simply copying bits. Types like i32 implement Copy. Types like String do not implement Copy because, for example, it requires a heap allocation. If you copied a String just by copying bits, you'd end up with two String values pointing to the same chunk of memory. That would not be good (it's unsafe!).
Therefore, giving your T a Copy bound is quite restrictive. A less restrictive bound would be T: Clone. The Clone trait is similar to Copy (in that it copies values), but it's usually done by more than just "copying bits." For example, the String type will implement Clone by creating a new heap allocation for the underlying memory.
This requires you to change how your generic_main is written:
fn generic_main<T: Clone>(t: Test<T>) {
let x = t.get_first();
t.do_something_with_x(x.clone());
}
Alternatively, if you don't want to have either the Clone or Copy bounds, then you could change your do_something_with_x method to take a reference to T rather than an owned T:
impl<T> Test<T> {
// other methods elided
fn do_something_with_x(&self, x: &T) {
// Irrelevant
}
}
And your generic_main stays mostly the same, except you don't dereference x:
fn generic_main<T>(t: Test<T>) {
let x = t.get_first();
t.do_something_with_x(x);
}
You can read more about Copy in the docs. There are some nice examples, including how to implement Copy for your own types.

Resources