How do you replace the value of a mutable variable by taking ownership of it? - collections

I am working with a LinkedList and I want to remove all elements which do not pass a test. However, I am running into the error cannot move out of borrowed content.
From what I understand, this is because I am working with &mut self, so I do not have the right to invalidate (i.e. move) one of the contained values even for a moment to construct a new list of its values.
In C++/Java, I would simply iterate the list and remove any elements which match a criteria. As there is no remove that I have yet found, I have interpreted it as an iterate, filter, and collect.
The goal is to avoid creating a temporary list, cloning values, and needing take self and return a "new" object. I have constructed an example which produces the same error. Playground.
use std::collections::LinkedList;
#[derive(Debug)]
struct Example {
list: LinkedList<i8>,
// Other stuff here
}
impl Example {
pub fn default() -> Example {
let mut list = LinkedList::new();
list.push_back(-5);
list.push_back(3);
list.push_back(-1);
list.push_back(6);
Example { list }
}
// Simmilar idea, but with creating a new list
pub fn get_positive(&self) -> LinkedList<i8> {
self.list.iter()
.filter(|&&x| x > 0)
.map(|x| x.clone())
.collect()
}
// Now, attempt to filter the elements without cloning anything
pub fn remove_negative(&mut self) {
self.list = self.list.into_iter()
.filter(|&x| x > 0)
.collect()
}
}
fn main() {
let mut e = Example::default();
println!("{:?}", e.get_positive());
println!("{:?}", e);
}
In my actual case, I cannot simply consume the wrapping object because it needs to be referenced from different places and contains other important values.
In my research, I found some unsafe code which leads me to question if a safe function could be constructed to perform this action in a similar way to std::mem::replace.

You can std::mem::swap your field with a temp, and then replace it with your modified list like this. The big downside is the creation of the new LinkedList. I don't know how expensive that is.
pub fn remove_negative(&mut self) {
let mut temp = LinkedList::new();
std::mem::swap(&mut temp, &mut self.list);
self.list = temp.into_iter()
.filter(|&x| x > 0)
.collect();
}

If the goal is not clone you may use a reference-counting pointer: the clone method on Rc increments the reference counter.
use std::collections::LinkedList;
use std::rc::Rc;
#[derive(Debug)]
struct Example {
list: LinkedList<Rc<i8>>,
// ...
}
impl Example {
pub fn default() -> Example {
let mut list = LinkedList::new();
list.push_back(Rc::new(-5));
list.push_back(Rc::new(3));
list.push_back(Rc::new(-1));
list.push_back(Rc::new(6));
Example { list }
}
// Simmilar idea, but with creating a new list
pub fn get_positive(&self) -> LinkedList<Rc<i8>> {
self.list.iter()
.filter(|&x| x.as_ref() > &0)
.map(|x| x.clone())
.collect()
}
// Now, attempt to filter the elements without cloning anything
pub fn remove_negative(&mut self) {
self.list = self.list.iter()
.filter(|&x| x.as_ref() > &0)
.map(|x| x.clone())
.collect()
}
}
fn main() {
let mut e = Example::default();
e.remove_negative();
println!("{:?}", e.get_positive());
println!("{:?}", e);
}

Related

Parallel Recursion Fix

Quite new to Rust and trying to tackle toy problems. Trying to write a directory traversal with only Rayon.
struct Node {
path: PathBuf,
files: Vec<PathBuf>,
hashes: Vec<String>,
folders: Vec<Box<Node>>,
}
impl Node {
pub fn new(path: PathBuf) -> Self {
Node {
path: path,
files: Vec::new(),
hashes: Vec::new(),
folders: Vec::new(),
}
}
pub fn burrow(&mut self) {
let mut contents: Vec<PathBuf> = ls_dir(&self.path);
contents.par_iter().for_each(|item|
if item.is_file() {
self.files.push(*item);
} else if item.is_dir() {
let mut new_folder = Node::new(*item);
new_folder.burrow();
self.folders.push(Box::new(new_folder));
});
}
}
The errors I am receiving are
error[E0596]: cannot borrow `*self.files` as mutable, as it is a captured variable in a `Fn` closure
--> src/main.rs:40:37
|
40 | ... self.files.push(*item);
| ^^^^^^^^^^^^^^^^^^^^^^ cannot borrow as mutable
error[E0507]: cannot move out of `*item` which is behind a shared reference
--> src/main.rs:40:53
|
40 | ... self.files.push(*item);
| ^^^^^ move occurs because `*item` has type `PathBuf`, which does not implement the `Copy` trait
error[E0507]: cannot move out of `*item` which is behind a shared reference
--> src/main.rs:42:68
|
42 | ... let mut new_folder = Node::new(*item);
| ^^^^^ move occurs because `*item` has type `PathBuf`, which does not implement the `Copy` trait
error[E0596]: cannot borrow `*self.folders` as mutable, as it is a captured variable in a `Fn` closure
--> src/main.rs:44:37
|
44 | ... self.folders.push(Box::new(new_folder));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot borrow as mutable
The errors are clear in that they are preventing different threads from accessing mutable memory, but I'm just not sure how to start to address the errors.
Below is the original (non-parallel) version of burrow
pub fn burrow(&mut self) {
let mut contents: Vec<PathBuf> = ls_dir(&self.path);
for item in contents {
if item.is_file() {
self.files.push(item);
} else if item.is_dir() {
let mut new_folder = Node::new(item);
new_folder.burrow();
self.folders.push(Box::new(new_folder));
}
}
}
The best option in this case is to use ParallelIterator::partition_map() which allows you to turn a parallel iterator into two different collections according to some condition, which is exactly what you need to do.
Example program:
use rayon::iter::{Either, IntoParallelIterator, ParallelIterator};
fn main() {
let input = vec!["a", "bb", "c", "dd"];
let (chars, strings): (Vec<char>, Vec<&str>) =
input.into_par_iter().partition_map(|s| {
if s.len() == 1 {
Either::Left(s.chars().next().unwrap())
} else {
Either::Right(s)
}
});
dbg!(chars, strings);
}
If you had three different outputs, unfortunately Rayon does not support that. I haven't looked at whether it'd be possible to build using Rayon's traits, but what I would suggest as a more general (though not quite as efficient) solution is to use channels. A channel like std::sync::mpsc allows any number of threads to insert items while another thread removes them — in your case, to move them into a collection. This would not be quite as efficient as parallel collection, but in an IO-dominated problem like yours, it would not be significant.
I'm going to skip the separation of files and folders, ignore the structure, and demonstrate a simple recursive approach that gets all the files in a directory recursively:
fn burrow(dir: &Path) -> Vec<PathBuf> {
let mut contents = vec![];
for entry in std::fs::read_dir(dir).unwrap() {
let entry = entry.unwrap().path();
if entry.is_dir() {
contents.extend(burrow(&entry));
} else {
contents.push(entry);
}
}
contents
}
The first step if you want to use the parallel iterators from rayon, is to convert this loop into a non-parallel iterator chain. The best way to do that is with .flat_map() to flatten results that yield more than one element:
fn burrow(dir: &Path) -> Vec<PathBuf> {
std::fs::read_dir(dir)
.unwrap()
.flat_map(|entry| {
let entry = entry.unwrap().path();
if entry.is_dir() {
burrow(&entry)
} else {
vec![entry] // use a single-element Vec if not a directory
}
})
.collect()
}
Then to use rayon to process this iteration in parallel is to use .par_bridge() to convert an iterator into a parallel iterator. And that's it actually:
use rayon::iter::{ParallelBridge, ParallelIterator};
fn burrow(dir: &Path) -> Vec<PathBuf> {
std::fs::read_dir(dir)
.unwrap()
.par_bridge()
.flat_map(|entry| {
let entry = entry.unwrap().path();
if entry.is_dir() {
burrow(&entry)
} else {
vec![entry]
}
})
.collect()
}
See it working on the playground. You can extend on this to collect more complex results (like folders and hashes and whatever else).

What is the most efficient way to return/move a Vec/Field in rust while also emptying it? [duplicate]

I have a struct with a field:
struct A {
field: SomeType,
}
Given a &mut A, how can I move the value of field and swap in a new value?
fn foo(a: &mut A) {
let mut my_local_var = a.field;
a.field = SomeType::new();
// ...
// do things with my_local_var
// some operations may modify the NEW field's value as well.
}
The end goal would be the equivalent of a get_and_set() operation. I'm not worried about concurrency in this case.
Use std::mem::swap().
fn foo(a: &mut A) {
let mut my_local_var = SomeType::new();
mem::swap(&mut a.field, &mut my_local_var);
}
Or std::mem::replace().
fn foo(a: &mut A) {
let mut my_local_var = mem::replace(&mut a.field, SomeType::new());
}
If your type implements Default, you can use std::mem::take:
#[derive(Default)]
struct SomeType;
fn foo(a: &mut A) {
let mut my_local_var = std::mem::take(&mut a.field);
}
If your field happens to be an Option, there's a specific method you can use — Option::take:
struct A {
field: Option<SomeType>,
}
fn foo(a: &mut A) {
let old = a.field.take();
// a.field is now None, old is whatever a.field used to be
}
The implementation of Option::take uses mem::take, just like the more generic answer above shows, but it is wrapped up nicely for you:
pub fn take(&mut self) -> Option<T> {
mem::take(self)
}
See also:
Temporarily move out of borrowed content
Change enum variant while moving the field to the new variant

Is there a shorter way of appending to a vector in a method call that takes self by value?

I need to append Bar to Foo as vectors. This solution works for me:
trait AppendBar {
fn append_bar(self) -> Self;
}
impl AppendBar for Vec<String> {
fn append_bar(self) -> Vec<String> {
let mut v = self;
v.push("Bar".to_string());
v
}
}
fn main(){
let foo = vec![String::from("Foo")].append_bar();
println!("{:#?}", foo)
}
Can I do something like this:
impl AppendBar for Vec<String> {
fn append_bar(self) -> Vec<String> {
let mut v = self.push("Bar".to_string());
v
}
}
Or this:
impl AppendBar for Vec<String> {
fn append_bar(self) -> Vec<String> {
self.push("Bar".to_string())
}
}
Since I could not get it to compile with either of the last 2 attempts, I assume this is just how the language is, but I want to make sure I am not missing something to make it more simple.
There is no need to rebind self to v to achieve mutability. Just declare it as mut self:
fn append_bar(mut self) -> Vec<String> {
self.push("Bar".to_string());
self
}
Beyond that, there are no standard Vec methods for chaining modifications (consuming self and returning modified version), pretty much all methods modify the Vec in-place as &mut self.

Why does a node in a linked list using raw pointers become corrupted?

I am struggling to learn raw pointers while implementing a linked list. A simple piece of code gives me unintended results for which I struggle to find any explanation whatsoever:
use std::cmp::PartialEq;
use std::default::Default;
use std::ptr;
pub struct LinkedListElement<T> {
pub data: T,
pub next: *mut LinkedListElement<T>,
}
pub struct LinkedList<T> {
head: *mut LinkedListElement<T>,
}
impl<T: PartialEq> LinkedListElement<T> {
pub fn new(elem: T, next: Option<*mut LinkedListElement<T>>) -> LinkedListElement<T> {
let mut_ptr = match next {
Some(t) => t,
None => ptr::null_mut(),
};
let new_elem = LinkedListElement {
data: elem,
next: mut_ptr,
};
if !mut_ptr.is_null() {
println!(
"post create ll mut ptr: {:p}, post create ll mut ptr next {:p}",
mut_ptr,
unsafe { (*mut_ptr).next }
);
}
new_elem
}
}
impl<T: PartialEq + Default> LinkedList<T> {
pub fn new(elem: T) -> LinkedList<T> {
LinkedList {
head: &mut LinkedListElement::new(elem, None),
}
}
pub fn insert(&mut self, elem: T) {
println!("head: {:p} . next: {:p}", self.head, unsafe {
(*self.head).next
});
let next = Some(self.head);
let mut ll_elem = LinkedListElement::new(elem, next);
println!(
"before pointer head: {:p}. before pointer next {:p}",
self.head,
unsafe { (*self.head).next }
);
let ll_elem_ptr = &mut ll_elem as *mut LinkedListElement<T>;
self.head = ll_elem_ptr;
}
}
fn main() {
let elem: i32 = 32;
let second_elem: i32 = 64;
let third_elem: i32 = 72;
let mut list = LinkedList::new(elem);
list.insert(second_elem);
list.insert(third_elem);
}
(playground)
This code gives me the following output:
head: 0x7ffe163275e8 . next: 0x0
post create ll mut ptr: 0x7ffe163275e8, post create ll mut ptr next 0x0
before pointer head: 0x7ffe163275e8. before pointer next 0x0
head: 0x7ffe16327560 . next: 0x7ffe163275e8
post create ll mut ptr: 0x7ffe16327560, post create ll mut ptr next 0x7ffe163275e8
before pointer head: 0x7ffe16327560. before pointer next 0x7ffe16327560
For the first 2 elements the code behaves as expected: it creates an element with null pointer as its next element. Here is the state of things after adding second element:
{
head: {
elem: 64,
next: {
elem: 32,
next: nullptr
}
}
}
64 -> 32 -> null
When the third element is added, things become weird and the linked list transforms into something like this:
{
head: {
elem: 72,
next: {
elem: 72,
next: {
elem: 72,
next: ...
}
}
}
}
72 -> 72 -> 72 -> ...
It seems that the linked list element's next field starts pointing at the element itself.
I have debugged the LinkedListElement::new method and found that the proper element should get returned from it:
{
elem: 72,
next: {
elem: 64,
next: {
elem: 32,
next: nullptr
}
}
}
For some reason, immediately after it is returned to LinkedList::insert method, even before self.head is reassigned, the contents of LinkedList self becomes "corrupted".
I know using raw pointers in Rust is not idiomatic but I still want to learn them.
Congratulations, you have successfully proven why Rust needs to exist in the first place: programmers write memory-unsafe code.
First, please read why this is disallowed when using safe Rust:
Is there any way to return a reference to a variable created in a function?
TL;DR: the memory address of LinkedListElement changes when it's moved. A move occurs when a value is returned from a function (among other times). By using a raw pointer, you've subverted the borrow checker and get no useful feedback from the compiler.
Second, please read Learning Rust With Entirely Too Many Linked Lists. For whatever reason, programmers think that linked lists are "easy" and a good way to learn a language. This is generally not true in Rust, where memory safety is paramount.
TL;DR: you can use a Box to allocate memory on the heap. This memory address will not change when the pointer to it is moved. You will need to ensure that you appropriately free the pointer when your linked list goes out of scope to prevent memory leaks.
See also:
How to copy a raw pointer when implementing a linked list in Rust?
Box::into_raw / Box::from_raw
NonNull

Returning Error Enumeration with an Arbitrary Variable

I have a function in Rust using try! that attempts to collect all files in a directory recursively and insert them into a vector. Because the function uses try! to check errors, the compiler seems to expect an io::Result return from the function, and doesn't let me include the vector because the try! macro only returns a result. I need the vector to be returned.
Code is as follows:
mod os{
use std::io;
use std::fs::{self, DirEntry};
//use std::fs;
use std::path::Path;
// one possible implementation of walking a directory only visiting files
pub fn visit_dirs(dir: &Path, cb: &Fn(&DirEntry)) -> (io::Result<()>,Vec<String>) {
let mut filevec: Vec<String> = Vec::new();
if try!(fs::metadata(dir)).is_dir() {
for entry in try!(fs::read_dir(dir)) {
let entry = try!(entry);
if try!(fs::metadata(entry.path())).is_dir() {
try!(visit_dirs(&entry.path(), cb));
} else {
cb(&entry);
}
}
}
(Ok(()),filevec)
}
fn push_path_to_vec(p:&DirEntry,v:Vec<String>){
v.push(p.path().to_str().unwrap().to_string());
}}
Here is the error:
<std macros>:5:8: 6:42 error: mismatched types:
expected `(core::result::Result<(), std::io::error::Error>, collections::vec::Vec<collections::string::String>)`
found `core::result::Result<_, _>`
(expected tuple,
found enum `core::result::Result`) [E0308]
I wonder if there's any idiomatic way to do this that I've missed.
The return type of visit_dirs is wrong. The function should return a Result, but right now it returns a tuple. Since try! only works for functions returning a Result, your code doesn't compile. You can change the return value of visit_dirs in order to fix it:
pub fn visit_dirs(dir: &Path, cb: &Fn(&DirEntry)) -> io::Result<Vec<String>>
The new definition means that a Vec<String> will be stored in the Result upon success. With some minor tweaks, the code is accepted by the compiler (see below)
mod os{
use std::io;
use std::fs::{self, DirEntry};
//use std::fs;
use std::path::Path;
// one possible implementation of walking a directory only visiting files
pub fn visit_dirs(dir: &Path, cb: &Fn(&DirEntry)) -> io::Result<Vec<String>> {
let mut filevec: Vec<String> = Vec::new();
if try!(fs::metadata(dir)).is_dir() {
for entry in try!(fs::read_dir(dir)) {
let entry = try!(entry);
if try!(fs::metadata(entry.path())).is_dir() {
try!(visit_dirs(&entry.path(), cb));
} else {
cb(&entry);
}
}
}
Ok(filevec)
}
fn push_path_to_vec(p:&DirEntry,mut v:Vec<String>){
v.push(p.path().to_str().unwrap().to_string());
}}

Resources