How to return a single element from a Vec from a function? - vector

I'm new to Rust, and I'm trying to make an interface where the user can choose a file by typing the filename from a list of available files.
This function is supposed to return the DirEntry corresponding to the chosen file:
fn ask_user_to_pick_file(available_files: Vec<DirEntry>) -> DirEntry {
println!("Which month would you like to sum?");
print_file_names(&available_files);
let input = read_line_from_stdin();
let chosen = available_files.iter()
.find(|dir_entry| dir_entry.file_name().into_string().unwrap() == input )
.expect("didnt match any files");
return chosen
}
However, it appears chosen is somehow borrowed here? I get the following error:
35 | return chosen
| ^^^^^^ expected struct `DirEntry`, found `&DirEntry`
Is there a way I can "unborrow" it? Or do I have to implement the Copy trait for DirEntry?
If it matters I don't care about theVec after this method, so if "unborrowing" chosen destroys the Vec, thats okay by me (as long as the compiler agrees).

Use into_iter() instead of iter() so you get owned values instead of references out of the iterator. After that change the code will compile and work as expected:
fn ask_user_to_pick_file(available_files: Vec<DirEntry>) -> DirEntry {
println!("Which month would you like to sum?");
print_file_names(&available_files);
let input = read_line_from_stdin();
let chosen = available_files
.into_iter() // changed from iter() to into_iter() here
.find(|dir_entry| dir_entry.file_name().into_string().unwrap() == input)
.expect("didnt match any files");
chosen
}

Related

How to convert Vector of Box to Vector of reference?

I'm looking for a way to conver Vec<Box<u32>> to Vec<&u32>. Here is what I tried:
fn conver_to_ref(){
let test: Vec<Box<u32>> = vec![Box::new(1), Box::new(2)];
let _test2: Vec<&u32> = test.into_iter().map(|elem| &*elem).collect();
}
Unfortunately it does not compile: demo. The error message:
error[E0515]: cannot return reference to local data `*elem`
--> src/lib.rs:3:57
|
3 | let _test2: Vec<&u32> = test.into_iter().map(|elem| &*elem).collect();
| ^^^^^^ returns a reference to data owned by the current function
How to do such conversion?
into_iter() consumes the original vector and its items. If the code compiled as written, all the references in _test2 would be dangling because the boxes would be destroyed along with test.
You can build a vector of references, but you will need to not consume the original test vector, so that the boxes retain an owner. You can simply use iter() instead of into_iter():
fn convert_to_ref() {
let test: Vec<Box<u32>> = vec![Box::new(1), Box::new(2)];
let _test2: Vec<&u32> = test.iter().map(Box::as_ref).collect();
}
Note that test.iter() yields references to test elements, i.e. to the boxes themselves (&Box<u32>), and not to the boxed items (&u32) which we're interested in. This is why we must apply as_ref to get the latter.

How to convert a vector of vectors into a vector of slices without creating a new object? [duplicate]

I have the following:
enum SomeType {
VariantA(String),
VariantB(String, i32),
}
fn transform(x: SomeType) -> SomeType {
// very complicated transformation, reusing parts of x in order to produce result:
match x {
SomeType::VariantA(s) => SomeType::VariantB(s, 0),
SomeType::VariantB(s, i) => SomeType::VariantB(s, 2 * i),
}
}
fn main() {
let mut data = vec![
SomeType::VariantA("hello".to_string()),
SomeType::VariantA("bye".to_string()),
SomeType::VariantB("asdf".to_string(), 34),
];
}
I would now like to call transform on each element of data and store the resulting value back in data. I could do something like data.into_iter().map(transform).collect(), but this will allocate a new Vec. Is there a way to do this in-place, reusing the allocated memory of data? There once was Vec::map_in_place in Rust but it has been removed some time ago.
As a work-around, I've added a Dummy variant to SomeType and then do the following:
for x in &mut data {
let original = ::std::mem::replace(x, SomeType::Dummy);
*x = transform(original);
}
This does not feel right, and I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop. Is there a better way of doing this?
Your first problem is not map, it's transform.
transform takes ownership of its argument, while Vec has ownership of its arguments. Either one has to give, and poking a hole in the Vec would be a bad idea: what if transform panics?
The best fix, thus, is to change the signature of transform to:
fn transform(x: &mut SomeType) { ... }
then you can just do:
for x in &mut data { transform(x) }
Other solutions will be clunky, as they will need to deal with the fact that transform might panic.
No, it is not possible in general because the size of each element might change as the mapping is performed (fn transform(u8) -> u32).
Even when the sizes are the same, it's non-trivial.
In this case, you don't need to create a Dummy variant because creating an empty String is cheap; only 3 pointer-sized values and no heap allocation:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
let old = std::mem::replace(self, VariantA(String::new()));
// Note this line for the detailed explanation
*self = match old {
VariantA(s) => VariantB(s, 0),
VariantB(s, i) => VariantB(s, 2 * i),
};
}
}
for x in &mut data {
x.transform();
}
An alternate implementation that just replaces the String:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
*self = match self {
VariantA(s) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 0)
}
VariantB(s, i) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 2 * *i)
}
};
}
}
In general, yes, you have to create some dummy value to do this generically and with safe code. Many times, you can wrap your whole element in Option and call Option::take to achieve the same effect .
See also:
Change enum variant while moving the field to the new variant
Why is it so complicated?
See this proposed and now-closed RFC for lots of related discussion. My understanding of that RFC (and the complexities behind it) is that there's an time period where your value would have an undefined value, which is not safe. If a panic were to happen at that exact second, then when your value is dropped, you might trigger undefined behavior, a bad thing.
If your code were to panic at the commented line, then the value of self is a concrete, known value. If it were some unknown value, dropping that string would try to drop that unknown value, and we are back in C. This is the purpose of the Dummy value - to always have a known-good value stored.
You even hinted at this (emphasis mine):
I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop
That "should" is the problem. During a panic, that dummy value is visible.
See also:
How can I swap in a new value for a field in a mutable reference to a structure?
Temporarily move out of borrowed content
How do I move out of a struct field that is an Option?
The now-removed implementation of Vec::map_in_place spans almost 175 lines of code, most of having to deal with unsafe code and reasoning why it is actually safe! Some crates have re-implemented this concept and attempted to make it safe; you can see an example in Sebastian Redl's answer.
You can write a map_in_place in terms of the take_mut or replace_with crates:
fn map_in_place<T, F>(v: &mut [T], f: F)
where
F: Fn(T) -> T,
{
for e in v {
take_mut::take(e, f);
}
}
However, if this panics in the supplied function, the program aborts completely; you cannot recover from the panic.
Alternatively, you could supply a placeholder element that sits in the empty spot while the inner function executes:
use std::mem;
fn map_in_place_with_placeholder<T, F>(v: &mut [T], f: F, mut placeholder: T)
where
F: Fn(T) -> T,
{
for e in v {
let mut tmp = mem::replace(e, placeholder);
tmp = f(tmp);
placeholder = mem::replace(e, tmp);
}
}
If this panics, the placeholder you supplied will sit in the panicked slot.
Finally, you could produce the placeholder on-demand; basically replace take_mut::take with take_mut::take_or_recover in the first version.

How can I handle "if may be missing an else clause" when using an if inside of a map?

I'm working on a code challenge which will detect case-insensitive anagrams of a given word from a list of words.
My first cut is to use something like this:
pub fn anagrams_for(s: &'static str, v: &[&'static str]) -> Vec<&'static str> {
let mut outputs: Vec<&str> = vec![];
// Find the case-insensitive, sorted word to check
let mut s_sorted: Vec<_> = s.to_string().to_lowercase().chars().collect();
s_sorted.sort();
for word in v {
// Case-desensitize and sort each word in the slice
let mut word_sorted: Vec<_> = word.to_string().to_lowercase().chars().collect();
word_sorted.sort();
// if the case-insensitive words are the same post sort and not presort (to avoid self-anagrams), add it to the vector
if word_sorted == s_sorted && s.to_string().to_lowercase() != word.to_string().to_lowercase() {
outputs.push(word)
}
}
outputs
}
This works as expected, but is not very idiomatic. I'm now trying a second iteration which uses more functional features of Rust:
pub fn anagrams_for(s: &'static str, v: &[&'static str]) -> Vec<&'static str> {
let mut s_sorted: Vec<_> = s.to_string().to_lowercase().chars().collect();
s_sorted.sort();
v.iter().map(&|word: &str| {
let mut word_sorted: Vec<_> = word.to_string().to_lowercase().chars().collect();
word_sorted.sort();
if word_sorted == s_sorted && s.to_string().to_lowercase() != word.to_string().to_lowercase() {
word
}
}).collect()
}
I'm currently getting a few errors (most of which I could likely resolve), but the one I'm interested in solving is
if may be missing an else clause:
expected `()`,
found `&str`
(expected (),
found &-ptr) [E0308]
This is because in the case of a non-anagram, map attempts to push something into the vector (seemingly ()).
How can I handle this? It's possible that map isn't the best idiom because it requires some operation to be performed on each element in a list, not a subset (maybe filter?).
As you noticed, the problem is that in the non-anagram-case your closure (the || { ... } block) doesn't return a value.
You can solve this by using filter_map instead of map. That function takes a closure that returns Option<U> instead of U, so the last expression of your closure looks something like:
if /* ... */ {
Some(word)
} else {
None
}
Unrelated to the main question, some notes on your code:
You can remove the .to_string() calls before .to_lowercase() calls. the latter method belongs to the type str, so it works fine. Calling to_string() adds unnecessary allocations.
the & in front of the closure (&|...|) can most probably be removed...
... as can the : &str type annotation in the closures argument list

Creating Sequence of Sequences is Causing a StackOverflowException

I'm trying to take a large file and split it into many smaller files. The location where each split occurs is based on a predicate returned from examining the contents of each given line (isNextObject function).
I have attempted to read in the large file via the File.ReadLines function so that I can iterate through the file one line at a time without having to hold the entire file in memory. My approach was to group the sequence into a sequence of smaller sub-sequences (one per file to be written out).
I found a useful function that Tomas Petricek created on fssnip called groupWhen. This function worked great for my initial testing on a small subset of the file, but a StackoverflowException is thrown when using the real file. I am not sure how to adjust the groupWhen function to prevent this (I'm still an F# greenie).
Here is a simplified version of the code showing only the relevant parts that will recreate the StackoverflowExcpetion::
// This is the function created by Tomas Petricek where the StackoverflowExcpetion is occuring
module Seq =
/// Iterates over elements of the input sequence and groups adjacent elements.
/// A new group is started when the specified predicate holds about the element
/// of the sequence (and at the beginning of the iteration).
///
/// For example:
/// Seq.groupWhen isOdd [3;3;2;4;1;2] = seq [[3]; [3; 2; 4]; [1; 2]]
let groupWhen f (input:seq<_>) = seq {
use en = input.GetEnumerator()
let running = ref true
// Generate a group starting with the current element. Stops generating
// when it founds element such that 'f en.Current' is 'true'
let rec group() =
[ yield en.Current
if en.MoveNext() then
if not (f en.Current) then yield! group() // *** Exception occurs here ***
else running := false ]
if en.MoveNext() then
// While there are still elements, start a new group
while running.Value do
yield group() |> Seq.ofList }
This is the gist of the code making use Tomas' function:
module Extractor =
open System
open System.IO
open Microsoft.FSharp.Reflection
// ... elided a few functions include "isNextObject" which is
// a string -> bool (examines the line and returns true
// if the string meets the criteria to that we are at the
// start of the next inner file)
let writeFile outputDir file =
// ... write out "file" to the file system
// NOTE: file is a seq<string>
let writeFiles outputDir (files : seq<seq<_>>) =
files
|> Seq.iter (fun file -> writeFile outputDir file)
And here is the relevant code in the console application that makes use of the functions:
let lines = inputFile |> File.ReadLines
writeFiles outputDir (lines |> Seq.groupWhen isNextObject)
Any ideas on the proper way to stop groupWhen from blowing the stack? I'm not sure how I would convert the function to use an accumulator (or to use a continuation instead, which I think is the correct terminology).
The problem with this is that the group() function returns a list, which is an eagerly evaluated data structure, which means that every time you call group() it has to run to the end, collect all results in a list, and return the list. This means that the recursive call happens within that same evaluation - i.e. truly recursively, - thus creating stack pressure.
To mitigate this problem, you could just replace the list with a lazy sequence:
let rec group() = seq {
yield en.Current
if en.MoveNext() then
if not (f en.Current) then yield! group()
else running := false }
However, I would consider less drastic approaches. This example is a good illustration of why you should avoid doing recursion yourself and resort to ready-made folds instead.
For example, judging by your description, it seems that Seq.windowed may work for you.
It's easy to overuse sequences in F#, IMO. You can accidentally get stack overflows, plus they are slow.
So (not actually answering your question),
personally I would just fold over the seq of lines using something like this:
let isNextObject line =
line = "---"
type State = {
fileIndex : int
filename: string
writer: System.IO.TextWriter
}
let makeFilename index =
sprintf "File%i" index
let closeFile (state:State) =
//state.writer.Close() // would use this in real code
state.writer.WriteLine("=== Closing {0} ===",state.filename)
let createFile index =
let newFilename = makeFilename index
let newWriter = System.Console.Out // dummy
newWriter.WriteLine("=== Creating {0} ===",newFilename)
// create new state with new writer
{fileIndex=index + 1; writer = newWriter; filename=newFilename }
let writeLine (state:State) line =
if isNextObject line then
/// finish old file here
closeFile state
/// create new file here and return updated state
createFile state.fileIndex
else
//write the line to the current file
state.writer.WriteLine(line)
// return the unchanged state
state
let processLines (lines: string seq) =
//setup
let initialState = createFile 1
// process the file
let finalState = lines |> Seq.fold writeLine initialState
// tidy up
closeFile finalState
(Obviously a real version would use files rather than the console)
Yes, it is crude, but it is easy to reason about, with
no unpleasant surprises.
Here's a test:
processLines [
"a"; "b"
"---";"c"; "d"
"---";"e"; "f"
]
And here's what the output looks like:
=== Creating File1 ===
a
b
=== Closing File1 ===
=== Creating File2 ===
c
d
=== Closing File2 ===
=== Creating File3 ===
e
f
=== Closing File3 ===

How do I convert a vector of strings to a vector of integers in a functional way?

I'm trying to convert Vec<&str> to Vec<u16> but I can't figure out a functional way to do it.
let foo: &str = "1,2,3"; // Parsing a string here
let bar: Vec<&str> = foo.split(",").collect(); // Bar is a nice vector of &str's
I need to get bar into a Vec<u16>.
There's an iterator adapter map! You'd use it like this:
let bar: Vec<u16> = foo.split(",").map(|x| x.parse::<u16>().unwrap()).collect();
parse is a library function that relies on the trait FromStr, and it can return an error, so we need to unwrap() the error type. (This is a good idea for a short example, but in real code, you will want to handle the error properly - if you have a value that's not a u16 there, your program will just crash).
map takes a closure that takes it's parameter by value and then returns the iterator obtained by lazily applying that function. You're collecting all of the values here, but if you only take(5) of them, you would only parse 5 of the strings.
You haven't fully specified your problem. Specifically, what should happen when one of the strings cannot be parsed into a number? When you parse a number from a string using parse, it can fail. That is why the function returns a Result:
fn parse<F>(&self) -> Result<F, F::Err>
where
F: FromStr,
Here's a solution that takes the vector, gets an iterator with iter, changes each item using map and ultimately returns a Result using collect. If the parsing was a success, you get an Ok. If any failed, you get an Err:
fn main() {
let input = "1,2,3";
let strings: Vec<_> = input.split(",").collect();
let numbers: Result<Vec<u16>, _> = strings.iter().map(|x| x.parse()).collect();
println!("{:?}", numbers);
}
Or you could remove failed conversions by filtering out Err values with flat_map:
fn main() {
let input = "1,2,3";
let strings: Vec<_> = input.split(",").collect();
let numbers: Vec<u16> = strings.iter().flat_map(|x| x.parse()).collect();
println!("{:?}", numbers);
}
Of course, it's a bit silly to convert the string into a vector of strings and then convert it again to a vector of integers. If you actually have a comma-separated string and want numbers, do it in one go:
fn main() {
let input = "1,2,3";
let numbers: Result<Vec<u16>, _> = input.split(",").map(|x| x.parse()).collect();
println!("{:?}", numbers);
}
See also:
Why does `Option` support `IntoIterator`?
My take as someone not really experienced in Rust yet.
fn main() {
let foo: &str = "1,2,3"; // Parsing a string here
let bar: Vec<&str> = foo.split(",").collect(); // Bar is a nice vector of &str's
// here the magic happens
let baz = bar.iter().map(|x| x.parse::<i64>());
for x in baz {
match x {
Ok(i) => println!("{}", i),
Err(_) => println!("parse failed"),
}
}
}
Note that since parse returns a Result, you have to extract the value from each parsed element. You might want to behave in a different way, e.g. filter only the succeeded results.

Resources