Incomprehensible behaivior of () -> &str function - pointers

I'm new in rust and don't understand one moment:
Pointer's Guide says: "You don't really should return pointers". I'm okay with that, but in my HelloWorld-like program I tried to return &str. I don't need boxed String, just &str to immediately print it. And I succeeded in this chalenge, but only partially.
So, question itself:
Why I can do
fn foo(p: &[uint]) -> &str { "STRING" }
, but can't do
fn foo(p: Vec<uint>) -> &str { "STRING" } //error: missing lifetime specifier [E0106]
, but still can do
fn foo(p: Vec<uint>) -> &'static str { "STRING" }
What changes fact of switching from slice to Vec?
P.S. Sorry for my dummy english and dummy question. I think, I just don't get a point of rust's borrow-checher

This is how lifetime specifiers are automatically added:
This:
fn foo(p: &[uint]) -> &str { "STRING" }
In the past you had to write it explicilty:
fn foo<'a>(p: &'a [uint]) -> &'a str { "STRING" }
These two are equivalent (but not really accurate since input p and returned str are not related). It works because 'static > 'a so is always valid for any lifetime.
The second example wont work because there is no input reference so no automatic lifetime
parameters are added and the whole thing makes no sense (every reference needs a lifetime, explicit or implicit):
fn foo(p: Vec<uint>) -> &str { "STRING" }
As you already did you fix it by adding the lifetime to it:
fn foo(p: Vec<uint>) -> &'static str { "STRING" }

Related

Print Vec using a placeholder [duplicate]

I tried the following code:
fn main() {
let v2 = vec![1; 10];
println!("{}", v2);
}
But the compiler complains:
error[E0277]: `std::vec::Vec<{integer}>` doesn't implement `std::fmt::Display`
--> src/main.rs:3:20
|
3 | println!("{}", v2);
| ^^ `std::vec::Vec<{integer}>` cannot be formatted with the default formatter
|
= help: the trait `std::fmt::Display` is not implemented for `std::vec::Vec<{integer}>`
= note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
= note: required by `std::fmt::Display::fmt`
Does anyone implement this trait for Vec<T>?
let v2 = vec![1; 10];
println!("{:?}", v2);
{} is for strings and other values which can be displayed directly to the user. There's no single way to show a vector to a user.
The {:?} formatter can be used to debug it, and it will look like:
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Display is the trait that provides the method behind {}, and Debug is for {:?}
Does anyone implement this trait for Vec<T> ?
No.
And surprisingly, this is a demonstrably correct answer; which is rare since proving the absence of things is usually hard or impossible. So how can we be so certain?
Rust has very strict coherence rules, the impl Trait for Struct can only be done:
either in the same crate as Trait
or in the same crate as Struct
and nowhere else; let's try it:
impl<T> std::fmt::Display for Vec<T> {
fn fmt(&self, _: &mut std::fmt::Formatter) -> Result<(), std::fmt::Error> {
Ok(())
}
}
yields:
error[E0210]: type parameter `T` must be used as the type parameter for some local type (e.g., `MyStruct<T>`)
--> src/main.rs:1:1
|
1 | impl<T> std::fmt::Display for Vec<T> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ type parameter `T` must be used as the type parameter for some local type
|
= note: only traits defined in the current crate can be implemented for a type parameter
Furthermore, to use a trait, it needs to be in scope (and therefore, you need to be linked to its crate), which means that:
you are linked both with the crate of Display and the crate of Vec
neither implement Display for Vec
and therefore leads us to conclude that no one implements Display for Vec.
As a work around, as indicated by Manishearth, you can use the Debug trait, which is invokable via "{:?}" as a format specifier.
If you know the type of the elements that the vector contains, you could make a struct that takes vector as an argument and implement Display for that struct.
use std::fmt::{Display, Formatter, Error};
struct NumVec(Vec<u32>);
impl Display for NumVec {
fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {
let mut comma_separated = String::new();
for num in &self.0[0..self.0.len() - 1] {
comma_separated.push_str(&num.to_string());
comma_separated.push_str(", ");
}
comma_separated.push_str(&self.0[self.0.len() - 1].to_string());
write!(f, "{}", comma_separated)
}
}
fn main() {
let numbers = NumVec(vec![1; 10]);
println!("{}", numbers);
}
Here is a one-liner which should also work for you:
println!("[{}]", v2.iter().fold(String::new(), |acc, &num| acc + &num.to_string() + ", "));
Here is
a runnable example.
In my own case, I was receiving a Vec<&str> from a function call. I did not want to change the function signature to a custom type (for which I could implement the Display trait).
For my one-of case, I was able to turn the display of my Vec into a one-liner which I used with println!() directly as follows:
println!("{}", myStrVec.iter().fold(String::new(), |acc, &arg| acc + arg));
(The lambda can be adapted for use with different data types, or for more concise Display trait implementations.)
Starting with Rust 1.58, there is a slightly more concise way to print a vector (or any other variable). This lets you put the variable you want to print inside the curly braces, instead of needing to put it at the end. For the debug formatting needed to print a vector, you add :? in the braces, like this:
fn main() {
let v2 = vec![1; 10];
println!("{v2:?}");
}
Sometimes you don't want to use something like the accepted answer
let v2 = vec![1; 10];
println!("{:?}", v2);
because you want each element to be displayed using its Display trait, not its Debug trait; however, as noted, you can't implement Display on Vec because of Rust's coherence rules. Instead of implementing a wrapper struct with the Display trait, you can implement a more general solution with a function like this:
use std::fmt;
pub fn iterable_to_str<I, D>(iterable: I) -> String
where
I: IntoIterator<Item = D>,
D: fmt::Display,
{
let mut iterator = iterable.into_iter();
let head = match iterator.next() {
None => return String::from("[]"),
Some(x) => format!("[{}", x),
};
let body = iterator.fold(head, |a, v| format!("{}, {}", a, v));
format!("{}]", body)
}
which doesn't require wrapping your vector in a struct. As long as it implements IntoIterator and the element type implements Display, you can then call:
println!("{}", iterable_to_str(it));
Is there any reason not to write the vector's content item by item w/o former collecting? *)
use std::fmt::{Display, Formatter, Error};
struct NumVec(Vec<u32>);
impl Display for NumVec {
fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {
let v = &self.0;
if v.len() == 0 {
return Ok(());
}
for num in &v[0..v.len() - 1] {
if let Err(e) = write!(f, "{}, ", &num.to_string()) {
return Err(e);
}
}
write!(f, "{}", &v[v.len() - 1])
}
}
fn main() {
let numbers = NumVec(vec![1; 10]);
println!("{}", numbers);
}
*) No there isn't.
Because we want to display something, the Display trait is implemented for sure. So this is correct Rust because: the Doc says about the ToString trait:
"This trait is automatically implemented for any type which implements the Display trait. As such, ToString shouldn’t be implemented directly: Display should be implemented instead, and you get the ToString implementation for free."
In particular on microcontrollers where space is limited I definitely would go with this solution and write immediately.

Folding over references inside a match results in a lifetime error

I want to build a string s by iterating over a vector of simple structs, appending different strings to acc depending on the struct.
#[derive(Clone, Debug)]
struct Point(Option<i32>, Option<i32>);
impl Point {
fn get_first(&self) -> Option<i32> {
self.0
}
}
fn main() {
let mut vec = vec![Point(None, None); 10];
vec[5] = Point(Some(1), Some(1));
let s: String = vec.iter().fold(
String::new(),
|acc, &ref e| acc + match e.get_first() {
None => "",
Some(ref content) => &content.to_string()
}
);
println!("{}", s);
}
Running this code results in the following error:
error: borrowed value does not live long enough
Some(ref content) => &content.to_string()
^~~~~~~~~~~~~~~~~~~
note: reference must be valid for the expression at 21:22...
|acc, &ref e| acc + match e.get_first() {
^
note: ...but borrowed value is only valid for the expression at 23:33
Some(ref content) => &content.to_string()
^~~~~~~~~~~~~~~~~~~~
The problem is that the lifetime of the &str I create seems to end immediately. However, if to_string() would have returned a &str in the first place, the compiler would not have complained. Then, what is the difference?
How can I make the compiler understand that I want the string references to live as long as I am constructing s?
There is a difference between the result of your branches:
"" is of type &'static str
content is of type i32, so you are converting it to a String and then from that to a &str... but this &str has the same lifetime as the String returned by to_string, which dies too early
A quick work-around, as mentioned by #Dogbert, is to move acc + inside the branches:
let s: String = vec.iter().fold(
String::new(),
|acc, &ref e| match e.get_first() {
None => acc,
Some(ref content) => acc + &content.to_string(),
}
);
However, it's a bit wasteful, because every time we have an integer, we are allocating a String (via to_string) just to immediately discard it.
A better solution is to use the write! macro instead, which just appends to the original string buffer. This means there are no wasted allocations.
use std::fmt::Write;
let s = vec.iter().fold(
String::new(),
|mut acc, &ref e| {
if let Some(ref content) = e.get_first() {
write!(&mut acc, "{}", content).expect("Should have been able to format!");
}
acc
}
);
It's maybe a bit more complicated, notably because formatting adds in error handling, but is more efficient as it only uses a single buffer.
There are multiple solutions to your problem. But first some explanations:
If to_string() would have returned a &str in the first place, the compiler would not have complained. Then, what is the difference?
Suppose there is a method to_str() that returns a &str. What would the signature look like?
fn to_str(&self) -> &str {}
To better understand the issue, lets add explicit lifetimes (that are not necessary thanks to lifetime elision):
fn to_str<'a>(&'a self) -> &'a str {}
It becomes clear that the returned &str lives as long as the receiver of the method (self). This would be OK since the receiver lives long enough for your acc + ... operation. In your case however, the .to_string() call creates a new object the only lives in the second match arm. After the arm's body is left, it will be destroyed. Therefore you can't pass a reference to it to the outer scope (in which acc + ... takes place).
So one possible solution looks like this:
let s = vec.iter().fold(
String::new(),
|acc, e| {
acc + &e.get_first()
.map(|f| f.to_string())
.unwrap_or(String::new())
}
);
It's not optimal, but luckily your default value is an empty string and the owned version of an empty string (String::new()) does not require any heap allocations, so there is no performance penalty.
However, we are still allocating once per integer. For a more efficient solution, see Matthieu M.'s answer.

How can I handle "if may be missing an else clause" when using an if inside of a map?

I'm working on a code challenge which will detect case-insensitive anagrams of a given word from a list of words.
My first cut is to use something like this:
pub fn anagrams_for(s: &'static str, v: &[&'static str]) -> Vec<&'static str> {
let mut outputs: Vec<&str> = vec![];
// Find the case-insensitive, sorted word to check
let mut s_sorted: Vec<_> = s.to_string().to_lowercase().chars().collect();
s_sorted.sort();
for word in v {
// Case-desensitize and sort each word in the slice
let mut word_sorted: Vec<_> = word.to_string().to_lowercase().chars().collect();
word_sorted.sort();
// if the case-insensitive words are the same post sort and not presort (to avoid self-anagrams), add it to the vector
if word_sorted == s_sorted && s.to_string().to_lowercase() != word.to_string().to_lowercase() {
outputs.push(word)
}
}
outputs
}
This works as expected, but is not very idiomatic. I'm now trying a second iteration which uses more functional features of Rust:
pub fn anagrams_for(s: &'static str, v: &[&'static str]) -> Vec<&'static str> {
let mut s_sorted: Vec<_> = s.to_string().to_lowercase().chars().collect();
s_sorted.sort();
v.iter().map(&|word: &str| {
let mut word_sorted: Vec<_> = word.to_string().to_lowercase().chars().collect();
word_sorted.sort();
if word_sorted == s_sorted && s.to_string().to_lowercase() != word.to_string().to_lowercase() {
word
}
}).collect()
}
I'm currently getting a few errors (most of which I could likely resolve), but the one I'm interested in solving is
if may be missing an else clause:
expected `()`,
found `&str`
(expected (),
found &-ptr) [E0308]
This is because in the case of a non-anagram, map attempts to push something into the vector (seemingly ()).
How can I handle this? It's possible that map isn't the best idiom because it requires some operation to be performed on each element in a list, not a subset (maybe filter?).
As you noticed, the problem is that in the non-anagram-case your closure (the || { ... } block) doesn't return a value.
You can solve this by using filter_map instead of map. That function takes a closure that returns Option<U> instead of U, so the last expression of your closure looks something like:
if /* ... */ {
Some(word)
} else {
None
}
Unrelated to the main question, some notes on your code:
You can remove the .to_string() calls before .to_lowercase() calls. the latter method belongs to the type str, so it works fine. Calling to_string() adds unnecessary allocations.
the & in front of the closure (&|...|) can most probably be removed...
... as can the : &str type annotation in the closures argument list

Generic map as function argument

I wrote a method:
fn foo(input: HashMap<String, Vec<String>>) {...}
I then realized that for the purpose of writing tests, I'd like to have control of the iteration order (maybe a BTreeMap or LinkedHashMap). This led to two questions:
Is there some trait or combination of traits I could use that would essentially express "a map of string to string-vector"? I didn't see anything promising in the docs for HashMap.
It turns out that in this method, I just want to iterate over the map entries, and then the items in each string vector, but couldn't figure out the right syntax for specifying this. What's the correct way to write this?
fn foo(input: IntoIterator<(String, IntoIterator<String>)>) {...}
There's no such trait to describe an abstract HashMap. I believe there's no plan to make one. The best answer so far is your #2 suggestion: for a read-only HashMap you probably just want something to iterate on.
To answer at the syntax level, you tried to write:
fn foo(input: IntoIterator<(String, IntoIterator<String>)>)
But this is not valid because IntoIterator takes no template argument:
pub trait IntoIterator where Self::IntoIter::Item == Self::Item {
type Item;
type IntoIter: Iterator;
fn into_iter(self) -> Self::IntoIter;
}
It takes two associated types, however, so what you really wanted to express is probably the following (internally I changed the nested IntoIterator to a concrete type like Vec for simplicity):
fn foo<I>(input: I)
where I: IntoIterator<
Item=(String, Vec<String>),
IntoIter=IntoIter<String, Vec<String>>>
However the choice if IntoIterator is not always suitable because it implies a transfer of ownership. If you just wanted to borrow the HashMap for read-only purposes, you'd be probably better with the standard iterator trait of a HashMap, Iterator<Item=(&'a String, &'a Vec<String>)>.
fn foo_iter<'a, I>(input: I)
where I: Iterator<Item=(&'a String, &'a Vec<String>)>
Which you can use several times by asking for a new iterator, unlike the first version.
let mut h = HashMap::new();
h.insert("The Beatles".to_string(),
vec!["Come Together".to_string(),
"Twist And Shout".to_string()]);
h.insert("The Rolling Stones".to_string(),
vec!["Paint It Black".to_string(),
"Satisfaction".to_string()]);
foo_iter(h.iter());
foo_iter(h.iter());
foo(h);
//foo(h); <-- error: use of moved value: `h`
Full gist
EDIT
As asked in comments, here is the version of foo for nested IntoIterators instead of the simpler Vec:
fn foo<I, IVecString>(input: I)
where
I: IntoIterator<
Item=(String, IVecString),
IntoIter=std::collections::hash_map::IntoIter<String, IVecString>>,
IVecString: IntoIterator<
Item=String,
IntoIter=std::vec::IntoIter<String>>
There are not traits that define a common interface for containers. The only trait that maybe is suited for your is the Index trait.
See below for a working example of the correct syntax for IntoIterator and the Index traits. You need to use references if you don't want consume the input, so be careful with lifetime parameters.
use std::ops::Index;
use std::iter::IntoIterator;
use std::collections::HashMap;
// this consume the input
fn foo<I: IntoIterator<Item = (String, String)>>(input: I) {
let mut c = 0;
for _ in input {
c += 1;
}
println!("{}", c);
}
// maybe you want this
fn foo_ref<'a, I: IntoIterator<Item = (&'a String, &'a String)>>(input: I) {
let mut c = 0;
for _ in input {
c += 1;
}
println!("{}", c);
}
fn get<'a, I: Index<&'a String, Output = String>>(table: &I, k: &'a String) {
println!("{}", table[k]);
}
fn main() {
let mut h = HashMap::<String, String>::new();
h.insert("one".to_owned(), "1".to_owned());
h.insert("two".to_owned(), "2".to_owned());
h.insert("three".to_owned(), "3".to_owned());
foo_ref(&h);
get(&h, &"two".to_owned());
}
Edit
I changed the value type to everything implements the IntoIterator trait :
use std::ops::Index;
use std::iter::IntoIterator;
use std::collections::HashMap;
use std::collections::LinkedList;
fn foo_ref<'a, B, I, >(input: I)
where B : IntoIterator<Item = String>, I: IntoIterator<Item = (&'a String, &'a B)> {
//
}
fn get<'a, B, I>(table: &I, k: &'a String)
where B : IntoIterator<Item = String>, I: Index<&'a String, Output = B>
{
// do something with table[k];
}
fn main() {
let mut h1 = HashMap::<String, Vec<String>>::new();
let mut h2 = HashMap::<String, LinkedList<String>>::new();
foo_ref(&h1);
get(&h1, &"two".to_owned());
foo_ref(&h2);
get(&h2, &"two".to_owned());
}

Pointer-stashing generics via `mem::transmute()`

I'm attempting to write Rust bindings for a C collection library (Judy Arrays [1]) which only provides itself room to store a pointer-width value. My company has a fair amount of existing code which uses this space to directly store non-pointer values such as pointer-width integers and small structs. I'd like my Rust bindings to allow type-safe access to such collections using generics, but am having trouble getting the pointer-stashing semantics working correctly.
The mem::transmute() function seems like one potential tool for implementing the desired behavior, but attempting to use it on an instance of a parameterized type yield a confusing-to-me compilation error.
Example code:
pub struct Example<T> {
v: usize,
t: PhantomData<T>,
}
impl<T> Example<T> {
pub fn new() -> Example<T> {
Example { v: 0, t: PhantomData }
}
pub fn insert(&mut self, val: T) {
unsafe {
self.v = mem::transmute(val);
}
}
}
Resulting error:
src/lib.rs:95:22: 95:36 error: cannot transmute to or from a type that contains type parameters in its interior [E0139]
src/lib.rs:95 self.v = mem::transmute(val);
^~~~~~~~~~~~~~
Does this mean a type consisting only of a parameter "contains type parameters in its interior" and thus transmute() just won't work here? Any suggestions of the right way to do this?
(Related question, attempting to achieve the same result, but not necessarily via mem::transmute().)
[1] I'm aware of the existing rust-judy project, but it doesn't support the pointer-stashing I want, and I'm writing these new bindings largely as a learning exercise anyway.
Instead of transmuting T to usize directly, you can transmute a &T to &usize:
pub fn insert(&mut self, val: T) {
unsafe {
let usize_ref: &usize = mem::transmute(&val);
self.v = *usize_ref;
}
}
Beware that this may read from an invalid memory location if the size of T is smaller than the size of usize or if the alignment requirements differ. This could cause a segfault. You can add an assertion to prevent this:
assert_eq!(mem::size_of::<T>(), mem::size_of::<usize>());
assert!(mem::align_of::<usize>() <= mem::align_of::<T>());

Resources