Why there is a & symbol before self in the full_name() function but there isn't any in the to_tuple() function? When I look at them, the usage of self is similar in both function, but why use &. Also when I add & to to_tuple() or delete it from full_name() it would throw an error. Can someone explain it?
fn full_name(&self) -> String {
format!("{} {}", self.first_name, self.last_name)
}
fn to_tuple(self) -> (String, String) {
(self.first_name, self.last_name)
}
full_name does not consume self, it uses a reference via &self: The members are only used via references as arguments to format!(), so a reference suffices.
to_tuple (as the name to_... suggests) consumes self: It moves the members from self into the returned tuple. Since the original self is no longer valid memory after the move (self no longer owns the memory), it has to be consumed, hence a move via self.
You can change full_name to use self, that is move ownership. This would become unhandy, though, as calling the function would consume the struct without the need to.
to_tuple could be changed to not consume self, yet it would need to .clone() (make a copy) of the members, which is costly.
Related
I am trying to create a user input validation function in rust utilising functional programming and recursion. How can I return an immutable vector with one element concatenated onto the end?
fn get_user_input(output_vec: Vec<String>) -> Vec<String> {
// Some code that has two variables: repeat(bool) and new_element(String)
if !repeat {
return output_vec.add_to_end(new_element); // What function could "add_to_end" be?
}
get_user_input(output_vec.add_to_end(new_element)) // What function could "add_to_end" be?
}
There are functions for everything else:
push adds a mutable vector to a mutable vector
append adds an element to the end of a mutable vector
concat adds an immutable vector to an immutable vector
??? adds an element to the end of a immutable vector
The only solution I have been able to get working is using:
[write_data, vec![new_element]].concat()
but this seems inefficient as I'm making a new vector for just one element (so the size is known at compile time).
You are confusing Rust with a language where you only ever have references to objects. In Rust, code can have exclusive ownership of objects, and so you don't need to be as careful about mutating an object that could be shared, because you know whether or not the object is shared.
For example, this is valid JavaScript code:
const a = [];
a.push(1);
This works because a does not contain an array, it contains a reference to an array.1 The const prevents a from being repointed to a different object, but it does not make the array itself immutable.
So, in these kinds of languages, pure functional programming tries to avoid mutating any state whatsoever, such as pushing an item onto an array that is taken as an argument:
function add_element(arr) {
arr.push(1); // Bad! We mutated the array we have a reference to!
}
Instead, we do things like this:
function add_element(arr) {
return [...arr, 1]; // Good! We leave the original data alone.
}
What you have in Rust, given your function signature, is a totally different scenario! In your case, output_vec is owned by the function itself, and no other entity in the program has access to it. There is therefore no reason to avoid mutating it, if that is your goal:
fn get_user_input(mut output_vec: Vec<String>) -> Vec<String> {
// Add mut ^^^
You have to keep in mind that any non-reference is an owned value. &Vec<String> would be an immutable reference to a vector something else owns, but Vec<String> is a vector this code owns and nobody else has access to.
Don't believe me? Here's a simple example of broken code that demonstrates this:
fn take_my_vec(y: Vec<String>) { }
fn main() {
let mut x = Vec::<String>::new();
x.push("foo".to_string());
take_my_vec(x);
println!("{}", x.len()); // E0382
}
The expression x.len() causes a compile-time error, because the vector x was moved into the function argument and we don't own it anymore.
So why shouldn't the function mutate the vector it owns now? The caller can't use it anymore.
In summary, functional programming looks a bit different in Rust. In other languages that have no way to communicate "I'm giving you this object" you must avoid mutating values you are given because the caller may not expect you to change them. In Rust, who owns a value is clear, and the argument reflects that:
Is the argument a value (Vec<String>)? The function owns the value now, the caller gave it away and can't use it anymore. Mutate it if you need to.
Is the argument an immutable reference (&Vec<String>)? The function doesn't own it, and it can't mutate it anyway because Rust won't allow it. You could clone it and mutate the clone.
Is the argument a mutable reference (&mut Vec<String>)? The caller must explicitly give the function a mutable reference and is therefore giving the function permission to mutate it -- but the function still doesn't own the value. The function can mutate it, clone it, or both -- it depends what the function is supposed to do.
If you take an argument by value, there is very little reason not to make it mut if you need to change it for whatever reason. Note that this detail (mutability of function arguments) isn't even part of the function's public signature simply because it's not the caller's business. They gave the object away.
Note that with types that have type arguments (like Vec) other expressions of ownership are possible. Here are a few examples (this is not an exhaustive list):
Vec<&String>: You now own a vector, but you don't own the String objects that it contains references to.
&Vec<&String>: You are given read-only access to a vector of string references. You could clone this vector, but you still couldn't change the strings, just rearrange them, for example.
&Vec<&mut String>: You are given read-only access to a vector of mutable string references. You can't rearrange the strings, but you can change the strings themselves.
&mut Vec<&String>: Like the above but opposite: you are allowed to rearrange the string references but you can't change the strings.
1 A good way to think of it is that non-primitive values in JavaScript are always a value of Rc<RefCell<T>>, so you're passing around a handle to the object with interior mutability. const only makes the Rc<> immutable.
Looking through the documentation for std::cell::Cell, I don't see anywhere how I can retrieve a non-mutable reference to inner data. There is only the get_mut method: https://doc.rust-lang.org/std/cell/struct.Cell.html#method.get_mut
I don't want to use this function because I want to have &self instead of &self mut.
I found an alternative solution of taking the raw pointer:
use std::cell::Cell;
struct DbObject {
key: Cell<String>,
data: String
}
impl DbObject {
pub fn new(data: String) -> Self {
Self {
key: Cell::new("some_uuid".into()),
data,
}
}
pub fn assert_key(&self) -> &str {
// setup key in the future if is empty...
let key = self.key.as_ptr();
unsafe {
let inner = key.as_ref().unwrap();
return inner;
}
}
}
fn main() {
let obj = DbObject::new("some data...".into());
let key = obj.assert_key();
println!("Key: {}", key);
}
Is there any way to do this without using unsafe? If not, perhaps RefCell will be more practical here?
Thank you for help!
First of, if you have a &mut T, you can trivially get a &T out of it. So you can use get_mut to get &T.
But to get a &mut T from a Cell<T> you need that cell to be mutable, as get_mut takes a &mut self parameter. And this is by design the only way to get a reference to the inner object of a cell.
By requiring the use of a &mut self method to get a reference out of a cell, you make it possible to check for exclusive access at compile time with the borrow checker. Remember that a cell enables interior mutability, and has a method set(&self, val: T), that is, a method that can modify the value of a non-mut binding! If there was a get(&self) -> &T method, the borrow checker could not ensure that you do not hold a reference to the inner object while setting the object, which would not be safe.
TL;DR: By design, you can't get a &T out of a non-mut Cell<T>. Use get_mut (which requires a mut cell), or set/replace (which work on a non-mut cell). If this is not acceptable, then consider using RefCell, which can get you a &T out of a non-mut instance, at some runtime cost.
In addition to to #mcarton answer, in order to keep interior mutability sound, that is, disallow mutable reference to coexist with other references, we have three different ways:
Using unsafe with the possibility of Undefined Behavior. This is what UnsafeCell does.
Have some runtime checks, involving runtime overhead. This is the approach RefCell, RwLock and Mutex use.
Restrict the operations that can be done with the abstraction. This is what Cell, Atomic* and (the unstable) OnceCell (and thus Lazy that uses it) does (note that the thread-safe types also have runtime overhead because they need to provide some sort of locking). Each provides a different set of allowed operations:
Cell and Atomic* do not let you to get a reference to the contained value, and only replace it as whole (basically, get() and set, though convenience methods are provided on top of these, such as swap()). Projection (cell-of-slice to slice-of-cells) is also available for Cell (field projection is possible, but not provided as part of std).
OnceCell allows you to assign only once and only then take shared reference, guaranteeing that when you assign you have no references and while you have shared references you cannot assign anymore.
Thus, when you need to be able to take a reference into the content, you cannot choose Cell as it was not designed for that - the obvious choice is RefCell, indeed.
According to the docs on task::spawn (note I'm using tokio::spawn which seems similar but lacks the description)
The 'static constraint means that the closure and its return value must have a lifetime of the whole program execution. The reason for this is that threads can detach and outlive the lifetime they have been created in. Indeed if the thread, and by extension its return value, can outlive their caller, we need to make sure that they will be valid afterwards, and since we can't know when it will return we need to have them valid as long as possible, that is until the end of the program, hence the 'static lifetime.
I'm trying to pass a Url into a thread, like this,
do_update(confg.to_feed_url()).await;
Where I do this in do_update,
async fn do_update(url: Url) {
let task = task::spawn(async {
let duration = Duration::from_millis(5_000);
let mut stream = time::interval(duration);
stream.tick().await;
loop {
feeds::MyFeed::from_url(url.clone(), true);
This doesn't work and generates these errors,
error[E0373]: async block may outlive the current function, but it borrows url, which is owned by the current function
I've read the docs, but this doesn't much make sense to me because I'm cloning in the block. How can I resolve this error?
In this case, if you don't need the URL elsewhere, you can write async move for your block instead of async, which will move the URL (and anything else the block captures, which it looks like is nothing) into the block.
If you do need it elsewhere, then you can clone the url variable outside of the block and then do the same thing (async move) to move just that particular variable into the block and not url itself. The clone call will probably not be necessary in such a case.
This occurs because url lives outside the block and calling url.clone borrows it (probably immutably, but it depends on the implementation). Since its scope is this function and not the life of the program, it won't live for the 'static lifetime and hence you need to move it into the block instead of borrowing.
I am trying to implement a set of functions in go. The context is an event server; I would like to prevent (or at least warn) adding the same handler more than once for an event.
I have read that maps are idiomatic to use as sets because of the ease of checking for membership:
if _, ok := set[item]; ok {
// don't add item
} else {
// do add item
}
I'm having some trouble with using this paradigm for functions though. Here is my first attempt:
// this is not the actual signature
type EventResponse func(args interface{})
type EventResponseSet map[*EventResponse]struct{}
func (ers EventResponseSet) Add(r EventResponse) {
if _, ok := ers[&r]; ok {
// warn here
return
}
ers[&r] = struct{}{}
}
func (ers EventResponseSet) Remove(r EventResponse) {
// if key is not there, doesn't matter
delete(ers, &r)
}
It is clear why this doesn't work: functions are not reference types in Go, though some people will tell you they are. I have proof, though we shouldn't need it since the language specification says that everything other than maps, slices, and pointers are passed by value.
Attempt 2:
func (ers EventResponseSet) Add(r *EventResponse) {
// ...
}
This has a couple of problems:
Any EventResponse has to be declared like fn := func(args interface{}){} because you can't address functions declared in the usual manner.
You can't pass a closure at all.
Using a wrapper is not an option because any function passed to the wrapper will get a new address from the wrapper - no function will be uniquely identifiable by address, and all this careful planning is for nought.
Is it silly of me to not accept defining functions as variables as a solution? Is there another (good) solution?
To be clear, I accept that there are cases that I can't catch (closures), and that's fine. The use case that I envision is defining a bunch of handlers and being relatively safe that I won't accidentally add one to the same event twice, if that makes sense.
You could use reflect.Value presented by Uvelichitel, or the function address as a string acquired by fmt.Sprint() or the address as uintptr acquired by reflect.Value.Pointer() (more in the answer How to compare 2 functions in Go?), but I recommend against it.
Since the language spec does not allow to compare function values, nor does it allow to take their addresses, you have no guarantee that something that works at a time in your program will work always, including a specific run, and including different (future) Go compilers. I would not use it.
Since the spec is strict about this, this means compilers are allowed to generate code that would for example change the address of a function at runtime (e.g. unload an unused function, then load it again later if needed again). I don't know about such behavior currently, but this doesn't mean that a future Go compiler will not take advantage of such thing.
If you store a function address (in whatever format), that value does not count as keeping the function value anymore. And if no one else would "own" the function value anymore, the generated code (and the Go runtime) would be "free" to modify / relocate the function (and thus changing its address) – without violating the spec and Go's type safety. So you could not be rightfully angry at and blame the compiler, but only yourself.
If you want to check against reusing, you could work with interface values.
Let's say you need functions with signature:
func(p ParamType) RetType
Create an interface:
type EventResponse interface {
Do(p ParamType) RetType
}
For example, you could have an unexported struct type, and a pointer to it could implement your EventResponse interface. Make an exported function to return the single value, so no new values may be created.
E.g.:
type myEvtResp struct{}
func (m *myEvtResp) Do(p ParamType) RetType {
// Your logic comes here
}
var single = &myEvtResp{}
func Get() EventResponse { return single }
Is it really needed to hide the implementation in a package, and only create and "publish" a single instance? Unfortunately yes, because else you could create other value like &myEvtResp{} which may be different pointers still having the same Do() method, but the interface wrapper values might not be equal:
Interface values are comparable. Two interface values are equal if they have identical dynamic types and equal dynamic values or if both have value nil.
[...and...]
Pointer values are comparable. Two pointer values are equal if they point to the same variable or if both have value nil. Pointers to distinct zero-size variables may or may not be equal.
The type *myEvtResp implements EventResponse and so you can register a value of it (the only value, accessible via Get()). You can have a map of type map[EventResponse]bool in which you may store your registered handlers, the interface values as keys, and true as values. Indexing a map with a key that is not in the map yields the zero value of the value type of the map. So if the value type of the map is bool, indexing it with a non-existing key will result in false – telling it's not in the map. Indexing with an already registered EventResponse (an existing key) will result in the stored value – true – telling it's in the map, it's already registered.
You can simply check if one already been registered:
type EventResponseSet map[*EventResponse]bool
func (ers EventResponseSet) Add(r EventResponse) {
if ers[r] {
// warn here
return
}
ers[r] = true
}
Closing: This may seem a little too much hassle just to avoid duplicated use. I agree, and I wouldn't go for it. But if you want to...
Which functions you mean to be equal? Comparability is not defined for functions types in language specification. reflect.Value gives you the desired behaviour more or less
type EventResponseSet map[reflect.Value]struct{}
set := make(EventResponseSet)
if _, ok := set[reflect.ValueOf(item)]; ok {
// don't add item
} else {
// do add item
set[reflect.ValueOf(item)] = struct{}{}
}
this assertion will treat as equal items produced by assignments only
//for example
item1 := fmt.Println
item2 := fmt.Println
item3 := item1
//would have all same reflect.Value
but I don't think this behaviour guaranteed by any documentation.
Is it possible to declare a tuple struct where the members are hidden for all intents and purposes, except for declaring?
// usize isn't public since I don't want users to manipulate it directly
struct MyStruct(usize);
// But now I can't initialize the struct using an argument to it.
let my_var = MyStruct(0xff)
// ^^^^
// How to make this work?
Is there a way to keep the member private but still allow new structs to be initialized with an argument as shown above?
As an alternative, a method such as MyStruct::new can be implemented, but I'm still interested to know if its possible to avoid having to use a method on the type since it's shorter, and nice for types that wrap a single variable.
Background
Without going into too many details, the only purpose of this type is to wrap a single type (a helper which hides some details, adds some functionality and is optimized away completely when compiled), in this context it's not exactly exposing hidden internals to use the Struct(value) style initializing.
Further, since the wrapper is zero overhead, its a little misleading to use the new method which is often associated with allocation/creation instead of casting.
Just as it's convenient type (int)v or int(v), instead of int::new(v), I'd like to do this for my own type.
It's used often, so the ability to use short expression is very convenient. Currently I'm using a macro which calls a new method, its OK but a little awkward/indirect, hence this question.
Strictly speaking this isn't possible in Rust.
However the desired outcome can be achieved using a normal struct with a like-named function (yes, this works!)
pub struct MyStruct {
value: usize,
}
#[allow(non_snake_case)]
pub fn MyStruct(value: usize) -> MyStruct {
MyStruct { value }
}
Now, you can write MyStruct(5) but not access the internals of MyStruct.
I'm afraid that such a concept is not possible, but for a good reason. Each member of a struct, unless marked with pub, is admitted as an implementation detail that should not raise to the surface of the public API, regardless of when and how the object is currently being used. Under this point of view, the question's goal reaches a conundrum: wishing to keep members private while letting the API user define them arbitrarily is not only uncommon but also not very sensible.
As you mentioned, having a method named new is the recommended approach of doing that. It's not like you're compromising code readability with the extra characters you have to type. Alternatively, for the case where the struct is known to wrap around an item, making the member public can be a possible solution. That, on the other hand, would allow any kind of mutations through a mutable borrow (thus possibly breaking the struct's invariants, as mentioned by #MatthieuM). This decision depends on the intended API.