Given this code:
trait Base {
fn a(&self);
fn b(&self);
fn c(&self);
fn d(&self);
}
trait Derived : Base {
fn e(&self);
fn f(&self);
fn g(&self);
}
struct S;
impl Derived for S {
fn e(&self) {}
fn f(&self) {}
fn g(&self) {}
}
impl Base for S {
fn a(&self) {}
fn b(&self) {}
fn c(&self) {}
fn d(&self) {}
}
Unfortunately, I cannot cast &Derived to &Base:
fn example(v: &Derived) {
v as &Base;
}
error[E0605]: non-primitive cast: `&Derived` as `&Base`
--> src/main.rs:30:5
|
30 | v as &Base;
| ^^^^^^^^^^
|
= note: an `as` expression can only be used to convert between primitive types. Consider using the `From` trait
Why is that? The Derived vtable has to reference the Base methods in one way or another.
Inspecting the LLVM IR reveals the following:
#vtable4 = internal unnamed_addr constant {
void (i8*)*,
i64,
i64,
void (%struct.S*)*,
void (%struct.S*)*,
void (%struct.S*)*,
void (%struct.S*)*
} {
void (i8*)* #_ZN2i813glue_drop.98717h857b3af62872ffacE,
i64 0,
i64 1,
void (%struct.S*)* #_ZN6S.Base1a20h57ba36716de00921jbaE,
void (%struct.S*)* #_ZN6S.Base1b20h3d50ba92e362d050pbaE,
void (%struct.S*)* #_ZN6S.Base1c20h794e6e72e0a45cc2vbaE,
void (%struct.S*)* #_ZN6S.Base1d20hda31e564669a8cdaBbaE
}
#vtable26 = internal unnamed_addr constant {
void (i8*)*,
i64,
i64,
void (%struct.S*)*,
void (%struct.S*)*,
void (%struct.S*)*,
void (%struct.S*)*,
void (%struct.S*)*,
void (%struct.S*)*,
void (%struct.S*)*
} {
void (i8*)* #_ZN2i813glue_drop.98717h857b3af62872ffacE,
i64 0,
i64 1,
void (%struct.S*)* #_ZN9S.Derived1e20h9992ddd0854253d1WaaE,
void (%struct.S*)* #_ZN9S.Derived1f20h849d0c78b0615f092aaE,
void (%struct.S*)* #_ZN9S.Derived1g20hae95d0f1a38ed23b8aaE,
void (%struct.S*)* #_ZN6S.Base1a20h57ba36716de00921jbaE,
void (%struct.S*)* #_ZN6S.Base1b20h3d50ba92e362d050pbaE,
void (%struct.S*)* #_ZN6S.Base1c20h794e6e72e0a45cc2vbaE,
void (%struct.S*)* #_ZN6S.Base1d20hda31e564669a8cdaBbaE
}
All Rust vtables contain a pointer to the destructor, size and alignment in the first fields, and the subtrait vtables don't duplicate them when referencing supertrait methods, nor use indirect reference to supertrait vtables. They just have copies of the method pointers verbatim and nothing else.
Given that design, it's easy to understand why this does not work. A new vtable would need to be constructed at runtime, which would likely reside on the stack, and that isn't exactly an elegant (or optimal) solution.
There are some workarounds, of course, like adding explicit upcast methods to the interface, but that requires quite a bit of boilerplate (or macro frenzy) to work properly.
Now, the question is - why isn't it implemented in some way that would enable trait object upcasting? Like, adding a pointer to the supertrait's vtable in the subtrait's vtable. For now, Rust's dynamic dispatch doesn't seem to satisfy the Liskov substitution principle, which is a very basic principle for object-oriented design.
Of course you can use static dispatch, which is indeed very elegant to use in Rust, but it easily leads to code bloat which is sometimes more important than computational performance - like on embedded systems, and Rust developers claim to support such use cases of the language. Also, in many cases you can successfully use a model which is not purely Object-Oriented, which seems to be encouraged by Rust's functional design. Still, Rust supports many of the useful OO patterns... so why not the LSP?
Does anyone know the rationale for such design?
Actually, I think I got the reason. I found an elegant way to add upcasting support to any trait that desires it, and that way the programmer is able to choose whether to add that additional vtable entry to the trait, or prefer not to, which is a similar trade-off as in C++'s virtual vs. non-virtual methods: elegance and model correctness vs. performance.
The code can be implemented as follows:
trait Base: AsBase {
// ...
}
trait AsBase {
fn as_base(&self) -> &Base;
}
impl<T: Base> AsBase for T {
fn as_base(&self) -> &Base {
self
}
}
One may add additional methods for casting a &mut pointer or a Box (that adds a requirement that T must be a 'static type), but this is a general idea. This allows for safe and simple (although not implicit) upcasting of every derived type without boilerplate for every derived type.
As of Jun 2017, the status of this "sub-trait coercion" (or "super-trait coercion") is as follows:
An accepted RFC #0401 mentions this as a part of coercion. So this conversion should be done implicitly.
coerce_inner(T) = U where T is a sub-trait of U;
However, this is not yet implemented. There is a corresponding issue #18600.
There is also a duplicate issue #5665. Comments there explain what prevent this from being implemented.
Basically, the problem is how to derive vtables for super-traits. Current layout of vtables is as follows (in x86-64 case):
+-----+-------------------------------+
| 0- 7|pointer to "drop glue" function|
+-----+-------------------------------+
| 8-15|size of the data |
+-----+-------------------------------+
|16-23|alignment of the data |
+-----+-------------------------------+
|24- |methods of Self and supertraits|
+-----+-------------------------------+
It doesn't contain a vtable for a super-trait as a subsequence. We have at least to have some tweaks with vtables.
Of course there are ways to mitigate this problem, but many with differing advantages/disadvantages! One has a benefit for the vtable size when there is a diamond inheritance. Another is supposed to be faster.
There #typelist says they prepared a draft RFC which looks well-organized, but they look like disappeared after that (Nov 2016).
I ran into the same wall when I started with Rust.
Now, when I think about traits, I have a different image in mind than when I think about classes.
trait X: Y {} means when you implement trait X for struct S you also need to implement trait Y for S.
Of course this means that a &X knows it also is a &Y, and therefore offers the appropriate functions.
It would require some runtime-effort (more pointer dereferences) if you needed to traverse pointers to Y's vtable first.
Then again, the current design + additional pointers to other vtables probably wouldn't hurt much, and would allow easy casting to be implemented. So maybe we need both? This is something to be discussed on internals.rust-lang.org
This feature is so desired that there is both a tracking issue for adding it to the language, and an dedicated initiative repository for the people contributing to implementing it.
Tracking Issue: https://github.com/rust-lang/rust/issues/65991
Initiative Repository: https://github.com/rust-lang/dyn-upcasting-coercion-initiative
This is now working on stable rust, you can upcast to the base trait also you can call base trait functions directly from the derived trait object
trait Base {
fn a(&self) {
println!("a from base");
}
}
trait Derived: Base {
fn e(&self) {
println!("e from derived");
}
}
fn call_derived(d: &impl Derived) {
d.e();
d.a();
call_base(d);
}
fn call_base(b: &impl Base) {
b.a();
}
struct S;
impl Base for S {}
impl Derived for S {}
fn main() {
let s = S;
call_derived(&s);
}
playground link
Related
What I am trying to do
I have built a Rust interface, with which I want to interact via C (or C# but it does not really matter for the sake of the question). Because it does not seem to be possible to make a Rust Struct accessible to C I am trying to build some wrapper functions that I can call and that will create the Struct in Rust, call functions of the struct and eventually free the Struct from memory manually.
In order to do this I thought I would pass the pointer to the Struct instance that I create in the init function back to C (or C# and temporary store it as an IntPtr). Then when I call the other functions I would pass the pointer to Rust again, dereference it and call the appropriate functions on the dereferenced Struct, mutating it in the process.
I know that I will have to use unsafe code to do this and I am fine with that. I should probably also point out, that I don't know a lot about life-time management in Rust and it might very well be, that what I am trying to is impossible, because it is quite easy to produce a loose pointer somewhere. In that case, I would wonder how I would need to adjust my approach, because I think I am not the first person who is trying to mutate some sort of state from C inside Rust.
What I tried first
So first of all I made sure to output the correct library and add my native functions to it. In the Cargo.toml I set the lib type to:
[lib]
crate-type = ["cdylib"]
Then I created some functions to interact with the struct and exposed them like this:
#[no_mangle]
pub extern fn init() -> *mut MyStruct {
let mut struct_instance = MyStruct::default();
struct_instance.init();
let raw_pointer_mut = &mut struct_instance as *mut MyStruct;
return raw_pointer_mut;
}
#[no_mangle]
pub extern fn add_item(struct_instance_ref: *mut MyStruct) {
unsafe {
let struct_instance = &mut *struct_instance_ref;
struct_instance.add_item();
}
}
As you can see in the init function I am creating the struct and then I return the (mutable) pointer.
I then take the pointer in the add_item function and use it.
Now I tried to test this implementation, because I had some doubts about the pointer still beeing valid. In another Rust module I loaded the .dll and .lib files (I am on Windows, but that should not matter for the question) and then called the functions accordingly like so:
fn main() {
unsafe {
let struct_pointer = init();
add_item(struct_pointer);
println!("The pointer adress: {:?}", struct_pointer);
}
}
#[link(name = "my_library.dll")]
extern {
fn init() -> *mut u32;
fn add_item(struct_ref: *mut u32);
}
What happened: I did get some memory adress output and (because I am actually creating a file in the real implementation) I could also see that the functions were executed as planned. However the Struct's fields seem to be not mutated. They were basically all empty, what they should not have been after I called the add_item function (and also not after I called the init function).
What I tried after that
I read a bit on life-time management in Rust and therefore tried to allocate the Struct on the heap by using a Box like so:
#[no_mangle]
pub extern fn init() -> *mut Box<MyStruct> {
let mut struct_instance = MyStruct::default();
struct_instance.init();
let raw_pointer_mut = &mut Box::new(struct_instance) as *mut Box<MyStruct>;
return raw_pointer_mut;
}
#[no_mangle]
pub extern fn add_box(struct_instance_ref: *mut Box<MyStruct>) {
unsafe {
let struct_instance = &mut *struct_instance_ref;
struct_instance.add_box();
}
}
unfortunately the result was the same as above.
Additional Information
I figured it might be good to also include how the Struct is made up in principle:
#[derive(Default)]
#[repr(C)]
pub struct MyStruct{
// Some fields...
}
impl MyStruct{
/// Initializes a new struct.
pub fn init(&mut self) {
self.some_field = whatever;
}
/// Adds an item to the struct.
pub fn add_item(
&mut self,
maybe_more_data: of_type // Obviously the call in the external function would need to be adjusted to accomodate for that...
){
some_other_function(self); // Calls another function in Rust, that will take the struct instance as an argument and mutate it.
}
}
Rust has a strong notion of ownership. Ask yourself: who owns the MyStruct instance? It's the struct_instance variable, whose lifetime is the scope of the init() function. So after init() returns, the instance is dropped and an invalid pointer is returned.
Allocating the MyStruct on the heap would be the solution, but not in the way you tried: the instance is moved to the heap, but then the Box wrapper tied to the same problematic lifetime, so it destroys the heap-allocated object.
A solution is to use Box::into_raw to take the heap-allocated value back out of the box before the box is dropped:
#[no_mangle]
pub extern fn init() -> *mut MyStruct {
let mut struct_instance = MyStruct::default();
struct_instance.init();
let box = Box::new(struct_instance);
Box::into_raw(box)
}
To destroy the value later, use Box::from_raw to create a new Box that owns it, then let that box deallocate its contained value when it goes out of scope:
#[no_mangle]
pub extern fn destroy(struct_instance: *mut MyStruct) {
unsafe { Box::from_raw(struct_instance); }
}
This seems like a common problem, so there might be a more idiomatic solution. Hopefully someone more experienced will chime in.
I'm adding a simple answer for anyone who comes across this question but doesn't need to box - &mut struct_instance as *mut _ is the correct syntax to get a mutable pointer to a struct on the stack. This syntax is a bit tricky to find documented anywhere, it's easy to miss the initial mut.
Notably, this does not solve the original poster's issue, as returning a pointer to a local is undefined behavior. However, this is the correct solution for calling something via FFI (for which there don't seem to be any better results on Google).
Despite the fact that Rust has absorbed many good modern programming ideas, it looks like one very basic feature is not presented.
The modern (pseudo-)functional code is based on a large number of classes of the following kind:
pub struct NamedTuple {
a: i8,
b: char,
}
impl NamedTuple {
fn new(a: i8, b: char) -> NamedTuple {
NamedTuple { a: a, b: b }
}
fn a(&self) -> i8 {
self.a
}
fn b(&self) -> char {
self.b
}
}
As you can see, there is a lot of boilerplate code here. Is there really no way to describe such types compactly, without a boilerplate code?
When you have boilerplate, think macros:
macro_rules! ro {
(
pub struct $name:ident {
$($fname:ident : $ftype:ty),*
}
) => {
pub struct $name {
$($fname : $ftype),*
}
impl $name {
fn new($($fname : $ftype),*) -> $name {
$name { $($fname),* }
}
$(fn $fname(&self) -> $ftype {
self.$fname
})*
}
}
}
ro!(pub struct NamedTuple {
a: i8,
b: char
});
fn main() {
let n = NamedTuple::new(42, 'c');
println!("{}", n.a());
println!("{}", n.b());
}
This is a basic macro and could be extended to handle specifying visibility as well as attributes / documentation on the struct and the fields.
I'd challenge that you have as much boilerplate as you think you do. For example, you only show Copy types. As soon as you add a String or a Vec to your structs, this will fall apart and you need to either return a reference or take self.
Editorially, I don't think this is good or idiomatic Rust code. If you have a value type where people need to dig into it, just make the fields public:
pub struct NamedTuple {
pub a: i8,
pub b: char,
}
fn main() {
let n = NamedTuple { a: 42, b: 'c' };
println!("{}", n.a);
println!("{}", n.b);
}
Existing Rust features prevent most of the problems that getter methods attempt to solve in the first place.
Variable binding-based mutability
n.a = 43;
error[E0594]: cannot assign to field `n.a` of immutable binding
The rules of references
struct Something;
impl Something {
fn value(&self) -> &NamedTuple { /* ... */ }
}
fn main() {
let s = Something;
let n = s.value();
n.a = 43;
}
error[E0594]: cannot assign to field `n.a` of immutable binding
If you've transferred ownership of a value type to someone else, who cares if they change it?
Note that I'm making a distinction about value types as described by Growing Object-Oriented Software Guided by Tests, which they distinguish from objects. Objects should not have exposed internals.
Rust doesn't offer a built-in way to generate getters. However, there are multiple Rust features that can be used to tackle boilerplate code! The two most important ones for your question:
Custom Derives via #[derive(...)] attribute
Macros by example via macro_rules! (see #Shepmaster's answer on how to use those to solve your problem)
I think the best way to avoid boilerplate code like this is to use custom derives. This allows you to add a #[derive(...)] attribute to your type and generate these getters at compile time.
There is already a crate that offers exactly this: derive-getters. It works like this:
#[derive(Getters)]
pub struct NamedTuple {
a: i8,
b: char,
}
There is also getset, but it has two problems: getset should have derive in its crate name, but more importantly, it encourages the "getters & setters for everything" anti pattern by offering to also generate setters which don't perform any checks.
Finally, you might want to consider rethinking your approach to programming in Rust. Honestly, from my experience, "getter boilerplate" is hardly a problem. Sure, sometimes you need to write getters, but not "a large number" of them.
Mutability is also not unidiomatic in Rust. Rust is a multi paradigm language, supporting many styles of programming. Idiomatic Rust uses the most useful paradigm for each situation. Completely avoiding mutation might not be the best way to program in Rust. Furthermore, avoiding mutability is not only achieved by providing getters for your fields -- binding and reference mutability is far more important!
So, use read-only access to fields where it's useful, but not everywhere.
When I started learning Rust, I naively assumed Rust's pointers to traits were implemented just like a C++ pointer to a base class, and wrote some code that worked even under that assumption. Specifically, the code I wrote interfaced with an FFI library that needed to read and seek a stream, and it was something like this:
struct StreamParts {
reader: *mut Read,
seeker: *mut Seek,
}
fn new_ffi_object<T: Read + Seek + 'static>(stream: T) -> FFIObject {
let stream_ptr = Box::into_raw(Box::new(stream));
let stream_parts = Box::into_raw(Box::new(StreamParts {
reader: stream_ptr as *mut Read,
seeker: stream_ptr as *mut Seek,
}));
ffi_library::new_object(stream_parts, ffi_read, ffi_seek, ffi_close)
}
extern "C" fn ffi_read(stream_parts: *mut StreamParts, ...) -> c_ulong {
(*stream_parts.reader).read(...)
...
}
extern "C" fn ffi_seek(stream_parts: *mut StreamParts, ...) -> c_ulong {
(*stream_parts.seeker).seek(...)
...
}
extern "C" fn ffi_close(stream_parts: *mut StreamParts) {
mem::drop(Box::from_raw(stream_parts.reader));
mem::drop(Box::from_raw(stream_parts));
}
And it worked. However, there are three things I don't fully understand about why it works:
Rust's trait objects are fat, containing two pointers. Thus, unlike C++, *mut Read is a pointer to a trait object, correct? And where is this trait object allocated? The Rust docs don't touch on this specific case.
Am I correct to assume that mem::drop(Box::from_raw(stream_parts.reader)) fully drops the original stream?
Why is the 'static needed in new_ffi_object()?
Pointers and references behave exactly the same, except for the borrow-checker which forbids you to have dangling references and the fact that you need to wrap pointer dereferencing into an unsafe block.
So yes, sizeof::<*mut Read>() == sizeof::<*mut ()>() * 2. The trait object isn't allocated anywhere. It's nothing more than a struct with two fields. One that is a pointer that points to your data, and one that is a pointer that points to the vtable. The vtable is allocated in the static memory.
Correct. It accesses the vtable pointer of reader and looks up the drop impl in the vtable.
If you didn't have a 'static lifetime, your T might contain references with lifetimes shorter than 'static. All that lifetime bound says is that T doesn't have such references and may thus be copied anywhere without restrictions, even on the heap.
When one has a box pointer to some heap-allocated memory, I assume that Rust has 'hardcoded' knowledge of ownership, so that when ownership is transferred by calling some function, the resources are moved and the argument in the function is the new owner.
However, how does this happen for vectors for example? They too 'own' their resources, and ownership mechanics apply like for box pointers -- yet they are regular values stored in variables themselves, and not pointers. How does Rust (know to) apply ownership mechanics in this situation?
Can I make my own type which owns resources?
tl;dr: "owning" types in Rust are not some magic and they are most certainly not hardcoded into the compiler or language. They are just types which written in a certain way (do not implement Copy and likely have a destructor) and have certain semantics which is enforced through non-copyability and the destructor.
In its core Rust's ownership mechanism is very simple and has very simple rules.
First of all, let's define what move is. It is simple - a value is said to be moved when it becomes available under a new name and stops being available under the old name:
struct X(u32);
let x1 = X(12);
let x2 = x1;
// x1 is no longer accessible here, trying to use it will cause a compiler error
Same thing happens when you pass a value into a function:
fn do_something(x: X) {}
let x1 = X(12);
do_something(x1);
// x1 is no longer accessible here
Note that there is absolutely no magic here - it is just that by default every value of every type behaves like in the above examples. Values of each struct or enum you or someone else creates by default will be moved.
Another important thing is that you can give every type a destructor, that is, a piece of code which is invoked when the value of this type goes out of scope and destroyed. For example, destructors associated with Vec or Box will free the corresponding piece of memory. Destructors can be declared by implementing Drop trait:
struct X(u32);
impl Drop for X {
fn drop(&mut self) {
println!("Dropping {}", x.0);
}
}
{
let x1 = X(12);
} // x1 is dropped here, and "Dropping 12" will be printed
There is a way to opt-out of non-copyability by implementing Copy trait which marks the type as automatically copyable - its values will no longer be moved but copied:
#[derive(Copy, Clone)] struct X(u32);
let x1 = X(12);
let x2 = x1;
// x1 is still available here
The copy is done bytewise - x2 will contain a byte-identical copy of x1.
Not every type can be made Copy - only those which have Copy interior and do not implement Drop. All primitive types (except &mut references but including *const and *mut raw pointers) are Copy in Rust, so each struct which contains only primitives can be made Copy. On the other hand, structs like Vec or Box are not Copy - they deliberately do not implement it because bytewise copy of them will lead to double frees because their destructors can be run twice over the same pointer.
The Copy bit above is a slight digression on my side, just to give a clearer picture. Ownership in Rust is based on move semantics. When we say that some value own something, like in "Box<T> owns the given T", we mean semantic connection between them, not something magical or something which is built into the language. It is just most such values like Vec or Box do not implement Copy and thus moved instead of copied, and they also (optionally) have a destructor which cleans up anything these types may have allocated for them (memory, sockets, files, etc.).
Given the above, of course you can write your own "owning" types. This is one of the cornerstones of idiomatic Rust, and a lot of code in the standard library and external libraries is written in such way. For example, some C APIs provide functions for creating and destroying objects. Writing an "owning" wrapper around them is very easy in Rust and it is probably very close to what you're asking for:
extern {
fn create_widget() -> *mut WidgetStruct;
fn destroy_widget(w: *mut WidgetStruct);
fn use_widget(w: *mut WidgetStruct) -> u32;
}
struct Widget(*mut WidgetStruct);
impl Drop for Widget {
fn drop(&mut self) {
unsafe { destroy_widget(self.0); }
}
}
impl Widget {
fn new() -> Widget { Widget(unsafe { create_widget() }) }
fn use_it(&mut self) -> u32 {
unsafe { use_widget(self.0) }
}
}
Now you can say that Widget owns some foreign resource represented by *mut WidgetStruct.
Here is another example of how a value might own memory and free it when the value is destroyed:
extern crate libc;
use libc::{malloc, free, c_void};
struct OwnerOfMemory {
ptr: *mut c_void
}
impl OwnerOfMemory {
fn new() -> OwnerOfMemory {
OwnerOfMemory {
ptr: unsafe { malloc(128) }
}
}
}
impl Drop for OwnerOfMemory {
fn drop(&mut self) {
unsafe { free(self.ptr); }
}
}
fn main() {
let value = OwnerOfMemory::new();
}
Simple question: Where is sin()? I've searched and only found in the Rust docs that there are traits like std::num::Float that require sin, but no implementation.
The Float trait was removed, and the methods are inherent implementations on the types now (f32, f64). That means there's a bit less typing to access math functions:
fn main() {
let val: f32 = 3.14159;
println!("{}", val.sin());
}
However, it's ambiguous if 3.14159.sin() refers to a 32- or 64-bit number, so you need to specify it explicitly. Above, I set the type of the variable, but you can also use a type suffix:
fn main() {
println!("{}", 3.14159_f64.sin());
}
You can also use fully qualified syntax:
fn main() {
println!("{}", f32::sin(3.14159));
}
Real code should use the PI constant; I've used an inline number to avoid complicating the matter.
Float is Trait, include implementation, import this to apply for f32 or f64.
use std::num::Float;
fn main() {
println!("{}", 1.0f64.sin());
}