I'm writing code to interface with the NIC in my emulator (QEMU). I need to write the high and low 32-bits of my descriptor array to two separate addresses in memory. I'm struggling to mask (and shift) my raw pointer to write both halves to memory.
I have:
#[repr(align(16))]
struct e1000_rx_desc {
address : u64,
length : u16,
checksum : u16,
status : u8,
errors : u8,
special : u16,
}
And:
#[repr(align(16))]
struct e1000_tx_desc {
address : u64,
length : u16,
cso : u8,
cmd : u8,
status : u8,
css : u8,
special : u16,
}
An array of these descriptors is stored inside:
pub struct E1000 {
base : u32,
rx : [*mut e1000_rx_desc; 32],
tx : [*mut e1000_tx_desc; 8],
}
And I've been attempting to write to the hardware registers with:
self.write_u32_to_register(Registers::REG_RX_DESC_LO, 0, ((&self.rx as **mut e1000_rx_desc) & 0xFFFFFFFF) as u32);
self.write_u32_to_register(Registers::REG_RX_DESC_HI, 0, ((&self.rx as **mut e1000_rx_desc) >> 32) as u32);
I get the following error:
no implementation for `*const *mut e1000::e1000_rx_desc & {integer}
What is the best way to get access to the raw address so I can manipulate it?
Pointers can be cast to and from a usize using the as operator:
let x: i32 = 5;
let x_ptr: *const i32 = &x as *const i32;
let x_ptr_addr: usize = x_ptr as usize;
let new_ptr = (x_ptr_addr & 0xffff) as *const i32;
Related
Running the following code gives a segmentation fault:
fn main() {
let val = 1;
let ptr = val as *const i32;
unsafe { println!("{:?}", *ptr) };
}
Output:
[1] 69212 segmentation fault (core dumped) cargo r
However, when val is put in as a reference & while declaring the raw pointer, the code runs as intended and as val is printed out.
fn main() {
let val = 1;
let ptr = &val as *const i32;
unsafe { println!("{:?}", *ptr) };
}
Output:
1
So what is the shared reference doing here and why does the program fail without it? Isn't a reference in rust also a pointer with extra schematics? Why to we need to create a pointer to a reference and not directly to the val itself?
This issue can be answered by looking at the different semantics of the both code lines you provided.
fn main() {
let val = 1;
println!("{:?}", val as *const i32); // Output: 0x1
println!("{:?}", &val as *const i32); // Output: 0x7ff7b36a4eec (probably little different)
}
Without the reference the value of the variable is take as it is to be used to dereference the memory. This leads of course to a segmentation fault, since it will be not in the allowed address range of the program.
Only when the reference operator is used, the address of the variable is casted to a raw pointer, which then later can be dereferenced without any segmentation fault.
I'm trying to load a dll into my R script. Dll is written in rust. I read in R Studio documentation that .Call passes integers as int * in C which i interpret as &i32 in rust (also assuming that mutability is just rust thing, and i don't have to make it &mut i32 if i don't intent to mutate it). However R kept on crashing the session, so i start doing the trial and error. Made this file and tried to load it (the base taken from this repo):
#![cfg(windows)]
use winapi::shared::minwindef;
use winapi::shared::minwindef::{BOOL, DWORD, HINSTANCE, LPVOID};
use winapi::um::consoleapi;
/// Entry point which will be called by the system once the DLL has been loaded
/// in the target process. Declaring this function is optional.
///
/// # Safety
///
/// What you can safely do inside here is very limited, see the Microsoft documentation
/// about "DllMain". Rust also doesn't officially support a "life before main()",
/// though it is unclear what that that means exactly for DllMain.
#[no_mangle]
#[allow(non_snake_case, unused_variables)]
extern "system" fn DllMain(
dll_module: HINSTANCE,
call_reason: DWORD,
reserved: LPVOID)
-> BOOL
{
const DLL_PROCESS_ATTACH: DWORD = 1;
const DLL_PROCESS_DETACH: DWORD = 0;
match call_reason {
DLL_PROCESS_ATTACH => demo_init(),
DLL_PROCESS_DETACH => (),
_ => ()
}
minwindef::TRUE
}
fn demo_init() {
unsafe { consoleapi::AllocConsole() };
println!("Hello, world!");
}
#[no_mangle]
extern "cdecl" fn seven_cdecl_u32() -> u32 {
7
}
#[no_mangle]
extern "cdecl" fn seven_cdecl_u64() -> u64 {
7
}
#[no_mangle]
extern "cdecl" fn seven_cdecl_i32() -> i32 {
7
}
#[no_mangle]
extern "cdecl" fn seven_cdecl_i64() -> i64 {
7
}
#[no_mangle]
extern "stdcall" fn seven_stdcall_u32() -> u32 {
7
}
#[no_mangle]
extern "stdcall" fn seven_stdcall_u64() -> u64 {
7
}
#[no_mangle]
extern "stdcall" fn seven_stdcall_i32() -> i32 {
7
}
#[no_mangle]
extern "stdcall" fn seven_stdcall_i64() -> i64 {
7
}
#[no_mangle]
extern "system" fn seven_system_u32() -> u32 {
7
}
#[no_mangle]
extern "system" fn seven_system_i32() -> i32 {
7
}
#[no_mangle]
extern "system" fn seven_system_u64() -> u64 {
7
}
#[no_mangle]
extern "system" fn seven_system_i64() -> i64 {
7
}
#[no_mangle]
extern "C" fn seven_c_u32() -> u32 {
7
}
#[no_mangle]
extern "C" fn seven_c_i32() -> i32 {
7
}
#[no_mangle]
extern "C" fn seven_c_u64() -> u64 {
7
}
#[no_mangle]
extern "C" fn seven_c_i64() -> i64 {
7
}
CWD = r"(C:\\Users\grass\Desktop\codes\R\dlload)"
dllname = paste(CWD,r"(\rdll.dll)", sep="")
print(getwd())
dyn.load(dllname)
#print(.Call("seven_cdecl_i32", pakage=dllname))
#print(.Call("seven_cdecl_u32", pakage=dllname))
#print(.Call("seven_cdecl_i64", pakage=dllname))
#print(.Call("seven_cdecl_u64", pakage=dllname))
#print(.Call("seven_stdcall_i32", pakage=dllname))
#print(.Call("seven_stdcall_u32", pakage=dllname))
#print(.Call("seven_stdcall_i64", pakage=dllname))
#print(.Call("seven_stdcall_u64", pakage=dllname))
#print(.Call("seven_system_i32", pakage=dllname))
#print(.Call("seven_system_u32", pakage=dllname))
#print(.Call("seven_system_i64", pakage=dllname))
#print(.Call("seven_system_u64", pakage=dllname))
#print(.Call("seven_c_i32", pakage=dllname))
#print(.Call("seven_c_u32", pakage=dllname))
#print(.Call("seven_c_i64", pakage=dllname))
#print(.Call("seven_c_u64", pakage=dllname))
I was commenting out line by line but it never worked. But the entry point did work, and the hello world was printed. When i try to print a value of integer i pass to function (7) i get some absolute garbage, which made me think that memory layout is different. I read that all values in R are vectors which changes the layout, but i assumed that .Call is designed with this in mind.
Finally the documentation in R Studio claims that for R unaware functions .C should be used, but i don't understand how to get return value from .C as it evaluates to a list of parameters and a package name.
If anyone can tell me how to properly get arguments in rust from R and return from rust to R I would be grateful.
So from the Rodrigo's comment I looked if i could mutate value passed instead of returning it. It seems that there are limited capabilities to pass opaque pointers, hence using this way to return structs is impossible. But I managed to take and mutate a string value, which is shown here:
use std::iter::{once, zip};
struct RString {
base_ptr: *mut u8,
len: usize,
}
impl From<*mut *mut u8> for RString {
fn from(base_ptr: *mut *mut u8) -> Self {
Self {
base_ptr: unsafe { base_ptr.read() },
len: {
let mut off: isize = 0;
while '\0' as u8 != unsafe { base_ptr.read().offset(off).read() } {
off += 1;
}
off as usize
}
}
}
}
impl RString {
pub fn value(&self) -> String {
let mut buff: String = String::new();
for off in 0..(self.len as isize) {
buff.push(unsafe { self.base_ptr.clone().offset(off).read() } as char);
}
buff
}
pub fn edit(&mut self, new_value: String) {
self.len = new_value.len();
for (off, val) in (0..(self.len as isize)).zip(new_value.chars().chain(once(0_u8 as char))) {
unsafe {self.base_ptr.clone().offset(off).write(val as u8)};
}
}
}
/// takes a single string argument <a>, returns "Hello <a>!"
#[no_mangle]
extern "system" fn meet_n_greet(nameptr: *mut *mut u8) {
let mut rs: RString = RString::from(nameptr);
println!("Hello {}!", rs.value());
rs.edit(format!("Hello {}!", rs.value()));
}
CWD = r"(C:\\Users\grass\Desktop\codes\R\dlload)"
dllname = paste(CWD,r"(\rdll.dll)", sep="")
dyn.load(dllname)
print(.C("meet_n_greet", "Leroy Jenkins",package=dllname)[1])
dyn.unload(dllname)
The rust code is ugly and unsafe, but the example does work. This answer also does not solve the issue of opaque data so I'm just posting it to help people on their way.
Here is what I found in Rust's source code. I have difficulty in understanding &mut *(self as *mut str as *mut [u8]) and self as *const str as *const u8.
Is it a two-step conversion? First convert to a *mut str or *const str, next as a *mut [u8] or *const u8?
#[stable(feature = "str_mut_extras", since = "1.20.0")]
#[inline(always)]
pub unsafe fn as_bytes_mut(&mut self) -> &mut [u8] {
&mut *(self as *mut str as *mut [u8])
}
#[stable(feature = "rust1", since = "1.0.0")]
#[inline]
pub const fn as_ptr(&self) -> *const u8 {
self as *const str as *const u8
}
In Rust, the as operator allows converting by one step at a time.
There are a few conversions allowed, such as:
&T to *const T,
&mut T to *mut T,
*mut T to *mut U (pending some conditions on T and U),
...
However, even though you can go &mut T to *mut T to *mut U using as twice, you cannot go directly from &mut T to *mut U; both because compiler and humans would have a hard time figuring out the intermediate steps.
So, what's this conversion sequence about?
Going from reference to pointer: typical &T to *const T, or the mut variant.
Going from pointer to str to pointer to [u8]: a typical *const T to *const U for adequates T and U. str actually has the same representation as [u8], but only a subset of values are valid (proper UTF-8 ones).
It's interesting to note that one is safe and not the other:
Since all str are [u8], converting from *str to *[u8] is always safe.
However, exposing &mut [u8] allows breaking invariants inside str, and therefore as_bytes_mut is unsafe.
I'm working on a kernel module code, where I need to peek in to the
routing table to fetch ARP table entry for my daddr:
|--------------------------------------------------|
-------+ enp0s1 192.168.2.0/24 192.168.3.0/24 enp0s2 +-----
|--------------------------------------------------|
For example, I need to obtain neighbour entry for 192.168.3.111, and this entry has been permanently added in the table:
% ip neigh add 192.168.3.111 lladdr 00:11:22:33:44:55 dev enp0s2 nud permanent
% ip neigh sh
...
192.168.3.111 dev enp0s2 lladdr 00:11:22:33:44:55 PERMANENT
% ip route show
...
192.168.3.0/24 dev enp0s2 proto kernel scope link src 192.168.3.2
I came up with the following code:
struct rtable *rt;
struct flowi4 fl4;
struct dst_entry dst;
struct neighbour *neigh;
u8 mac[ETH_ALEN];
...
memset(&fl4, 0, sizeof fl4);
fl4.daddr = daddr;
fl4.flowi4_proto = IPPROTO_UDP;
rt = ip_route_output_key(net, &fl4);
if (IS_ERR(rt))
goto err;
...
dst = rt->dst;
neigh = dst_neigh_lookup(&dst, &fl4.daddr);
if (!neigh) {
...
}
neigh_ha_snapshot(mac, neigh, neigh->dev);
neigh_release(neigh);
ip_rt_put(rt);
However neigh_ha_snapshot does not return correct MAC address, in fact I think it returns garbage, sometimes ff:ff:ff:ff:ff:ff, sometimes multicast 01:xx:xx:xx:xx:xx.
What am I doing wrong?
The issue is fixed as follows:
neigh = dst_neigh_lookup(&rt->dst, &fl4.daddr);
So instead of having struct dst_entry object on the stack, assigning a value from rt and passing a pointer to it in dst_neigh_lookup(), just pass a a pointer to dst member in the current rt object.
The reason is withing the following code:
static inline struct neighbour *dst_neigh_lookup(const struct dst_entry *dst, const void *daddr)
{
struct neighbour *n = dst->ops->neigh_lookup(dst, NULL, daddr);
return IS_ERR(n) ? NULL : n;
}
where neigh_lookup is initialized to function ipv4_neigh_lookup() defined in net/ipv4/route.c :
static struct neighbour *ipv4_neigh_lookup(const struct dst_entry *dst,
struct sk_buff *skb,
const void *daddr)
{
struct net_device *dev = dst->dev;
const __be32 *pkey = daddr;
const struct rtable *rt;
struct neighbour *n;
rt = (const struct rtable *) dst;
...
}
From this point rt is bogus and so is the rest.
I have a type:
struct Foo {
memberA: Bar,
memberB: Baz,
}
and a pointer which I know is a pointer to memberB in Foo:
p: *const Baz
What is the correct way to get a new pointer p: *const Foo which points to the original struct Foo?
My current implementation is the following, which I'm pretty sure invokes undefined behavior due to the dereference of (p as *const Foo) where p is not a pointer to a Foo:
let p2 = p as usize -
((&(*(p as *const Foo)).memberB as *const _ as usize) - (p as usize));
This is part of FFI - I can't easily restructure the code to avoid needing to perform this operation.
This is very similar to Get pointer to object from pointer to some member but for Rust, which as far as I know has no offsetof macro.
The dereference expression produces an lvalue, but that lvalue is not actually read from, we're just doing pointer math on it, so in theory, it should be well defined. That's just my interpretation though.
My solution involves using a null pointer to retrieve the offset to the field, so it's a bit simpler than yours as it avoids one subtraction (we'd be subtracting 0). I believe I saw some C compilers/standard libraries implementing offsetof by essentially returning the address of a field from a null pointer, which is what inspired the following solution.
fn main() {
let p: *const Baz = 0x1248 as *const _;
let p2: *const Foo = unsafe { ((p as usize) - (&(*(0 as *const Foo)).memberB as *const _ as usize)) as *const _ };
println!("{:p}", p2);
}
We can also define our own offset_of! macro:
macro_rules! offset_of {
($ty:ty, $field:ident) => {
unsafe { &(*(0 as *const $ty)).$field as *const _ as usize }
}
}
fn main() {
let p: *const Baz = 0x1248 as *const _;
let p2: *const Foo = ((p as usize) - offset_of!(Foo, memberB)) as *const _;
println!("{:p}", p2);
}
With the implementation of RFC 2582, raw reference MIR operator, it is now possible to get the address of a field in a struct without an instance of the struct and without invoking undefined behavior.
use std::{mem::MaybeUninit, ptr};
struct Example {
a: i32,
b: u8,
c: bool,
}
fn main() {
let offset = unsafe {
let base = MaybeUninit::<Example>::uninit();
let base_ptr = base.as_ptr();
let c = ptr::addr_of!((*base_ptr).c);
(c as usize) - (base_ptr as usize)
};
println!("{}", offset);
}
The implementation of this is tricky and nuanced. It is best to use a crate that is well-maintained, such as memoffset.
Before this functionality was stabilized, you must have a valid instance of the struct. You can use tools like once_cell to minimize the overhead of the dummy value that you need to create:
use once_cell::sync::Lazy; // 1.4.1
struct Example {
a: i32,
b: u8,
c: bool,
}
static DUMMY: Lazy<Example> = Lazy::new(|| Example {
a: 0,
b: 0,
c: false,
});
static OFFSET_C: Lazy<usize> = Lazy::new(|| {
let base: *const Example = &*DUMMY;
let c: *const bool = &DUMMY.c;
(c as usize) - (base as usize)
});
fn main() {
println!("{}", *OFFSET_C);
}
If you must have this at compile time, you can place similar code into a build script and write out a Rust source file with the offsets. However, that will span multiple compiler invocations, so you are relying on the struct layout not changing between those invocations. Using something with a known representation would reduce that risk.
See also:
How do I create a global, mutable singleton?
How to create a static string at compile time