What are glue and chain dependencies in an LLVM DAG? - graph

I'm somewhat new to LLVM and compilers.
I've decided to generate a DAG using the following command
llc -view-sched-dags hello_world.ll
I got a really big graph with different dependency types. "Getting Started with LLVM Core Libraries" book explained that:
Black arrows mean data flow dependency
Red arrows mean glue dependency
Blue dashed arrows mean chain dependency
I clearly remember talking about data flow dependency in my compiler class at school. But I don't remember talking about the other two. Can someone expland the meaning of other dependencies? Any help is appreciated.
hello_world.cpp
#include <stdio.h>
#include <assert.h>
int sum(int a, int b) {
return a + b;
}
int main(int argc, char** argv) {
printf("Hello World! %d\n", sum(argc, 1));
return 0;
}
hello_world.ll
; ModuleID = 'hello_world.cpp'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
#.str = private unnamed_addr constant [17 x i8] c"Hello World! %d\0A\00", align 1
; Function Attrs: nounwind uwtable
define i32 #_Z3sumii(i32 %a, i32 %b) #0 {
entry:
%a.addr = alloca i32, align 4
%b.addr = alloca i32, align 4
store i32 %a, i32* %a.addr, align 4
store i32 %b, i32* %b.addr, align 4
%0 = load i32* %a.addr, align 4
%1 = load i32* %b.addr, align 4
%add = add nsw i32 %0, %1
ret i32 %add
}
; Function Attrs: uwtable
define i32 #main(i32 %argc, i8** %argv) #1 {
entry:
%retval = alloca i32, align 4
%argc.addr = alloca i32, align 4
%argv.addr = alloca i8**, align 8
store i32 0, i32* %retval
store i32 %argc, i32* %argc.addr, align 4
store i8** %argv, i8*** %argv.addr, align 8
%0 = load i32* %argc.addr, align 4
%call = call i32 #_Z3sumii(i32 %0, i32 1)
%call1 = call i32 (i8*, ...)* #printf(i8* getelementptr inbounds ([17 x i8]* #.str, i32 0, i32 0), i32 %call)
ret i32 0
}
declare i32 #printf(i8*, ...) #2
attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #2 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.ident = !{!0}
!0 = metadata !{metadata !"clang version 3.5.0 "}
hello_world.main.jpg
hello_world.sum.jpg

Chain dependencies prevent nodes with side effects (including memory operations and explicit register operations) from being scheduled out of order relative to each other.
Glue prevents the two nodes from being broken up during scheduling. It's actually more subtle than that [1], but most of the time you don't need to worry about it. (If you're implementing your own backend that requires two instructions to be adjacent to each other, you really want to be using a pseudoinstruction instead, and expand that after scheduling happens.)
[1]: See http://lists.llvm.org/pipermail/llvm-dev/2014-June/074046.html for example

Related

ThreadSanitizer (TSan) instrumentation using LLVM opt and TSan passes

My goal is to instrument my initial IR with proper calls to TSan runtime library functions using LLVM opt tool and TSan passes. In other words, I want to end up with similar TSan instrumentation as when using clang -fsanitize=thread -S but by directly using opt and TSan passes instead.
As far as I know, LLVM has two passes for TSan instrumentation: tsan-module (a module pass) and tsan (a function pass). Both passes are available by default in opt, i.e. are included in opt -print-passes report.
I choose tiny_race.c as my sample programe, where the main thread and the thread it spawns (Thread1) form a data race while accessing a global variable Global.
Here are the two steps I take to instrument the code my way:
Generating the initial LLVM IR for tiny_race.c:
clang -S -emit-llvm tiny_race.c -o tiny_race.ll
Using LLVM opt to instrument tiny_race.ll with the two TSan passes:
opt -passes='tsan-module,tsan' tiny_race.ll -S -o myInstrumented.ll
The above pass pipeline executes fine but the resulting myInstrumented.ll lacks some TSan instrumentations. More specifically:
Thread1 (child thread) is left completely un-instrumented.
main thread only has #__tsan_func_entry and #__tsan_func_exit instrumentations and its accesses to Global are not instrumented.
Could anyone please explain why my approach produces a partially-instrumented output? Any suggestion is greatly appreciated.
To better display the difference between the IR resulting from my approach and the expected one, bellow you can find definitions of main and Thread1 in each of them.
Here is myInstrumented.ll:
; Function Attrs: noinline nounwind optnone uwtable
define dso_local ptr #Thread1(ptr noundef %x) #0 {
entry:
%x.addr = alloca ptr, align 8
store ptr %x, ptr %x.addr, align 8
store i32 42, ptr #Global, align 4
%0 = load ptr, ptr %x.addr, align 8
ret ptr %0
}
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 #main() #0 {
entry:
%0 = call ptr #llvm.returnaddress(i32 0)
call void #__tsan_func_entry(ptr %0) *****TSAN INSTRUMENTATION*****
%retval = alloca i32, align 4
%t = alloca i64, align 8
store i32 0, ptr %retval, align 4
%call = call i32 #pthread_create(ptr noundef %t, ptr noundef null, ptr noundef #Thread1, ptr noundef null) #4
store i32 43, ptr #Global, align 4
%1 = load i64, ptr %t, align 8
%call1 = call i32 #pthread_join(i64 noundef %1, ptr noundef null)
%2 = load i32, ptr #Global, align 4
call void #__tsan_func_exit() *****TSAN INSTRUMENTATION*****
ret i32 %2
}
And here is the resulting IR when using clang -fsanitize=thread -S -emit-llvm tiny_race.c which is my expected result:
; Function Attrs: noinline nounwind optnone sanitize_thread uwtable
define dso_local ptr #Thread1(ptr noundef %x) #0 {
entry:
%0 = call ptr #llvm.returnaddress(i32 0)
call void #__tsan_func_entry(ptr %0) *****TSAN INSTRUMENTATION*****
%x.addr = alloca ptr, align 8
store ptr %x, ptr %x.addr, align 8
call void #__tsan_write4(ptr #Global) *****TSAN INSTRUMENTATION*****
store i32 42, ptr #Global, align 4
%1 = load ptr, ptr %x.addr, align 8
call void #__tsan_func_exit() *****TSAN INSTRUMENTATION*****
ret ptr %1
}
; Function Attrs: noinline nounwind optnone sanitize_thread uwtable
define dso_local i32 #main() #0 {
entry:
%0 = call ptr #llvm.returnaddress(i32 0)
call void #__tsan_func_entry(ptr %0) *****TSAN INSTRUMENTATION*****
%retval = alloca i32, align 4
%t = alloca i64, align 8
store i32 0, ptr %retval, align 4
%call = call i32 #pthread_create(ptr noundef %t, ptr noundef null, ptr noundef #Thread1, ptr noundef null) #4
call void #__tsan_write4(ptr #Global) *****TSAN INSTRUMENTATION*****
store i32 43, ptr #Global, align 4
call void #__tsan_read8(ptr %t) *****TSAN INSTRUMENTATION*****
%1 = load i64, ptr %t, align 8
%call1 = call i32 #pthread_join(i64 noundef %1, ptr noundef null)
call void #__tsan_read4(ptr #Global) *****TSAN INSTRUMENTATION*****
%2 = load i32, ptr #Global, align 4
call void #__tsan_func_exit() *****TSAN INSTRUMENTATION*****
ret i32 %2
}

How to bulk insert into a vector in rust? [duplicate]

Is there any straightforward way to insert or replace multiple elements from &[T] and/or Vec<T> in the middle or at the beginning of a Vec in linear time?
I could only find std::vec::Vec::insert, but that's only for inserting a single element in O(n) time, so I obviously cannot call that in a loop.
I could do a split_off at that index, extend the new elements into the left half of the split, and then extend the second half into the first, but is there a better way?
As of Rust 1.21.0, Vec::splice is available and allows inserting at any point, including fully prepending:
let mut vec = vec![1, 5];
let slice = &[2, 3, 4];
vec.splice(1..1, slice.iter().cloned());
println!("{:?}", vec); // [1, 2, 3, 4, 5]
The docs state:
Note 4: This is optimal if:
The tail (elements in the vector after range) is empty
or replace_with yields fewer elements than range’s length
or the lower bound of its size_hint() is exact.
In this case, the lower bound of the slice's iterator should be exact, so it should perform one memory move.
splice is a bit more powerful in that it allows you to remove a range of values (the first argument), insert new values (the second argument), and optionally get the old values (the result of the call).
Replacing a set of items
let mut vec = vec![0, 1, 5];
let slice = &[2, 3, 4];
vec.splice(..2, slice.iter().cloned());
println!("{:?}", vec); // [2, 3, 4, 5]
Getting the previous values
let mut vec = vec![0, 1, 2, 3, 4];
let slice = &[9, 8, 7];
let old: Vec<_> = vec.splice(3.., slice.iter().cloned()).collect();
println!("{:?}", vec); // [0, 1, 2, 9, 8, 7]
println!("{:?}", old); // [3, 4]
Okay, there is no appropriate method in Vec interface (as I can see). But we can always implement the same thing ourselves.
memmove
When T is Copy, probably the most obvious way is to move the memory, like this:
fn push_all_at<T>(v: &mut Vec<T>, offset: usize, s: &[T]) where T: Copy {
match (v.len(), s.len()) {
(_, 0) => (),
(current_len, _) => {
v.reserve_exact(s.len());
unsafe {
v.set_len(current_len + s.len());
let to_move = current_len - offset;
let src = v.as_mut_ptr().offset(offset as isize);
if to_move > 0 {
let dst = src.offset(s.len() as isize);
std::ptr::copy_memory(dst, src, to_move);
}
std::ptr::copy_nonoverlapping_memory(src, s.as_ptr(), s.len());
}
},
}
}
shuffle
If T is not copy, but it implements Clone, we can append given slice to the end of the Vec, and move it to the required position using swaps in linear time:
fn push_all_at<T>(v: &mut Vec<T>, mut offset: usize, s: &[T]) where T: Clone + Default {
match (v.len(), s.len()) {
(_, 0) => (),
(0, _) => { v.push_all(s); },
(_, _) => {
assert!(offset <= v.len());
let pad = s.len() - ((v.len() - offset) % s.len());
v.extend(repeat(Default::default()).take(pad));
v.push_all(s);
let total = v.len();
while total - offset >= s.len() {
for i in 0 .. s.len() { v.swap(offset + i, total - s.len() + i); }
offset += s.len();
}
v.truncate(total - pad);
},
}
}
iterators concat
Maybe the best choice will be to not modify Vec at all. For example, if you are going to access the result via iterator, we can just build iterators chain from our chunks:
let v: &[usize] = &[0, 1, 2];
let s: &[usize] = &[3, 4, 5, 6];
let offset = 2;
let chain = v.iter().take(offset).chain(s.iter()).chain(v.iter().skip(offset));
let result: Vec<_> = chain.collect();
println!("Result: {:?}", result);
I was trying to prepend to a vector in rust and found this closed question that was linked here, (despite this question being both prepend and insert AND efficiency. I think my answer would be better as an answer for that other, more precises question because I can't attest to the efficiency), but the following code helped me prepend, (and the opposite.) [I'm sure that the other two answers are more efficient, but the way that I learn, I like having answers that can be cut-n-pasted with examples that demonstrate an application of the answer.]
pub trait Unshift<T> { fn unshift(&mut self, s: &[T]) -> (); }
pub trait UnshiftVec<T> { fn unshift_vec(&mut self, s: Vec<T>) -> (); }
pub trait UnshiftMemoryHog<T> { fn unshift_memory_hog(&mut self, s: Vec<T>) -> (); }
pub trait Shift<T> { fn shift(&mut self) -> (); }
pub trait ShiftN<T> { fn shift_n(&mut self, s: usize) -> (); }
impl<T: std::clone::Clone> ShiftN<T> for Vec<T> {
fn shift_n(&mut self, s: usize) -> ()
// where
// T: std::clone::Clone,
{
self.drain(0..s);
}
}
impl<T: std::clone::Clone> Shift<T> for Vec<T> {
fn shift(&mut self) -> ()
// where
// T: std::clone::Clone,
{
self.drain(0..1);
}
}
impl<T: std::clone::Clone> Unshift<T> for Vec<T> {
fn unshift(&mut self, s: &[T]) -> ()
// where
// T: std::clone::Clone,
{
self.splice(0..0, s.to_vec());
}
}
impl<T: std::clone::Clone> UnshiftVec<T> for Vec<T> {
fn unshift_vec(&mut self, s: Vec<T>) -> ()
where
T: std::clone::Clone,
{
self.splice(0..0, s);
}
}
impl<T: std::clone::Clone> UnshiftMemoryHog<T> for Vec<T> {
fn unshift_memory_hog(&mut self, s: Vec<T>) -> ()
where
T: std::clone::Clone,
{
let mut tmp: Vec<_> = s.to_owned();
//let mut tmp: Vec<_> = s.clone(); // this also works for some data types
/*
let local_s: Vec<_> = self.clone(); // explicit clone()
tmp.extend(local_s); // to vec is possible
*/
tmp.extend(self.clone());
*self = tmp;
//*self = (*tmp).to_vec(); // Just because it compiles, doesn't make it right.
}
}
// this works for: v = unshift(v, &vec![8]);
// (If you don't want to impl Unshift for Vec<T>)
#[allow(dead_code)]
fn unshift_fn<T>(v: Vec<T>, s: &[T]) -> Vec<T>
where
T: Clone,
{
// create a mutable vec and fill it
// with a clone of the array that we want
// at the start of the vec.
let mut tmp: Vec<_> = s.to_owned();
// then we add the existing vector to the end
// of the temporary vector.
tmp.extend(v);
// return the tmp vec that is identitcal
// to unshift-ing the original vec.
tmp
}
/*
N.B. It is sometimes (often?) more memory efficient to reverse
the vector and use push/pop, rather than splice/drain;
Especially if you create your vectors in "stack order" to begin with.
*/
fn main() {
let mut v: Vec<usize> = vec![1, 2, 3];
println!("Before push:\t {:?}", v);
v.push(0);
println!("After push:\t {:?}", v);
v.pop();
println!("popped:\t\t {:?}", v);
v.drain(0..1);
println!("drain(0..1)\t {:?}", v);
/*
// We could use a function
let c = v.clone();
v = unshift_fn(c, &vec![0]);
*/
v.splice(0..0, vec![0]);
println!("splice(0..0, vec![0]) {:?}", v);
v.shift_n(1);
println!("shift\t\t {:?}", v);
v.unshift_memory_hog(vec![8, 16, 31, 1]);
println!("MEMORY guzzler unshift {:?}", v);
//v.drain(0..3);
v.drain(0..=2);
println!("back to the start: {:?}", v);
v.unshift_vec(vec![0]);
println!("zerothed with unshift: {:?}", v);
let mut w = vec![4, 5, 6];
/*
let prepend_this = &[1, 2, 3];
w.unshift_vec(prepend_this.to_vec());
*/
w.unshift(&[1, 2, 3]);
assert_eq!(&w, &[1, 2, 3, 4, 5, 6]);
println!("{:?} == {:?}", &w, &[1, 2, 3, 4, 5, 6]);
}

LLVM global constructor is not called for ATmel processors

I have compiled a cpp code and downloaded it to Arduino Uno for blinking an LED. The code works fine.
However, when I convert it to .ll and from .ll to an object file then hex and upload, the code stops working. No LED blinks by the Arduino.
If I address the ports directly:
typedef unsigned char uint8_t;
typedef uint8_t * volatile port_type;
const port_type portB = (port_type) 0x25;
const port_type ddrB = (port_type) 0x24;
it will work fine but if I initialize port addressed via global constructor, it does not work:
int getPortB() {return 0x25;}
int getDdrB() {return 0x24;}
const port_type portB = (port_type) getPortB();
const port_type ddrB = (port_type) getDdrB();
This is because that global constructor is not called at all. If I call it from main function via
call addrspace(1) void #global_var_init()
it will work.
I use the following commands to compile and download the ll file to the Arduino uno:
llvm-as-9 blink1.ll -o blink1.bc
llc-9 -filetype=obj blink1.bc
avr-g++ -mmcu=atmega328p blink1.o -o blink1
avr-objcopy -O ihex -R .eeprom blink1 blink1.hex
avrdude -F -V -c arduino -p ATMEGA328P -P /dev/ttyUSB0 -b 115200 -U flash:w:blink1.hex
blink1.ll
; ModuleID = 'blink1.cpp'
source_filename = "blink1.cpp"
target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
target triple = "avr"
#portB = dso_local global i8* null, align 1
#ddrB = dso_local global i8* null, align 1
#llvm.global_ctors = appending global [1 x { i32, void () addrspace(1)*, i8* }] [{ i32, void () addrspace(1)*, i8* } { i32 65535, void () addrspace(1)* #global_var_init, i8* null }]
; Function Attrs: noinline
define internal void #global_var_init() addrspace(1) {
%1 = inttoptr i16 37 to i8*
store volatile i8* %1, i8** #portB, align 1
%2 = inttoptr i16 36 to i8*
store volatile i8* %2, i8** #ddrB, align 1
ret void
}
; Function Attrs: noinline nounwind optnone
define dso_local void #delay_500ms() addrspace(1) {
call addrspace(0) void asm sideeffect "ldi r19, 150 \0A\09ldi r20, 128 \0A\09ldi r23, 41 \0A\09L1: \0A\09dec r20 \0A\09brne L1 \0A\09dec r19 \0A\09brne L1 \0A\09dec r23 \0A\09brne L1 \0A\09", ""() #3, !srcloc !2
ret void
}
; Function Attrs: noinline norecurse nounwind optnone
define dso_local i16 #main() addrspace(1) {
; call addrspace(1) void #global_var_init()
%1 = alloca i16, align 1
store i16 0, i16* %1, align 1
%2 = load volatile i8*, i8** #ddrB, align 1
store i8 32, i8* %2, align 1
br label %3
3: ; preds = %0, %3
%4 = load volatile i8*, i8** #portB, align 1
store i8 32, i8* %4, align 1
call addrspace(1) void #delay_500ms()
%5 = load volatile i8*, i8** #portB, align 1
store i8 0, i8* %5, align 1
call addrspace(1) void #delay_500ms()
br label %3
}
!0 = !{i32 1, !"wchar_size", i32 2}
!1 = !{!"clang version 9.0.1-+20210314105943+c1a0a213378a-1~exp1~20210314220516.107 "}
!2 = !{i32 1296, i32 1313, i32 1338, i32 1362, i32 1377, i32 1397, i32 1416, i32 1436, i32 1455, i32 1475, i32 1494}
Is this an LLVM bug or am I doing a mistake?

Cannot assign to immutable indexed content while iterating

I'm writing a library in Rust for a Java application and I'm trying to send data from the Java code to the Rust code. This data consists of structs called Chunks which I construct on the Rust side. I'm also sending data to modify these structs, so they need to be mutable. I'm getting an error saying the Chunks inside the HashSet are immutable, which shouldn't be the case.
#[derive(Eq, PartialEq, Hash)]
struct Chunk {
x: i32,
y: i32,
z: i32,
blocks: [[[i32; 16]; 16]; 16],
}
lazy_static! {
// static mutable list (or at least it should be)
static ref CHUNKS: Mutex<HashSet<Chunk>> = Mutex::new(HashSet::new());
}
#[no_mangle]
pub extern fn add_chunk(cx: i32, cy: i32, cz: i32, c_blocks: [[[i32; 16]; 16]; 16]) {
// create Chunk and put it in the global list
CHUNKS.lock().unwrap().insert(Chunk {x: cx, y: cy, z: cz, blocks: c_blocks});
}
#[no_mangle]
pub extern fn update_block(x: i32, y: i32, z: i32, id: i32) {
let cx: i32 = x / 16;
let cy: i32 = y / 16;
let cz: i32 = z / 16;
let rx: i32 = if x > 0 { x % 16 } else { 16 + (x % 16) };
let ry: i32 = if y > 0 { y % 16 } else { 16 + (y % 16) };
let rz: i32 = if z > 0 { z % 16 } else { 16 + (z % 16) };
for c in CHUNKS.lock().unwrap().iter() {
if c.x == cx && c.y == cy && c.z == cz {
// ERROR: cannot assign to immutable indexed content `c.blocks[..][..][..]`
c.blocks[rx as usize][ry as usize][rz as usize] = id;
}
}
}
I don't know if I should be using a Vec or HashSet, I went with the latter because it seemed the easiest.
The original answer is incorrect - HashSet does not have iter_mut() method: changing elements of a hash table is unsafe, because their position is determined by their hash, so if a value changes, its hash also changes, but since it is modified in-place, it won't be positioned in the hash table correctly anymore, and will likely be lost.
Therefore, the most natural approach would be to use a HashMap<(i32, i32, i32), Chunk>, as suggested by #starblue:
lazy_static! {
static ref CHUNKS: Mutex<HashMap<(i32, i32, i32), Chunk>> = Mutex::new(HashMap::new());
}
#[no_mangle]
pub extern fn add_chunk(cx: i32, cy: i32, cz: i32, c_blocks: [[[i32; 16]; 16]; 16]) {
CHUNKS.lock().unwrap().insert((cx, cy, cz), Chunk {x: cx, y: cy, z: cz, blocks: c_blocks});
}
#[no_mangle]
pub extern fn update_block(x: i32, y: i32, z: i32, id: i32) {
let cx: i32 = x / 16;
let cy: i32 = y / 16;
let cz: i32 = z / 16;
let guard = CHUNKS.lock().unwrap();
if let Some(chunk) = guard.get_mut((cx, cy, cz)) {
let rx: i32 = if x > 0 { x % 16 } else { 16 + (x % 16) };
let ry: i32 = if y > 0 { y % 16 } else { 16 + (y % 16) };
let rz: i32 = if z > 0 { z % 16 } else { 16 + (z % 16) };
chunk.blocks[rx as usize][ry as usize][rz as usize] = id;
}
}
Additionally, with a hash map you don't need to walk through the whole collection to get an item by its coordinates.
The original answer is below.
Your code is almost correct, you just need to use iter_mut() instead of iter():
for c in CHUNKS.lock().unwrap().iter_mut()
or, alternatively:
for c in &mut *CHUNKS.lock().unwrap()
iter() returns an iterator which yields immutable references, so you can't modify anything through it. iter_mut(), on the other hand, returns an iterator yielding mutable references - exactly what you need.
Also, instead of directly calling iter_mut(), it is more idiomatic to rely on IntoIterator implementations for references to collections: for example, &mut HashSet<T> implements IntoIterator by calling iter_mut() on the set, so for x in &mut hash_set is equivalent to for x in hash_set.iter_mut(). Additional * here is required because unwrap() returns not just the contained value, but a MutexGuard which derefs to whatever the mutex contains.

How would I write this C function in Rust?

How would I write the function below in Rust? Is there a way to write replace() safely or is the operation inherently unsafe? list does not have to be an array, a vector would work as well. It's the replacement operation that I'm interested in.
void replace(int *list[], int a, int b) {
*list[a] = *list[b];
}
I would like the following behavior:
int a = 1;
int b = 2;
int *list[] = { &a, &a, &b, &b };
*list[0] = 3; // list has pointers to values: [3, 3, 2, 2]
replace(list, 2, 0); // list has pointers to values: [3, 3, 3, 3]
*list[0] = 4; // list has pointers to values: [4, 4, 4, 4]
Answer for modified question
Rust does not allow you to have multiple mutable references (aliasing) to the same item. This means you'd never be able to run the equivalent of your third line:
fn main() {
let mut a = 1;
let vals = &[&mut a, &mut a];
}
This fails with:
cannot borrow `a` as mutable more than once at a time
What about using Rc and RefCell?
Rc doesn't let us mutate the value:
A reference-counted pointer type over an immutable value.
(Emphasis mine)
RefCell::borrow_mut won't allow multiple concurrent borrows:
Panics if the value is currently borrowed.
Answer for original question
It's basically the same. I picked a u8 cause it's easier to type. :-)
fn replace(v: &mut [&mut u8], a: usize, b: usize) {
*v[a] = *v[b]
}
fn main() {
let mut vals = vec![1,2,3,4];
{
let mut val_refs: Vec<&mut u8> = vals.iter_mut().collect();
replace(&mut val_refs, 0, 3);
}
println!("{:?}", vals);
}
(playpen link)
Rust does do boundary-checking, so if you call with an index bigger than the slice, the program will panic and you don't get memory corruption.

Resources