What is the difference between Vec<struct> and &[struct]? - vector

I often find myself getting an error like this:
mismatched types: expected `collections::vec::Vec<u8>`, found `&[u8]` (expected struct collections::vec::Vec, found &-ptr)
As far as I know, one is mutable and one isn't but I've no idea how to go between the types, i.e. take a &[u8] and make it a Vec<u8> or vice versa.
What's the different between them? Is it the same as String and &str?

Is it the same as String and &str?
Yes. A Vec<T> is the owned variant of a &[T]. &[T] is a reference to a set of Ts laid out sequentially in memory (a.k.a. a slice). It represents a pointer to the beginning of the items and the number of items. A reference refers to something that you don't own, so the set of actions you can do with it are limited. There is a mutable variant (&mut [T]), which allows you to mutate the items in the slice. You can't change how many are in the slice though. Said another way, you can't mutate the slice itself.
take a &[u8] and make it a Vec
For this specific case:
let s: &[u8]; // Set this somewhere
Vec::from(s);
However, this has to allocate memory not on the stack, then copy each value into that memory. It's more expensive than the other way, but might be the correct thing for a given situation.
or vice versa
let v = vec![1u8, 2, 3];
let s = v.as_slice();
This is basically "free" as v still owns the data, we are just handing out a reference to it. That's why many APIs try to take slices when it makes sense.

Related

Pointer vs value receiver in Go | heap.Interface vs sort.Interface

I came across priorityqueue example under heap.Interface package
Link: https://golang.org/pkg/container/heap/#Interface
For Push() and Pop() function required by heap.Interface, the implementation is on pointer receiver. But for Swap() function required by sort.Interface, the implementation is on value.
Why this difference ?
As per my understanding, Push() and Pop() are implemented on pointer type, as they need to change the underlying data. But going by that logic, Swap() should also be implemented on pointer type.
How and why does the Swap() implementation work on value, but Push() and Pop() do not ?
A pointer receiver is needed when the value passed needs to be modified. In the case of Swap, the value itself (which is a slice) doesn't get modified, although the array backing the slice does get modified.
In the case of Push and Pop, the slice does get modified since in both cases the length changes (and in the case of Push the underlying array may get replaced by a new one if it has reached its capacity).
Take a look at Push implementation:
func (pq *PriorityQueue) Push(x interface{}) {
n := len(*pq)
item := x.(*Item)
item.index = n
*pq = append(*pq, item) // Here, the slice is assigned a new value
}
Push (and Pop) modify the underlying slice as well as the slice elements for the priority queue, whereas Swap will only swap two elements in the slice, and will not change the slice itself. Thus, Swap can work with a value receiver.
Internally, a slice variable holds a length, a capacity, and a pointer to the data. Swapping items changes the data, but doesn't change any of the items in the slice header. Russ Cox explained this in a blog post.
Adding items to the slice, like to push something onto a heap, may require the array to be re-allocated, which will change the capacity and the location that needs to be pointed to.
You may find this answer on pointers vs. values generally to be useful. There are other types, like channels and maps, that contain references such that you don't need a pointer to mess with the data underneath.

Memory leak in golang slice

I just started learning go, while going through slice tricks, couple of points are very confusing. can any one help me to clarify.
To cut elements in slice its given
Approach 1:
a = append(a[:i], a[j:]...)
but there is a note given that it may cause to memory leaks if pointers are used and recommended way is
Approach 2:
copy(a[i:], a[j:])
for k, n := len(a)-j+i, len(a); k < n; k++ {
a[k] = nil // or the zero value of T
}
a = a[:len(a)-j+i]
Can any one help me understand how memory leaks happen.
I understood sub slice will be backed by the main array. My thought is irrespective of pointer or not we have to follow approach 2 always.
update after #icza and #Volker answer..
Lets say you have a struct
type Books struct {
title string
author string
}
var Book1 Books
var Book2 Books
/* book 1 specification */
Book1.title = "Go Programming"
Book1.author = "Mahesh Kumar"
Book2.title = "Go Programming"
Book2.author = "Mahesh Kumar"
var bkSlice = []Books{Book1, Book2}
var bkprtSlice = []*Books{&Book1, &Book2}
now doing
bkSlice = bkSlice[:1]
bkSlice still holds the Book2 in backing array which is still in memory and is not required to be.
so do we need to do
bkSlice[1] = Books{}
so that it will be GCed. I understood pointers have to be nil-ed as the slice will hold unnecessary references to the objects outside backing array.
Simplest can be demonstrated by a simple slice expression.
Let's start with a slice of *int pointers:
s := []*int{new(int), new(int)}
This slice has a backing array with a length of 2, and it contains 2 non-nil pointers, pointing to allocated integers (outside of the backing array).
Now if we reslice this slice:
s = s[:1]
Length will become 1. The backing array (holding 2 pointers) is not touched, it sill holds 2 valid pointers. Even though we don't use the 2nd pointer now, since it is in memory (it is the backing array), the pointed object (which is a memory space for storing an int value) cannot be freed by the garbage collector.
The same thing happens if you "cut" multiple elements from the middle. If the original slice (and its backing array) was filled with non-nil pointers, and if you don't zero them (with nil), they will be kept in memory.
Why isn't this an issue with non-pointers?
Actually, this is an issue with all pointer and "header" types (like slices and strings), not just pointers.
If you would have a slice of type []int instead of []*int, then slicing it will just "hide" elements that are of int type which must stay in memory as part of the backing array regardless of if there's a slice that contains it or not. The elements are not references to objects stored outside of the array, while pointers refer to objects being outside of the array.
If the slice contains pointers and you nil them before the slicing operation, if there are no other references to the pointed objects (if the array was the only one holding the pointers), they can be freed, they will not be kept due to still having a slice (and thus the backing array).
Update:
When you have a slice of structs:
var bkSlice = []Books{Book1, Book2}
If you slice it like:
bkSlice = bkSlice[:1]
Book2 will become unreachabe via bkSlice, but still will be in memory (as part of the backing array).
You can't nil it because nil is not a valid value for structs. You can however assign its zero value to it like this:
bkSlice[1] = Book{}
bkSlice = bkSlice[:1]
Note that a Books struct value will still be in memory, being the second element of the backing array, but that struct will be a zero value, and thus will not hold string references, thus the original book author and title strings can be garbage collected (if no one else references them; more precisely the byte slice referred from the string header).
The general rule is "recursive": You only need to zero elements that refer to memory located outside of the backing array. So if you have a slice of structs that only have e.g. int fields, you do not need to zero it, in fact it's just unnecessary extra work. If the struct has fields that are pointers, or slices, or e.g. other struct type that have pointers or slices etc., then you should zero it in order to remove the reference to the memory outside of the backing array.

Does Kotlin have pointers?

Does Kotlin have pointers?
If yes,
How to increment a Pointer?
How to decrement a Pointer?
How to do Pointer Comparisons?
It has references, and it doesn't support pointer arithmetic (so you can't increment or decrement).
Note that the only thing that "having pointers" allows you is the ability to create a pointer and to dereference it.
The closest thing to a "pointer comparison" is referential equality, which is performed with the === operator.
There is no pointers in Kotlin for low-level processing as C.
However, it's possible emulate pointers in high-level programming.
For low-level programming it is necessary using special system APIs to simulate arrays in memories, that exists in Windows, Linux, etc. Read about memory mapped files here and here. Java has library to read and write directly in memory.
Single types (numeric, string and boolean) are values, however, other types are references (high level pointers) in Kotlin, that one can compare, assign, etc.
If one needs increment or decrement pointers, just encapsulate the desired data package into a array
For simulate pointers to simple values it just wrap the value in a class:
data class pStr ( // Pointer to a String
var s:String=""
)
fun main() {
var st=pStr("banana")
var tt=st
tt.s = "melon"
println(st.s) // display "melon"
var s:String = "banana"
var t:String = s
t.s = "melon"
println(s.s) // display "banana"
}
I found this question while googling over some interesting code I found and thought that I would contribute my own proverbial "two cents". So Kotlin does have an operator which might be confused as a pointer, based on syntax, the spread operator. The spread operator is often used to pass an array as a vararg parameter.
For example, one might see something like the following line of code which looks suspiciously like the use of a pointer:
val process = ProcessBuilder(*args.toTypedArray()).start()
This line isn't calling the toTypedArray() method on a pointer to the args array, as you might expect if you come from a C/C++ background like me. Rather, this code is actually just calling the toTypedArray() method on the args array (as one would expect) and then passing the elements of the array as an arbitrary number of varargs arguments. Without the spread operator (i.e. *), a single argument would be passed, which would be the typed args array, itself.
That's the key difference: the spread operator enables the developer to pass the elements of the array as a list of varargs as opposed to passing a pointer to the array, itself, as a single argument.
I hope that helps.

Why are the keys and values of a borrowed HashMap accessed by reference, not value?

I have a function that takes a borrowed HashMap and I need to access values by keys. Why are the keys and values taken by reference, and not by value?
My simplified code:
fn print_found_so(ids: &Vec<i32>, file_ids: &HashMap<u16, String>) {
for pos in ids {
let whatever: u16 = *pos as u16;
let last_string: &String = file_ids.get(&whatever).unwrap();
println!("found: {:?}", last_string);
}
}
Why do I have to specify the key as a reference, i.e., file_ids.get(&whatever).unwrap() instead of file_ids.get(whatever).unwrap()?
As I understand it, the last_string has to be of type &String, meaning a borrowed string, because the owning collection is borrowed. Is that right?
Similar to the above point, am I correct in assuming pos is of type &u16 because it takes borrowed values from ids?
Think about the semantics of passing parameters as references or as values:
As reference: no ownership transfer. The called function merely borrows the parameter.
As value: the called function takes ownership of the parameter and may not be used by the caller anymore.
Since the function HashMap::get does not need ownership of the key to find an element, the less restrictive passing method was chosen: by reference.
Also, it does not return the value of the element, only a reference. If it returned the value, the value inside the HashMap would no longer be owned by the HashMap and thus be inaccessible in the future.
TL;DR: Rust is not Java.
Rust may have high-level constructs, and data-structures, but it is at heart a low-level language, as illustrated by one of its guiding principle: You don't pay for what you don't use.
As a result, the language and its libraries will as much as possible attempt to eliminate any cost that is superfluous, such as allocating memory needlessly.
Case 1: Taking the key by value.
If the key is a String, this means allocating (and deallocating) memory for each and every look-up, when you could use a local buffer that is only allocated once and for all.
Case 2: Returning by value.
Returning by value means that either:
you remove the entry from the container to give it to the user
you copy the entry in the container to give it to the user
The latter is obviously inefficient (copy means allocation), the former means that if the user wants the value back in another insertion has to take place again, which means look-up etc... and is also inefficient.
In short, returning by value is inefficient in this case.
Rust, therefore, takes the most logical choice as far as efficiency is concerned and passes and returns by value whenever practical.
While it seems unhelpful when the key is a u16, think about how it would work with a more complex key such as a String.
In that case taking the key by value would often mean having to allocate and initialise a new String for each lookup, which would be expensive.

Difference in mutability between reference and box

I'm trying to understand Rust pointer types and their relation to mutability. Specifically, the ways of declaring a variable which holds the pointer and is itself mutable -- i.e. can be pointed to some other memory, and declaring that the data itself is mutable -- i.e. can be changed through the value of the pointer variable.
This is how I understand plain references work:
let mut a = &5; // a is a mutable pointer to immutable data
let b = &mut 5; // b is an immutable pointer to mutable data
So a can be changed to point to something else, while b can't. However, the data to which b points to can be changed through b, while it can't through a. Do I understand this correctly?
For the second part of the question -- why does Box::new seem to behave differently? This is my current understanding:
let mut a = Box::new(5); // a is a mutable pointer to mutable data
let c = Box::new(7); // c is an immutable pointer to immutable data
new should return a pointer to some heap-allocated data, but the data it points to seems to inherit mutability from the variable which holds the pointer, unlike in the example with references where these two states of mutability are independent! Is that how Box::new is supposed to work? If so, how can I create a pointer value to mutable data on the heap that is stored in an immutable variable?
First, you do understand how references behave correctly. mut a is a mutable variable (or, more correctly, a mutable binding), while &mut 5 is a mutable reference pointing to a mutable piece of data (which is implicitly allocated on the stack for you).
Second, Box behaves differently from references because it is fundamentally different from references. Another name for Box is owning/owned pointer. Each Box owns the data it holds, and it does so uniquely, therefore mutability of this data is inherited from mutability of the box itself. So yes, this is exactly how Box should work.
Another, probably more practical, way to understand it is to consider Box<T> exactly equivalent to just T, except of fixed size and allocation method. In other words, Box provides value semantics: it is moved around just like any value and its mutability depends on the binding it is stored in.
There are several ways to create a pointer to a mutable piece of data on the heap while keeping the pointer immutable. The most generic one is RefCell:
use std::cell::RefCell;
struct X { id: u32 }
let x: Box<RefCell<X>> = Box::new(RefCell::new(X { id: 0 }));
x.borrow_mut().id = 1;
Alternatively, you can use Cell (for Copy types):
let x: Box<Cell<u32>> = Box::new(Cell::new(0));
x.set(1);
Note that the above examples are using so-called "internal mutability" which should better be avoided unless you do need it for something. If you want to create a Box with mutable interior only to keep mutability properties, you really shouldn't. It isn't idiomatic and will only result in a syntactic and semantic burden.
You can find a lot of useful information here:
Ownership
References and borrowing
Mutability
std::cell - internal mutability types
In fact, if you have a question about such fundamental things as mutability, it is probably already explained in the book :)

Resources