I am trying to create a user input validation function in rust utilising functional programming and recursion. How can I return an immutable vector with one element concatenated onto the end?
fn get_user_input(output_vec: Vec<String>) -> Vec<String> {
// Some code that has two variables: repeat(bool) and new_element(String)
if !repeat {
return output_vec.add_to_end(new_element); // What function could "add_to_end" be?
}
get_user_input(output_vec.add_to_end(new_element)) // What function could "add_to_end" be?
}
There are functions for everything else:
push adds a mutable vector to a mutable vector
append adds an element to the end of a mutable vector
concat adds an immutable vector to an immutable vector
??? adds an element to the end of a immutable vector
The only solution I have been able to get working is using:
[write_data, vec![new_element]].concat()
but this seems inefficient as I'm making a new vector for just one element (so the size is known at compile time).
You are confusing Rust with a language where you only ever have references to objects. In Rust, code can have exclusive ownership of objects, and so you don't need to be as careful about mutating an object that could be shared, because you know whether or not the object is shared.
For example, this is valid JavaScript code:
const a = [];
a.push(1);
This works because a does not contain an array, it contains a reference to an array.1 The const prevents a from being repointed to a different object, but it does not make the array itself immutable.
So, in these kinds of languages, pure functional programming tries to avoid mutating any state whatsoever, such as pushing an item onto an array that is taken as an argument:
function add_element(arr) {
arr.push(1); // Bad! We mutated the array we have a reference to!
}
Instead, we do things like this:
function add_element(arr) {
return [...arr, 1]; // Good! We leave the original data alone.
}
What you have in Rust, given your function signature, is a totally different scenario! In your case, output_vec is owned by the function itself, and no other entity in the program has access to it. There is therefore no reason to avoid mutating it, if that is your goal:
fn get_user_input(mut output_vec: Vec<String>) -> Vec<String> {
// Add mut ^^^
You have to keep in mind that any non-reference is an owned value. &Vec<String> would be an immutable reference to a vector something else owns, but Vec<String> is a vector this code owns and nobody else has access to.
Don't believe me? Here's a simple example of broken code that demonstrates this:
fn take_my_vec(y: Vec<String>) { }
fn main() {
let mut x = Vec::<String>::new();
x.push("foo".to_string());
take_my_vec(x);
println!("{}", x.len()); // E0382
}
The expression x.len() causes a compile-time error, because the vector x was moved into the function argument and we don't own it anymore.
So why shouldn't the function mutate the vector it owns now? The caller can't use it anymore.
In summary, functional programming looks a bit different in Rust. In other languages that have no way to communicate "I'm giving you this object" you must avoid mutating values you are given because the caller may not expect you to change them. In Rust, who owns a value is clear, and the argument reflects that:
Is the argument a value (Vec<String>)? The function owns the value now, the caller gave it away and can't use it anymore. Mutate it if you need to.
Is the argument an immutable reference (&Vec<String>)? The function doesn't own it, and it can't mutate it anyway because Rust won't allow it. You could clone it and mutate the clone.
Is the argument a mutable reference (&mut Vec<String>)? The caller must explicitly give the function a mutable reference and is therefore giving the function permission to mutate it -- but the function still doesn't own the value. The function can mutate it, clone it, or both -- it depends what the function is supposed to do.
If you take an argument by value, there is very little reason not to make it mut if you need to change it for whatever reason. Note that this detail (mutability of function arguments) isn't even part of the function's public signature simply because it's not the caller's business. They gave the object away.
Note that with types that have type arguments (like Vec) other expressions of ownership are possible. Here are a few examples (this is not an exhaustive list):
Vec<&String>: You now own a vector, but you don't own the String objects that it contains references to.
&Vec<&String>: You are given read-only access to a vector of string references. You could clone this vector, but you still couldn't change the strings, just rearrange them, for example.
&Vec<&mut String>: You are given read-only access to a vector of mutable string references. You can't rearrange the strings, but you can change the strings themselves.
&mut Vec<&String>: Like the above but opposite: you are allowed to rearrange the string references but you can't change the strings.
1 A good way to think of it is that non-primitive values in JavaScript are always a value of Rc<RefCell<T>>, so you're passing around a handle to the object with interior mutability. const only makes the Rc<> immutable.
Looking through the documentation for std::cell::Cell, I don't see anywhere how I can retrieve a non-mutable reference to inner data. There is only the get_mut method: https://doc.rust-lang.org/std/cell/struct.Cell.html#method.get_mut
I don't want to use this function because I want to have &self instead of &self mut.
I found an alternative solution of taking the raw pointer:
use std::cell::Cell;
struct DbObject {
key: Cell<String>,
data: String
}
impl DbObject {
pub fn new(data: String) -> Self {
Self {
key: Cell::new("some_uuid".into()),
data,
}
}
pub fn assert_key(&self) -> &str {
// setup key in the future if is empty...
let key = self.key.as_ptr();
unsafe {
let inner = key.as_ref().unwrap();
return inner;
}
}
}
fn main() {
let obj = DbObject::new("some data...".into());
let key = obj.assert_key();
println!("Key: {}", key);
}
Is there any way to do this without using unsafe? If not, perhaps RefCell will be more practical here?
Thank you for help!
First of, if you have a &mut T, you can trivially get a &T out of it. So you can use get_mut to get &T.
But to get a &mut T from a Cell<T> you need that cell to be mutable, as get_mut takes a &mut self parameter. And this is by design the only way to get a reference to the inner object of a cell.
By requiring the use of a &mut self method to get a reference out of a cell, you make it possible to check for exclusive access at compile time with the borrow checker. Remember that a cell enables interior mutability, and has a method set(&self, val: T), that is, a method that can modify the value of a non-mut binding! If there was a get(&self) -> &T method, the borrow checker could not ensure that you do not hold a reference to the inner object while setting the object, which would not be safe.
TL;DR: By design, you can't get a &T out of a non-mut Cell<T>. Use get_mut (which requires a mut cell), or set/replace (which work on a non-mut cell). If this is not acceptable, then consider using RefCell, which can get you a &T out of a non-mut instance, at some runtime cost.
In addition to to #mcarton answer, in order to keep interior mutability sound, that is, disallow mutable reference to coexist with other references, we have three different ways:
Using unsafe with the possibility of Undefined Behavior. This is what UnsafeCell does.
Have some runtime checks, involving runtime overhead. This is the approach RefCell, RwLock and Mutex use.
Restrict the operations that can be done with the abstraction. This is what Cell, Atomic* and (the unstable) OnceCell (and thus Lazy that uses it) does (note that the thread-safe types also have runtime overhead because they need to provide some sort of locking). Each provides a different set of allowed operations:
Cell and Atomic* do not let you to get a reference to the contained value, and only replace it as whole (basically, get() and set, though convenience methods are provided on top of these, such as swap()). Projection (cell-of-slice to slice-of-cells) is also available for Cell (field projection is possible, but not provided as part of std).
OnceCell allows you to assign only once and only then take shared reference, guaranteeing that when you assign you have no references and while you have shared references you cannot assign anymore.
Thus, when you need to be able to take a reference into the content, you cannot choose Cell as it was not designed for that - the obvious choice is RefCell, indeed.
I have C code which uses a pointer to a struct. I'm trying to figure out how to pass it to cuda without much luck.
I have
typedef struct node { /* describes a tip species or an ancestor */
struct node *next, *back; /* pointers to nodes */
etc...
} node;
Then
typedef node **pointptr;
static pointptr treenode;
In my code I iterate through all of these, and I'm trying to figure out how to pass them to the kernel so I can perform the following operation:
for (i = 1; i <= nonodes; i++) {
treenode[i - 1]->back = NULL;
etc....
}
But I can't figure out how to pass it.
Any ideas?
The problem is that in order to use your tree inside the kernel, your next and back should probably point somewhere in device memory. Assuming you construct your tree on the host and then pass it, you could do something like:
node* traverse(node*n){
if (n==NULL)
return NULL;
node x, *d;
x.back = traverse(n->back);
x.next = traverse(n->next);
cudaMalloc(&d, sizeof(node));
cudaMemcpy(d, &x, sizeof(node), cudaMemcpyHostToDevice);
return d;
}
and by calling it on the root you'd end up with a pointer to the root of the tree in device memory, which you could pass to your kernel directly. I haven't tested this code, and you'd have to write something similar to delete the tree afterwards.
Alternatively, you could store your tree nodes contiguously inside an array, with indices in the back and next instead of pointers (possibly changing them back to pointers in device code if necessary).
Check this question:
Copying a multi-branch tree to GPU memory
Although it does not answer your question exactly, I think it may clear some things out and ultimately help you tackle your problem.
When using reference counting, what are possible solutions/techniques to deal with circular references?
The most well-known solution is using weak references, however many articles about the subject imply that there are other methods as well, but keep repeating the weak-referencing example. Which makes me wonder, what are these other methods?
I am not asking what are alternatives to reference counting, rather what are solutions to circular references when using reference counting.
This question isn't about any specific problem/implementation/language rather a general question.
I've looked at the problem a dozen different ways over the years, and the only solution I've found that works every time is to re-architect my solution to not use a circular reference.
Edit:
Can you expand? For example, how would you deal with a parent-child relation when the child needs to know about/access the parent? – OB OB
As I said, the only good solution is to avoid such constructs unless you are using a runtime that can deal with them safely.
That said, if you must have a tree / parent-child data structure where the child knows about the parent, you're going to have to implement your own, manually called teardown sequence (i.e. external to any destructors you might implement) that starts at the root (or at the branch you want to prune) and does a depth-first search of the tree to remove references from the leaves.
It gets complex and cumbersome, so IMO the only solution is to avoid it entirely.
Here is a solution I've seen:
Add a method to each object to tell it to release its references to the other objects, say call it Teardown().
Then you have to know who 'owns' each object, and the owner of an object must call Teardown() on it when they're done with it.
If there is a circular reference, say A <-> B, and C owns A, then when C's Teardown() is called, it calls A's Teardown, which calls Teardown on B, B then releases its reference to A, A then releases its reference to B (destroying B), and then C releases its reference to A (destroying A).
I guess another method, used by garbage collectors, is "mark and sweep":
Set a flag in every object instance
Traverse the graph of every instance that's reachable, clearing that flag
Every remaining instance which still has the flag set is unreachable, even if some of those instances have circular references to each other.
I'd like to suggest a slightly different method that occured to me, I don't know if it has any official name:
Objects by themeselves don't have a reference counter. Instead, groups of one or more objects have a single reference counter for the entire group, which defines the lifetime of all the objects in the group.
In a similiar fashion, references share groups with objects, or belong to a null group.
A reference to an object affects the reference count of the (object's) group only if it's (the reference) external to the group.
If two objects form a circular reference, they should be made a part of the same group. If two groups create a circular reference, they should be united into a single group.
Bigger groups allow more reference-freedom, but objects of the group have more potential of staying alive while not needed.
Put things into a hierarchy
Having weak references is one solution. The only other solution I know of is to avoid circular owning references all together. If you have shared pointers to objects, then this means semantically that you own that object in a shared manner. If you use shared pointers only in this way, then you can hardly get cyclic references. It does not occur very often that objects own each other in a cyclic manner, instead objects are usually connected through a hierarchical tree-like structure. This is the case I'll describe next.
Dealing with trees
If you have a tree with objects having a parent-child relationship, then the child does not need an owning reference to its parent, since the parent will outlive the child anyways. Hence a non-owning raw back pointer will do. This also applies to elements pointing to a container in which they are situated. The container should, if possible, use unique pointers or values instead of shared pointers anyways, if possible.
Emulating garbage collection
If you have a bunch of objects that can wildly point to each other and you want to clean up as soon as some objects are not reachable, then you might want to build a container for them and an array of root references in order to do garbage collection manually.
Use unique pointers, raw pointers and values
In the real world I have found that the actual use cases of shared pointers are very limited and they should be avoided in favor of unique pointers, raw pointers, or -- even better -- just value types. Shared pointers are usually used when you have multiple references pointing to a shared variable. Sharing causes friction and contention and should be avoided in the first place, if possible. Unique pointers and non-owning raw pointers and/or values are much easier to reason about. However, sometimes shared pointers are needed. Shared pointers are also used in order to extend the lifetime of an object. This does usually not lead to cyclic references.
Bottom line
Use shared pointers sparingly. Prefer unique pointers and non-owning raw pointers or plain values. Shared pointers indicate shared ownership. Use them in this way. Order your objects in a hierarchy. Child objects or objects on the same level in a hierarchy should not use owning shared references to each other or to their parent, but they should use non-owning raw pointers instead.
No one has mentioned that there is a whole class of algorithms that collect cycles, not by doing mark and sweep looking for non-collectable data, but only by scanning a smaller set of possibly circular data, detecting cycles in them and collecting them without a full sweep.
To add more detail, one idea for making a set of possible nodes for scanning would be ones whose reference count was decremented but which didn't go to zero on the decrement. Only nodes to which this has happened can be the point at which a loop was cut off from the root set.
Python has a collector that does that, as does PHP.
I'm still trying to get my head around the algorithm because there are advanced versions that claim to be able to do this in parallel without stopping the program...
In any case it's not simple, it requires multiple scans, an extra set of reference counters, and decrementing elements (in the extra counter) in a "trial" to see if the self referential data ends up being collectable.
Some papers:
"Down for the Count? Getting Reference Counting Back in the Ring" Rifat Shahriyar, Stephen M. Blackburn and Daniel Frampton
http://users.cecs.anu.edu.au/~steveb/downloads/pdf/rc-ismm-2012.pdf
"A Unified Theory of Garbage Collection" by David F. Bacon, Perry Cheng and V.T. Rajan
http://www.cs.virginia.edu/~cs415/reading/bacon-garbage.pdf
There are lots more topics in reference counting such as exotic ways of reducing or getting rid of interlocked instructions in reference counting. I can think of 3 ways, 2 of which have been written up.
I have always redesigned to avoid the issue. One of the common cases where this comes up is the parent child relationship where the child needs to know about the parent. There are 2 solutions to this
Convert the parent to a service, the parent then does not know about the children and the parent dies when there are no more children or the main program drops the parent reference.
If the parent must have access to the children, then have a register method on the parent which accepts a pointer that is not reference counted, such as an object pointer, and a corresponding unregister method. The child will need to call the register and unregister method. When the parent needs to access a child then it type casts the object pointer to the reference counted interface.
When using reference counting, what are possible solutions/techniques to deal with circular references?
Three solutions:
Augment naive reference counting with a cycle detector: counts decremented to non-zero values are considered to be potential sources of cycles and the heap topology around them is searched for cycles.
Augment naive reference counting with a conventional garbage collector like mark-sweep.
Constrain the language such that its programs can only ever produce acyclic (aka unidirectional) heaps. Erlang and Mathematica do this.
Replace references with dictionary lookups and then implement your own garbage collector that can collect cycles.
i too am looking for a good solution to the circularly reference counted problem.
i was stealing borrowing an API from World of Warcraft dealing with achievements. i was implicitely translating it into interfaces when i realized i had circular references.
Note: You can replace the word achievements with orders if you don't like achievements. But who doesn't like achievements?
There's the achievement itself:
IAchievement = interface(IUnknown)
function GetName: string;
function GetDescription: string;
function GetPoints: Integer;
function GetCompleted: Boolean;
function GetCriteriaCount: Integer;
function GetCriteria(Index: Integer): IAchievementCriteria;
end;
And then there's the list of criteria of the achievement:
IAchievementCriteria = interface(IUnknown)
function GetDescription: string;
function GetCompleted: Boolean;
function GetQuantity: Integer;
function GetRequiredQuantity: Integer;
end;
All achievements register themselves with a central IAchievementController:
IAchievementController = interface
{
procedure RegisterAchievement(Achievement: IAchievement);
procedure UnregisterAchievement(Achievement: IAchievement);
}
And the controller can then be used to get a list of all the achievements:
IAchievementController = interface
{
procedure RegisterAchievement(Achievement: IAchievement);
procedure UnregisterAchievement(Achievement: IAchievement);
function GetAchievementCount(): Integer;
function GetAchievement(Index: Integer): IAchievement;
}
The idea was going to be that as something interesting happened, the system would call the IAchievementController and notify them that something interesting happend:
IAchievementController = interface
{
...
procedure Notify(eventType: Integer; gParam: TGUID; nParam: Integer);
}
And when an event happens, the controller will iterate through each child and notify them of the event through their own Notify method:
IAchievement = interface(IUnknown)
function GetName: string;
...
function GetCriteriaCount: Integer;
function GetCriteria(Index: Integer): IAchievementCriteria;
procedure Notify(eventType: Integer; gParam: TGUID; nParam: Integer);
end;
If the Achievement object decides the event is something it would be interested in it will notify its child criteria:
IAchievementCriteria = interface(IUnknown)
function GetDescription: string;
...
procedure Notify(eventType: Integer; gParam: TGUID; nParam: Integer);
end;
Up until now the dependancy graph has always been top-down:
IAchievementController --> IAchievement --> IAchievementCriteria
But what happens when the achievement's criteria have been met? The Criteria object was going to have to notify its parent `Achievement:
IAchievementController --> IAchievement --> IAchievementCriteria
^ |
| |
+----------------------+
Meaning that the Criteria will need a reference to its parent; the who are now referencing each other - memory leak.
And when an achievement is finally completed, it is going to have to notify its parent controller, so it can update views:
IAchievementController --> IAchievement --> IAchievementCriteria
^ | ^ |
| | | |
+----------------------+ +----------------------+
Now the Controller and its child Achievements circularly reference each other - more memory leaks.
i thought that perhaps the Criteria object could instead notify the Controller, removing the reference to its parent. But we still have a circular reference, it just takes longer:
IAchievementController --> IAchievement --> IAchievementCriteria
^ | |
| | |
+<---------------------+ |
| |
+-------------------------------------------------+
The World of Warcraft solution
Now the World of Warcraft api is not object-oriented friendly. But it does solve any circular references:
Do not pass references to the Controller. Have a single, global, singleton, Controller class. That way an achievement doesn't have to reference the controller, just use it.
Cons: Makes testing, and mocking, impossible - because you have to have a known global variable.
An achievement doesn't know its list of criteria. If you want the Criteria for an Achievement you ask the Controller for them:
IAchievementController = interface(IUnknown)
function GetAchievementCriteriaCount(AchievementGUID: TGUID): Integer;
function GetAchievementCriteria(Index: Integer): IAchievementCriteria;
end;
Cons: An Achievement can no longer decide to pass notifications to it's Criteria, because it doesn't have any criteria. You now have to register Criteria with the Controller
When a Criteria is completed, it notifies the Controller, who notifies the Achievement:
IAchievementController-->IAchievement IAchievementCriteria
^ |
| |
+----------------------------------------------+
Cons: Makes my head hurt.
i'm sure a Teardown method is much more desirable that re-architecting an entire system into a horribly messy API.
But, like you wonder, perhaps there's a better way.
If you need to store the cyclic data, for a snapShot into a string,
I attach a cyclic boolean, to any object that may be cyclic.
Step 1:
When parsing the data to a JSON string, I push any object.is_cyclic that hasn't been used into an array and save the index to the string. (Any used objects are replaced with the existing index).
Step 2: I traverse the array of objects, setting any children.is_cyclic to the specified index, or pushing any new objects to the array. Then parsing the array into a JSON string.
NOTE: By pushing new cyclic objects to the end of the array, will force recursion until all cyclic references are removed..
Step 3: Last I parse both JSON strings into a single String;
Here is a javascript fiddle...
https://jsfiddle.net/7uondjhe/5/
function my_json(item) {
var parse_key = 'restore_reference',
stringify_key = 'is_cyclic';
var referenced_array = [];
var json_replacer = function(key,value) {
if(typeof value == 'object' && value[stringify_key]) {
var index = referenced_array.indexOf(value);
if(index == -1) {
index = referenced_array.length;
referenced_array.push(value);
};
return {
[parse_key]: index
}
}
return value;
}
var json_reviver = function(key, value) {
if(typeof value == 'object' && value[parse_key] >= 0) {
return referenced_array[value[parse_key]];
}
return value;
}
var unflatten_recursive = function(item, level) {
if(!level) level = 1;
for(var key in item) {
if(!item.hasOwnProperty(key)) continue;
var value = item[key];
if(typeof value !== 'object') continue;
if(level < 2 || !value.hasOwnProperty(parse_key)) {
unflatten_recursive(value, level+1);
continue;
}
var index = value[parse_key];
item[key] = referenced_array[index];
}
};
var flatten_recursive = function(item, level) {
if(!level) level = 1;
for(var key in item) {
if(!item.hasOwnProperty(key)) continue;
var value = item[key];
if(typeof value !== 'object') continue;
if(level < 2 || !value[stringify_key]) {
flatten_recursive(value, level+1);
continue;
}
var index = referenced_array.indexOf(value);
if(index == -1) (item[key] = {})[parse_key] = referenced_array.push(value)-1;
else (item[key] = {})[parse_key] = index;
}
};
return {
clone: function(){
return JSON.parse(JSON.stringify(item,json_replacer),json_reviver)
},
parse: function() {
var object_of_json_strings = JSON.parse(item);
referenced_array = JSON.parse(object_of_json_strings.references);
unflatten_recursive(referenced_array);
return JSON.parse(object_of_json_strings.data,json_reviver);
},
stringify: function() {
var data = JSON.stringify(item,json_replacer);
flatten_recursive(referenced_array);
return JSON.stringify({
data: data,
references: JSON.stringify(referenced_array)
});
}
}
}
Here are some techniques described in Algorithms for Dynamic Memory Management by R. Jones and R. Lins. Both suggest that you can look at the cyclic structures as a whole in some way or another.
Friedman and Wise
The first way is one suggested by Friedman and Wise. It is for handling cyclic references when implementing recursive calls in functional programming languages. They suggest that you can observe the cycle as a single entity with a root object. To do that, you should be able to:
Create the cyclic structure all at once
Access the cyclic structure only through a single node - a root, and mark which reference closes the cycle
If you need to reuse just a part of the cyclic structure, you shouldn't add external references but instead make copies of what you need
That way you should be able to count the structure as a single entity and whenever the RC to the root falls to zero - collect it.
This should be the paper for it if anyone is interested in more details - https://www.academia.edu/49293107/Reference_counting_can_manage_the_circular_invironments_of_mutual_recursion
Bobrow's technique
Here you could have more than one reference to the cyclic structure. We just distinguish between internal and external references for it.
Overall, all allocated objects are assigned by the programmer to a group. And all groups are reference counted. We distinguish between internal and external references for the group and of course - objects can be moved between groups. In this case, intra-group cycles are easily reclaimed whenever there are no more external references to the group. However, inter-group cyclic structures should still be an issue.
If I'm not mistaken, this should be the original paper - https://dl.acm.org/doi/pdf/10.1145/357103.357104
I'm in the search for other general purpose alternatives besides those algorithms and using weak pointers.
Y Combinator
I wrote a programming language once in which every object was immutable. As such, an object could only contain pointers to objects that were created earlier. Since all reference pointed backwards in time, there could not possibly be any cyclic references, and reference counting was a perfectly viable way of managing memory.
The question then is, "How do you create self-referencing data structures without circular references?" Well, functional programming has been doing this trick forever using the fixed-point/Y combinator to write recursive functions when you can only refer to previously-written functions. See: https://en.wikipedia.org/wiki/Fixed-point_combinator
That Wikipedia page is complicated, but the concept is really not that complicated. How it works in terms of data structures is like this:
Let's say you want to make a Map<String,Object>, where the values in the map can refer to other values in the map. Well, instead of actually storing those objects, you store functions that generate those objects on-demand given a pointer to the map.
So you might say object = map.get(key), and what this will do is call a private method like _getDefinition(key) that returns this function and calls it:
Object get(String key) {
var definition = _getDefinition(key);
return definition == null ? null : definition(this);
}
So the map has a reference to the function that defines the value. This function doesn't need to have a reference to the map, because it will be passed the map as an argument.
When the definition called, it returns a new object that does have a pointer to the map in it somewhere, but the map it self does not have a pointer to this object, so there are no circular references.
There are couple of ways I know of for walking around this:
The first (and preferred one) is simply extracting the common code into third assembly, and make both references use that one
The second one is adding the reference as "File reference" (dll) instead of "Project reference"
Hope this helps
I need to basically merge a Binary Heap, and Linear Probing Hashtable to make a "compound" data structure, which has the functionality of a heap, with the sorting power of a hashtable.
What I need to do is create 2 2 dimension arrays for each data structure (Binary Heap, and Hash) then link them to each other with pointers so that when I change things, such as deleting a value in the Binary Heap, it also gets deleted in the Hash table.
Therefore, I need to have one row of the Heap array pointing from the Heap to the Hastable, and one row of the hashtable array pointing from the hashtable to the heap.
Create a container that contains both, with accessor functions/methods (depending on your language of implementation) that performs all the operations required of your algorithm.
IE:
Delete from container: does a delete from Binary and from hash.
Add to container: adds to binary and to hash.
EDIT:
Oh, an assignment - fun! :)
I'd do this:
still implement a container. But, instead of using a standard library for btree/hash, implement them like this:
Make a type that can be put in your data member that has a pointer to the BTree node and the Hashtable Node that the data element lives in.
To delete a data element, given a pointer to it, you can perform the delete algorithm on a btree (navigate to parent from node pointer, delete child (left or right), restructure tree) and on the hash table (delete from hash list). When adding a value, perform the add algorithm on btree and hash, but be sure you update the node pointers in the data before you return.
Some pseudocode (I'll use C, but i'm not sure what language your using):
typedef struct
{
BTreeNode* btree
HashNode* hash
} ContianerNode;
to put data in your container:
typedef struct
{
ContainerNode node;
void* data; /* whatever the data is */
} Data;
a BTreeNode has something like:
typedef struct _BTreeNode
{
struct _BTreeNode* parent;
struct _BTreeNode* left;
struct _BTreeNode* right;
} BTreeNode;
and a HashNode has something like:
typedef struct _HashNode
{
struct _HashNode* next;
} HashNode;
/* ala singly linked list */
and your BTree would be a pointer to a BTreeNode and your hastable would be an array of pointers to HashNodes. Like this:
typedef struct
{
BTreeNode* btree;
HashNode* hashtable[HASHTABLESIZE];
} Container;
void delete(Container* c, ContainerNode* n)
{
delete_btree_node(n->btree);
delete_hashnode(n->hash);
}
ContainerNode* add(Container* c, void* data)
{
ContainerNode* n = malloc(sizeof(ContainerNode));
n->btree = add_to_btree(n);
n->hash = add_to_hash(n);
}
I'll let you complete those other functions (can't do the whole assignment for you ;) )
Why bother with the links?
You have two associative structures just duplicate any operation on one to the other (ensuring that if one operation excepts you either crash the whole thing or leave the object in a valid state if you care about such things)
Unless you can make use of the structure of one to help you with the other (and I don't see how you can since either one can entirely rearrange it's internal state on any modification operation) this is just as effective and much simpler.
Of course this means that the O() cost of any modification operation is the cost of the most expensive and memory costs are doubled but that is true of the original plan unless their is some trick I'm missing.