Rudimentary Tree, and Pointers, in Rust - recursion

Coming from a scripting language background with some C, trying to 'learn' Rust leads me to question my competence. I'm trying to figure out how to change an owned pointer, and struggling to do it.
Besides copying in from the extra libs, I can't figure out the recursion I need on a binary tree. Particularly, I don't know how to swap out the pointer branches. Whereas with a linked list I can cheat and use a temporary vector to return a new list, or prepend a new Cons(value, ~Cons) to the list head, branches have got me boggled.
enum NaiveTreeNode {
NNil,
NNode(~NaiveTreeNode, ~NaiveTreeNode, int, char)
// left right key val
}
impl NaiveTreeNode {
fn eq(first_node: &NaiveTreeNode, second_node: &NaiveTreeNode) -> bool {
match (first_node, second_node) {
(&NNil, &NNil) => true,
( &NNode( ~ref left_lval, ~ref left_rval, left_leafkey, left_leafval ),
&NNode( ~ref right_lval, ~ref right_rval, right_leafkey, right_leafval )
) if left_leafkey == right_leafkey && left_leafval == right_leafval => {
NaiveTreeNode::eq(left_lval, right_lval) && NaiveTreeNode::eq(left_rval, right_rval)
},
_ => false
}
}
fn add_branch(&mut self, node_to_add: ~NaiveTreeNode) {
match (self, node_to_add) {
(&NaiveTreeNode(~NNil, ~ref r_branch, leaf_key, leaf_val), ~NaiveTreeNode(_, _, new_node_key, _) )
if leaf_key > new_node_key => self = &NaiveTreeNode(node_to_add, *r_branch, leaf_key, leaf_val),
(&NaiveTreeNode(~ref l_branch, ~NNil, leaf_key, leaf_val), ~NaiveTreeNode(_, _, new_node_key, _))
if leaf_key < new_node_key => self = &NaiveTreeNode(*l_branch, node_to_add, leaf_key, leaf_val),
(&NaiveTreeNode(~ref l_branch, _, leaf_key, _), ~NaiveTreeNode(_, _, new_node_key, _))
if leaf_key > new_node_key => self.add_branch(l_branch, node_to_add),
(&NaiveTreeNode(_, ~ref r_branch, leaf_key, _), ~NaiveTreeNode(_, _, new_node_key, _))
if leaf_key < new_node_key => self.add_branch(l_branch, node_to_add),
(_, ~NNil) => fail!("NNil branch. failing"),
(&NNil, _) => fail!("NNil trunk. failing"),
_ => fail!("something is wrong. failing.")
};
}
}
The compiler throws 11 errors on this, and when I type it out, it feels like pseudocode. I'm frustrated because I feel okay implementing a tree with C pointers.
What I'm trying to do is update the pointers in-place--this is part of the reason I'm using them, right?--rather than copying the entire tree every time I want to make a change. But I don't even know how to get to them.
I'm not sure how I'd go about doing this with structs rather than enums. I've looked at the Treemap lib, but it seems to introduce too much complexity for what I want to accomplish right now, which is proof of concept--I might be trying to run when I should crawl, though!

I believe that you would do better with a different data representation:
struct NaiveTreeNode {
left: Option<~NaiveTreeNode>,
right: Option<~NaiveTreeNode>,
key: int,
val: char,
}
This will be easier to work with and is slightly more efficient (Option<~T> can be represented as a nullable pointer, while your current solution has a leaf node still requiring a pointer lookup to check if it's NNil).
You don't need to implement your eq method; it can be derived, an implementation of the Eq trait, by putting #[deriving(Eq)] immediately before the struct.
Of your add_branch method, you must understand that self.add_branch is a method bound to self. When you call self.add_branch(l_branch, node_to_add), that is invalid, for you are passing two arguments to one expecting one. What you meant was l_branch.add_branch(node_to_add).
I've restructured the add_branch method significantly; here's the complete code that I would write:
#[deriving(Eq)]
struct NaiveTreeNode {
left: Option<~NaiveTreeNode>,
right: Option<~NaiveTreeNode>,
key: int,
val: char,
}
impl NaiveTreeNode {
fn add_branch(&mut self, node: ~NaiveTreeNode) {
match (self.key.cmp(node.key), self.left, self.right) {
(Greater, None, _) => self.left = Some(node),
(Greater, Some(~ref mut left), _) => left.add_branch(node),
(Less, _, None) => self.right = Some(node),
(Less, _, Some(~ref mut right)) => right.add_branch(node),
(Equal, _, _) => fail!("this tree already has a node with key {} \
(value {}, attempted value {})",
self.key, self.value, node.value),
}
}
}
The match could also be expanded to the following, if you desired:
match self.key.cmp(node.key) {
Greater => match self.left {
None => self.left = Some(node),
Some(~ref mut left) => left.add_branch(node),
},
Less => match self.right {
None => self.right = Some(node),
Some(~ref mut right) => right.add_branch(node),
},
Equal => fail!("this tree already has a node with key {} \
(value {}, attempted value {})",
self.key, self.value, node.value),
}
If there's anything you don't understand in this code, just holler and I'll explain it.

Related

Why does the decorator pattern work for owned types, but causes a trait evaluation overflow (E0275) for references? [duplicate]

This question already has an answer here:
Why do I get an "overflow evaluating the requirement" error for a simple trait implementation?
(1 answer)
Closed 1 year ago.
While experimenting with the decorator design pattern in Rust, I came across what I believe may be a compiler error, but I am too new to the language to be confident.
I think that the following example code should not generate a recursive trait E0275 error.
A simple type that can be converted to an i64:
enum MyNumbers {
Zero,
One,
Two,
}
impl From<MyNumbers> for i64 {
fn from(n: MyNumbers) -> Self {
match n {
MyNumbers::Zero => 0,
MyNumbers::One => 1,
MyNumbers::Two => 2,
}
}
}
And here, a struct that might be used in a decorator:
struct MyWrapper<N> {
n: N,
}
MyWrapper<N> can be converted to an i64 if N can be converted to i64.
impl<N> From<MyWrapper<N>> for i64
where
N: Into<i64>,
{
fn from(wrapper: MyWrapper<N>) -> Self {
wrapper.n.into()
}
}
Playground
This works as I expect.
Now I want to be able to construct an i64 from MyWrapper without consuming it. I change my From trait implementations to operate on references:
impl From<&MyNumbers> for i64 {
fn from(n: &MyNumbers) -> Self {
match n {
MyNumbers::Zero => 0,
MyNumbers::One => 1,
MyNumbers::Two => 2,
}
}
}
impl<'a, N> From<&'a MyWrapper<N>> for i64
where
&'a N: Into<i64>,
{
fn from(wrapper: &'a MyWrapper<N>) -> Self {
(&wrapper.n).into()
}
}
Playground
But now...
error[E0275]: overflow evaluating the requirement `i64: From<&MyWrapper<_>>`
--> src/main.rs:34:13
|
34 | let i = i64::from(&w);
| ^^^^^^^^^
|
= help: consider adding a `#![recursion_limit="256"]` attribute to your crate (`playground`)
= note: required because of the requirements on the impl of `Into<i64>` for `&MyWrapper<_>`
= note: required because of the requirements on the impl of `From<&MyWrapper<MyWrapper<_>>>` for `i64`
= note: 126 redundant requirements hidden
= note: required because of the requirements on the impl of `From<&MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<MyWrapper<_>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>` for `i64`
Why was this fine for owned types, but not references?
Also, this really confuses me -
No calls to i64::from() is fine - Playground
Call to different i64::from errors - but should not be evaluating my code!? Playground
fn main() {
i64::from(32i32);
}
Compiler bug. https://github.com/rust-lang/rust/issues/37748
As a workaround, I have had to resort to #[derive(Copy)] to prevent my types from being consumed at each level of the conversion. Perhaps From/Into is poorly suited to this pattern.... Until the language matures a little more.

Is it possible to pattern match on a (nested) Vec in Rust?

A library presents me with a deeply nested data structure that I would like to match on. It contains Vecs internally. I would like something like one of the commented out lines to work:
struct Foo {
bar: Vec<bool>,
}
let foo = Foo {
bar: vec![true, false],
};
match foo {
// Foo{bar:[true,false]} => Ok(()), // expected an array or slice, found Vec<bool>
// Foo{bar:&[true, false]} => Ok(()), // expected struct `Vec`, found reference
// Foo{bar:vec![true,false]} => Ok(()), // Arbitrary expressions aren't allowed in patterns
Foo { bar: v } => match v.as_slice() {
[true, false] => Ok(()),
_ => bail!("match failed!"),
}, // Ugly when nesting deeply
_ => bail!("match failed!"),
}
The match statement can be broken into smaller pieces that first do some dereferencing/unpacking on the value being matched, turning it into a slice first. I am currently doing this in my code, but it is quite ugly, and obscures the structure of the thing being destructured.
The issue is that Vec is in the standard library, not part of the language, but I'm still hoping there is some pattern matching magic that can get around this.
No, pattern-matching vecs (let alone in-place) is not currently supported. Currently, Rust only supports a somewhat limited forms of slice patterns, and even that is fairly recent (1.42).
You could use some of the other Rust facilities to make the code slightly terser but that's about it e.g. if let or matches!
match foo {
Foo { bar: v } if matches!(v.as_slice(), [true, false]) => Ok(()),
_ => bail!("match failed!"),
}

Rust match mutable enum reference with vectors

I'm trying to change an enum's named property but getting this error.
cannot move out of a mutable referencerustc(E0507)
parser.rs(323, 36): data moved here
parser.rs(323, 36): move occurs because `statements` has type `std::vec::Vec<std::boxed::Box<ast::ast::StatementType>>`, which does not implement the `Copy` trait
I saw that we can change enum's named props with match statements. But I couldn't understand why there's a move occurrence, since I'm borrowing the enum itself. Here's the code:
match &mut block {
StatementType::Block { mut statements, .. } => {
statements = block_statements;
},
_ => panic!()
};
return block;
I've tried mem::swap too but still it's the same error:
match &mut block {
StatementType::Block { mut statements, .. } => {
// statements = block_statements;
std::mem::swap(&mut statements, &mut block_statements);
},
_ => panic!()
};
return block;
BUT when I do this:
std::mem::swap(&mut *statements, &mut *block_statements);
The error changes to:
the size for values of type `[std::boxed::Box<ast::ast::StatementType>]` cannot be known at compilation time
doesn't have a size known at compile-time
Types are:
StatementType is an enum that derives Clone
Block is mutable variable of StatementType
Block's statements is a variable of Vec<Box<StatementType>>
block_statements is another variable of Vec<Box<StatementType>>
Please do not say that it happens because statements' type is Vector: come with a solution as I can read error messages too.
You have to think what the type of statements is and what you would like it to be.
With the code as you wrote it, it is of type Vec<_> (sorry, I said it), but since the match captures the block by reference, it cannot take the contents by value, hence the error. Note that the error is not in the assignment but in the match brace itself:
error[E0507]: cannot move out of a mutable reference
--> src/main.rs:15:11
|
15 | match &mut block {
| ^^^^^^^^^^
16 | StatementType::Block { mut statements, .. } => {
| --------------
| |
| data moved here
| move occurs because `statements` has type `std::vec::Vec<std::boxed::Box<StatementType>>`, which does not implement the `Copy` trait
You would like statement to be of type &mut Vec<_> of course. And you get that by using the ref mut capture mode:
match block {
StatementType::Block { ref mut statements, .. } => {
*statements = block_statements;
},
_ => panic!()
};
And remember to use *statement when assigning, as it is now a reference. You could also use a mem::swap if you want, of course:
std::mem::swap(statements, &mut block_statements);
But note that you do not need to match &mut block but you can do match block directly.
There is this thing called match ergonomics that lets you match against a reference and omit the ref mut capture mode, that makes your code easier to write and understand:
match &mut block {
StatementType::Block { statements, .. } => {
*statements = block_statements;
},
_ => panic!()
};
The problem in your original code is that if specify any capture mode then match ergonomics is disabled.

Fastest way to check if a given word exists in a fixed list of words

This may not be specific to Rust, although it's the language I'm currently focusing on.
I'm writing a function to parse a language (MySQL) into tokens and output them in a formatted way, and part of that includes looking up the current work token to see if it's a name, a function, or a column/table name.
Currently, I'm using a match statement like
pub fn is_word(word: &str) -> bool {
match word {
"accessible"
| "account"
| "action"
| "active"
| "add"
// ...
| "year"
| "year_month"
| "zerofill" => true,
_ => false,
}
}
The actual list is much, much longer.
Is this the best way to go about this? I've tried using a HashMap as well with .contains_key(), but that was notably slower
My HashMap implementation looks like this:
use std::collections::HashMap;
lazy_static! {
static ref words: HashMap<&'static str, u8> = hashmap!{
"accessible" => 0,
"account" => 0,
"action" => 0,
"active" => 0,
"add" => 0,
// ...
"year" => 0,
"year_month" => 0,
"zerofill" => 0,
};
}
pub fn is_word(word: &str) -> bool {
words.contains_key(word)
}
Since your list is fixed at compile time, use a perfect hash, such as that provided by the phf crate:
build.rs
extern crate phf_codegen;
use std::env;
use std::fs::File;
use std::io::{BufWriter, Write};
use std::path::Path;
fn main() {
let path = Path::new(&env::var("OUT_DIR").unwrap()).join("codegen.rs");
let mut file = BufWriter::new(File::create(&path).unwrap());
write!(&mut file, "static KEYWORDS: phf::Set<&'static str> = ").unwrap();
phf_codegen::Set::new()
.entry("accessible")
.entry("account")
.entry("action")
.entry("active")
.entry("add")
// ...
.entry("year")
.entry("year_month")
.entry("zerofill")
.build(&mut file)
.unwrap();
write!(&mut file, ";\n").unwrap();
}
src/main.rs
extern crate phf;
include!(concat!(env!("OUT_DIR"), "/codegen.rs"));
pub fn is_word(word: &str) -> bool {
KEYWORDS.contains(word)
}
According to the benchmarking code you've provided, this is at least as fast.

Scala: Most concise conversion of a CSS color string to RGB integers

I am trying to get the RGB values of a CSS color string and wonder how good my code is:
object Color {
def stringToInts(colorString: String): Option[(Int, Int, Int)] = {
val trimmedColorString: String = colorString.trim.replaceAll("#", "")
val longColorString: Option[String] = trimmedColorString.length match {
// allow only strings with either 3 or 6 letters
case 3 => Some(trimmedColorString.flatMap(character => s"$character$character"))
case 6 => Some(trimmedColorString)
case _ => None
}
val values: Option[Seq[Int]] = longColorString.map(_
.foldLeft(Seq[String]())((accu, character) => accu.lastOption.map(_.toSeq) match {
case Some(Seq(_, _)) => accu :+ s"$character" // previous value is complete => start with succeeding
case Some(Seq(c)) => accu.dropRight(1) :+ s"$c$character" // complete the previous value
case _ => Seq(s"$character") // start with an incomplete first value
})
.flatMap(hexString => scala.util.Try(Integer.parseInt(hexString, 16)).toOption)
// .flatMap(hexString => try {
// Some(Integer.parseInt(hexString, 16))
// } catch {
// case _: Exception => None
// })
)
values.flatMap(values => values.size match {
case 3 => Some((values.head, values(1), values(2)))
case _ => None
})
}
}
// example:
println(Color.stringToInts("#abc")) // prints Some((170,187,204))
You may run that example on https://scastie.scala-lang.org
The parts of that code I am most unsure about are
the match in the foldLeft (is it a good idea to use string interpolation or can the code be written shorter without string interpolation?)
Integer.parseInt in conjunction with try (can I use a prettier alternative in Scala?) (solved thanks to excellent comment by Xavier Guihot)
But I expect most parts of my code to be improvable. I do not want to introduce new libraries in addition to com.itextpdf to shorten my code, but using com.itextpdf functions is an option. (The result of stringToInts is going to be converted into a new com.itextpdf.kernel.colors.DeviceRgb(...), thus I have installed com.itextpdf anyway.)
Tests defining the expected function:
import org.scalatest.{BeforeAndAfterEach, FunSuite}
class ColorTest extends FunSuite with BeforeAndAfterEach {
test("shorthand mixed case color") {
val actual: Option[(Int, Int, Int)] = Color.stringToInts("#Fa#F")
val expected = (255, 170, 255)
assert(actual === Some(expected))
}
test("mixed case color") {
val actual: Option[(Int, Int, Int)] = Color.stringToInts("#1D9a06")
val expected = (29, 154, 6)
assert(actual === Some(expected))
}
test("too short long color") {
val actual: Option[(Int, Int, Int)] = Color.stringToInts("#1D9a6")
assert(actual === None)
}
test("too long shorthand color") {
val actual: Option[(Int, Int, Int)] = Color.stringToInts("#1D9a")
assert(actual === None)
}
test("invalid color") {
val actual: Option[(Int, Int, Int)] = Color.stringToInts("#1D9g06")
assert(actual === None)
}
}
At the moment of writing this answer the other answers don't properly handle rgb(), rgba() and named colors cases. Color strings that start with hashes (#) are only a part of the deal.
As you have iText7 as a dependency and iText7 has a pdfHTML add-on which means the logic for parsing CSS colors obviously must be somewhere in iText7 and, more importantly, it must handle various range of CSS color cases. The question is only about finding the right place. Fortunately, this API is public and easy to use.
The method you are interested in is WebColors.getRGBAColor() from package com.itextpdf.kernel.colors which accepts a CSS color string a returns a 4-element array with R, G, B, A values (last one stands for alpha, i.e. transparency).
You can use those values to create a color right away (code in Java):
float[] rgbaColor = WebColors.getRGBAColor("#ababab");
Color color = new DeviceRgb(rgbaColor[0], rgbaColor[1], rgbaColor[2]);
In Scala it must be something like
val rgbaColor = WebColors.getRGBAColor("#ababab");
val color = new DeviceRgb(rgbaColor(0), rgbaColor(1), rgbaColor(2));
I came up with this fun answer (untested); I guess the biggest help for you will be the use of sliding(2,2) instead of the foldLeft.
def stringToInts(colorString: String): Option[(Int, Int, Int)] = {
val trimmedString: String => String = _.trim.replaceAll("#", "")
val validString: String => Option[String] = s => s.length match {
case 3 => Some(s.flatMap(c => s"$c$c"))
case 6 => Some(s)
case _ => None
}
val hex2rgb: String => List[Option[Int]] = _.sliding(2, 2).toList
.map(hex => Try(Integer.parseInt(hex, 16)).toOption)
val listOpt2OptTriple: List[Option[Int]] => Option[(Int, Int, Int)] = {
case Some(r) :: Some(g) :: Some(b) :: Nil => Some(r, g, b)
case _ => None
}
for {
valid <- validString(trimmedString(colorString))
rgb = hex2rgb(valid)
answer <- listOpt2OptTriple(rgb)
} yield answer
}
Here is a possible implementation of your function
def stringToInts(css: String): Option[(Int, Int, Int)] = {
def cssColour(s: String): Int = {
val v = Integer.parseInt(s, 16)
if (s.length == 1) v*16 + v else v
}
val s = css.trim.replaceAll("#", "")
val l = s.length/3
if (l > 2 || l*3 != s.length) {
None
} else {
Try{
val res = s.grouped(l).map(cssColour).toSeq
(res(0), res(1), res(2))
}.toOption
}
}
The implementation would be cleaner if it returned Option[List[Int]] or even Try[List[Int]] to preserve the error in the case of failure.
If you're looking for conciseness, perhaps this solution will do the job (at the expense of efficiency—more on that later):
import scala.util.Try
def parseLongForm(rgb: String): Try[(Int, Int, Int)] =
Try {
rgb.replace("#", "").
grouped(2).toStream.filter(_.length == 2).
map(Integer.parseInt(_, 16)) match { case Stream(r, g, b) => (r, g, b) }
}
def parseShortForm(rgb: String): Try[(Int, Int, Int)] =
parseLongForm(rgb.flatMap(List.fill(2)(_)))
def parse(rgb: String): Option[(Int, Int, Int)] =
parseLongForm(rgb).orElse(parseShortForm(rgb)).toOption
In terms of conciseness is that every function here is effectively a one-liner (if that's something you're looking for right now).
The core is the function parseLongForm, which attempts to parse the long 6-character long form by:
removing the # character
grouping the characters in pairs
filtering out lone items (in case we have an odd number of characters)
parsing each pair
matching with the expected result to extract individual items
parseLongForm represents the possibility of failure with Try, which allows us to fail gracefully whenever parseInt or the pattern matching fails.
parse invokes parseLongForm and, if the result is a failure (orElse), invokes parseShortForm, which just tries the same approach after doubling each character.
It successfully passes the tests that you provided (kudos, that makes addressing the question much easier).
The main issue with this approach is that you would still try to parse the long form even if it can be clear from the beginning that it would not work. So, this is not recommended code if this could be a performance bottleneck for your use case. Another issue is that, although that's more or less hidden, we're using exceptions for flow control (which also hurts performance).
The nice things are conciseness and, I'd argue, readability (as I'd say that the code maps in a fairly straightforward fashion to the problem—but readability, of course, is by definition in the eye of the beholder).
You can find this solution on Scastie.

Resources