R cannot input quotation mark using Rcpp - r

Double quotation marks cannot be recognized by Rcpp, which shows an error of "unexpected symbol".
The following is example codes.
cppFunction("NumericVector attrs() {
NumericVector out = NumericVector::create(1,2,3);
out.names() = CharacterVector::create("xa","xb","xc");
return out;
}")
The quotation marks in "xa", "xb", and "xc" are the problem. The codes have been written using Microsoft Word and Notepad.

Try escaping the quotation marks out:
cppFunction("NumericVector attrs() {
NumericVector out = NumericVector::create(1,2,3);
out.names() = CharacterVector::create(\"xa\",\"xb\",\"xc\");
return out;
}")
To generalize, you cannot include a quotation mark inside a string in R without escaping. You can however use single quotation marks inside a double quotation marks string or vice versa:
s1 <- "the 'cat' on the roof"
s2 <- 'the "cat" on the roof'
The latter approach might be in fact an easier solution to your issue with cppFunction, but I'll keep my original answer here because it addressed the issue itself.

Related

read null terminated string from byte vector in julia

I have a vector of type UInt8 and fixed length 10. I think it contains a null-terminated string but when I do String(v) it shows the string + all of the zeros of the rest of the vector.
v = zeros(UInt8, 10)
v[1:5] = Vector{UInt8}("hello")
String(v)
the output is "hello\0\0\0\0\0".
Either I'm packing it wrong or reading it wrong. Any thoughts?
I use this snippet:
"""
nullstring(Vector{UInt8})
Interpret a vector as null terminated string.
"""
nullstring(x::Vector{UInt8}) = String(x[1:findfirst(==(0), x) - 1])
Although I bet there are faster ways to do this.
You can use unsafe_string: unsafe_string(pointer(v)), this does it without a copy, so is very fast. But #laborg's solution is better in almost all cases, because it's safe.
If you want both safety and maximal performance, you have to write a manual function yourself:
function get_string(v::Vector{UInt8})
# Find first zero
zeropos = 0
#inbounds for i in eachindex(v)
iszero(v[i]) && (zeropos = i; break)
end
iszero(zeropos) && error("Not null-terminated")
GC.#preserve v unsafe_string(pointer(v), zeropos - 1)
end
But eh, what are the odds you REALLY need it to be that fast.
You can avoid copying bytes and preserve safety with the following code:
function nullstring!(x::Vector{UInt8})
i = findfirst(iszero, x)
SubString(String(x),1,i-1)
end
Note that after calling it x will be empty and the returned value is Substring rather than String but in many scenarios it does not matter. This code makes half allocations than code by #laborg and is slightly faster (around 10-20%). The code by Jacob is still unbeatable though.

Julia string concatenation gives an array that elements are broken to individual characters

I hope I get someone who understand this. I have been trying to concatenate Julia string for quit a while now but I still have an issue. I have this loop where I am trying to concatenate the string and a number from the loop then add the new value to an array, everything is fine when I print the value in the loop but printing the arrays then all the elements of the array are split again to individual characters.
my code is as bellow
a = 1
for i in nums_loop
i_val = i[a]
append!(const_names, (string(x, string(a))))
println(string(x, string(a)))
a += 1
end
print(const_names)
the output is as bellow
X1
X2
Any['X', '1', 'X', '2']
This seems the easiest way: first initiliaze your array_names with an empty string, later removing it with popfirst! (bad practise to call the array constant if you are actually changing its content)
array_names=[" "]
num_loops=2
for i=1:num_loops
push!(array_names, "X$i")
end
popfirst!(array_names)
println(array_names)
This gives me the result:
julia> println(array_names)
["X1", "X2"]

Read a string with single and doubles quotes

Just summertime curiosity about strings in R. Let use say that I have a x and y strings. As we know we have to quote single quotes in double quotes and vice versa.
x <- "a string with 'single' quotes"
y <- 'another one with "double" quotes'
paste0(x, y)
[1] "a string with 'single' quotesanother one with \"double\" quotes"
cat(x, y)
a string with 'single' quotes another one with "double" quotes
What if we have a string with single and double quotes too? I have tried this:
Backticks do not work (R triggers an error):
z <- `a string with 'single' quotes and with "double" quotes`
Use a \" instead of " and then use cat:
This works well but the problem is that users must add a backslash to every double quote.
z1 <- "a string with 'single' quotes and with \"double\" quotes"
what if we have a huge text file (like a .txt for example) with both type of quotes and we want to read in R?
At this point a (silly) solution to me seems to be: work outside R, do some manipulations (like substitute all " with \") and then read in R.
Is this a solution or does exist a better way inside R?
Here is just a little .txt file for example: Link, anyways for who is interested, the file is just a .txt with one line with this text:
a string with 'single' quotes and with \"double\" quotes
You may specify any alternate quoting characters as desired when reading text, e.g.
> p<-scan(what="character",quote="`")
1: `It is 'ambiguous' if "this is a new 'string' or "nested" in the 'first'", isn't it?`
2:
Read 1 item
> p
[1] "It is 'ambiguous' if \"this is a new 'string' or \"nested\" in the 'first'\", isn't it?"
Or, just read raw text, e.g. with readline as suggested by #rawr
> readline()
"It is 'ambiguous' if "this is a new 'string' or "nested" in the 'first'", isn't it?"
[1] "\"It is 'ambiguous' if \"this is a new 'string' or \"nested\" in the 'first'\", isn't it?\""

Pass a string value in a recursive bison rule

i'm having some issues on bison (again).
I'm trying to pass a string value between a "recursive rule" in my grammar file using the $$,
but when I print the value I have passed, the output looks like a wrong reference ( AU�� ) instead the value I wrote in my input file.
line: tok1 tok2
| tok1 tok2 tok3
{
int len=0;
len = strlen($1) + strlen($3) + 3;
char out[len];
strcpy(out,$1);
strcat(out," = ");
strcat(out,$3);
printf("out -> %s;\n",out);
$$ = out;
}
| line tok4
{
printf("line -> %s\n",$1);
}
Here I've reported a simplified part of the code.
Giving in input the token tok1 tok2 tok3 it should assign to $$ the out variable (with the printf I can see that in the first part of the rule the out variable has the correct value).
Matching the tok4 sequentially I'm in the recursive part of the rule. But when I print the $1 value (who should be equal to out since I have passed it trough $$), I don't have the right output.
You cannot set:
$$ = out;
because the string that out refers to is just about to vanish into thin air, as soon as the block in which it was declared ends.
In order to get away with this, you need to malloc the storage for the new string.
Also, you need strlen($1) + strlen($3) + 4; because you need to leave room for the NUL terminator.
It's important to understand that C does not really have strings. It has pointers to char (char*), but those are really pointers. It has arrays (char []), but you cannot use an array as an aggregate. For example, in your code, out = $1 would be illegal, because you cannot assign to an array. (Also because $1 is a pointer, not an array, but that doesn't matter because any reference to an array, except in sizeof, is effectively reduced to a pointer.)
So when you say $$ = out, you are making $$ point to the storage represented by out, and that storage is just about to vanish. So that doesn't work. You can say $$ = $1, because $1 is also a pointer to char; that makes $$ and $1 point to the same character. (That's legal but it makes memory management more complicated. Also, you need to be careful with modifications.) Finally, you can say strcpy($$, out), but that relies on $$ already pointing to a string which is long enough to hold out, something which is highly unlikely, because what it means is to copy the storage pointed to by out into the location pointed to by $$.
Also, as I noted above, when you are using "string" functions in C, they all insist that the sequence of characters pointed to by their "string" arguments (i.e. the pointer-to-character arguments) must be terminated with a 0 character (that is, the character whose code is 0, not the character 0).
If you're used to programming in languages which actually have a string datatype, all this might seem a bit weird. Practice makes perfect.
The bottom line is that what you need to do is to create a new region of storage large enough to contain your string, like this (I removed out because it's not necessary):
$$ = malloc(len + 1); // room for NUL
strcpy($$, $1);
strcat($$, " = ");
strcat($$, $3);
// You could replace the strcpy/strcat/strcat with:
// sprintf($$, "%s = %s", $1, $3)
Note that storing mallocd data (including the result of strdup and asprintf) on the parser stack (that is, as $$) also implies the necessity to free it when you're done with it; otherwise, you have a memory leak.
I've solved it changin the $$ = out; line into strcpy($$,out); and now it works properly.

Remove extraneous spaces with `gsub` for `print.xtable`

I am new to R development, and have to modify some existing code. Specifically, I need to change a print() call so that it removes extraneous consecutive space characters.
I've found the sanitize.text.function parameter, and have successfully passed it my custom function to the print() function. And it does what I need it to do. That code is as follows:
print(xtable(x,...),type="html",
sanitize.text.function = function(s) gsub(" {2,}", "", s),...)
Now what I am trying to do is extract the "anonymous" / "inline" function code into a named function like so...
clean <- function(s) { gsub(" {2,}", "", s) }
print(xtable(x,...),type="html",sanitize.text.function = clean(s),...)
However, when I execute this, I get the following:
Error in gsub(" {2,}", "", s) : object 's' not found
The desire to define a function is two-fold:
to create a reusable block of code that could be referenced in other places, and
the ability to add more gsub() or similar executions that may be needed,
For example,
clean <- function(s) {
gsub(" {2,}", "", s)
gsub(">(.*?:)", "<span style=float:left>\1</span>", s)
}
print(xtable(x,...),type="html",sanitize.text.function = clean(s),...)
The sanitize.text.function expects a function yet you pass a result of clean(s) instead of the function (the argument will be evaluated!). So you can either use sanitize.text.function=clean or if you need to re-map arguments sanitize.text.function=function(x) clean(x) which is the lambda (unnamed) function construct you were looking for (the latter makes only sense for something more complex, obviously).

Resources