i'm having some issues on bison (again).
I'm trying to pass a string value between a "recursive rule" in my grammar file using the $$,
but when I print the value I have passed, the output looks like a wrong reference ( AU�� ) instead the value I wrote in my input file.
line: tok1 tok2
| tok1 tok2 tok3
{
int len=0;
len = strlen($1) + strlen($3) + 3;
char out[len];
strcpy(out,$1);
strcat(out," = ");
strcat(out,$3);
printf("out -> %s;\n",out);
$$ = out;
}
| line tok4
{
printf("line -> %s\n",$1);
}
Here I've reported a simplified part of the code.
Giving in input the token tok1 tok2 tok3 it should assign to $$ the out variable (with the printf I can see that in the first part of the rule the out variable has the correct value).
Matching the tok4 sequentially I'm in the recursive part of the rule. But when I print the $1 value (who should be equal to out since I have passed it trough $$), I don't have the right output.
You cannot set:
$$ = out;
because the string that out refers to is just about to vanish into thin air, as soon as the block in which it was declared ends.
In order to get away with this, you need to malloc the storage for the new string.
Also, you need strlen($1) + strlen($3) + 4; because you need to leave room for the NUL terminator.
It's important to understand that C does not really have strings. It has pointers to char (char*), but those are really pointers. It has arrays (char []), but you cannot use an array as an aggregate. For example, in your code, out = $1 would be illegal, because you cannot assign to an array. (Also because $1 is a pointer, not an array, but that doesn't matter because any reference to an array, except in sizeof, is effectively reduced to a pointer.)
So when you say $$ = out, you are making $$ point to the storage represented by out, and that storage is just about to vanish. So that doesn't work. You can say $$ = $1, because $1 is also a pointer to char; that makes $$ and $1 point to the same character. (That's legal but it makes memory management more complicated. Also, you need to be careful with modifications.) Finally, you can say strcpy($$, out), but that relies on $$ already pointing to a string which is long enough to hold out, something which is highly unlikely, because what it means is to copy the storage pointed to by out into the location pointed to by $$.
Also, as I noted above, when you are using "string" functions in C, they all insist that the sequence of characters pointed to by their "string" arguments (i.e. the pointer-to-character arguments) must be terminated with a 0 character (that is, the character whose code is 0, not the character 0).
If you're used to programming in languages which actually have a string datatype, all this might seem a bit weird. Practice makes perfect.
The bottom line is that what you need to do is to create a new region of storage large enough to contain your string, like this (I removed out because it's not necessary):
$$ = malloc(len + 1); // room for NUL
strcpy($$, $1);
strcat($$, " = ");
strcat($$, $3);
// You could replace the strcpy/strcat/strcat with:
// sprintf($$, "%s = %s", $1, $3)
Note that storing mallocd data (including the result of strdup and asprintf) on the parser stack (that is, as $$) also implies the necessity to free it when you're done with it; otherwise, you have a memory leak.
I've solved it changin the $$ = out; line into strcpy($$,out); and now it works properly.
Related
I have a vector of type UInt8 and fixed length 10. I think it contains a null-terminated string but when I do String(v) it shows the string + all of the zeros of the rest of the vector.
v = zeros(UInt8, 10)
v[1:5] = Vector{UInt8}("hello")
String(v)
the output is "hello\0\0\0\0\0".
Either I'm packing it wrong or reading it wrong. Any thoughts?
I use this snippet:
"""
nullstring(Vector{UInt8})
Interpret a vector as null terminated string.
"""
nullstring(x::Vector{UInt8}) = String(x[1:findfirst(==(0), x) - 1])
Although I bet there are faster ways to do this.
You can use unsafe_string: unsafe_string(pointer(v)), this does it without a copy, so is very fast. But #laborg's solution is better in almost all cases, because it's safe.
If you want both safety and maximal performance, you have to write a manual function yourself:
function get_string(v::Vector{UInt8})
# Find first zero
zeropos = 0
#inbounds for i in eachindex(v)
iszero(v[i]) && (zeropos = i; break)
end
iszero(zeropos) && error("Not null-terminated")
GC.#preserve v unsafe_string(pointer(v), zeropos - 1)
end
But eh, what are the odds you REALLY need it to be that fast.
You can avoid copying bytes and preserve safety with the following code:
function nullstring!(x::Vector{UInt8})
i = findfirst(iszero, x)
SubString(String(x),1,i-1)
end
Note that after calling it x will be empty and the returned value is Substring rather than String but in many scenarios it does not matter. This code makes half allocations than code by #laborg and is slightly faster (around 10-20%). The code by Jacob is still unbeatable though.
I am using IDL 8.4. I want to use isa() function to determine input type read by read_csv(). I want to use /number, /integer, /float and /string as some field I want to make sure float, other to be integer and other I don't care. I can do like this, but it is not very readable to human eye.
str = read_csv(filename, header=inheader)
; TODO check header
if not isa(str.(0), /integer) then stop
if not isa(str.(1), /number) then stop
if not isa(str.(2), /float) then stop
I am hoping I can do something like
expected_header = ['id', 'x', 'val']
expected_type = ['/integer', '/number', '/float']
str = read_csv(filename, header=inheader)
if not array_equal(strlowcase(inheader), expected_header) then stop
for i=0l,n_elements(expected_type) do
if not isa(str.(i), expected_type[i]) then stop
endfor
the above doesn't work, as '/integer' is taken literally and I guess isa() is looking for named structure. How can you do something similar?
Ideally I want to pick expected type based on header read from file, so that script still works as long as header specifies expected field.
EDIT:
my tentative solution is to write a wrapper for ISA(). Not very pretty, but does what I wanted... if there is cleaner solution , please let me know.
Also, read_csv is defined to return only one of long, long64, double and string, so I could write function to test with this limitation. but I just wanted to make it to work in general so that I can reuse them for other similar cases.
function isa_generic,var,typ
; calls isa() http://www.exelisvis.com/docs/ISA.html with keyword
; if 'n', test /number
; if 'i', test /integer
; if 'f', test /float
; if 's', test /string
if typ eq 'n' then return, isa(var, /number)
if typ eq 'i' then then return, isa(var, /integer)
if typ eq 'f' then then return, isa(var, /float)
if typ eq 's' then then return, isa(var, /string)
print, 'unexpected typename: ', typ
stop
end
IDL has some limited reflection abilities, which will do exactly what you want:
expected_types = ['integer', 'number', 'float']
expected_header = ['id', 'x', 'val']
str = read_csv(filename, header=inheader)
if ~array_equal(strlowcase(inheader), expected_header) then stop
foreach type, expected_types, index do begin
if ~isa(str.(index), _extra=create_struct(type, 1)) then stop
endforeach
It's debatable if this is really "easier to read" in your case, since there are only three cases to test. If there were 500 cases, it would be a lot cleaner than writing 500 slightly different lines.
This snipped used some rather esoteric IDL features, so let me explain what's happening a bit:
expected_types is just a list of (string) keyword names in the order they should be used.
The foreach part iterates over expected_types, putting the keyword string into the type variable and the iteration count into index.
This is equivalent to using for index = 0, n_elements(expected_types) - 1 do and then using expected_types[index] instead of type, but the foreach loop is easier to read IMHO. Reference here.
_extra is a special keyword that can pass a structure as if it were a set of keywords. Each of the structure's tags is interpreted as a keyword. Reference here.
The create_struct function takes one or more pairs of (string) tag names and (any type) values, then returns a structure with those tag names and values. Reference here.
Finally, I replaced not (bitwise not) with ~ (logical not). This step, like foreach vs for, is not necessary in this instance, but can avoid headache when debugging some types of code, where the distinction matters.
--
Reflective abilities like these can do an awful lot, and come in super handy. They're work-horses in other languages, but IDL programmers don't seem to use them as much. Here's a quick list of common reflective features I use in IDL, with links to the documentation for each:
create_struct - Create a structure from (string) tag names and values.
n_tags - Get the number of tags in a structure.
_extra, _strict_extra, and _ref_extra - Pass keywords by structure or reference.
call_function - Call a function by its (string) name.
call_procedure - Call a procedure by its (string) name.
call_method - Call a method (of an object) by its (string) name.
execute - Run complete IDL commands stored in a string.
Note: Be very careful using the execute function. It will blindly execute any IDL statement you (or a user, file, web form, etc.) feed it. Never ever feed untrusted or web user input to the IDL execute function.
You can't access the keywords quite like that, but there is a typename parameter to ISA that might be useful. This is untested, but should work:
expected_header = ['id', 'x', 'val']
expected_type = ['int', 'long', 'float']
str = read_cv(filename, header=inheader)
if not array_equal(strlowcase(inheader), expected_header) then stop
for i = 0L, n_elemented(expected_type) - 1L do begin
if not isa(str.(i), expected_type[i]) then stop
endfor
This code works:
void reverse(char *str)
{
if(*str)
{
reverse(str+1);
printf("%c", *str);
}
}
But, if i change reverse(str+1) with reverse(++str), it doesn't print first character.
In: Geeks
Out: skee
I don't know why.
Because you're altering the pointer given to you in the very first call of the method, so when it finally gets around to printing itself out and completing the execution, the index has already been incremented to the second character.
In the first case, str+1, str isn't being modified at all, so the very last printf just prints the first character.
Keep in mind that the prefix and postfix ++ actually change the value of the variable.
++str increments first then prints, you need str++
I have an array -
char name[256];
sprintf(name, "hello://cert=prv:netid=%d:tsid=%d:pid=%d\0", 1010,1200, 1300);
QString private_data_string = name;
At the last offset of this string i.e. '\0',when I try to do the following.
while(private_data_string.at(offset) != ':' &&
private_data_string.at(offset) != ';' &&
private_data_string.at(offset).isNull() == false)
The application aborts. Looks like that the data pointer is also zero at the string '\'. How can I fix this?
QString doesn't contain terminating character as you expect that is why you are failing assertion out of bounds. This is proper approach:
while(offset<private_data_string.length() &&
private_data_string.at(offset) != ':' &&
private_data_string.at(offset) != ';') {
// ...
}
It looks like you are doing something strange. Looks like your question is wrong. You are asking how to fix your strange solution of some mysterious problem, instead explain what are you trying to do and then as a bonus: how did you try to solve it.
You need to know several facts:
Writing \0 at tge end of your string literal is not necessary. String literals are null-terminated by default. Literal "abc" will actually contain 4 characters including terminating null character. Your string literal has 2 null characters at its end.
You have used the default constructor QString(char*). There is no additional data about buffer's length, so QString reads characters from the buffer until it encounters first null character. It doesn't matter how many null characters are actually at the end. The null character is interpreted as a buffer end marker, not a part of the string.
When you have QString "abc", its size is 3 (it would be surprising to have another value). Null character is not a part of the string. QString::at function can be used for positions 0 <= position < size(). This is explicitly specified in the documentation. So it doesn't matter if QString's internal buffer is null-terminated or not. Either way, you don't have access to null terminator.
If you really want null character to be part of your data, you should use QByteArray instead of QString. It allows to specify buffer size on construction and can contain as many null characters as you want. However, when dealing with strings, it's usually not necessary.
You should use QString::arg instead of sprintf:
QString private_data_string =
QString("hello://cert=prv:netid=%1:tsid=%2:pid=%3")
.arg(netid).arg(tsid).arg(pid);
sprintf is unsafe and can overrun your fixed-size buffer if you're not careful. In C++ there's no good reason to use sprintf.
"A QString that has not been assigned to anything is null, i.e., both the length and data pointer is 0" - this has nothing to do with your situation because you have assigned a value to your string.
So I work with files, and I need to know the largest line in file X. Using Unix awk results in a Int that I'm looking for. But in Haskell how can I return that value and save it to a variable?
I tried define something with IO [Int] -> [Int]
maxline = do{system "awk ' { if ( length > x ) { x = length } }END{ print x }' filename";}
doesn't work cause:
Couldn't match expected type 'Int',against inferred type 'IO GHC.IO.Exception.ExitCode'
This is because the system action returns the exit status of the command you run which cannot be converted to Int. You should use the readProcess to get the commands output.
> readProcess "date" [] []
"Thu Feb 7 10:03:39 PST 2008\n"
Note that readProcess does not pass the command to the system shell: it runs it directly. The second parameter is where the command's arguments should go. So your example should be
readProcess "awk" [" { if ( length > x ) { x = length } }END{ print x }", "/home/basic/Desktop/li11112mp/textv"] ""
You can use readProcess to get another program's output. You will not be able to convert the resulting IO String into a pure String; however, you can lift functions that expect Strings into functions that expect IO Strings. My two favorite references for mucking about with IO (and various other monads) are sigfpe's excellent blog posts, You Could Have Invented Monads! (And Maybe You Already Have.) and The IO Monad for People who Simply Don't Care.
For this particular problem, I would strongly suggest looking into finding a pure-Haskell solution (that is, not calling out to awk). You might like readFile, lines, and maximumBy.