So I work with files, and I need to know the largest line in file X. Using Unix awk results in a Int that I'm looking for. But in Haskell how can I return that value and save it to a variable?
I tried define something with IO [Int] -> [Int]
maxline = do{system "awk ' { if ( length > x ) { x = length } }END{ print x }' filename";}
doesn't work cause:
Couldn't match expected type 'Int',against inferred type 'IO GHC.IO.Exception.ExitCode'
This is because the system action returns the exit status of the command you run which cannot be converted to Int. You should use the readProcess to get the commands output.
> readProcess "date" [] []
"Thu Feb 7 10:03:39 PST 2008\n"
Note that readProcess does not pass the command to the system shell: it runs it directly. The second parameter is where the command's arguments should go. So your example should be
readProcess "awk" [" { if ( length > x ) { x = length } }END{ print x }", "/home/basic/Desktop/li11112mp/textv"] ""
You can use readProcess to get another program's output. You will not be able to convert the resulting IO String into a pure String; however, you can lift functions that expect Strings into functions that expect IO Strings. My two favorite references for mucking about with IO (and various other monads) are sigfpe's excellent blog posts, You Could Have Invented Monads! (And Maybe You Already Have.) and The IO Monad for People who Simply Don't Care.
For this particular problem, I would strongly suggest looking into finding a pure-Haskell solution (that is, not calling out to awk). You might like readFile, lines, and maximumBy.
Related
Well, this must be the most stupid and idiotic behavior I've seen from a programming language.
https://www.bfgroup.xyz/b2/manual/release/index.html says:
Syntactically, a Boost.Jam program consists of two kinds of
elements—keywords (which have a special meaning to Boost.Jam) and
literals. Consider this code:
a = b ;
which assigns the value b to the variable a. Here, = and ; are
keywords, while a and b are literals.
⚠ All syntax elements, even
keywords, must be separated by spaces. For example, omitting the space
character before ; will lead to a syntax error.
If you want to use a literal value that is the same as some keyword,
the value can be quoted:
a = "=" ;
OK, so far so good. So I have this in my Jamroot:
import path : basename ;
actions make_mytest_install
{
echo "make_mytest_install: MY_ROOT_PATH $(MY_ROOT_PATH) PWD $(PWD:E=not_set)" ;
epath = "$(MY_ROOT_PATH)/projects/mytest/bin/gcc-9/release/qt5client" ;
ename = basename ( $(epath) ) ;
echo "epath $(epath) ename $(ename)" ;
}
explicit install-gettext ;
make install-mytest : : #make_mytest_install ;
... and I try this:
bjam install-mytest
...updating 1 target...
Jamfile</home/USER/src/myproject>.make_mytest_install bin/install-mytest
make_mytest_install: MY_ROOT_PATH /home/USER/src/myproject PWD not_set
[ SHELL pstree -s -p 2720269 && echo PID 2720269 PWD /home/USER/src/myproject ]
/bin/sh: 13: epath: not found
/bin/sh: 14: Syntax error: "(" unexpected
.....
...failed Jamfile</home/USER/src/myproject>.make_mytest_install bin/install-mytest...
...failed updating 1 target...
Now - how come that the SIMPLEST assignment to a string, EXACTLY AS in the manual:
epath = "$(MY_ROOT_PATH)/projects/mytest/bin/gcc-9/release/qt5client" ;
... fails, and this variable cannot be found anymore?
What is the logic in this? How the hell is this supposed to work? I would get it if MY_ROOT_PATH was undefined - but the echo before it, shows that it is not? What is this lunacy?
So I cannot believe I'm asking something this trivial, but:
How do you assign a string to a variable in bjam language?
Well, the error gives somewhat of a hint: /bin/sh: -> so apparently inside actions, it is sh that runs - then again, if it was really sh I could have assigned variables, but I can't. So best I could do, was to remove the assignments OUT of actions:
import path : basename ;
epath = "$(MY_ROOT_PATH)/projects/mytest/bin/gcc-9/release/qt5client" ;
# ename = basename ( $(epath) ) ; # nope, causes target install-mytest to not be found :(
# calling a shell for basename works - but adds a damn NEWLINE at end!?!?!?!
ename = [ SHELL "basename $(epath)" ] ;
actions make_mytest_install
{
echo "make_mytest_install: MY_ROOT_PATH $(MY_ROOT_PATH) PWD $(PWD:E=not_set)" ;
echo "epath $(epath) ename $(ename)" ;
}
explicit install-mytest ;
make install-mytest : : #make_mytest_install ;
So, assignment kind of passes, but you still can't get the basename ?!
I still don't understand, who thought this kind of variable management is a good idea ... I don't even understand, how people managed to build stuff with this system
I am currently learning sml but I have one question that I can not find an answer for. I have googled but still have not found anything.
This is my code:
fun diamond(n) =
if(n=1) then (
print("*")
) else (
print("*")
diamond(n-1)
)
diamond(5);
That does not work. I want the code to show as many * as number n is and I want to do that with recursion, but I don't understand how to do that.
I get an error when I try to run that code. This is the error:
Standard ML of New Jersey v110.78 [built: Thu Aug 20 19:23:18 2015]
[opening a4_p2.sml] a4_p2.sml:8.5-9.17 Error: operator is not a
function [tycon mismatch] operator: unit in expression:
(print "*") diamond /usr/local/bin/sml: Fatal error -- Uncaught exception Error with 0 raised at
../compiler/TopLevel/interact/evalloop.sml:66.19-66.27
Thank you
You can do side effects in ML by using ';'
It will evaluate whatever is before the ';' and discard its result.
fun diamond(n) =
if(n=1)
then (print "*"; 1)
else (print "*"; diamond(n-1));
diamond(5);
The reason for the error is because ML is a strongly typed language that although you don't need to specify types explicitly, it will infer them based on environmental factors at compile time. For this reason, every evaluation of functions, statements like if else need to evaluate to an unambiguous singular type.
If you were allowed to do the following:
if(n=1)
then 1
else print "*";
then the compiler will get a different typing for the then and else branch respectively.
For the then branch the type would be int -> int whereas the type for the else branch would be int -> unit
Such a dichotomy is not allowed under a strongly typed language.
As you need to evaluate to a singular type, you will understand that ML does not support the execution of a block of instructions as we commonly see in other paradigms which transposed to ML naively would render something like this:
....
if(n=1)
then (print "1"
print "2"
)
else (print "3"
diamond(n-1)
)
...
because what type would the then branch evaluate to? int -> unit? Then what about the other print statement? A statement has to return a singular result(even it be a compound) so that would not make sense. What about int -> unit * unit? No problem with that except that syntactically speaking, you failed to communicate a tuple to the compiler.
For this reason, the following WOULD work:
fun diamond(n) =
if(n=1)
then (print "a", 1) /* A tuple of the type unit * int */
else diamond(n-1);
diamond(5);
As in this case you have a function of type int -> unit * int.
So in order to satisfy the requirement of the paradigm of strongly typed functional programming where we strive for building mechanisms that evaluate to one result-type, we thus need to communicate to the compiler that certain statements are to be executed as instructions and are not to be incorporated under the typing of the function under consideration.
For this reason, you use ';' to communicate to the compiler to simply evaluate that statement and discard its result from being incorporated under the type evaluation of the function.
As far as your actual objective is concerned, following is a better way of writing the function, diamond as type int -> string:
fun diamond(n) =
if(n=1)
then "*"
else "*" ^ diamond(n-1);
print( diamond(5) );
The above way is more for debugging purposes.
I am trying to learn Erlang and I am working on the practice problems Erlang has on the site. One of them is:
Write the function time:swedish_date() which returns a string containing the date in swedish YYMMDD format:
time:swedish_date()
"080901"
My function:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
string:substr((integer_to_list(YYYY, 3,4)++pad_string(integer_to_list(MM))++pad_string(integer_to_list(DD)).
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
I'm getting the following errors when compiled.
demo.erl:6: syntax error before: '.'
demo.erl:2: function swedish_date/0 undefined
demo.erl:9: Warning: function pad_string/1 is unused
error
How do I fix this?
After fixing your compilation errors, you're still facing runtime errors. Since you're trying to learn Erlang, it's instructive to look at your approach and see if it can be improved, and fix those runtime errors along the way.
First let's look at swedish_date/0:
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
Why convert the list to a tuple? Since you use the list elements individually and never use the list as a whole, the conversion serves no purpose. You can instead just pattern-match the returned tuple:
{YYYY,MM,DD} = date(),
Next, you're calling string:substr/1, which doesn't exist:
string:substr((integer_to_list(YYYY,3,4) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD))).
The string:substr/2,3 functions both take a starting position, and the 3-arity version also takes a length. You don't need either, and can avoid string:substr entirely and instead just return the assembled string:
integer_to_list(YYYY,3,4) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
Whoops, this is still not right: there is no such function integer_to_list/3, so just replace that first call with integer_to_list/1:
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
Next, let's look at pad_string/1:
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
There's a runtime error here because '0' is an atom and you're attempting to append String, which is a list, to it. The error looks like this:
** exception error: bad argument
in operator ++/2
called as '0' ++ "8"
Instead of just fixing that directly, let's consider what pad_string/1 does: it adds a leading 0 character if the string is a single digit. Instead of using if to check for this condition — if isn't used that often in Erlang code — use pattern matching:
pad_string([D]) ->
[$0,D];
pad_string(S) ->
S.
The first clause matches a single-element list, and returns a new list with the element D preceded with $0, which is the character constant for the character 0. The second clause matches all other arguments and just returns whatever is passed in.
Here's the full version with all changes:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
{YYYY,MM,DD} = date(),
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD)).
pad_string([D]) ->
[$0,D];
pad_string(S) ->
S.
But a simpler approach would be to use the io_lib:format/2 function to just format the desired string directly:
swedish_date() ->
io_lib:format("~w~2..0w~2..0w", tuple_to_list(date())).
First, note that we're back to calling tuple_to_list(date()). This is because the second argument for io_lib:format/2 must be a list. Its first argument is a format string, which in our case says to expect three arguments, formatting each as an Erlang term, and formatting the 2nd and 3rd arguments with a width of 2 and 0-padded.
But there's still one more step to address, because if we run the io_lib:format/2 version we get:
1> demo:swedish_date().
["2015",["0",56],"29"]
Whoa, what's that? It's simply a deep list, where each element of the list is itself a list. To get the format we want, we can flatten that list:
swedish_date() ->
lists:flatten(io_lib:format("~w~2..0w~2..0w", tuple_to_list(date()))).
Executing this version gives us what we want:
2> demo:swedish_date().
"20150829"
Find the final full version of the code below.
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
lists:flatten(io_lib:format("~w~2..0w~2..0w", tuple_to_list(date()))).
UPDATE: #Pascal comments that the year should be printed as 2 digits rather than 4. We can achieve this by passing the date list through a list comprehension:
swedish_date() ->
DateVals = [D rem 100 || D <- tuple_to_list(date())],
lists:flatten(io_lib:format("~w~2..0w~2..0w", DateVals)).
This applies the rem remainder operator to each of the list elements returned by tuple_to_list(date()). The operation is needless for month and day but I think it's cleaner than extracting the year and processing it individually. The result:
3> demo:swedish_date().
"150829"
There are a few issues here:
You are missing a parenthesis at the end of line 6.
You are trying to call integer_to_list/3 when Erlang only defines integer_to_list/1,2.
This will work:
-module(demo).
-export([swedish_date/0]).
swedish_date() ->
[YYYY,MM,DD] = tuple_to_list(date()),
string:substr(
integer_to_list(YYYY) ++
pad_string(integer_to_list(MM)) ++
pad_string(integer_to_list(DD))
).
pad_string(String) ->
if
length(String) == 1 -> '0' ++ String;
true -> String
end.
In addition to the parenthesis error on line 6, you also have an error on line 10 where yo use the form '0' instead of "0", so you define an atom rather than a string.
I understand you are doing this for educational purpose, but I encourage you to dig into erlang libraries, it is something you will have to do. For a common problem like this, it already exists function that help you:
swedish_date() ->
{YYYY,MM,DD} = date(), % not useful to transform into list
lists:flatten(io_lib:format("~2.10.0B~2.10.0B~2.10.0B",[YYYY rem 100,MM,DD])).
% ~X.Y.ZB means: uses format integer in base Y, print X characters, uses Z for padding
I am using IDL 8.4. I want to use isa() function to determine input type read by read_csv(). I want to use /number, /integer, /float and /string as some field I want to make sure float, other to be integer and other I don't care. I can do like this, but it is not very readable to human eye.
str = read_csv(filename, header=inheader)
; TODO check header
if not isa(str.(0), /integer) then stop
if not isa(str.(1), /number) then stop
if not isa(str.(2), /float) then stop
I am hoping I can do something like
expected_header = ['id', 'x', 'val']
expected_type = ['/integer', '/number', '/float']
str = read_csv(filename, header=inheader)
if not array_equal(strlowcase(inheader), expected_header) then stop
for i=0l,n_elements(expected_type) do
if not isa(str.(i), expected_type[i]) then stop
endfor
the above doesn't work, as '/integer' is taken literally and I guess isa() is looking for named structure. How can you do something similar?
Ideally I want to pick expected type based on header read from file, so that script still works as long as header specifies expected field.
EDIT:
my tentative solution is to write a wrapper for ISA(). Not very pretty, but does what I wanted... if there is cleaner solution , please let me know.
Also, read_csv is defined to return only one of long, long64, double and string, so I could write function to test with this limitation. but I just wanted to make it to work in general so that I can reuse them for other similar cases.
function isa_generic,var,typ
; calls isa() http://www.exelisvis.com/docs/ISA.html with keyword
; if 'n', test /number
; if 'i', test /integer
; if 'f', test /float
; if 's', test /string
if typ eq 'n' then return, isa(var, /number)
if typ eq 'i' then then return, isa(var, /integer)
if typ eq 'f' then then return, isa(var, /float)
if typ eq 's' then then return, isa(var, /string)
print, 'unexpected typename: ', typ
stop
end
IDL has some limited reflection abilities, which will do exactly what you want:
expected_types = ['integer', 'number', 'float']
expected_header = ['id', 'x', 'val']
str = read_csv(filename, header=inheader)
if ~array_equal(strlowcase(inheader), expected_header) then stop
foreach type, expected_types, index do begin
if ~isa(str.(index), _extra=create_struct(type, 1)) then stop
endforeach
It's debatable if this is really "easier to read" in your case, since there are only three cases to test. If there were 500 cases, it would be a lot cleaner than writing 500 slightly different lines.
This snipped used some rather esoteric IDL features, so let me explain what's happening a bit:
expected_types is just a list of (string) keyword names in the order they should be used.
The foreach part iterates over expected_types, putting the keyword string into the type variable and the iteration count into index.
This is equivalent to using for index = 0, n_elements(expected_types) - 1 do and then using expected_types[index] instead of type, but the foreach loop is easier to read IMHO. Reference here.
_extra is a special keyword that can pass a structure as if it were a set of keywords. Each of the structure's tags is interpreted as a keyword. Reference here.
The create_struct function takes one or more pairs of (string) tag names and (any type) values, then returns a structure with those tag names and values. Reference here.
Finally, I replaced not (bitwise not) with ~ (logical not). This step, like foreach vs for, is not necessary in this instance, but can avoid headache when debugging some types of code, where the distinction matters.
--
Reflective abilities like these can do an awful lot, and come in super handy. They're work-horses in other languages, but IDL programmers don't seem to use them as much. Here's a quick list of common reflective features I use in IDL, with links to the documentation for each:
create_struct - Create a structure from (string) tag names and values.
n_tags - Get the number of tags in a structure.
_extra, _strict_extra, and _ref_extra - Pass keywords by structure or reference.
call_function - Call a function by its (string) name.
call_procedure - Call a procedure by its (string) name.
call_method - Call a method (of an object) by its (string) name.
execute - Run complete IDL commands stored in a string.
Note: Be very careful using the execute function. It will blindly execute any IDL statement you (or a user, file, web form, etc.) feed it. Never ever feed untrusted or web user input to the IDL execute function.
You can't access the keywords quite like that, but there is a typename parameter to ISA that might be useful. This is untested, but should work:
expected_header = ['id', 'x', 'val']
expected_type = ['int', 'long', 'float']
str = read_cv(filename, header=inheader)
if not array_equal(strlowcase(inheader), expected_header) then stop
for i = 0L, n_elemented(expected_type) - 1L do begin
if not isa(str.(i), expected_type[i]) then stop
endfor
i'm having some issues on bison (again).
I'm trying to pass a string value between a "recursive rule" in my grammar file using the $$,
but when I print the value I have passed, the output looks like a wrong reference ( AU�� ) instead the value I wrote in my input file.
line: tok1 tok2
| tok1 tok2 tok3
{
int len=0;
len = strlen($1) + strlen($3) + 3;
char out[len];
strcpy(out,$1);
strcat(out," = ");
strcat(out,$3);
printf("out -> %s;\n",out);
$$ = out;
}
| line tok4
{
printf("line -> %s\n",$1);
}
Here I've reported a simplified part of the code.
Giving in input the token tok1 tok2 tok3 it should assign to $$ the out variable (with the printf I can see that in the first part of the rule the out variable has the correct value).
Matching the tok4 sequentially I'm in the recursive part of the rule. But when I print the $1 value (who should be equal to out since I have passed it trough $$), I don't have the right output.
You cannot set:
$$ = out;
because the string that out refers to is just about to vanish into thin air, as soon as the block in which it was declared ends.
In order to get away with this, you need to malloc the storage for the new string.
Also, you need strlen($1) + strlen($3) + 4; because you need to leave room for the NUL terminator.
It's important to understand that C does not really have strings. It has pointers to char (char*), but those are really pointers. It has arrays (char []), but you cannot use an array as an aggregate. For example, in your code, out = $1 would be illegal, because you cannot assign to an array. (Also because $1 is a pointer, not an array, but that doesn't matter because any reference to an array, except in sizeof, is effectively reduced to a pointer.)
So when you say $$ = out, you are making $$ point to the storage represented by out, and that storage is just about to vanish. So that doesn't work. You can say $$ = $1, because $1 is also a pointer to char; that makes $$ and $1 point to the same character. (That's legal but it makes memory management more complicated. Also, you need to be careful with modifications.) Finally, you can say strcpy($$, out), but that relies on $$ already pointing to a string which is long enough to hold out, something which is highly unlikely, because what it means is to copy the storage pointed to by out into the location pointed to by $$.
Also, as I noted above, when you are using "string" functions in C, they all insist that the sequence of characters pointed to by their "string" arguments (i.e. the pointer-to-character arguments) must be terminated with a 0 character (that is, the character whose code is 0, not the character 0).
If you're used to programming in languages which actually have a string datatype, all this might seem a bit weird. Practice makes perfect.
The bottom line is that what you need to do is to create a new region of storage large enough to contain your string, like this (I removed out because it's not necessary):
$$ = malloc(len + 1); // room for NUL
strcpy($$, $1);
strcat($$, " = ");
strcat($$, $3);
// You could replace the strcpy/strcat/strcat with:
// sprintf($$, "%s = %s", $1, $3)
Note that storing mallocd data (including the result of strdup and asprintf) on the parser stack (that is, as $$) also implies the necessity to free it when you're done with it; otherwise, you have a memory leak.
I've solved it changin the $$ = out; line into strcpy($$,out); and now it works properly.