In Julia, why is #printf a macro instead of a function? - julia

In Julia, the syntax to print a formatted string is as follows:
#printf("Hello %d\n", 5)
Why is #printf a macro instead of a function? Is it so that it can accept a varying number of arguments?

Taking a variable number of arguments is not a problem for normal Julia functions [1]. #printf is a macro so that it can parse and interpret the format string at compile time and generate custom code for that specific format string. People may not realize that C's printf function re-parses and re-interprets the format string each time you call printf. The fact that it's as fast as it is represents a minor miracle of insane pointer programming. Seriously, just look at your nearest libc's printf implementation. It's completely nuts.
Julia uses a different approach: #printf is a macro which translates format strings into efficient code specific to that format specification. If you think about it, a printf-style format string is really just a way to express a function that takes a fixed number and type of arguments and prints them in a particular way. Note that I said that the format string is a function, not printf itself, which is conceptually a function generator, turning formats into formatters. The fact that this is all crammed into a run-time function in C is a bit of a mismatch due to that being the only reasonable option in C. In fact, because of this, until very recently, it was rather easy to shoot yourself in the foot by passing the wrong number or type of arguments to C's printf. This is only better now because compilers have been special-cased to understand the semantics of printf formats.
In theory, Julia's #printf can be made faster than C since it generate's custom code, but in practice, I had a hard enough time matching C, let alone beating it. But I think that's due to the current design of our I/O system and how I'm using it, not an inherent limitation. The I/O stuff is due for an overhaul though, and when that happens, we might actually be able to beat C at formatted printing by leveraging the fact that #printf is a macro.

It is for performance. The printf macro takes a constant format string (eg. "Hello %d\n") and generates optimized code for that string.

Related

Why use macros in Julia?

I was reading up on the documentation of macros and ran into the following under the `Hold up: why macros' section. The reasoning given to use macros is as follows:
Macros are necessary because they execute when code is parsed,
therefore, macros allow the programmer to generate and include
fragments of customized code before the full program is run
This leads me to wonder why someone would want to use "generate and include fragments of customized code before the full program is run". Can someone provide context as to why this would be beneficial and/or other good use cases for macros?
Let me give you my view on macros.
A macro basically is a code -> code function. It takes code (a Julia expression) as input and spits out code (a different Julia expression).
Why is this useful? It has multiple purposes:
compile time copy-and-paste: You don't have to write the same piece of code multiple times but instead can define a short macro that writes it for you wherever you put it. (example)
domain specific language (DSL): You can create special syntax that after the macros code -> code transform is replaced by pure Julia constructs. This is used in many packages to define special syntax, for example here and here.
code generation: Imagine you want to write a really long piece of code which, although being long, is very simple because it has some kind of pattern that repeats itself rather trivially. Writing that code by hand can be a pain (or even practically impossible). A macro can programmatically generate the code for you. One example is for-loop unrolling (see here and here). But even the #time macro isn't doing much more than just putting a bunch of Base.time_ns() function calls around the provided Julia expression.
special string parsing: If you type the literal 3.2 in Julia it will be parsed and interpreted as a Float64. Now, imagine you want to supply a number literally that goes beyond Float64 precision but would fit into a BigFloat. Typing big(3.123124812498124812498) won't work, because the literal number is first interpreted as a Float64 and then handed to the big function. Instead you need a way to tell Julia at parse time that this should become a BigFloat. This is handled by a #big_str 3.2 macro which (for convenience) can also be written as big"3.2". The latter is just syntactic sugar.
There might be many more applications of macros, but those are the most important to me.
Let me end by referencing Steven G. Johnson's great talk at JuliaCon 2019:
Most of the time, don't do metaprogramming :)

Is long double useful in ANSI C?

There is a data type in C89 (ANSI C) standard called long double, but there is no any mathematical function to support long double (<math.h>). For example, sin function accepts a long argument.
C99 supports mathematical functions for long double.
My question is, when there is no any mathematical functions to support long double in ANSI C, islong double useful?
Yes, "long double" is absolutely useful if you wish to compute an expression with more than double precision.
An interesting side question is "What exactly IS 'long double'"?
The answer is platform- and/or compiler dependent:
http://en.wikipedia.org/wiki/Long_double
Just because math.h doesn't support something doesn't mean you can't make it yourself.
The type existing is a good thing, because it means there is a cross-platform way to request something with more or equal precision to a long. This couldn't be done if it wasn't in the language somewhere (your best bet would be to hack something together with a struct or array of longs / doubles).
The functions are just for convenience; sometimes a built-in sin processor function can be used, but sometimes not, and instead the sin function simply contains an algorithm to produce the answer, or look it up, using standard operations.
You could copy the sinl functions for your target platform from C99 to C89 if you wanted. There's a big list of implementations here: http://sourceware.org/git/?p=glibc.git;a=tree;f=sysdeps/ieee754;hb=HEAD
Or just stick to C99.

Program to mimic scanf() using system calls

As the Title says, i am trying out this last year's problem that wants me to write a program that works the same as scanf().
Ubuntu:
Here is my code:
#include<unistd.h>
#include<stdio.h>
int main()
{
int fd=0;
char buf[20];
read(0,buf,20);
printf("%s",buf);
}
Now my program does not work exactly the same.
How do i do that both the integer and character values can be stored since my given code just takes the character strings.
Also how do i make my input to take in any number of data, (only 20 characters in this case).
Doing this job thoroughly is a non-trivial exercise.
What you show does not emulate sscanf("%s", buffer); very well. There are at least two problems:
You limit the input to 20 characters.
You do not stop reading at the first white space character, leaving it and other characters behind to be read next time.
Note that the system calls cannot provide an 'unget' functionality; that has to be provided by the FILE * type. With file streams, you are guaranteed one character of pushback. I recently did some empirical research on the limits, finding values that the number of pushed back characters ranged from 1 (AIX, HP-UX) to 4 (Solaris) to 'big', meaning up to 4 KiB, possibly more, on Linux and MacOS X (BSD). Fortunately, scanf() only requires one character of pushback. (Well, that's the usual claim; I'm not sure whether that's feasible when distinguishing between "1.23e+f" and "1.23e+1"; the first needs three characters of lookahead, it seems to me, before it can tell that the e+f is not part of the number.)
If you are writing a plug-in replacement for scanf(), you are going to need to use the <stdarg.h> mechanism. Fortunately, all the arguments to scanf() after the format string are data pointers, and all data pointers are the same size. This simplifies some aspects of the code. However, you will be parsing the scan format string (a non-trivial exercise in its own right; see the recent discussion of print format string parsing) and then arranging to make the appropriate conversions and assignments.
Unless you have unusually stringent conditions imposed upon you, assume that you will use the character-level Standard I/O library functions such as getchar(), getc() and ungetc(). If you can't even use them, then write your own variants of them. Be aware that full integration with the rest of the I/O functions is tricky - things like fseek() complicate matters, and ensuring that pushed-back characters are properly consumed is also not entirely trivial.

What, if any, is wrong with this approach to declarative I/O

I'm not sure exactly how much this falls under 'programming' opposed to 'program language design'. But the issue is this:
Say, for sake of simplicity we have two 'special' lists/arrays/vectors/whatever we just call 'ports' for simplicity, one called stdIn and another stdOut. These conceptually represent respectively
All the user input given to the program in the duration of the program
All the output written to the terminal during the duration of the program
In Haskell-inspired pseudocode, it should then be possible to create this wholly declarative program:
let stdOut = ["please input a number",
"and please input another number",
"The product of both numbers is: " ++ stdIn[0] * stdIn[1]]
Which would do the expected, ask for two numbers, and print their product. The trick being that stdOut represents the list of strings written to the terminal at the completion of the program, and stdIn the list of input strings. Type errors and the fact that there needs to be some safeguard to only print the next line after a new line has been entered left aside here for simplicity's sake, it's probably easy enough to solve that.
So, before I go of to implement this idea, are there any pitfalls to it that I overlooked? I'm not aware of a similar construct already existing so it'd be naïve to not take into account that there is an obvious pitfall to it I overlooked.
Otherwise, I know that of course:
let stdOut = [stdIn[50],"Hello, World!"]
Would be an error if these results need to be interwoven in a similar fashion as above.
A similar approach was used in early versions of Haskell, except that the elements of the stdin and stdout channels were not strings but generic IO 'actions'--in fact, input and output were generalized to 'response' and 'request'. As long as both channels are lazy (i.e. they are actually 'iterators' or 'enumerators'), the runtime can simply walk the request channel, act on each request and tack appropriate responses onto the response channel. Unfortunately, the system was very hard to use, so it was scrapped in favor of monadic IO. See these papers:
Hudak, P., and Sundaresh, R. On the expressiveness of purely-functional I/O systems. Tech. Rep. YALEU/DCS/RR-665, Department of Computer Science, Yale University, Mar. 1989.
Peyton Jones, S. Tackling the Awkward Squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell. In Engineering theories of software construction, 2002, pp. 47--96.
The approach you're describing sounds like "Dialogs." In their award-winning 1993 paper Imperative Functional Programming, Phil Wadler and Simon Peyton Jones give some examples where dialogs really don't work very well, and they explain why monadic I/O is better.
I don't see how you will weave them considering this example compared to your own:
let stdOut = ["Welcome to the program which multiplies.",
"please input a number",
"and please input another number",
"The product of both numbers is: " ++ stdIn[0] * stdIn[1]]
Should the program prompt for the number represented by stdIn[0] after outputting one line (as in your example) or two lines? If the index 0 represents the 0th input from stdin, then it seems something similar to:
let stdOut = ["Welcome to the program which multiplies.",
"please input a number",
some_annotation(stdIn[0]),
"and please input another number",
some_annotation(stdIn[1]),
"The product of both numbers is: " ++ stdIn[0] * stdIn[1]]
will be required in order to coordinate the timing of output and input.
I like your idea. Replace some_annotation with your preference, perhaps something akin "synchronize?" I couldn't come up with the incisive word for it.
This approach seems to be the "most obvious" way to add I/O to a pure λ-calculus, and other people have mentioned that something along those lines has been tried in Haskell and Miranda.
However, I am aware of a language, not based on a λ-calculus, that still uses a very similar system:
How to handle input and output in a
language without side effects? In a
certain sense, input and output aren't
side effects; they are, so to speak,
front- and back-effects. (...) [A program is]
a function from the space
of possible inputs to the space of
possible outputs.
Input and output streams are
represented as lists of natural
numbers from 0 to 255, each
corresponding to one byte. End-of-file
is represented by the value 256, not
by end of list. (This is because it is
often easier to deal with EOF as a
character than as a special case.
Nevertheless, I wonder if it wouldn't
be better to use end-of-list.)
(...)
It's not difficult to write
interactive programs (...) [but] doing
so is, technically speaking, a sin.
(...) In a referentially transparent
language, anything not explicitly
synchronized is fair game for
evaluation in any order whatsoever, at
the run-time system's discretion.
(...) The most obvious way of writing
this particular program is to cons
together the "Hello, [name]!" string
in an expression which is conditioned
on receipt of a newline. If you do
this you are safe, because there's no
way for any evaluator to prove in
advance that the user will ever type a
newline.
(...)
So there's no practical problem with
interactive software. Nevertheless,
there's something unpleasant about the
way the second case is prevented. A
referentially transparent program
should not have to rely on lazy
evaluation in order to work properly.
How to escape this moral dilemma? The
hard way is to switch to a more
sophisticated I/O system, perhaps
based on Haskell's, in which input and
output are explicitly synchronized.
I'm rather disinclined to do this, as
I much prefer the simplicity of the
current system. The easy way out is to
write batch programs which happen to
work well interactively. This is
mainly just a matter of not prompting
the user.
Perhaps you would enjoying doing some programming in Lazy K?

Efficiency of stack-based expression evaluation for math parsing

I have to write, for academic purposes, an application that plots user-input expressions like: f(x) = 1 - exp(3^(5*ln(cosx)) + x)
The approach I've chosen to write the parser is to convert the expression in RPN with the Shunting-Yard algorithm, treating primitive functions like "cos" as unary operators. This means the function written above would be converted in a series of tokens like:
1, x, cos, ln, 5, *,3, ^, exp, -
The problem is that to plot the function I have to evaluate it LOTS of times, so applying the stack evaluation algorithm for each input value would be very inefficient.
How can I solve this? Do I have to forget the RPN idea?
How much is "LOTS of times"? A million?
What kind of functions could be input? Can we assume they are continuous?
Did you try measuring how well your code performs?
(Sorry, started off with questions!)
You could try one of the two approaches (or both) described briefly below (there are probably many more):
1) Parse Trees.
You could create a Parse Tree. Then do what most compilers do to optimize expressions, constant folding, common subexpression elimination (which you could achieve by linking together the common expression subtrees and caching the result), etc.
Then you could use lazy evaluation techniques to avoid whole subtrees. For instance if you have a tree
*
/ \
A B
where A evaluates to 0, you could completely avoid evaluating B as you know the result is 0. With RPN you would lose out on the lazy evaluation.
2) Interpolation
Assuming your function is continuous, you could approximate your function to a high degree of accuracy using Polynomial Interpolation. This way you can do the complicated calculation of the function a few times (based on the degree of polynomial you choose), and then do fast polynomial calculations for the rest of the time.
To create the initial set of data, you could just use approach 1 or just stick to using your RPN, as you would only be generating a few values.
So if you use Interpolation, you could keep your RPN...
Hope that helps!
Why reinvent the wheel? Use a fast scripting language instead.
Integrating something like lua into your code will take very little time and be very fast.
You'll usually be able byte compile your expression, and that should result in code that runs very fast, certainly fast enough for simple 1D graphs.
I recommend lua as its fast, and integrates with C/C++ easier than any other scripting language. Another good options would be python, but while its better known I found it trickier to integrate.
Why not keep around a parse tree (I use "tree" loosely, in your case it's a sequence of operations), and mark input variables accordingly? (e.g. for inputs x, y, z, etc. annotate "x" with 0 to signify the first input variable, "y" with 1 to signify the 2nd input variable, etc.)
That way you can parse the expression once, keep the parse tree, take in an array of inputs, and apply the parse tree to evaluate.
If you're worrying about the performance aspects of the evaluation step (vs. the parsing step), I don't think you'd do much better unless you get into vectorizing (applying your parse tree on a vector of inputs at once) or hard-coding the operations into a fixed function.
What I do is use the shunting algorithm to produce the RPN. I then "compile" the RPN into a tokenised form that can be executed (interpretively) repeatedly without re-parsing the expression.
Michael Anderson suggested Lua. If you want to try Lua for just this task, see my ae library.
Inefficient in what sense? There's machine time and programmer time. Is there a standard for how fast it needs to run with a particular level of complexity? Is it more important to finish the assignment and move on to the next one (perfectionists sometimes never finish)?
All those steps have to happen for each input value. Yes, you could have a heuristic that scans the list of operations and cleans it up a bit. Yes, you could compile some of it down to assembly instead of calling +, * etc. as high level functions. You can compare vectorization (doing all the +'s then all the *'s etc, with a vector of values) to doing the whole procedure for one value at a time. But do you need to?
I mean, what do you think happens if you plot a function in gnuplot or Mathematica?
Your simple interpretation of RPN should work just fine, especially since it contains
math library functions like cos, exp, and ^(pow, involving logs)
symbol table lookup
Hopefully, your symbol table (with variables like x in it) will be short and simple.
The library functions will most likely be your biggest time-takers, so unless your interpreter is poorly written, it will not be a problem.
If, however, you really gotta go for speed, you could translate the expression into C code, compile and link it into a dll on-the-fly and load it (takes about a second). That, plus memoized versions of the math functions, could give you the best performance.
P.S. For parsing, your syntax is pretty vanilla, so a simple recursive-descent parser (about a page of code, O(n) same as shunting-yard) should work just fine. In fact, you might just be able to compute the result as you parse (if math functions are taking most of the time), and not bother with parse trees, RPN, any of that stuff.
I think this RPN based library can serve the purpose: http://expressionoasis.vedantatree.com/
I used it with one of my calculator project and it works well. It is small and simple, but extensible.
One optimization would be to replace the stack with an array of values and implement the evaluator as a three address mechine where each operation loads from two (or one) location and saves to a third. This can make for very tight code:
struct Op {
enum {
add, sub, mul, div,
cos, sin, tan,
//....
} op;
int a, b, d;
}
void go(Op* ops, int n, float* v) {
for(int i = 0; i < n; i++) {
switch(ops[i].op) {
case add: v[op[i].d] = v[op[i].a] + v[op[i].b]; break;
case sub: v[op[i].d] = v[op[i].a] - v[op[i].b]; break;
case mul: v[op[i].d] = v[op[i].a] * v[op[i].b]; break;
case div: v[op[i].d] = v[op[i].a] / v[op[i].b]; break;
//...
}
}
}
The conversion from RPN to 3-address should be easy as 3-address is a generalization.

Resources