Count the number of statements per line in C/C++ - count

Given a C program (could be a C++, although for now, i'd stick to C), I want to count the number of statements for every line of code (excluding comments etc of course)
I have been writing a parser to do that - but clearly i keep encountering code that kinda fails.
so if a code line has "i= 0; i++; i--;" on a single line, i want my parser to return 3 for this line. If i have "if (x) { x++}; else x--;" on a single line, then it should return 3 (if, x++, x--). Are there tools that already do this (does pycparser provide an option to return the number of statements in a given line?)

Related

Is it possible to break 1 line of code into multiple in atom IDE, just like in Rstudio

I am new to using Julia and the atom IDE, and I was wondering if it was possible to somehow just press enter and have the computer run 1 line of code spread over multiple lines, and still have it recognize that it is still the same 1 line of code, just like in Rstudio? (I want this for readability purposes only)
what I mean is something like:
println("Hello
world")
and be able to highlight and run this script without receiving an error message.
Yes. The method differs based on what the line of code contains, or specifically, where you want to break the line.
For a string like you've posted in the question, you have to precede the newline with a \ character, to inform Julia that you're using this newline only for readability, and don't want it to be included in the actual string. (Note: I'll be illustrating these with the command-line REPL that has the julia> prompt, but the same principles apply in the Atom/VS Code based IDE setups too).
julia> println("Hello \
world")
Hello world
Outside of strings, if the current line isn't complete, Julia automatically looks for the continuation in the next line. This means that to break a line into multiple ones, you have to leave all but the final line incomplete:
julia> 1 +
2 +
3 *
4
15
julia> DEPOT_PATH |>
first |>
readdir
16-element Vector{String}:
"artifacts"
"bin"
⋮
julia> 1,
2,
3
(1, 2, 3)
In some cases when you don't have a convenient binary operator to leave hanging like the above, you may have to start your line with an opening parenthesis, so that Julia will know that it's a continuous statement until the closing parenthesis is encountered.
You also have multi-line Strings denoted by """:
julia> println("""This
is
a multi-line \
text""")
This
is
a multi-line text

Julia print statement not working in certain cases

I've written a prime-generating function generatePrimes (full code here) that takes input bound::Int64 and returns a Vector{Int64} of all primes up to bound. After the function definition, I have the following code:
println("Generating primes...")
println("Last prime: ", generatePrimes(10^7)[end])
println("Primes generated.")
which prints, unexpectedly,
Generating primes...
9999991
Primes generated.
This output misses the "Last prime: " segment of the second print statement. The output does work as expected for smaller inputs; any input at least up to 10^6, but somehow fails for 10^7. I've tried several workarounds for this (e.g. assigning the returned value or converting it to a string before calling it in a print statement, combining the print statements, et cetera) and discovered some other weird behaviour: if the "Last prime", is removed from the second print statement, for input 10^7, the last prime doesn't print at all and all I get is a blank line between the first and third print statements. These issues are probably related, and I can't seem to find anything online about why some print statements wouldn't work in Julia.
Thanks so much for any clarification!
Edit: Per DNF's suggestion, following are some reductions to this issue:
Removing the first and last print statements doesn't change anything -- a blank line is always printed in the case I outlined and each of the cases below.
println(generatePrimes(10^7)[end]) # output: empty line
Calling the function and storing the last index in a variable before calling println doesn't change anything either; the cases below work exactly the same either way.
lastPrime::Int = generatePrimes(10^7)[end]
println(lastPrime) # output: empty line
If I call the function in whatever form immediately before a println, an empty line is printed regardless of what's inside the println.
lastPrime::Int = generatePrimes(10^7)[end]
println("This doesn't print") # output: empty line
println("This does print") # output: This does print
If I call the function (or print the pre-generated-and-stored function result) inside a println, anything before the function call (that's also inside the println) isn't printed. The 9999991 and anything else there may be after the function call is printed only if there is something else inside the println before the function call.
# Example 1
println(generatePrimes(10^7)[end]) # output: empty line
# Example 2
println("This first part doesn't print", generatePrimes(10^7)[end]) # output: 9999991
# Example 3
println("This first part doesn't print", generatePrimes(10^7)[end], " prints") # output: 9999991 prints
# Example 4
println(generatePrimes(10^7)[end], "prime doesn't print") # output: prime doesn't print
I could probably list twenty different variations of this same thing, but that probably wouldn't make things any clearer. In every single case version of this issue I've seen so far, the issue only manifests if there's that function call somewhere; println prints large integers just fine. That said, please let me know if anyone feels like they need more info. Thanks so much!
Most likely you are running this code from Atom Juno which recently has some issues with buffering standard output (already reported by others and I also sometimes have this problem).
One thing you can try to do is to flush your standard output
flush(stdout)
Like with any unstable bug restarting Atom Juno also seems to help.
I had the same issue. For me, changing the terminal renderer (File -> Settings -> Packages -> julia-client -> Terminal Options) from webgl to canvas (see pic below) seems to solve the issue.
change terminal renderer
I've also encountered this problem many times. (First time, it was triggered after using the debugger. It is probably unrelated but I have been using Julia+Juno for 2 weeks prior to this issue.)
In my case, the code before the println statement needed to have multiple dictionary assignation (with new keys) in order to trigger the behavior.
I also confirmed that the same code ran in Command Prompt (with same Julia interpreter) prints fine. Any hints about how to further investigate this will be appreciated.
I temporarily solve this issue by printing to stderr, thinking that this stream has more stringent flush mechanism: println(stderr, "hello!")

return value of if statement in r

So, I'm brushing up on how to work with data frames in R and I came across this little bit of code from https://cloud.r-project.org/web/packages/data.table/vignettes/datatable-intro.html:
input <- if (file.exists("flights14.csv")) {
"flights14.csv"
} else {
"https://raw.githubusercontent.com/Rdatatable/data.table/master/vignettes/flights14.csv"
}
Apparently, this assigns the strings (character vectors?) in the if and else statements to input based on the conditional. How is this working? It seems like magic. I am hoping to find somewhere in the official R documentation that explains this.
From other languages I would have just done:
if (file.exists("flights14.csv")) {
input <- "flights14.csv"
} else {
input <- "https://raw.githubusercontent.com/Rdatatable/data.table/master/vignettes/flights14.csv"
}
or in R there is ifelse which also seems designed to do exactly this, but somehow that first example also works. I can memorize that this works but I'm wondering if I'm missing the opportunity to understand the bigger picture about how R works.
From the documentation on the ?Control help page under "Value"
if returns the value of the expression evaluated, or NULL invisibly if none was (which may happen if there is no else).
So the if statement is kind of like a function that returns a value. The value that's returned is the result of either evaulating the if or the then block. When you have a block in R (code between {}), the brackets are also like a function that just return the value of the last expression evaluated in the block. And a string literal is a valid expression that returns itself
So these are the same
x <- "hello"
x <- {"hello"}
x <- {"dropped"; "hello"}
x <- if(TRUE) {"hello"}
x <- if(TRUE) {"dropped"; "hello"}
x <- if(TRUE) {"hello"} else {"dropped"}
And you only really need blocks {} with if/else statements when you have more than one expression to run or when spanning multiple lines. So you could also do
x <- if(TRUE) "hello" else "dropped"
x <- if(FALSE) "dropped" else "hello"
These all store "hello" in x
You are not really missing anything about the "big picture" in R. The R if function is atypical compared both to other languages as well as to R's typical behavior. Unlike most functions in R which do require assignment of their output to a "symbol", i.e a proper R name, if allows assignments that occur within its consequent or alternative code blocks to occur within the global environment. Most functions would return only the final evaluation, while anything else that occurred inside the function body would be garbage collected.
The other common atypical function is for. R for-loops only
retain these interior assignments and always return NULL. The R Language Definition calls these atypical R functions "control structures". See section 3.3. On my machine (and I suspect most Linux boxes) that document is installed at: http://127.0.0.1:10731/help/doc/manual/R-lang.html#Control-structures. If you are on another OS then there is probably a pulldown Help menu in your IDE that will have a pointer to it. Thew help document calls them "control flow constructs" and the help page is at ?Control. Note that it is necessary to quote these terms when you wnat to access that help page using one of those names since they are "reserved words". So you would need ?'if' rather than typing ?if. The other reserved words are described in the ?Reserved page.
?Control
?'if' ; ?'for'
?Reserved
# When you just type:
?if # and hit <return>
# you will see a "+"-sign which indicateds an incomplete expression.
# you nthen need to hit <escape> to get back to a regular R interaction.
In R, functions don't need explicit return. If not specified the last line of the function is automatically returned. Consider this example :
a <- 5
b <- 1
result <- if(a == 5) {
a <- a + 1
b <- b + 1
a
} else {b}
result
#[1] 6
The last line in if block was saved in result. Similarly, in your case the string values are "returned" implicitly.

R: Enriched debugging for linear code chains

I am trying to figure out if it is possible, with a sane amount of programming, to create a certain debugging function by using R's metaprogramming features.
Suppose I have a block of code, such that each line uses as all or part of its input the output from thee line before -- the sort of code you might build with pipes (though no pipe is used here).
{
f1(args1) -> out1
f2(out1, args2) -> out2
f3(out2, args3) -> out3
...
fn(out<n-1>, args<n>) -> out<n>
}
Where for example it might be that:
f1 <- function(first_arg, second_arg, ...){my_body_code},
and you call f1 in the block as:
f1(second_arg = 1:5, list(a1 ="A", a2 =1), abc = letters[1:3], fav = foo_foo)
where foo_foo is an object defined in the calling environment of f1.
I would like a function I could wrap around my block that would, for each line of code, create an entry in a list. Each entry would be named (line1, line2) and each line entry would have a sub-entry for each argument and for the function output. the argument entries would consist, first, of the name of the formal, to which the actual argument is matched, second, the expression or name supplied to that argument if there is one (and a placeholder if the argument is just a constant), and third, the value of that expression as if it were immediately forced on entry into the function. (I'd rather have the value as of the moment the promise is first kept, but that seems to me like a much harder problem, and the two values will most often be the same).
All the arguments assigned to the ... (if any) would go in a dots = list() sublist, with entries named if they have names and appropriately labeled (..1, ..2, etc.) if they are assigned positionally. The last element of each line sublist would be the name of the output and its value.
The point of this is to create a fairly complete record of the operation of the block of code. I think of this as analogous to an elaborated version of purrr::safely that is not confined to iteration and keeps a more detailed record of each step, and indeed if a function exits with an error you would want the error message in the list entry as well as as much of the matched arguments as could be had before the error was produced.
It seems to me like this would be very useful in debugging linear code like this. This lets you do things that are difficult using just the RStudio debugger. For instance, it lets you trace code backwards. I may not know that the value in out2 is incorrect until after I have seen some later output. Single-stepping does not keep intermediate values unless you insert a bunch of extra code to do so. In addition, this keeps the information you need to track down matching errors that occur before promises are even created. By the time you see output that results from such errors via single-stepping, the matching information has likely evaporated.
I have actually written code that takes a piped function and eliminates the pipes to put it in this format, just using text manipulation. (Indeed, it was John Mount's "Bizarro pipe" that got me thinking of this). And if I, or we, or you, can figure out how to do this, I would hope to make a serious run on a second version where each function calls the next, supplying it with arguments internally rather than externally -- like a traceback where you get the passed argument values as well as the function name and and formals. Other languages have debugging environments like that (e.g. GDB), and I've been wishing for one for R for at least five years, maybe 10, and this seems like a step toward it.
Just issue the trace shown for each function that you want to trace.
f <- function(x, y) {
z <- x + y
z
}
trace(f, exit = quote(print(returnValue())))
f(1,2)
giving the following which shows the function name, the input and output. (The last 3 is from the function itself.)
Tracing f(1, 2) on exit
[1] 3
[1] 3

What is the rule behind instruction count in Intel PIN?

I wanted to count instructions in simple recursive fibo function O(2^n). I succeded to do so with bubble sort and matrix multiplication, but in this case it seemed like instruction count ignored my fibo function. Here is the code used for instrumentation:
// Insert a call at the entry point of a routine to increment the call count
RTN_InsertCall(rtn, IPOINT_BEFORE, (AFUNPTR)docount, IARG_PTR, &(rc->_rtnCount), IARG_END);
// For each instruction of the routine
for (INS ins = RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins))
{
// Insert a call to docount to increment the instruction counter for this rtn
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_PTR, &(rc->_icount), IARG_END);
}
I started to wonder what's the difference between this program and the previous ones and my first thought was: here I'm not using an array.
This is what I realised after some manual tests:
a = 5; // instruction ignored by PIN and
// pretty much everything not using array
fibo[1] = 1 // instruction counted properly
a = fibo[1] // instruction ignored by PIN
So it seems like only instructions counted are writes to the memory (that's what I assume). After I changed my fibo function to this it works:
long fibonacciNumber(int n, long *fiboNumbers)
{
if (n < 2) {
fiboNumbers[n] = n;
return n;
}
fiboNumbers[n] = fiboNumbers[n-1] + fiboNumbers[n-2];
return fibonacciNumber(n - 1, fiboNumbers) + fibonacciNumber(n - 2, fiboNumbers);
}
But I would like to count instructions also for programs that aren't written by me. Is there a way to count all type of instrunctions? Is there any particular reason why only this instructions are counted? Any help appreciated.
//Edit
I used disassembly option in Visual Studio to check how it looks and it still makes no sense for me. I can't find the reason why only assingment to array is interpreted by PIN as instruction.
instruction_comparison
This exceeded all my expectations, counted as 2 instructions:
even 2 instructions, not one
PIN, like other low-level profiling and analysis tools, measures individual instructions, low-level orders like "add these two registers" or "load a value from that memory address". The sequence of instructions which a program comprises are generally produced from a high-level language like C++ through a compiler. An individual line of C++ code might be transformed into exactly one instruction, but it's also common for a line to translate to several instructions or even to zero instructions; and the instructions for a line of code may be interleaved with those of other instructions.
Your compiler can output an assembly-language file for your source code, showing what instructions were produced for which lines of code. (For GCC and Clang, this is done with the -S flag.) Note that reading the assembly code output from a compiler is not the best way to learn assembly. Also, I would point you to godbolt.org, a very convenient tool for analyzing assembly output.

Resources