Debugging within a namespace - r

I frequently need to debug a function in one of my R packages, and find it convenient to add test code and print statements throughout the code. Then the function is a method inside a package, running tests from the console will use the package-stored old version of the function and not the new test version. I often resort to something like cat *.r > /tmp/package.r and then source('/tmp/package.r') to override all functions, which allows the test function to be prioritized. But this doesn't work when I have .Fortran or similar calls within the package.
Is there an elegant way to debug with function overrides within the correct version of a local package?

Regardless of your IDE, you can reload your package under development with devtools:
devtools::load_all("path/to/your/package/directory")
This should load it to your R session (RStudio has buttons and keyboard shortcuts for this too)

This is an extension of my comment above. As said in the comment checkout this guide for more detail.
To inspect calls dynamically during calls, editing functions and adding code can be done using the trace(func, edit = TRUE) method. However it is not the recommended methodology for doing so in any programming language. It is however recommended to instead perform debugging. Luckily debugging in R is simpler than many other languages. Some of the most noticeable methods for debugging in R is obtained using
debug(func) or debugonce(func)
trace
browser
I'll ignore trace's main usage, and only pass over it briefly in conjunction with browser.
Take this small example code
x <- 3
f <- function(x){
x <- x ** 2
x
}
Now if we wanted to "go into" this function and inspect what happens we can use the debug method simply by doing
debug(f) # alt: debugonce(f)
f(x)
and the following shows in the console
debugging in: f(x)
debug at #1: {
x <- x^2
x
}
We see 2 things: The original call in line 1, the function debugged including a function-line-number (#1) with the body in the function (potentially truncated). In addition the command has changed to Browse[n] (where n is a number), indicating we are in the debugger.
At this point nothing has run so in the "environment" tab in Rstudio we can see x | x. Note that x is still a promise (its value is delayed until used or changed). If we execute x we get the value [1] 3 and we see the environment change to x | 3 (it has been forced, and is no longer a promise).
Inside the debugger we can use the following commands to "walk through" the function
n to move to the "next line"
s to move to the "next line", or if the current call is a function, "step" into the function (debug this function)
f to move forward until the next break (with the added utility that if you are in a loop you stop at "loop end".
c to move until next break point or end of function (without breaking at end of loops).
Q to exit immediately
It you click n for example you will see
debug at #2: x <- x^2
printed in the console. This indicates the line that is executed next. Notice the value of x in the environment and run n again, notice the value changed from x | 3 to x | 9 and
debug at #3: x
is printed. This being the last line pressing n again will exit the function and print
exiting from: f(x)
[1] 9
Once you're done debugging you can run undebug(f) to remove the breakpoint and stop the debugger from activating.
This is a simple function, easy to debug, but the idea for more complex functions are the same. If you are in a long loop you can use f to skip to the end of the loop, similar to pressing n a bunch of times. Note that if you hit an error at any point it will exit automatically when the error occurs and you'll have to walk back to the point again or alternatively use browser.
In the case where you have a function like
f2 <- function(x){
x <- x + 2
f(x)
}
you can further step into the nested function call f(x) by using the s command while the line is printing
debug at #3: f(x)
or by using debug(f2) and debug(f) in conjunction. Both will give the same result (try it out!).
Now in many cases you might hit a bug or debug many lines (potentially thousands). In this case, you might have some "idea" where you want to start, and this might not be the start of the function. In this case you can use browser(). This basically sets a breakpoint. Whenever browser() is hit, it will stop and start the debugger (similar to debug(f) and calling f(x) but at a specific point). Try for example
f3 <- function(x){
x1 <- f(x)
browser()
x2 <- f2(x)
c(x1, x2)
}
f3(x)
and you'll notice see
Called from: f3(x)
printed (if you have run undebug(f2) and undebug(f) first).
Lets say it is not your function but a function within a namespace, well then we can even add the breakpoint ourself at run-time. Try for example calling
trace(f3, edit = TRUE)
and you will see an editing window pop up. Simply add browser() at the desired spot and click save. This edits the function within the namespace. It will be reverted once R is closed or alternatively you can remove it with another call to trace(f3, edit = TRUE).

Related

Julia print statement not working in certain cases

I've written a prime-generating function generatePrimes (full code here) that takes input bound::Int64 and returns a Vector{Int64} of all primes up to bound. After the function definition, I have the following code:
println("Generating primes...")
println("Last prime: ", generatePrimes(10^7)[end])
println("Primes generated.")
which prints, unexpectedly,
Generating primes...
9999991
Primes generated.
This output misses the "Last prime: " segment of the second print statement. The output does work as expected for smaller inputs; any input at least up to 10^6, but somehow fails for 10^7. I've tried several workarounds for this (e.g. assigning the returned value or converting it to a string before calling it in a print statement, combining the print statements, et cetera) and discovered some other weird behaviour: if the "Last prime", is removed from the second print statement, for input 10^7, the last prime doesn't print at all and all I get is a blank line between the first and third print statements. These issues are probably related, and I can't seem to find anything online about why some print statements wouldn't work in Julia.
Thanks so much for any clarification!
Edit: Per DNF's suggestion, following are some reductions to this issue:
Removing the first and last print statements doesn't change anything -- a blank line is always printed in the case I outlined and each of the cases below.
println(generatePrimes(10^7)[end]) # output: empty line
Calling the function and storing the last index in a variable before calling println doesn't change anything either; the cases below work exactly the same either way.
lastPrime::Int = generatePrimes(10^7)[end]
println(lastPrime) # output: empty line
If I call the function in whatever form immediately before a println, an empty line is printed regardless of what's inside the println.
lastPrime::Int = generatePrimes(10^7)[end]
println("This doesn't print") # output: empty line
println("This does print") # output: This does print
If I call the function (or print the pre-generated-and-stored function result) inside a println, anything before the function call (that's also inside the println) isn't printed. The 9999991 and anything else there may be after the function call is printed only if there is something else inside the println before the function call.
# Example 1
println(generatePrimes(10^7)[end]) # output: empty line
# Example 2
println("This first part doesn't print", generatePrimes(10^7)[end]) # output: 9999991
# Example 3
println("This first part doesn't print", generatePrimes(10^7)[end], " prints") # output: 9999991 prints
# Example 4
println(generatePrimes(10^7)[end], "prime doesn't print") # output: prime doesn't print
I could probably list twenty different variations of this same thing, but that probably wouldn't make things any clearer. In every single case version of this issue I've seen so far, the issue only manifests if there's that function call somewhere; println prints large integers just fine. That said, please let me know if anyone feels like they need more info. Thanks so much!
Most likely you are running this code from Atom Juno which recently has some issues with buffering standard output (already reported by others and I also sometimes have this problem).
One thing you can try to do is to flush your standard output
flush(stdout)
Like with any unstable bug restarting Atom Juno also seems to help.
I had the same issue. For me, changing the terminal renderer (File -> Settings -> Packages -> julia-client -> Terminal Options) from webgl to canvas (see pic below) seems to solve the issue.
change terminal renderer
I've also encountered this problem many times. (First time, it was triggered after using the debugger. It is probably unrelated but I have been using Julia+Juno for 2 weeks prior to this issue.)
In my case, the code before the println statement needed to have multiple dictionary assignation (with new keys) in order to trigger the behavior.
I also confirmed that the same code ran in Command Prompt (with same Julia interpreter) prints fine. Any hints about how to further investigate this will be appreciated.
I temporarily solve this issue by printing to stderr, thinking that this stream has more stringent flush mechanism: println(stderr, "hello!")

R: Enriched debugging for linear code chains

I am trying to figure out if it is possible, with a sane amount of programming, to create a certain debugging function by using R's metaprogramming features.
Suppose I have a block of code, such that each line uses as all or part of its input the output from thee line before -- the sort of code you might build with pipes (though no pipe is used here).
{
f1(args1) -> out1
f2(out1, args2) -> out2
f3(out2, args3) -> out3
...
fn(out<n-1>, args<n>) -> out<n>
}
Where for example it might be that:
f1 <- function(first_arg, second_arg, ...){my_body_code},
and you call f1 in the block as:
f1(second_arg = 1:5, list(a1 ="A", a2 =1), abc = letters[1:3], fav = foo_foo)
where foo_foo is an object defined in the calling environment of f1.
I would like a function I could wrap around my block that would, for each line of code, create an entry in a list. Each entry would be named (line1, line2) and each line entry would have a sub-entry for each argument and for the function output. the argument entries would consist, first, of the name of the formal, to which the actual argument is matched, second, the expression or name supplied to that argument if there is one (and a placeholder if the argument is just a constant), and third, the value of that expression as if it were immediately forced on entry into the function. (I'd rather have the value as of the moment the promise is first kept, but that seems to me like a much harder problem, and the two values will most often be the same).
All the arguments assigned to the ... (if any) would go in a dots = list() sublist, with entries named if they have names and appropriately labeled (..1, ..2, etc.) if they are assigned positionally. The last element of each line sublist would be the name of the output and its value.
The point of this is to create a fairly complete record of the operation of the block of code. I think of this as analogous to an elaborated version of purrr::safely that is not confined to iteration and keeps a more detailed record of each step, and indeed if a function exits with an error you would want the error message in the list entry as well as as much of the matched arguments as could be had before the error was produced.
It seems to me like this would be very useful in debugging linear code like this. This lets you do things that are difficult using just the RStudio debugger. For instance, it lets you trace code backwards. I may not know that the value in out2 is incorrect until after I have seen some later output. Single-stepping does not keep intermediate values unless you insert a bunch of extra code to do so. In addition, this keeps the information you need to track down matching errors that occur before promises are even created. By the time you see output that results from such errors via single-stepping, the matching information has likely evaporated.
I have actually written code that takes a piped function and eliminates the pipes to put it in this format, just using text manipulation. (Indeed, it was John Mount's "Bizarro pipe" that got me thinking of this). And if I, or we, or you, can figure out how to do this, I would hope to make a serious run on a second version where each function calls the next, supplying it with arguments internally rather than externally -- like a traceback where you get the passed argument values as well as the function name and and formals. Other languages have debugging environments like that (e.g. GDB), and I've been wishing for one for R for at least five years, maybe 10, and this seems like a step toward it.
Just issue the trace shown for each function that you want to trace.
f <- function(x, y) {
z <- x + y
z
}
trace(f, exit = quote(print(returnValue())))
f(1,2)
giving the following which shows the function name, the input and output. (The last 3 is from the function itself.)
Tracing f(1, 2) on exit
[1] 3
[1] 3

How to quit/exit from file included in the terminal

What can I do within a file "example.jl" to exit/return from a call to include() in the command line
julia> include("example.jl")
without existing julia itself. quit() will just terminate julia itself.
Edit: For me this would be useful while interactively developing code, for example to include a test file and return from the execution to the julia prompt when a certain condition is met or do only compile the tests I am currently working on without reorganizing the code to much.
I'm not quite sure what you're looking to do, but it sounds like you might be better off writing your code as a function, and use a return to exit. You could even call the function in the include.
Kristoffer will not love it, but
stop(text="Stop.") = throw(StopException(text))
struct StopException{T}
S::T
end
function Base.showerror(io::IO, ex::StopException, bt; backtrace=true)
Base.with_output_color(get(io, :color, false) ? :green : :nothing, io) do io
showerror(io, ex.S)
end
end
will give a nice, less alarming message than just throwing an error.
julia> stop("Stopped. Reason: Converged.")
ERROR: "Stopped. Reason: Converged."
Source: https://discourse.julialang.org/t/a-julia-equivalent-to-rs-stop/36568/12
You have a latent need for a debugging workflow in Julia. If you use Revise.jl and Rebugger.jl you can do exactly what you are asking for.
You can put in a breakpoint and step into code that is in an included file.
If you include a file from the julia prompt that you want tracked by Revise.jl, you need to use includet(.
The keyboard shortcuts in Rebugger let you iterate and inspect variables and modify code and rerun it from within an included file with real values.
Revise lets you reload functions and modules without needing to restart a julia session to pick up the changes.
https://timholy.github.io/Rebugger.jl/stable/
https://timholy.github.io/Revise.jl/stable/
The combination is very powerful and is described deeply by Tim Holy.
https://www.youtube.com/watch?v=SU0SmQnnGys
https://youtu.be/KuM0AGaN09s?t=515
Note that there are some limitations with Revise, such as it doesn't reset global variables, so if you are using some global count or something, it won't reset it for the next run through or when you go back into it. Also it isn't great with runtests.jl and the Test package. So as you develop with Revise, when you are done, you move it into your runtests.jl.
Also the Juno IDE (Atom + uber-juno package) has good support for code inspection and running line by line and the debugging has gotten some good support lately. I've used Rebugger from the julia prompt more than from the Juno IDE.
Hope that helps.
#DanielArndt is right.
It's just create a dummy function in your include file and put all the code inside (except other functions and variable declaration part that will be place before). So you can use return where you wish. The variables that only are used in the local context can stay inside dummy function. Then it's just call the new function in the end.
Suppose that the previous code is:
function func1(...)
....
end
function func2(...)
....
end
var1 = valor1
var2 = valor2
localVar = valor3
1st code part
# I want exit here!
2nd code part
Your code will look like this:
var1 = valor1
var2 = valor2
function func1(...)
....
end
function func2(...)
....
end
function dummy()
localVar = valor3
1st code part
return # it's the last running line!
2nd code part
end
dummy()
Other possibility is placing the top variables inside a function with a global prefix.
function dummy()
global var1 = valor1
global var2 = valor2
...
end
That global variables can be used inside auxiliary function (static scope) and outside in the REPL
Another variant only declares the variables and its posterior use is free
function dummy()
global var1, var2
...
end

How to find the assigned value to a variable inside a function on console in R

Suppose I have a small function in R
getsum <- function(a, b){
c <- a+b
}
Now when I run this function, It runs rormally. But my question is can I check the assigned value to c on console? I know that:
I can print its value within function, which will get reflected on console
I can return this value via return keyword.
I don't want any of these. My question is specifically, whether I can check the value of variables inside a function at console. I tried functionname::variablename, but it is not working
So there are a couple of answers I could give here, I'm not positive what you need it for.
(EDIT: oh, and FYI, I would avoid using "c" as a variable name, that's a function, the ambiguous value is bad. R will compensate by using the value in your current environment, but that's a stopgap measure)
The easiest would be to assign some variable:
x <- NULL
at the global environment level, and then, somewhere, during the function:
x <<- c
the double arrow assigns the value to the topmost environment.
However, this is a bit dangerous, as it's going to run as far up as it can to assign that variable. If you're after debugging, then I recommend adding:
browser()
inside the code somewhere, it will stop execution during the function and then you can run whatever you like inside the function environment.
If you want to debug your code, just use RSTUDIO https://www.rstudio.com/ is an IDE with debugging capabilities. In Rstudio setting a breakpoint inside your function and can check the value of internal variables in the right panels.

Can't add constant to vector in R

I don't know what is happening, but I can't seem to add a constant to a vector. For example, typing in the console c(1,2,3,4)+5 returns 15 instead of (6,7,8,9). What am I doing wrong?
Thank you for your help.
Someone.... probably you ... has redefined the "+" function. It's easy to do:
> `+` <- function(x,y) sum(x,y)
> c(1,2,3,4)+5
[1] 15
It's easy to fix, Just use rm():
> rm(`+`)
> c(1,2,3,4)+5
[1] 6 7 8 9
EDIT: The comments (which raised the alternate possibility that c had instead been redefined as sum) are prompting me to add information about how to examine and recover from the alternative possibilities. You could use two methods to determine which of the two functions in the expression c(1,2,3,4) + 5 was the culprit. One could either type their names (with the backticks enclosing +), and note whether you got the proper definition:
> `+`
function (e1, e2) .Primitive("+")
> c
function (..., recursive = FALSE) .Primitive("c")
Using rm on the culprit (the on that doesn't match above) remains the quickest solution. Using a global rm is an in-session brainwipe:
rm(list=ls())
# all user defined objects, including user-defined functions will be removed
The advice to quit and restart would not work in some situations. If you quit-with-save, the current function definitions would be preserved. If you had earlier quit-with-save from a session where the redefinition occurred, then not saving in this session would not have fixed the problem, either. The results of prior session are held in a file named ".Rdata and this file is invisible for both Mac and Windows users because the OS file viewer (Mac's Finder.app or MS's Windows Explorer) will not display file names that begin with a "dot". I suspect that Linux users get to see them by default since using ls in a Terminal session will show them. (It's easy to find ways to change that behavior in a Mac, and that is the way I run my device.) Deleting the .Rdata file is helpful in this instance, as well as in the situation where your R session crashes on startup.

Resources