Does utils:::fixInNamespace make permanent changes? R - r

I am using a 3rd party package for a script, I want to tweak one of the functions so that one of the variables created by the function is assigned to the global environment.
I did this previously by doing
fixInNamespace("the_function","the_namespace","namespace:::the_function")
And it opened a pop-up window where I could add my one line of code
assign("global_variable", "variable", envir = .GlobalEnv)
It worked like a charm, I could then write the rest of my script to use this newly formed variable.
I have tried to run the code again one day later and it can't find the global variable, and if I run
namespace:::the_function
It shows me the function code without my edit, why has it reverted back to its previous form? Is fixInNamespace not permanent?
Thanks,
Ryan

No, it's not permanent. It will last until the namespace is loaded again. Normally namespaces stay loaded for an entire R session, so your change will last for the session. (It is possible to unload a namespace without quitting R; in that case your change would be lost as soon as the namespace was unloaded.) In any case, the next time the package is loaded, it will be the original version of the namespace.
There are a couple of ways to make your change permanent, but doing this is not a good idea. One way is to call assignInNamespace from your startup code (see ?Startup for possibilities). Another is to edit the R source code, and build your own custom copy of R.
Neither of these is a good idea in the long term. Some future version of R might change the function you've modified, and then you'll end up with a modified obsolete version.

Related

Why doesn't RStudio clear its Global Environment when something goes wrong?

Using R Version 4.0.2 and RStudio Version 1.3.1056
This is honestly one of the strangest features I've seen in RStudio, and I suppose there's probably a good reason for it to be there, but I'm currently not seeing it and I feel that it can lead to a lot of issues of misleading data.
Basically, to my understanding, when you create and open an R project in RStudio, RStudio creates a Session with a Global Environment.
Every time you run something, this is added to the Global Environment, I assume it's done as a cached value.
However, I've encountered situations where this feature leads to either:
Outdated/wrong values being shown in my tests.
Cases where a function stops working altogether after changing 1 piece of code, executing the new code, then undoing the change.
functions "bleeding into" other files without importing/sourcing them.
Case 1 and 2 obviously leads to a lot of issues while testing. If you try to run a test like
test <- someFunction()
test #to display the value of the test
If the code is correct, the test will execute and the results of test will be stored in the Global Environment.
However, if you then proceed to break the code and run the test again, since test already has a value stored in Global Environment, that old value will print, even though the function failed and thus didn't return anything. Of course if you go above on the console feed, you might run into a line after test <- someFunction() saying "someFunction failed for X reason", but I still think it's both pretty misleading and not very intuitive. Sometimes the result of a function is really large and it's complicated to scroll all the way up the console to see if the code exited with an error, whereas other IDEs would simply immediately tell you at the end of the console that the code failed, and not print the old and outdated value.
Example: Running the proper code.
Running the code after having changed is.na to the non-existent is.not.na.
Notice how it's still printing the old value belonging to the previous version of the function.
The third case can also lead to misleading scenarios.
If you execute a function in any file within your session, the function is stored in the Global Environment. This allows you to call the function from any other file, even if you haven't added a source statement at the top to load the file containing that function.
Once again this can lead to cases where you inadvertently change/add a new function on file B without running it, then try to invoke the function from file A and you get unexpected results because you were actually invoking the old/outdated function, and the Global Environment has no idea about the changes to the old one or the new function.
All of these issues are rather easy to fix, but I think that's a bit beyond the point. Why is this a feature in general? Why isn't the Global Environment emptied upon errors in execution? I know that you can manually empty the GE whenever you want, but it seems odd to me that the IDE doesn't do it on its own, or, to my knowledge, that it doesn't provide you with an option for it to do it.
I can imagine that it provides some benefit at runtime, but is it really that significant that it can justify these behaviors?

How to make a package set up protected variables in R?

I am trying to create a R package mypckg with a function createShinyApp. The latter function should create a directory structure ready to use as shiny app at some location. In this newly created shiny app, I have some variables, which should be accessed from within the shiny app, but not by the user directly (to prevent a user from accidentally overwriting them). The reason for these variables to exist (I know one should not use global variables) is that the shiny app is treating a text corpus and I want to avoid passing (and hence copying) it between the many functions because this could lead to exhaustion of the memory. Somebody using mypckg should be able to set these variables and later use createShinyApp.
My ideas so far are:
I make mypckg::createShinyApp save the protected variables in a protectedVariables.rds file and get the shinyApp to load the variables from this file into a new environment. I am not very experienced with environments so I could not get this to work properly yet because the creation of a new environment is not working upon running a shiny app so far.
I make mypckg::createShinyApp save the protected variables in a protectedVariables.rds file and get the shinyApp to load the variables from this file into the options. Thereafter I would access the variables and set the variables with options() and getOption.
What are the advantages and disadvantages of these ideas and are there yet simpler and more elegant ways of achieving my goal?
It's a little bit difficult to understand the situation without seeing a concrete example of the kind of variable and context you're using it in, but I'll try to answer.
First of all: In R, it's very very difficult to achieve 100% protection of a variable. Even in shiny, the authors of shiny tried putting up a lot of mechanisms to disallow some variables from getting overwritten by users (like the input variable for example), and while it does make it much harder, you should know that it's impossible, or at least extremely difficult, to actually block all ways of changing a variable.
With that disclaimer out of the way, I assume that you'd be happy with something that prevents people from accidentally overwriting the variable, but if they go purposely out of their way to do it, then so be it. In that case, you can certainly read the variables from an RDS file like you suggest (with the caveat that the user can of course overwrite that file). You can also use a global package-level variable -- usually talking about global variables is bad, but in the context of a package it's a very common thing to do.
For example, you can define in a globals.R file in your package:
.mypkgenv <- new.env(parent = emptyenv())
.mypkgenv$var1 <- "some variable"
.mypkgenv$var2 <- "some other variable"
And the shiny app can access these using
mypckg:::.mypkgenv$var1
This is just one way, there are other ways too

Include library calls in functions?

Is it good practice to include every library I need to execute a function within that function?
For example, my file global.r contains several functions I need for a shiny app. Currently I have all needed packages at the top of the file. When I'm switching projects/copying these functions I have to load the packages/include them in the new code. Otherwise all needed packages are contained in that function. Of course I have to check all functions with a new R session, but I think this could help in the long run.
When I tried to load a package twice it won't load the package again but checks it's already loaded. My main question is if it would slow my functions if I restructure in that way?
I only saw that practice once, library calls inside functions, so I'm not sure.
As one of the commenters suggest, you should avoid loading packages within a function since
The function now has a global effect - as a general rule, this is something to avoid.
There is a very small performance hit.
The first point is the big one. As with most optimisation, only worry about the second point if it's an issue.
Now that we've established the principle, what are the possible solution.
In small projects, I have a file called packages.R that includes all the library calls I need. This is sourced at the top of my analysis script. BTW, all my functions are in a file call func.R. This workflow was stolen/adapted from a previous SO question
If you're only importing a single function, you could use the :: trick, e.g. package::funcA(...) That way you avoid loading the package.
For larger projects, I create an R package that handles all necessary imports. The benefit of creating a package is detailed in this answer on structuring large R projects.

R is there a way to dynamically update a function as you are building it

I am very new to R and I am using RStudio. I am building a new "user-defined" function, which entails a huge amount of trial and error. Every time I make the slightest change to the function I need select the entire function and do crtl+Enter in order to "commit" the function to the workspace.
I am hoping there is a better way of doing it, perhaps in a separate window that automatically "commits" when I save.
I am coming from Matlab and am used to just saving the function after which it is already "committed".
Ctrl+Shift+P re-runs previously executed region, so you won't have to highlight your function again. so this will work unless you have executed something else in the interim.
If you wan to run some part of your code in RStudio you simply have to use Ctrl+Enter. If the code were run every time you saved it, it could have very bad effects. Imagine that you have a huge script that runs for a long time and uses much computer resources - this would lead you to killing R to stop the script every time you saved it!
What you could do is to save script in external file and than call it from your main script using source("some_directory/myscript.R").

Is there an Rstudio keyboard shortcut to open up the file that contains the source code to a function you've written?

I have multiple packages of my own making that I usually have loaded into my R sessions, as well as various functions specific to a small project stored in various utils files. Say I know the name of a function but want to open the particular file housing that function, for reading & debugging purposes. In pyCharm, for example, you can just select the name of that function and press ctrl-b. Is there any sort of keyboard shortcut or function to find (and ideally autmatically open) the file / line that contains the definition of my function of interest?
Thanks!
If you are within a package then F2 will navigate to the source file of functions defined within that package (it would be nice if you could also go to other packages but that doesn't work yet). You can also use Ctrl+. to do a typeahead search of all functions in the package (and navigate from the list).
The only soltuion that I am aware of is that you can select a function name in RStudio (it is actually enough to place the curser somewhere inside the function name) and then press F2. This will open up a tab called Source Viewer in the source pane, where you can look at the function definition. It does not, however, open the file where the function was defined. This means that you can not edit and save the function.
I don't know for sure that there is no funcionality to open the file where the function is defined, but I have good reason to suspect that there is not. When you source a script, the R expressions in that script are evaluated. If it contains function (or variable) definitions, these are stored in memory and will be available in the R session for further use. These R objects do not know where the code that defined them is stored (or whether they were just defined from the command line), so I don't see an immediate way how RStudio could know, where to look for the file containing the definition and open it.
In case you are looking for shortcut to access file, press CTRL key and double click the file name to get it opened.

Resources