source() doesn't work ("node stack overflow") - r

I have the following few lines of code in my R script called assign1.R:
(u <- c(1, 1, 0, 1, 0)) # a)
u[3] # b)
ones_u <- which(u == 1) # c)
ones_u
source("assign1.R")
Only, the source() function does not work. R shows me the following error message:
Error in match(x, table, nomatch = 0L) : node stack overflow
Error during wrapup: node stack overflow
What is the problem?

I didn't get exactly the same error you did, but I was able to get something pretty similar with a trivial example:
writeLines("source('badsource.R')",con="badsource.R")
source("badsource.R")
## Error in guess(ll) : node stack overflow
As one of the comments above states, the file you're sourcing is trying to source() itself.
This is how you would test for that possibility from within R, without just opening the file in a text editor (which is a much more sensible approach):
grepl("source('badsource.R')",readLines("badsource.R"),fixed=TRUE) ## TRUE
(obviously you should fill in the name of your assignment file here ...)
It feels like you should have noticed this yourself, but I'm answering anyway because the problem is delightfully recursive ...

Your are sourcing the file that you are in. That source() line of code should be deleted. If you are sourcing some code from another R file then you would use the source() function, otherwise there is no need to source another file. Also, if all the code works in the one file without running other bits of code in other files, it is likely that you already have the code you need and you wouldn't need to source another file.

Related

Error: object 'skim_without_charts' not found [duplicate]

I got the error message:
Error: object 'x' not found
Or a more complex version like
Error in mean(x) :
error in evaluating the argument 'x' in selecting a method for function 'mean': Error: object 'x' not found
What does this mean?
The error means that R could not find the variable mentioned in the error message.
The easiest way to reproduce the error is to type the name of a variable that doesn't exist. (If you've defined x already, use a different variable name.)
x
## Error: object 'x' not found
The more complex version of the error has the same cause: calling a function when x does not exist.
mean(x)
## Error in mean(x) :
## error in evaluating the argument 'x' in selecting a method for function 'mean': Error: object 'x' not found
Once the variable has been defined, the error will not occur.
x <- 1:5
x
## [1] 1 2 3 4 5
mean(x)
## [1] 3
You can check to see if a variable exists using ls or exists.
ls() # lists all the variables that have been defined
exists("x") # returns TRUE or FALSE, depending upon whether x has been defined.
Errors like this can occur when you are using non-standard evaluation. For example, when using subset, the error will occur if a column name is not present in the data frame to subset.
d <- data.frame(a = rnorm(5))
subset(d, b > 0)
## Error in eval(expr, envir, enclos) : object 'b' not found
The error can also occur if you use custom evaluation.
get("var", "package:stats") #returns the var function
get("var", "package:utils")
## Error in get("var", "package:utils") : object 'var' not found
In the second case, the var function cannot be found when R looks in the utils package's environment because utils is further down the search list than stats.
In more advanced use cases, you may wish to read:
The Scope section of the CRAN manual Intro to R and demo(scoping)
The Non-standard evaluation chapter of Advanced R
While executing multiple lines of code in R, you need to first select all the lines of code and then click on "Run".
This error usually comes up when we don't select our statements and click on "Run".
Let's discuss why an "object not found" error can be thrown in R in addition to explaining what it means. What it means (to many) is obvious: the variable in question, at least according to the R interpreter, has not yet been defined, but if you see your object in your code there can be multiple reasons for why this is happening:
check syntax of your declarations. If you mis-typed even one letter or used upper case instead of lower case in a later calling statement, then it won't match your original declaration and this error will occur.
Are you getting this error in a notebook or markdown document? You may simply need to re-run an earlier cell that has your declarations before running the current cell where you are calling the variable.
Are you trying to knit your R document and the variable works find when you run the cells but not when you knit the cells? If so - then you want to examine the snippet I am providing below for a possible side effect that triggers this error:
{r sourceDataProb1, echo=F, eval=F}
# some code here
The above snippet is from the beginning of an R markdown cell. If eval and echo are both set to False this can trigger an error when you try to knit the document. To clarify. I had a use case where I had left these flags as False because I thought i did not want my code echoed or its results to show in the markdown HTML I was generating. But since the variable was then used in later cells, this caused an error during knitting. Simple trial and error with T/F TRUE/FALSE flags can establish if this is the source of your error when it occurs in knitting an R markdown document from RStudio.
Lastly: did you remove the variable or clear it from memory after declaring it?
rm() removes the variable
hitting the broom icon in the evironment window of RStudio clearls everything in the current working environment
ls() can help you see what is active right now to look for a missing declaration.
exists("x") - as mentioned by another poster, can help you test a specific value in an environment with a very lengthy list of active variables
I had a similar problem with R-studio. When I tried to do my plots, this message was showing up.
Eventually I realised that the reason behind this was that my "window" for the plots was too small, and I had to make it bigger to "fit" all the plots inside!
Hope to help
I'm going to add this on here even though it's not a new question as it comes quite highly in the search results for the error:
As mentioned above, re checking syntax, if you're using dplyr, make sure you have all the %>% pipes at the end of the lines above the error, otherwise the contents of anything like a select statement won't pass down into the next part of the code block.

Gzip error when reading R data files into julia

I'm getting an error from gzip when reading an R data file. I'm trying to use the approach described here: Reading and writing RData files in Julia.
Here's a minimal example. In R, I run the following script:
var1 <- matrix( runif(9), 3, 3 )
save( var1, file='~/temp/file1.rda')
Then in julia:
using DataFrames
x = read_rda("~/temp/file1.rda")
This returns:
ERROR: GZip.GZError(-1,"gzopen failed")
in gzopen at /home/squipbar/.julia/v0.4/GZip/src/GZip.jl:250
in gzopen at /home/squipbar/.julia/v0.4/GZip/src/GZip.jl:265
in read_rda at /home/squipbar/.julia/v0.4/DataFrames/src/RDA.jl:418
I don't think that I'm doing anything dumb. The closest I've found to this error online is in the RDatasets github issues, here: https://github.com/johnmyleswhite/RDatasets.jl/issues/32
So perhaps this is somehow related to RDatasets? Suggestions very welcome.
As you found, tilde expansion is not automatic. You can use expanduser() to expand to the full file name.
julia> expanduser("~/Desktop")
"/Users/mycomputer/Desktop"
Ok, I figured this one out. It's the expansion of "~" in the location. The following works:
using DataFrames
x = read_rda("/home/squipbar/temp/file1.rda")
So I guess I learnt two things here: 1) The error message for read_rda is not that helpful, a File not found message would have saved me a lot of time, and 2) that you can't use ~ in this case (is this a general thing in Julia?)

Running as.Node from data.tree package in R

I'm trying to use the as.Node function from the data.tree library in R to visualize a set of media server log data as a tree. I've subset the original data frame by month and year, so that I can run one month's worth of data at a time. My function code for turning the data into a tree, and then printing it out as a .csv, is as follows:
treetrimmer2 <- function(x, y) {
urimodel <- as.Node(x)
uridf <- ToDataFrameTree(urimodel, "level", "count")
uridf <- filter(uridf, level <= y, count != 0)
filename <- paste(x$year[1], x$month[1], ".csv", sep="")
write.csv(uridf, file = filename, fileEncoding = "CP1252")
}
Some months finish without any issue. Other months, however, give me the following error (and traceback):
Error in (function () : unused argument (quote(<environment>))
7 (function ()
{
c(self$parent$path, self$name)
})(quote(<environment>))
6 self$AddChildNode(child)
5 mynode$AddChild(path)
4 FromDataFrameTable(x, pathName, pathDelimiter, colLevels, na.rm)
3 as.Node.data.frame(x)
2 as.Node(x) at media_visualizer.R#63
1 treetrimmer2(uricut$`2015.06`, 5)
Can anyone give me some guidance on what 'unused argument (quote())' means? I've tried googling it, and found that in some cases, it means that a function or term has already been defined in another context. But I'm still too novice to understand what that means here.
I'm running rStudio 0.99.896 and R 3.2.4 on Mac OS 10.11.5. I would share my data set, except that it is pretty massive, and I'm not sure which lines are causing the problem...
I can't claim credit for this; Christoph Glur (see the comments on the main post) figured it out. But it might be useful for others to share the cause, and my solution:
The problem is that a few of the log files contain one of the data.tree package's reserved words, in this case, "path". The format of the lines was "/something/something/path/something/something.jpg", so that data.tree read "path" as an independent word. There were other instances of "path" as part of a larger word, e.g., "pathString" or "pathTo", that didn't cause the bug.
Once he'd figured it out, my solution was to run the following command on all of the log files in Terminal:
sed -i '' 's/\/path\//\/spath\//' *.log
I'm still a novice, but as I understand it, what that means is "find and replace, in place, instances of "/path/" with "/spath/" in all of the .log files." I don't actually care about that one word, path vs. spath (which is gibberish), so changing it didn't matter. And now the as.Node() function runs properly on the data set.
Thank you, Christoph!

R Parallelisation Error unserialize(socklisk[[n]])

In a nutshell I am trying to parallelise my whole script over dates using Snow and adply but continually get the below error.
Error in unserialize(socklist[[n]]) : error reading from connection
In addition: Warning messages:
1: <anonymous>: ... may be used in an incorrect context: ‘.fun(piece, ...)’
2: <anonymous>: ... may be used in an incorrect context: ‘.fun(piece, ...)’
I have set up the parallelisation process in the following way:
Cores = detectCores(all.tests = FALSE, logical = TRUE)
cl = makeCluster(Cores, type="SOCK")
registerDoSNOW(cl)
clusterExport(cl, c("Var1","Var2","Var3","Var4"), envir = environment())
exposureDaily <- adply(.data = dateSeries,.margins = 1,.fun = MainCalcFunction,
.expand = TRUE, Var1, Var2, Var3,
Var4,.parallel = TRUE)
stopCluster(cl)
Where dateSeries might look something like
> dateSeries
marketDate
1 2016-04-22
2 2016-04-26
MainCalcFunction is a very long script with multiple of my own functions contained within it. As the script is so long reproducing it wouldn't be practical, and a hypothetical small function would defeat the purpose as I have already got this methodology to work with other smaller functions. I can say that within MainCalcFunction I call all my libraries, necessary functions, and a file containing all other variables aside from those exported above so that I don't have to export a long list libraries and other objects.
MainCalcFunction can run successfully in its entirety over 2 dates using adply but not parallelisation, which tells me that it is not a bug in the code that is causing the parallelisation to fail.
Initially I thought (from experience) that the parallelisation over dates was failing because there was another function within the code that utilised parallelisation, however I have subsequently rebuilt the whole code to make sure that there was no such function.
I have poured over the script with a fine tooth comb to see if there was any place where I accidently didn't export something that I needed and I can't find anything.
Some ideas as to what could be causing the code to fail are:
The use of various option valuation functions in fOptions and rquantlib
The use of type sock
I am aware of this question already asked and also this question, and while the first question has helped me, it hasn't yet help solve the problem. (Note: that may be because I haven't used it correctly, having mainly used loginfo("text") to track where the code is. Potentially, there is a way to change that such that I log warning and/or error messages instead?)
Please let me know if there is any other information I can provide to help in solving this. I would be so appreciative if someone could provide some guidance, as the code takes close to 40 minutes to run for a day and I need to run it for close to a year, therefore parallelisation is essential!
EDIT
I have tried to implement the suggestion in the first question included above by utilising the outfile option. Given I am using Windows, I have done this by including the following lines before the exporting of the key objects and running MainCalcFunction :
reportLogName <- paste("logout_parallel.txt", sep="")
addHandler(writeToFile,
file = paste(Save_directory,reportLogName, sep="" ),
level='DEBUG')
with(getLogger(), names(handlers))
loginfo(paste("Starting log file", getwd()))
mc<-detectCores()
cl<-makeCluster(mc, outfile="")
registerDoParallel(cl)
Similarly, at the beginning of MainCalcFunction, after having sourced my libraries and functions I have included the following to print to file:
reportLogName <- paste(testDate,"_logout.txt", sep="")
addHandler(writeToFile,
file = paste(Save_directory,reportLogName, sep="" ),
level='DEBUG')
with(getLogger(), names(handlers))
loginfo(paste("Starting test function ",getwd(), sep = ""))
In the MainCalcFunction function I have then put loginfo("text") statements at key junctures to inform me of where the code is at.
This has resulted in some text files being available after the code fails due to the aforementioned error. However, these text files provide no more information on the cause of the error aside from at what point. This is despite having a tryCatch statement embedded in MainCalcFunction where at the end, on any instance of error I have added the line logerror(e)
I am posting this answer in case it helps anyone else with a similar problem in the future.
Essentially, the error unserialize(socklist[[n]]) doesn't tell you a lot, so to solve it it's a matter of narrowing down the issue.
Firstly, be absolutely sure the code runs over several dates in non-parallel with no errors
Ensure the parallelisation is set up correctly. There are some obvious initial errors that many other questions respond to, e.g., hidden parallelisation inside the code which means parallelisation is occurring twice.
Once you are sure that there is no problem with the code and the parallelisation is set up correctly start narrowing down. The issue is likely (unless something has been missed above) something in the code which isn't a problem when it is run in serial, but becomes a problem when run in parallel. The easiest way to narrow down is by setting outfile = "Log.txt" in which make cluster function you use, e.g., cl<-makeCluster(cores-1, outfile="Log.txt"). Then add as many print("Point in code") comments in your function to narrow down on where the issue is occurring.
In my case, the problem was the line jj = closeAllConnections(). This line works fine in non-parallel but breaks the code when in parallel. I suspect it has something to do with the function closing all connections including socket connections that are required for the parallelisation.
Try running using plain R instead of running in RStudio.

How to preserve changes to function with fix() between R sessions?

If I edit a function with R v2.14.0 using fix(), those fixes are applied during the session.
For example, I might make the following edit to get a white background in a hive plot:
> library(HiveR)
> fix(plotHive)
... :%s/black/white/g
... :w
... :q
> plotHive(myHiveData)
I then get a white background in the hive plot, as expected.
But if I quit and reopen R, I have lost those changes, and the plot has a black background again.
How do I preserve the edits I make with fix() between R sessions?
EDIT
If I source() the modified plotHive() function, I get the following error:
> modifiedPlotHive <- source("modifiedPlotHive.R")
Error in source("modifiedPlotHive.R") :
modifiedPlotHive.R:1160:1: unexpected '<'
1159: }
1160: <
^
In addition: Warning message:
In readLines(file) : incomplete final line found on 'modifiedPlotHive.R'
The final line in the modified plotHive() function is:
<environment: namespace:HiveR>
If I remove this line before source()-ing, then the function no longer works.
Sorry I missed this when it came out, but the latest version of HiveR has the option to control the background color (available on CRAN 0.2-1) Bryan
Here's the safer way of doing what you want, referenced by #joran.
The sink/source pair is fine for dealing with R code files. But saving to text files and then reading back in other types of objects can strip them of important attributes, especially those relating to environments. That's what you just experienced.
The save/load pair stores objects in R's own binary format, so is much less liable to lose important information/environments attached to functions.
In this example, I define a personal version of ls, which differs from the base function in that it by default lists objects that start with a dot/period:
my_ls <- ls
fix(my_ls)
# 1) On the first line, change 'all.names=FALSE' to 'all.names=TRUE'
# 2) Say "Yes", I want to save the changes
save("my_ls", file="my_ls.Rdata")
# Then, in a later session, test that it works
load("my_ls.Rdata")
.TrysToHide <- 99
my_ls()
# [1] ".TrysToHide" "my_ls"
One more note: it's much cleaner to give your modified function a name of its own. To really edit a packaged function, and have the changes persist, you'd need to edit the sources and recompile the package. But if you do that, beware, as you may well break the function for other packaged functions that depend on it.
There are a couple of options:
Save your workspace before quiting and load it again when you reopen R.
Save the modified function to script file and source it:
sink("modified_plotHive.r")
plotHive
sink()
In the next session:
plotHive <- source("modified_plotHive.r")
HTH

Resources