I'm running Spark in 'standalone' mode on a local machine in Docker containers. I have a master and two workers, each is running in its own Docker container. In each of the containers the path /opt/spark-data is mapped to the same local directory on the host.
I'm connecting to the Spark master from R using sparklyr, and I can do a few things, for example, loading data into Spark using sparklyr::copy_to.
However, I cannot get sparklyr::spark_read_csv to work. The data I'm trying to load is in the local directory that is mapped in the containers. When attaching to the running containers I can see that the file I'm trying to load does exist in each of the 3 containers, in the local (to the container) path /opt/spark-data.
This is an example for the code I'm using:
xx_csv <- spark_read_csv(
sc,
name = "xx1_csv",
path = "file:///opt/spark-data/data-csv"
)
data-csv is a directory containing a single CSV file. I've also tried specifying the full path, including the file name.
When I'm calling the above code, I'm getting an exception:
Error: org.apache.spark.sql.AnalysisException: Path does not exist: file:/opt/spark-data/data-csv;
I've also tried with different numbers of / in the path argument, but to no avail.
The documentation for spark_read_csv says that
path: The path to the file. Needs to be accessible from the
cluster. Supports the ‘"hdfs://"’, ‘"s3a://"’ and ‘"file://"’
protocols.
My naive expectation is that if, when attaching to the container, I can see the file in the container file system, it means that it is "accessible from the cluster", so I don't understand why I'm getting the error. All the directories and files in the path are owned by rood and have read permissions by all.
What am I missing?
try without "file://" and with \\ if your are Windows user.
I wanted to make internally sharing/locally launching a shiny app developed with the {golem} framework a little more robust.
Hence, I used the renv package and installed the shiny app as a local package into a project folder.
I proceeded as follows (thanks #Kat for the suggestion):
initialize renv using renv::init(bare = TRUE)
renv::install("my_local_package")
renv::snapshot(type = "all")
renv::isolate()
Writing a launch file consisting of:
library(golempackage)
renv::restore()
golempackage::run_app(options = list(launch.browser = TRUE))
Share folder.
However, when launching the shiny app on a different computer (or a docker testing environment), I get the following error caused by the package bslib. Same happens when I delete my cache:
An error has occurred!
File attachments must exist: 'C:/Users/XYZ/AppData/Local/R/cache/R/renv/cache/v5/.../bslib/lib/bs3/assets/fonts'
Note: this error even occurs if I set the cache to be project-local and share it inside the project folder.
However, now the error message does not reference the global but the project-local cache. Unfortunately still as an absolute path which throws an error for other users.
This is all super weird and I have not the slightest idea why this occurs.
I would like to avoid removing bslib.
As far as I can see, the error is coming from the sass package, e.g.
https://github.com/rstudio/sass/blob/f7a954027447dd0b9826ec01c7084c89a6e64fcc/R/layers.R#L442-L443
While I don't know exactly know what's going on, you could probably use the R debugger to check why that's failing. (Does the referenced folder exist in both cases? Are you expecting renv to be using the cache in the second case?)
I'm trying to run two things: first, I'm creating a PDF with 4x5, ending with dev.off(), and then trying to create a new graph. However, after starting the second plot, I get:
Error in gzfile(file, "wb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "wb") :
cannot open compressed file '/var/folders/n9/pw_dz8d13j3gb2xgqb6rfnz00000gn/T/RtmpTfm1Ur/rs-graphics-822a1c83-b3fd-46c3-8028-4e0778f91d0c/4db4b438-ac35-403b-b791-e781baba152c.snapshot', probable reason 'No such file or directory'
Graphics error: Error in gzfile(file, "wb") : cannot open the connection
What is this error? The working directory is one I have read/write access to, and my hard drive isn't full.
Also, I'm using RStudio.
This is a bit late but for anyone coming here for help, I got this error when I was trying to write a file from RStudio and my destination file path was very long. I realized this could be a problem because when I wrote the file to another location with a shorter name and tried to copy it into my original destination, Windows gave me an error saying "File path too long". You might need to save the original file into another location with a shorter absolute path.
Maybe you should look here. At the end it says
Note:
The most common reason for failure is lack of write permission in the current directory. For save.image and for saving at the end of a session this will shown by messages like
Error in gzfile(file, "wb") : unable to open connection
In addition: Warning message:
In gzfile(file, "wb") :
cannot open compressed file '.RDataTmp',
probable reason 'Permission denied'
So rapidly, if you try getwd(), look at where is your working directory set. If you're trying to save your document in a place where it's not in your current working directory, it will throw you this error.
At the end of your error message, it says probable reason 'No such file or directory'
Graphics error: Error in gzfile(file, "wb") : cannot open the connection
My diagnosis would be simply that it's trying to save your item in the wrong place and RStudio is not able to find the right place.
This burned me so hopefully saves someone else some toil. The issue was that the classifiers loaded just fine on OS X but on the Linux deployment system they would fail with the error listed in the question. The issue was the the files on the disk had extension abc.RData but the code modelAbc <- readRDS(file="abc.Rdata"). The difference in the upper and lowercase D in the .RData vs .Rdata extension would fail on Linux. It was not very noticeable but check your extensions for case.
You may have no permission to save file in the directory.
On RStudio, get your working directory by getwd().
Then, go to the directory in linux and observe its owner by ls -l.
Now you can change the owner of the directory by chown -R username directoryname.
But you must be root.
Problem resolved by specifying full file path:
saveRDS(df,'C:\\users\\matt\\desktop\\code\\df.Rdata')
I faced this issue lately. Try turning off your anti-virus and build the package, it might help. It worked for me. Usually anti-virus blocks the permissions and you could avoid it by disabling for sometime just before building a package.
I was trying to save an RDS file to my local Dropbox folder so it syncs with my Dropbox.
I figured out I got the same error because I was trying to create a new folder and looks like saveRDS cannot create a new folder, but it can add files to existing folders. So I changed the path to add the file into an existing folder and it worked!
In my case it was Windows Defender which was preventing Rstudio to write any file on hard drive. Either you need to turn Controlled Folder Access off or add Rstudio in the exclusion list.
I also had this problem when working with RStudio and R Markdown. I was getting this error message and had an annoying number of fatal errors which closed RStudio. My issue was that I was working off a network drive and either the name was too long, as in #AHedge above or my network firewalls were giving me trouble. For the moment, I have moved my working files to my desktop and things seem to be working fine. Not sure what this means for my file management over time.
Just want to add more clarity(scenarios in my experience) to what M Beausoleil mentioned.
When you are using a shared-working-directory and trying to rewrite the RDS files which are already existing in a working-directory written by some other user, you get this error.
As some people have already quoted that deleting the existing RDS files or changing the working directory works. It's not a magic. It just works because you are writing a new RDS file and not trying to re-write the old ones.
I came into the same problem after I re-install a new version of RStudio.
The Rmarkdown file I created using old version of RStudio shows the same problem.
When I use ggplot() to draw a picture the error code are as follow:
Warning in gzfile(file, "wb") :
cannot open compressed file 'I:/Rlearning/.Rproj.user/shared/notebooks/58A1385C-PCA作图/1/2C15461A183AC56C/cco192gb0pow1_t\_rs_rdf_32004888ecb.rdf', probable reason 'No such file or directory'
Error in gzfile(file, "wb") : cannot open the connection
Solution:
Create a new Rmarkdown file
Delete all codes
Copy your old Rmarkdown code into it.
I had the same problem.For me, it was caused due to not having enough disk space on the drive where R studio was installed.Freeing up space works.
The reason for the error is that your username is Chinese.Please create new user folder with English in the user directory.For example, you could name the folder for "DavidSmith".Then, you need create three folders("AppData","Local","Temp").File directory C:\Users\DavidSmith\AppData\Local\Temp.
In the Advanced system settings which will modify the environment variables TMP and TEMP C:\Users\DavidSmith\AppData\Local\Temp.Save them.
After modification, open RStudio and try again.
Notice:TMP and TEMP are modified in the USER VARIABLE.
I just ran into this problem after changing my system locale.
Check your locale using Sys.getlocale().
Change it to appropriate one using Sys.setLocale("LC_ALL","ENG") (replace "ENG" with appropriate one)
I can't say with certainty which locale would be appropriate, but it seems to be coherent with default OS one.
Hope this helps!
I had this error because of an invalid character in the filename to be used to save the file, in my case "/" (there are many such characters that cannot be used in a filename). I removed the character and it was solved.
In my case, I received the error "Error in gzfile(file, "wb") : cannot open the connection" when trying to exit R in the Anaconda Prompt and saving workspace image. I am using Windows 10 and R-3.5.2. To fix it, I had to go to the Program Files folder, right click and the R folder, then selected Properties. Selected the Security tab, then, in the Group or user names box, selected Users, then clicked Edit. In the Permissions for Users, I checked Full control and Modify and saved the changes. Then I was able to save the workspace image.
I have another instance of this error which seems to be new (or at least not listed here or here: apparently it's not OK to save a file with the name aux.RData. I guess it's a reserved filename.
x <- rnorm(9000)
save(x, file = "aux.RData")
Error in gzfile(file, "wb") : no se puede abrir la conexión
Also: Warning message:
In gzfile(file, "wb") :
cannot open compressed file 'aux.RData', probable reason 'No such file or directory'
But when I change the filename saves with no problem:
save(x, file = "aux_file.RData")
Haven't seen this case in the other answers:
if this seems to happen all the time, and to be very persistent when it does happen, check the default directory in your file handling software connection.
In my case FileZilla was logging on to my DigitalOcean droplet as "root" and whenever I used FileZilla to create a directory it was setting write permissions to "root", whereas my RStudio on the same droplet read/wrote as "My_Name". Anytime I set something up in FZ (e.g. large imported files, renamed or copied) the permissions would switch and I'd get this error.
If this is what is causing frequent error messages it can be solved instantly with chown -R My_Name directoryname but in the longer run, if you are going to be using your file handler to define and create a lot of directories, it will pay to create a connection whose default name is the same name you use for RStudio.
In my case, when it happened first, months ago, the solution here worked.
But recently, it came back, constantly... What solved this time was to change the anti-virus. I have not just the Windows defender, but also a 2nd anti-virus, the same in both times. I ended up deinstalling it and installing another antivirus... After this, the problem did not happen again...
After several days trying to solve this same ERROR or problem in my case (Windows 10 and R), I tried to save my file(file.RData) in D disk instead of C disk (where I always was working and I have installed R) and it was fine, without problems,my file was saved in D:/Users.When I tried many times to save it in C disk, always gave me Permission denied.
save(Myfile, file="D:/Users/Myfile.RData")
I encountered this same issue when trying to save an Rds file from an Markdown file. Changing my relative file path to an absolute file path worked for me.
In my case, this error was because the file that I wanted to re-write, was read-only (for whatever reason, I didn't do it myself). I just right-click on the file's name in the folder and unchecked the read-only property. After that it worked.