jupyter notebook uploading a 10mB source data file - jupyter-notebook

I have a question about uploading a 10mB source data file.
I tried multiple ways to upload this: upload its original version, zipped version, and txt version.
However, every time I click the uploaded data source file, I see this following error message:
out of memory.
I need your advice how to resolve this.

Related

open a OneDrive file with r

Im tring to create a shiny app that read and online onedrive xlsx file and show some things, but for the moment Im unable to read the onedrive xlsx file, I already explore the Microsoft365R and I can conect to my onedrive and I even can open the fil but... what it does is from r open a tab in chrome with the excel file.
I need the file in the local enviroment of r.. this its beacause the shiny app must be deploy in a web server, that every time the app runs it reads the update the file.
library(Microsfot365R)
odb <- get_business_onedrive()
odb$open_file("lcursos.xlsx")
Also this its a business account, so I also have to put the username and key to acces each file, that its beacause use the simple url doesnt work, it says Error 403 FORBIDEEN.
Any ideas?
Thank you so much!
Use the download_file() method to download the file to your local machine:
odb$download_file("lcursos.xlsx")
You can set the location of the download with the dest argument. Once it's downloaded, open it with the xls reader package of your choice. I suggest either openxlsx or readxl.
Note that if your file is password protected, your options are limited. See this question for possible solutions.

unzip in google colab corrupts after first read

I have uploaded a 11GB images file into my google drive, and trying to unzip in colab for processing. First time it processes properly. After colab session is closed and restarted the next day, the same unzip command fails saying the zip file is corrupted. So I had to remove corrupted zip file, load original zip file again and use it colab. Again first time works perfectly fine, but second time it fails again. Every time uploading 11GB file to google drive takes lot of time, and uses so much bandwidth.
I am using !unzip '/content/drive/My Drive/CheXpert-v1.0-small.zip to unzip.
Second time also it starts unzipping, but after few records, it throws an error saying read error on a specific image, which is different every time. If I restart unzip again, it gives offset errors without unzipping any image.
Is there any way to fix this problem, so that I can unzip successfully any number of times?
Thanks in advance for a quick help.

How to load the actual .RData file, that is just called .RData (the compressed file that gets saved from a session)

Similar questions, but not the question I have, were around loading a file that someone saved as somefilename.RData. I am trying to do something different.
What I am trying to do is load the actual .RData file that gets saved from an R session. The context is that I am using 2 different computers and am trying to download the .RData file from one computer and then load this same .RData file on a different computer in RStudio.
When I download the .RData file it shows up without the “.” (e.g., it shows up as RData). When I try to rename it “.RData”, Windows will not allow me to do so.
Is there a way to do what I am trying to do?
Thanks!
After playing around with this, I was able to load the file (even though it was called “RData“ and not called “.RData”, by using RStudio by going to Session > Load Workspace... and then navigating to that file. I had used File > Open File... which did not work

How to upload CSV files to GitHub repo and use them as data for my R scripts

I'm currently doing a project that uses R to process some large csv files that are saved in my local directory linked to my repo.
So far, I managed to create the R project and commit and push R scripts into the repo with no problem.
However, the scripts read in the data from the csv files saved in my local directory, so the code goes in a form
df <- read.csv("mylocaldirectorylink")
However, this is not helpful if my partner and I working on the same project have to change that url to our own local directory every time we pull it off the repo. So I was thinking that maybe we can upload the csv files onto GitHub Repo and let the R script refer directly to the csv files online.
So my questions are:
Why can't I upload csv files onto GitHub? They keep saying that my file is too large.
If I can upload the csv files, how to I read the data from these csv files?
Firstly, it's generally a bad idea to store data on Github, especially if it's large. If you want to save it somewhere on the Internet, you can use, say, Dataverse, and then can access your data with URL (through the API), or Google Drive, as Jake Kaupp suggested.
Now back to your question. If your data doesn't change, I would just use not the absolute paths to CSV but relative ones. In other words, instead of
df<-read.csv("C:/folder/subfolder/data.csv")
I would use
df <- read.csv("../data.csv")
If you are working with R project, then the initial working directory is inside the folder of the project. You can check it with getwd(). This working directory changes as you move the R project. Just agree with your colleague that your data file should be in the same folder where the folder with R project is situated.
This is for a Python script.
You can track csv files by editing your .gitignore file.
**OR**
You can add csv files in your github repo, which can be used by others.
I did so by following steps:
Checkout the branch on github.com
Go to the folder where you want to keep csv files.
Here, you will see an option "Add file" in top right area as shown below:
Here you can upload csv files and commit the changes in same branch or by creating a new branch.

csv files in opencpu

If I put a very small csv file in my GitHub directory so that it gets copied to /ocpu/github/username/projectname/www/ , will I be able to access the contents of the csv for use in a R function? I tried to ajax the file, but I get a 404 error even though I can see the csv file sitting in the www directory of my local server. I need to have the csv on the server as a static file rather than being uploaded by a function. Thanks
You should be able to access them like any other file. Can you post an example that shows what you are doing and what error you are getting?
That said, if you just want to use this data in your R functions, it is better to include it in the R package as an actual data file. Also see section 1.1.6 of Writing R Extensions. An example is the mapapp package, which includes a dataset called countryExData. Also see the live app.

Resources