I noticed that even after clearing the environment, clearing the workspace and uninstalling R, I still can't get rid of old variables that still show up.
Here is how I launch my database:
#rm(list=ls(all=TRUE))
stim<-read.table(file.choose(),header=T)
attach(stim)
names(stim)
summary(stim)
str(stim$emotionT2)
names(stim)
I tried removing the "attach(stim)" line, but then none of the newly imported dataset works.
How can I completely clear all data to make sure that I am really testing the newly imported one?
Delete any .RData files in your working directory (your home directory, if you aren't sure). If you want to be careful, just move them/rename them rather than deleting.
Related
This isn't a major issue, but I still thought I would ask.
I've been cleaning some data for a project at work, and there's a point at the process where I save all of the individual files I've cleaned as a CSV in long format. I noticed that with some of the files that if I open them, some cells that SHOULD have data appear blank. If I use the "Clear All Formats" option, the data appears. It reads into R just fine and it hasn't caused any issues, but I still think it's weird.
Has anyone else run into this and if so, was there a way to resolve this without going through each column? The files I'm cleaning start out with all sorts of formatting, so I'm curious if that could be the cause. I thought that a CSV doesn't save formats though, so I'm a little confused.
Again, not the biggest deal but slightly annoying and I'll get questions about it if my colleagues ever take a look at these files.
The data is prorietary, and I'm not exactly sure how I would share it. but I'm using a pretty stragith forward write_csv(data,"path.csv")
I think I figured out the solution to this issue, and I wanted to share in case anyone else runs into this.
I'm using a Windows Computer, which needed an update. That got me thinking and I needed to update my version of RStudio. I'm not sure what would have caused this issue, but when I re-run those files, the issue appears to be resolved.
When closing R Studio at the end of a R session, I am asked via a dialog box: "Save workspace image to [working directory] ?"
What does that mean? If I choose to save the workspace image, where is it saved? I always choose not to save the workspace image, are there any disadvantages to save it?
I looked at stackoverflow but did not find posts explaining what does the question mean? I only find a question about how to disable the prompt (with no simple answers...): How to disable "Save workspace image?" prompt in R?
What does that mean?
It means that R saves a list of objects in your global environment (i.e. where your normal work happens) into a file. When R next loads, this list is by default restored (at least partially — there are cases where it won’t work).
A consequence is that restarting R does not give you a clean slate. Instead, your workspace is cluttered with existing stuff, which is generally not what you want. People then resort to all kinds of hacks to try to clean their workspace. But none of these hacks are reliable, and none are necessary if you simply don’t save/restore your workspace.
If I choose to save the workspace image, where is it saved?
R creates a (hidden) file called .RData in your current working directory.
I always choose not to save the workspace image, are there any disadvantages to save it?
The advantage is that, under some circumstances, you avoid recomputing results when you continue your work later. However, there are other, better ways of achieving this. On the flip side, starting R without a clean slate has many disadvantages: Any new analysis you now start won’t be in a clean room, and it won’t be reproducible when executed again.
So you are doing the right thing by not saving the workspace! It’s one of the rules of creating reproducible R code. For more information, I recommend Jenny Bryan’s article on using R with a Project-oriented workflow
But having to manually reject saving the workspace every time is annoying and error-prone. You can disable the dialog box in the RStudio options.
The workspace will include any of your saved objects e.g. dataframes, matrices, functions etc.
Saving it into your working directory will allow you to load this back in next time you open up RStudio so you can continue exactly where you left off. No real disadvantage if you can recreate everything from your script next time and if your script doesn't take a long time to run.
The only thing I have to add here is that you should consider seriously that some people may be working on ongoing projects, i.e. things that aren't accomplished in one day and thus must save their workspace image so as to not start from the beginning again.
I think, best practice is: its ok to save your workspace, but your code only really works if you can clear your entire workspace and then rerun it completely with no errors!
Working Direcotry Cannot Change It's saying that there's an error in my code but I've tried it multiple times with countless variations on the code (I wiped my past attempts sorry) and it refuses to change the working directory. It won't change to other things either so it's not just this folder. What's the issue?
This probably means that the directory you want to change to does not exist. From the image I think you are using Windows, in which case the proper path to the directory would look like this:
setwd("C:/Users/$USER$/Desktop/r-novice-inflammation"
Change the $USER$ to your own username and it should work.
Paths always start with the letter of the hard drive in Windows. The easiest way to find the proper path to a directory is, in my opinion, to right click on the folder and look for the "Location" in properties. The IDE RStudio has a menu which you can use to change the working directory, which may be easier than using vanilla R.
The exception is setwd("~") which links to the Documents folder of your current user (i.e. C:/Users/$USER$/Documents). Based on the comments I realised that other commands such as setwd("..") (i.e. one folder up in the hierarchy) can be combined with ~ which explains what you are doing. In this case the following works for me:
setwd("~/../Desktop/")
I'm looking to make my code available to others to run, and they need the correct csv files to run my code.
Once they have git cloned my repo, they need to get the data
so I currently have:
u = 'https://someURL/data/RegularSeasonCompactResults.csv'
download.file(u,'RegularSeasonCompactResults.csv')
data = read.table('RegularSeasonCompactResults.csv')
However, if the user runs this the second time, it will re-download the file, even though that is not necessary.
This seems like it could be a reoccurring problem for people, so im wondering if there is a built in solution to this?
Wrap it with if(!file.exists("RegularSeasonCompactResults.csv")){ ... }
After reading this question I attempted to clean out my workspace and found that each time I opened R all the original items I had recently removed were restored. I then checked .RData and found that it had not been modified in a few weeks even though I repeatedly saved the workspace image. How often is .RData updated and how can I change when .RData is updated so that it reflects more recent changes?
It gets modified if and when you
use save.image()
use q() and answer yes
Otherwise it does not get changed.
My personal preference is to explicitly load and save data I want to cache across sessions or for further analysis.