Viewing more than 1000 rows in RStudio - r

In RStudio when you use the View() function, it only allows you to see up to 1000 rows. Is there any way to see more than that. I know it is possible to subset the viewing and see rows 1000-2000 for example, but I would want to be able to see 1-2000. The best I could find was a comment about a year ago saying that it wasn't possible at the time but they were planning on fixing this.
Here's an example (note: I'm guessing you will have to run this in RStudio).
rstudio <- (1:2000)
View(rstudio)

The View command is specifically for the little helper window. You can easily view the full value in the actual console window. If you want the same layout, use cbind.
cbind(rstudio)
which in fact will even give you the same nice row-numbering setup
And if that's too cumbersome
pview <- function(x, rows=100) {
if (length(x) > rows)
print(cbind(x))
else
print(cbind(head(x, rows/2)))
print(cbind(tail(x, rows/2)))
}
pview(rstudio, 1998)
you will need to clean that up to get the row names to lineup

You can change this setting, for instance:
options(max.print=5000)

Related

Problem with dplyr causing additional info to appear

Using dplyr in R (microsoft R Open 3.5.3 to be precise). I'm having a slight problem with dplyr whereby I'm sometimes seeing lots of additional information in the data frame I create. For example, for these lines of code:
claims_frame_2 <- left_join(claims_frame,
select(new_policy_frame, c(Lookup_Key_4, Exposure_Year, RowName)),
by = c("Accident_Year" = "Exposure_Year", "Lookup_Key_4" = "Lookup_Key_4")
)
claims_frame_3 <- claims_frame_2 %>% group_by(Claim.Number) %>% filter(RowName == max(RowName))
No problem with the left_join command, but when I do the second command (group by/filter), the data structure of the claims_frame_3 object is different to that of the claims_frame_2 object. Seems to suddenly have lots of attributes (something I know little about) attached to the RowName field. See the attached photo.
Does anyone know why this happens and how I can stop it?
I had hoped to put together a small chunk of reproducible code that demonstrated this happening, but so far I haven't been successful. I will continue. In the mean time, I'm hoping someone might see this code (from a real project) and immediately know why this is happening!
Grateful for any advice.
Thanks
Alan

Customize existing function in R

I want to change a condition within the function psych::polychoric in R.
Specifically, I want to increase the limit of different realizations of a a variable from 8 to 10 on line 77 of the code.
I can manually increase the limit by calling
trace(polychoric, edit=TRUE)
Since the script is meant for reproduction purposes for a paper of mine, I want to make handling as smooth as possible by avoiding manual editing.
Is there a way to edit the function by a piece code,
e.g. by replacing if (nvalues > 8) by if (nvalues > 10) in the code by another function?
Any suggestions would be much appreciated.
find the location in the function that you want to change
as.list(body(psych::polychoric))
Change the function
trace(psych::polychoric, quote(nvalues > 10), at=11)
Check to see that you changed what you want to change
trace(psych::polychoric, edit=TRUE)
Set the function back to original
untrace(psych::polychoric)
-----
Seems like fix may be easier for you to implement for this task
fix(polychoric)
opens a pane that you can change the code in - change and hit save.
This will make the function local to your global environment you can check this by looking at the original function trace(polychoric, edit = T) will show nvalues > 10, and trace(psych::polychoric, edit = T) will show nvalues > 8. The next time you reload psych you will be using the original function. Bit of a manual hack - but hopefully works for this one off situation.

Execute multiple sets of lines from another R file

I asked this before, but maybe I didn't ask exactly enough.
I want to run from my Master-R file other, quite long R files. On the first glimpse that's easy to accomplish with source().
The point is, they are so long, that I don't want to run all of them, just a certain part of it. Someone on my former post showed me this hidden gem, but the both run from point A to point B.
What I want is to run from my file another file, starting at line x, then run to line x+z, skip a certain amount of rows, and then continue to run the same file from line y to y+z.
The solution in the link I attached is working and great, but I can't skip rows (This coding is above my skill), without creating another funtion and setting more start- and endpoints.
Is it possible to call something like this source(df.R, excludeLine(1:6, 20, 30:end)?
Just slightly modifying this very excellent answer: should work.
sourcePartial <- function(fn,startTag1='#from here1',endTag1='#to here1', startTag2='#from here2',endTag2='#to here2') {
lines <- scan(fn, what=character(), sep="\n", quiet=TRUE)
st1<-grep(startTag1,lines)
en1<-grep(endTag1,lines)
st2<-grep(startTag2,lines)
en2<-grep(endTag2,lines)
tc <- textConnection(lines[c((st1+1):(en1-1),(st2+1):(en2-1))])
source(tc)
close(tc)
}
But really, just have a go yourself next time and you might learn...

Editing or Viewing data frame in R Console

I can see the entire data frame in the console. Is there any possible way or any function to view data frame in the R-Console (Editing similar to that of Excel) so that I should be able to edit the data manually?
S3 method for class 'data.frame'
You can use:
edit(name, factor.mode = c("character", "numeric"),
edit.row.names = any(row.names(name) != 1:nrow(name)), ...)
Example:
edit(your_dataframe)
You can go through in detail with the help of this link - Here
You really can use edit() or view().
But maybe, if you dataset isn't big enough, if you prefer to use Excel, you can use this function below:
library(xlsx)
view.excel<-function(inputDF,nrows=5000){
if (class(inputDF)!="data.frame"){
stop("ERROR: <inputDF> class is not \"data.frame\"")
}
if(nrow(inputDF)>5000 & nrows!=-1){
inputDF=inputDF[1:nrows,]
}
tempPath=tempfile(fileext='.xlsx')
write.xlsx(inputDF,tempPath)
system(paste0('open ',tempPath))
return(invisible(tempPath))
}
I've defined this function to help me with some tasks in R...
Basically, you only need to pass a DataFrame to the function as a parameter. The function by default display a maximum of 5000 rows (you can set the parameter nrows = -1 to view all the rows, but it may be slow).
This function opens your DataFrame in Excel and returns the path where your temporary view was saved. If you wanna save and load your temporary view, after changing something directly with Excel, you can load again your data frame with:
# Open a view in excel
tempPath <- view.excel(initialDF, nrows=-1)
# Load the file of the Excel View in the new DataFrame modifiedDF
modifiedDF <- read.xlsx(tempPath)
This function may works well in Linux, Windows or Mac.
You can view the dataframe with View():
View(df)
As #David Arenburg says, you can also open your dataframe in an editable view, but be warned this is slow:
edit(df)
For updates/changes to affect the dataframe use:
df <- edit(df)
Since a lot of people are using (and developing in) RStudio and Shiny nowadays, things have become far more convenient for R users.
You should look at the rhandsontable package.
There is also very nice Shiny implementation of rhandsontable, from a blog I stumbled upon: https://stla.github.io/stlapblog/posts/shiny_editTable.html. It's not using the console, but it is anyway super slick:
(A few years later) This may be worth trying if you use RStudio: It seems to support all data types. I did not use it extensively but helped me ocassionally:
https://cran.r-project.org/web/packages/editData/README.html
It shows an editing dialog by default. If your dataframe is big you can browse to http://127.0.0.1:7212 while the dialog is being shown, to get a resizable editing view.
You can view and edit a dataframe using with the fix() function:
# Open the mtcars dataframe for editing:
fix(mtcars)
# Edit and close.
# This produces the same result:
mtcars <- edit(mtcars)
# But it is a longer command to write.

Is there a built-in function for sampling a large delimited data set?

I have a few large data files I'd like to sample when loading into R. I can load the entire data set, but it's really too large to work with. sample does roughly the right thing, but I'd like to have to take random samples of the input while reading it.
I can imagine how to build that with a loop and readline and what-not but surely this has been done hundreds of times.
Is there something in CRAN or even base that can do this?
You can do that in one line of code using sqldf. See part 6e of example 6 on the sqldf home page.
No pre-built facilities. Best approach would be to use a database management program. (Seems as though this was addressed in either SO or Rhelp in the last week.)
Take a look at: Read csv from specific row , and especially note Grothendieck's comments. I consider him a "class A wizaRd". He's got first hand experience with sqldf. (The author IIRC.)
And another "huge files" problem with a Grothendieck solution that succeeded:
R: how to rbind two huge data-frames without running out of memory
I wrote the following function that does close to what I want:
readBigBz2 <- function(fn, sample_size=1000) {
f <- bzfile(fn, "r")
rv <- c()
repeat {
lines <- readLines(f, sample_size)
if (length(lines) == 0) break
rv <- append(rv, sample(lines, 1))
}
close(f)
rv
}
I may want to go with sqldf in the long-term, but this is a pretty efficient way of sampling the file itself. I just don't quite know how to wrap that around a connection for read.csv or similar.

Resources