This is my first post so I will try to be specific.
I have imported few .csv files and I am trying to combine them together.
When I am inspecting each individual data frame, as I import them, I can open them in RStudio View window and data looks correct.
However once I combine the data frames together using Master<-do.call("rbind", list(DF1,DF2,DF3,DF4)) and try try to view the Master table i get following massage:
Error in if (nchar(col_min_c) >= 16 || grepl("e", col_min_c, fixed =
TRUE) || : missing value where TRUE/FALSE needed
However, when I view all the original data frames I am able to see them with no problem.
If I use utils::View(Master) I am able to see the data frame.
So I am not sure where this issue comes from.
These are the package I am running:
require(data.table)
require(dplyr)
require(sqldf)
require(ggplot2)
require(stringr)
require(reshape2)
require(bit64)
Thanks for any help you can provide
I was able to get around this issue by transforming my table via:
Master<-sqldf("SELECT * FROM 'Master'")
So I hope this helps others in the case they come across a similar issue in the future.
I was able to view the file if I removed NA values from a long numeric column (19 char) on the far left hand side of the table (column 1)
Related
All I need to do for this simple assignment, which I have been able to do with no problem before, is take the mean of a column from a .csv dataset uploaded to R.
Here is the code I have:
library(readr)
X1888B_PSet03_Dataset1 <-read_csv("Downloads/188B_PSet03_Dataset1.csv")
View(X188B_PSet03_Dataset1)
mean(FillerAcc)
and this is where I get the error message
object 'FillerAcc' not found
Meanwhile, there is literally a column table in my data called FillerAcc.
Call your column this way.
mean(X1888B_PSet03_Dataset1$FillerAcc)
I'm using an R script within Power Query to do some data transformations and return a scaled table.
My R code is like this:
# 'dataset'
It does seem like odd that this fails to return. A quick glance online gave this 3 minute youtube video, which uses the same method, which you are using. Further searching down a source, one may come across the Microsoft Documentation, which gives a possible reason for why there might be an issue.
When preparing and running an R script in Power BI Desktop, there are a few limitations:
Only data frames are imported, so make sure the data you want to import to Power BI is represented in a data frame
Columns that are typed as Complex and Vector are not imported, and are replaced with error values in the created table
These seem like the most obvious reasons. Betting that there is no complex columns in your dataset, I'd believe the prior is likely the reason. A quick recreation of your dataset shows that the scale functions changes your dataset into a matrix class object. This is kept by cbind, and as such output is of class matrix and not data.frame.
>dataset <- as.data.frame(abs(matrix(rnorm(1000),ncol=4)))
>class(dataset)
[1]"data.frame"
>library(dplyr)
>df_normal <- log(dataset + 1) %>%
> select(c(2:4)) %>%
> scale
>class(df_normal)
[1] "matrix"
>df_normal <- cbind(dataset[,1], df_normal)
>output <- df_normal
>class(output)
[1] "matrix"
A simple fix would then seem to be adding output <- as.data.frame(output), as this is in line with the documentation of powerBI. Maybe it would need a return like statement at the end. Adding a line at the end of the script simply stating output should fix this.
Edit
For clarification, I believe the following edited script (of yours) should return the data expected
# 'dataset' contém os dados de entrada neste script
library(dplyr)
df_normal <- log(dataset+1) %>%
select(c(2:4)) %>%
scale
df_normal <-cbind(dataset[,c(1)], df_normal)
output <- as.data.frame(df_normal)
#output ##This line might be needed without the first comment
I use sparkR to run some sql query to get sparkR data frame like the following
data = sql(sql_query)
And I can get the dimension of data by using dim(data)
However, when I want to take a look at the data by using head(data) it will fail and give me an error java.lang
.ClassCastException: org.apache.hadoop.hive.serde2.io.TimestampWritable cannot be
cast to org.apache.hadoop.io.IntWritable
I tried the sql query in Hive and it doesn't have any problem. The weird thing here is I can get the dimension but cannot get the head.
Any idea?
Try: View(head(data, 20))...also, it would be helpful if you could give a little bit more detail on your question.
I'm brand new to R and am having difficulty with something very basic. I'm importing data from an excel file like this:
data1 <- read.csv(file.choose(), header=TRUE)
When I try to look at the data in the table by column, R doesn't recognize the column headers as objects. This is what it looks like
summary(Square.Feet)
Error in summary(Square.Feet) : object 'Square.Feet' not found
I need to run a regression and I'm having the same problem. Any help would be much appreciated.
Yes it recognizes, you have to tell R to select the dataframe so:
summary(data1$Square.Feet)
Where "data" is the name of your dataframe, and after the dollar goes the name of the variable
Hope it helps
UPDATE
As suggested below, you can use the following:
data1 <- read.csv(file.choose(), header=TRUE)
attach(data1)
This way, by doing "attach", you avoid to write everytime the name of the dataset, so we would go from
summary(data1$Square.Feet)
To this point after attaching the data:
summary(Square.Feet)
However I DO NOT recommend to do it, because if you load other datasets you may mess everything as it's quite common that variables have the same names, among other major problems, see here (Thanks Ben Bolker for your contribution): here , here, here and
here
if you want a summary of all data fields, then
summary(data1)
or you can use the 'with' helper function
with(data1, summary(Square.Feet))
I can see the entire data frame in the console. Is there any possible way or any function to view data frame in the R-Console (Editing similar to that of Excel) so that I should be able to edit the data manually?
S3 method for class 'data.frame'
You can use:
edit(name, factor.mode = c("character", "numeric"),
edit.row.names = any(row.names(name) != 1:nrow(name)), ...)
Example:
edit(your_dataframe)
You can go through in detail with the help of this link - Here
You really can use edit() or view().
But maybe, if you dataset isn't big enough, if you prefer to use Excel, you can use this function below:
library(xlsx)
view.excel<-function(inputDF,nrows=5000){
if (class(inputDF)!="data.frame"){
stop("ERROR: <inputDF> class is not \"data.frame\"")
}
if(nrow(inputDF)>5000 & nrows!=-1){
inputDF=inputDF[1:nrows,]
}
tempPath=tempfile(fileext='.xlsx')
write.xlsx(inputDF,tempPath)
system(paste0('open ',tempPath))
return(invisible(tempPath))
}
I've defined this function to help me with some tasks in R...
Basically, you only need to pass a DataFrame to the function as a parameter. The function by default display a maximum of 5000 rows (you can set the parameter nrows = -1 to view all the rows, but it may be slow).
This function opens your DataFrame in Excel and returns the path where your temporary view was saved. If you wanna save and load your temporary view, after changing something directly with Excel, you can load again your data frame with:
# Open a view in excel
tempPath <- view.excel(initialDF, nrows=-1)
# Load the file of the Excel View in the new DataFrame modifiedDF
modifiedDF <- read.xlsx(tempPath)
This function may works well in Linux, Windows or Mac.
You can view the dataframe with View():
View(df)
As #David Arenburg says, you can also open your dataframe in an editable view, but be warned this is slow:
edit(df)
For updates/changes to affect the dataframe use:
df <- edit(df)
Since a lot of people are using (and developing in) RStudio and Shiny nowadays, things have become far more convenient for R users.
You should look at the rhandsontable package.
There is also very nice Shiny implementation of rhandsontable, from a blog I stumbled upon: https://stla.github.io/stlapblog/posts/shiny_editTable.html. It's not using the console, but it is anyway super slick:
(A few years later) This may be worth trying if you use RStudio: It seems to support all data types. I did not use it extensively but helped me ocassionally:
https://cran.r-project.org/web/packages/editData/README.html
It shows an editing dialog by default. If your dataframe is big you can browse to http://127.0.0.1:7212 while the dialog is being shown, to get a resizable editing view.
You can view and edit a dataframe using with the fix() function:
# Open the mtcars dataframe for editing:
fix(mtcars)
# Edit and close.
# This produces the same result:
mtcars <- edit(mtcars)
# But it is a longer command to write.