Error: must rename columns with a valid subscript vector - r

I'm just trying to import a kaggle data set to study R on and it's being a nightmare.
I'm trying to rename the columns in my data frame but I keep getting errors.
library(tidyverse)
library(dplyr)
library(ggplot2)
library(tibble)
library(janitor)
food_advs<- read.csv("CAERS_ASCII_2004_2017Q2.csv")
food_df <- data.frame(food_advs)
food_df %>% rename(food_df, Product = PRI_Reported.Brand.Product.Name, Industry = PRI_FDA.Industry.Name, Person_age = CI_Age.at.Adverse.Event, Gender = CI_Gender, Outcomes = AEC_One.Row.Outcomes, Symptoms = SYM_One.Row.Coded.Symptoms)
> food_df %>% rename(food_df, "Product" = "PRI_Reported.Brand.Product.Name", "Industry" = "PRI_FDA.Industry.Name", "Person_age" = "CI_Age.at.Adverse.Event", "Gender" = "CI_Gender", "Outcomes" = "AEC_One.Row.Outcomes", "Symptoms" = "SYM_One.Row.Coded.Symptoms")
Error: Must rename columns with a valid subscript vector.
x Subscript has the wrong type `data.frame<
RA_Report.. : integer
RA_CAERS.Created.Date : character
AEC_Event.Start.Date : character
PRI_Product.Role : character
PRI_Reported.Brand.Product.Name: character
PRI_FDA.Industry.Code : integer
PRI_FDA.Industry.Name : character
CI_Age.at.Adverse.Event : integer
CI_Age.Unit : character
CI_Gender : character
AEC_One.Row.Outcomes : character
SYM_One.Row.Coded.Symptoms : character
>`.
i It must be numeric or character.
Run `rlang::last_error()` to see where the error occurred.

Try the following,
food_df %>%
rename(Product = PRI_Reported.Brand.Product.Name,
Industry = PRI_FDA.Industry.Name,
Person_age = CI_Age.at.Adverse.Event,
Gender = CI_Gender,
Outcomes = AEC_One.Row.Outcomes,
Symptoms = SYM_One.Row.Coded.Symptoms
)
Your mistake is in your usage of %>%; It is abundant to use rename(data, ...) when you already have data %>% before your call.

Related

Making a function that builds a dataframe

I'm trying to make a function that basically builds a dataframe and returns it. This new dataframe is made of columns taken from another dataframe that I have, called metadata.. in addetion to some additional data that I want to control, by passing the TRUE or FALSE values when calling the function.
Here is what I did:
make_data = function(metric, use_additions = FALSE){
data = data.frame(my_metric = metadata[['metric']], gender = metadata$Gender ,
age = as.numeric(metadata$Age) , use_additions = t(additional_data))
data = data %>% dplyr::select(my_metric, everything())
return(data)
}
data = make_data(CR, FALSE)
I want to pass different metric values each time, and all other features stay the same. So here for example I called the function with metric as CR which is the name of the column I want in the metadata. The argument I want to control is use_additions, sometines I want to add it and sometimes I don't.
metadata and additional_data have the exact same row names and the same rows number. It's just adding the data or not.
I get this error(s):
Error in data.frame(metric = metadata[["metric"]], gender = metadata$Gender, :
arguments imply differing number of rows: 0, 1523
In addition: Warning message:
In data.frame(metric = metadata[["metric"]], gender = metadata$Gender, :
Error in data.frame(my_metric = metadata[["metric"]], gender = metadata$Gender, :
arguments imply differing number of rows: 0, 1523
I've tried several ways to do this, with '' and without, using the $, but non of these worked. So for example when I type metric = metadata[[metric]] I get this:
Error in (function(x, i, exact) if (is.matrix(i)) as.matrix(x)[[i]] else .subset2(x, :
object 'CR' not found
make_data = function(colname, use_additions = FALSE){
data = data.frame(my_metric = metadata[colname], gender = metadata$Gender ,
age = as.numeric(metadata$Age))
if (use_additions) data$use_additions=additional_data
return(data)
}
data = make_data(“CR”, FALSE)

I keep getting error on this, can someone help me

delta_gamma = 0.05
MRW.data <- MRW.data %>%
mutate(ln.gdp.85 = log(gdp.85),
ln.gdp.60 = log(gdp.60),
ln.gdp.growth = ln.gdp.85 - ln.gdp.60,
ln.inv.gdp = log(Inv.gdp/100),
Non.oil = factor(Non.oil),
intermediate = factor(intermediate),
OECD= factor(OECD),
ln.ndg = log(pop.growth /100 + delta_gamma)) %>%
select(country, ln.gdp.85, ln.gdp.60, ln.inv.gdp, Non.oil, intermediate, OECD, ln.ndg, ln.school, gdp.growth, ln.gdp.growth)
skim(MRW.data)
Show in New Window
Error in select():
! Must subset columns with a valid subscript vector.
x Can't convert from to due to loss of precision.
Backtrace:
... %>% ...
rlang::cnd_signal(x)
Error in select(., country, ln.gdp.85, ln.gdp.60, ln.inv.gdp, Non.oil, :
x Can't convert from to due to loss of precision.

Background color changing R

how could I change color background in dataset just in header (names of columns) in R?
My dataset's name is "logous.df".
I found something like this, but it does not work.
logous.df(data.frame) %>%
row_spec(0, background = "yellow")
I need to see the color in dataframe saved as table RDS.
Another thing. I created it as matrix and then switch to dataframe (I must do it) and my leading zeroes are gone! How could I give them back? When I tried same operations in dataframe it gives me an error:
#any of them did not work
formatC(logous.df, width = 4, format = "d", flag = "0")
sprintf("%04.0f", logous.df)
str_pad(logous.df, 4, pad = "0")
stringr::str_pad(logous.df, 4, side = "left", pad = 0)
and error:
Warning message:
In stri_pad_left(string, width, pad = pad) :
argument is not an atomic vector; coercing
Error in sprintf("%04.0f", logous.df) :
'list' object cannot be coerced to type 'double'
Error in storage.mode(x) <- "integer" :
'list' object cannot be coerced to type 'integer'
In addition: Warning message:
In formatC(logous.df, width = 4, format = "d", flag = "0") :
class of 'x' was discarded
row_spec requires a knitr::kable object, so first you should create this object from your dataset before using this function :
logous.df %>% data.frame() %>% knitr::kable() %>% row_spec(0,background="yellow")

Error replacing a column with other values data frames R

I'm trying to replace the values which I've set by default in a data frame by the calculated ones but I get an error that I don't understand as far as I've no factors.
Here is the code :
nb_agences_iris <- agences %>%
group_by(CODE_IRIS) %>%
summarise(nb_agences = n()) %>%
arrange(CODE_IRIS)
int <- data.frame("CODE_IRIS" = as.character(intersect(typo$X0, nb_agences_iris$CODE_IRIS)))
typo$nb_agences <- as.character(rep(0, nrow(typo)))
typo[int$CODE_IRIS,]$nb_agences <- as.character(nb_agences_iris[int$CODE_IRIS,]$nb_agences)
And I get the following error:
Error in Summary.factor(1:734, na.rm = FALSE) :
‘max’ not meaningful for factors
In addition: Warning message:
In Ops.factor(i, 0L) : ‘>=’ not meaningful for factors
Thanks in advance for your help.

Error in as(x, class(k)) : no method or default for coercing “NULL” to “data.frame”

I am currently facing an error mentioned below which is related to NULL values being coerced to a data frame. The data set does contain nulls, however I have tried both is.na() and is.null() functions to replace the null values with something else. The data is stored on hdfs and is stored in a pig.hive format. I have also attached the code below. The code works fine if I remove v[,25] from the key.
Code:
AM = c("AN");
UK = c("PP");
sample.map <- function(k,v){
key <- data.frame(acc = v[!which(is.na(v[,1],1],
year = substr(v[!which(is.na(v[,1]),2],1,4),
month = substr(v[!which(is.na(v[,1]),2],5,6))
value <- data.frame(v[,3],count=1)
keyval(key,value)
}
sample.reduce <- function(key,v){
AT <- sum(v[which(v[,1] %in% AM=="TRUE"),2])
UnknownT <- sum(v[which(v[,1] %in% UK=="TRUE"),2])
Total <- AT + UnknownT
d <- data.frame(AT,UnknownT,Total)
keyval(key,d)
}
out <- mapreduce(input ="/user/hduser/input",
output = "/user/hduser/output",
input.format = make.input.format("pig.hive", sep = "\u0001")
output.format = make.output.format("csv", sep = ","),
map= sample.map)
reduce = sample.reduce)
Error:
Warning in asMethod(object) : NAs introduced by coercion
Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) : data length is not a multiple of split variable
Warning in rmr.split(x, x, FALSE, keep.rownames = FALSE) : number of items to replace is not a multiple of replacement length Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) :
data length is not a multiple of split variable
Warning in rmr.split(v, ind, lossy = lossy, keep.rownames = TRUE) : number of items to replace is not a multiple of replacement length
Error in as(x, class(k)) :
no method or default for coercing “NULL” to “data.frame”
Calls: <Anonymous> ... apply.reduce -> c.keyval -> reduce.keyval -> lapply -> FUN -> as No traceback available
UPDATE
I have added the sample data and edited the code above. Hope this helps!
Sample Data:
NULL,"2014-03-14","PP"
345689202,"2014-03-14","AN"
234539390,"2014-03-14","PP"
123125444,"2014-03-14","AN"
NULL,"2014-03-14","AN"
901828393,"2014-03-14","AN"
There are some issues with as which have been identified recently. I don't see why as can't handle this by default, but you can modify coerce which handles the conversion with an S4 method to call as.data.frame.
setMethod("coerce",c("NULL","data.frame"), function(from, to, strict=TRUE) as.data.frame(from))
[1] "coerce"
as(NULL,"data.frame")
data frame with 0 columns and 0 rows

Resources