I am attempting to check a column in my dataset that is all character values with values like: "1","2","12","NAME1","NAME2",...
I am attempting to pick out the values that have non-numeric names and change them to 99. This is what I have attempted so far:
install.packages("stringi")
library(stringi)
stacked_data$NewCol=ifelse(stri_detect_fixed(stacked_data$OldCol,"NAME")==TRUE,99,stacked_data)
I get this error message when I run this code:
Error in table(stacked_data$NewCol) :
attempt to make a table with >= 2^31 elements
Can someone help point me in the right direction? Any help would be appreciated! Thank you!
One easier option is
i1 <- is.na(as.numeric(df1$col))
df1$col[i1] <- 99
Related
I am a total beginner on R and I have to write the lines of code as I have shown in the attached image. When writing the highlighted line, I am faced with a "undefined columns error". I wondering whats causing that issue? When I try to insert a value before the i+1, for example (crash_data[HMSP, i+1] it gives me a "invalid type (NULL) for variable" error. HMSP is the name of one of the columns within the data set I imported
Does anyone have a solution for this? I would appreciate it so much, thank you.
Lines of Code in R I am following
Equation (1)
Link to dataset: https://drive.google.com/file/d/1qDvDJg5IORAlhNJH-e1ILco38S5zI_Qz/view?usp=sharing
crash_data<-read.csv("Dummycsv.csv", header=T)
Residuals_expanded_model<-matrix(0,311,51)
for (i in 1:51) {
Residuals_expanded_model[,i] <- resid(lm(crash_data[,i+1] ~
crash_data[,53]+ crash_data[,54]+ crash_data[,55]+ crash_data[,56]+
crash_data[,57]))
}
firm_spe_week_retur<-log(matrix(1,311,51)+Residuals_expanded_model)
write.csv(firm_spe_week_retur,"returns_W.csv")
Is HMSP the actual name, or a variable storing the name? If the former, you must use "HMSP". Also, the square brackets refer to x[row, col]
The problem is that there are NA values in your matrix and thus there are missing residuals, i.e. you try to replace the 311 elements in your matrix with only 309 values from your model: which(is.na(crash_data), arr.ind = TRUE)
I am trying to obtain Bitcoin data from yahoo finance using the following code:
getSymbols("BTC-USD",from= "2020-01-01",to="2020-12-31",warnings=FALSE,auto.assign = TRUE)
BTC-USD=BTC-USD[,"BTC-USD.Adjusted"]
However, I get the following error:
Warning message:
BTC-USD contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
How can I fix this?
Thanks.
You've got a first problem which is you're trying to assign to an invalid symbol. Use _ instead of - which is the subtraction operator. If you really want the -, you can use backticks around the symbol.
Then you can use is.na to find the NA values and replace them with 0.
library(quantmod)
getSymbols("BTC-USD",from= "2020-01-01",to="2020-12-31",warnings=FALSE,auto.assign = TRUE)
BTC_USD <- `BTC-USD`[,"BTC-USD.Adjusted"]
BTC_USD[is.na(BTC_USD)] <- 0
BTC_USD[100:110,]
# BTC-USD.Adjusted
#2020-04-09 7302.089
#2020-04-10 6865.493
#2020-04-11 6859.083
#2020-04-12 6971.092
#2020-04-13 6845.038
#2020-04-14 6842.428
#2020-04-15 6642.110
#2020-04-16 7116.804
#2020-04-17 0.000
#2020-04-18 7257.665
#2020-04-19 7189.425
A better plan is probably to just remove the NA rows instead of replacing them with 0:
BTC_USD <- BTC_USD[!is.na(BTC_USD),]
So I know this has been asked before, but from what I've searched I can't really find an answer to my problem. I should also add I'm relatively new to R (and any type of coding at all) so when it comes to fixing problems in code I'm not too sure what I'm looking for.
My code is:
education_ge <- data.frame(matrix(ncol=2, nrow=1))
colnames(education_ge) <- c("Education","Genetic.Engineering")
for (i in 1:nrow(survey))
if (survey[i,12]=="Bachelors")
education_ge$Education <- survey[i,12]
To give more info, 'survey' is a data frame with 12 columns and 26 rows, and the 12th column, 'Education', is a factor which has levels such as 'Bachelors', 'Masters', 'Doctorate' etc.
This is the error as it appears in R:
for (i in 1:nrow(survey))
if (survey[i,12]=="Bachelors")
education_ge$Education <- survey[i,12]
Error in if (survey[i, 12] == "Bachelors") education_ge$Education <- survey[i, :
missing value where TRUE/FALSE needed
Any help would be greatly appreciated!
If you just want to ignore any records with missing values and get on with your analysis, try inserting this at the beginning:
survey <- survey[ complete.cases(survey), ]
It basically finds the indexes of all the rows where there are no NAs anywhere, and then subsets survey to have only those rows.
For more information on subsetting, try reading this chapter: http://adv-r.had.co.nz/Subsetting.html
The command:
sapply(survey,function (x) sum(is.na(x)))
will show you how many NAs you have in each column. That might help your data cleaning.
You can try this:
sub<-subset(survey,survey$Education=="Bachelors")
education_ge$Education<-sub$Education
Let me know if this helps.
I have a dataframe in which I want to add a new column which is the product of two other columns, divided by 100.
The command I'm trying to use is:
fulldata$Conpolls <- fulldata$Conprct/100 * fulldata$Total.seats
for which I receive:
Error: unexpected input in "full.data$Conpolls <- fulldata$Conprct /100 * fulldata$Total.seats"
When I try to break up the process in 2 steps as:
fulldata$Conpolls <- fulldata$Conprct * fulldata$Total.seats
I get the error:
non-numeric argument to binary operator.
Any tips or help from experienced users greatly appreciated!
Veerendra Gadekar's answer should be correct if all the columns are numeric values.
If the columns with which you are doing the operations are not guaranteed to be numeric, you may turn them into numeric values with as.numeric(). It should look like this:
fulldata$Conpolls <- (as.numeric(fulldata$Conprct) * as.numeric(fulldata$Total.seats))/100
fulldata$Conpolls <- (fulldata$Conprct * fulldata$Total.seats)/100
This doesn't answer the question, however this should be the proper syntax to write such arithmetic operations. And yes as mentioned in the comments you should check the class of the objects you are using to find out what is wrong
I have a matrix with of 1's and 0's. I want to replace all the 1's by an identifier code for that row (the identifier code is given in row 2)
I tried:
dax[,2:109] <- replace(dax[,2:109],dax[,2:109]==1,dax[2,])
but this isn't working right. I've tried to set up a loop, but I've had no success so far.
I'm new to R. Any help is appreciated
This may do it for you, although it'd be nice to get more details from you.
for(j in 2:109) dax[dax[,j]==1,j] <- dax[2,j]