I am trying to work with the below data frame and what I am trying to do is to compare the values from columns:
#create a sample data frame
df<-data.frame(
item=c("a","b","c","d"),
price_today=c(1,2,3,4,5),
price_yesterday=c(1,2,3,4,5)
If values from column price_today are the same compared with values from column price_yesterday, then print okay or else print not okay, and the printed result will be shown as a new variable in the data frame.
May I know how should I go about the ifelse part here?
Many thanks for your help and have a good day.
Modified Questions:
Hi all, so now what if the df becomes like this:
#create a sample data frame (modified)
df<-data.frame(
item=c("a","a","c","d"),
price_today=c(1,"",3,"XYZ",5),
price_yesterday=c(1,2,3,4,5)
Now it contains both blank value and non-numerical values in column price_today. And instead of a,b,c,d in column item, it becomes a,a,c,d in column item. I have been trying to do the following:
Sort column item by "a" and I have the below code:
df_1<-df[df$item=="a",]
After df_1 is filtered, then again sort price_today, by removing blank and non-numerical values with codes below:
df_1<-df[!is.numeric(df_1$price_today),]
I am able to filter out by "a" in column item, however, with the second filter, it then returns with the original df, may I know what did I do wrong here?
Million thanks for your help and have a good day/night.
I am working on data set for analysis.
I have to remove column name of single variable of data frame in r.
I have used colnames(df)<-NULL function,it removes all colnames of dataframe
The excepted output is:
I don't really get the expected output, but maybe this is what you're looking for
colnames(df)[i] <- NA
'i' is the column location (for example 1,2,3 etc)
I have 10 topics. For each topic name I have a results_topic_df data frame. In this data frame there are 2 columns: index, which is a name of another data frame and var_name, which is a name of a variable inside the corresponding data frame (indicated by index).
What I want to do is to take the corresponding original data frame (whos name is indicated by results_topic_df$index), look at the value of results_topic_df$var_name in the same row, go to the original data frame and copy the relevant variable to a data frame named container_df.
Eventually I will have container_df having only the selected variables from all the data frames that appear in results_topic_df.
I want to repeat this procedure for each one of the 10 topics.
I have tried to do this with a loop but because my data frames' names change, I got really confused with all the combinations of assign(),paste0(), and eval(). Is there a simpler way to accomplish my goal? Thanks.
I have a data frame (originally from a CSV file) with the columns NAME and YEAR. I have extracted a sample from this data frame of the first ten entries like so:
sample<-df(1:10,)
I want to know the frequency of the values in the NAME column so I input the following:
as.data.frame(table(sample$NAME))
This counts the frequency in the sample correctly but also includes every name from the original data frame in the 'Var1' column (all with a Freq of 0).
The same thing happens if I use unique(sample$NAME) as well: it lists the names from the sample along with all of the names from the original data frame as well.
What am I doing wrong?
This could be a case of unused level in the 'NAME' factor column. We can use droplevels or call factor again to remove those unused levels.
as.data.frame(table(droplevels(sample$NAME)))
Or
as.data.frame(table(factor(sample$NAME)))
I have a data frame which I will call "abs.data" that contains 265 columns (variables). I have another data frame which I will call "corr.abs" that contains updated data on a subset of the columns in "abs.data". Both data frames have an equal number of rows, n=551. I need to replace the columns in "abs.data" with the correct observations in "corr.abs" where the column names match. I have tried the following
abs.samps <- colnames(abs.data) #vector of column names in abs. data
corr.abs.samps <- colnames(corr.abs) #vector of column names in corr.abs
abs.data[,which(abs.samps %in% corr.abs.samps==TRUE)] <- corr.abs[,which(corr.abs.samps %in% abs.samps==TRUE)] #replace columns in abs.data with correct observations in corr.abs where the column names are the same
When I run the left and right side of the last line of code R pulls the right columns, but it fails to replace the columns in abs.data with the correct data in corr.abs. Any ideas why?
you can find the common column names using
comm_col <- intersect(colnames(abs.samps), colnames(corr.abs))
eg. you find X2 is the common column
you can first drop the columns, in this case X2 from abs.samps that you do not want using subset
x<-subset(abs.samps, select = -X2)
then you can just add the new column (eg. column name X2)to the new data frame
y<-cbind(corr.abs$X2,x)